Convex optimization | Jun's Blog

Lagrange dual problem and conjugate function

The optimization problem have two components that are objective function

f_{0} : R^{n} \to R

and the constraints. The objective function and constraints keep in check each other and make balance at saddle point i.e. optimal point. The dual (Lagrange) problem of the optimal problem also solve the optimization problem by making low boundary. The dual problem can be explained as a conjugate function

f^{*} = sup (x^{T} y - f (x))

Approximation

The purpose of approximation is finding optimal point

x^{*}

i.e.

\nabla F (x^{*}) = 0

. We need a step/search direction

Δ x

and step size

t

. Taylor approximation has polynomial arguments that is a step and parameters of derivatives at the start point. The first degree of Taylor approximation has one adding term from start point

(x_{0}, F (x_{0}))

. The adding term

\nabla F (x) Δ x

is consistent with a parameter (gradient

\nabla F (x)

) and a argument (step

Δ x

Taylor series

f (x) = \sum_{k = 0}^{\infty} c_{k} x^{k} = c_{0} + c_{1} x + c_{2} x^{2} + \dots .

This is an approximation that is a function of h and derivatives of

f (x)

are elements of parameters.

f (x \pm h) = f (x) \pm h f^{'} (x) + \frac{h^{2}}{2} f^{″} (x) \pm \frac{h^{3}}{6} f^{‴} (x) + O (h^{4})

Let’s think about

\sin (x)

f (x) = \sin (x) f (0) = 0, f^{'} (x) = \cos (x) f^{'} (0) = 1, f^{″} (x) = - \sin (x) f^{″} (0) = 0

Thus, \[\begin{align*} \sin(x) &= 0 + \frac{1}{1!

Duality

This is summary of Boyd convex optimization. Steepest descent method is a convex optimization algorithm. The normalized steepest descent direction

x_{n s d}

is a vector of unit ball of a norm that extends in the direction

- \nabla f (x)

. The inner product of

x_{n s d}

and

- \nabla f (x)

is maximized. The first order Taylor approximation of

f (x + v) = f (x) + \nabla f (x)^{T} v

is most efficient when

v = x_{n s d}

. The

x_{n s d}

is unnormalized into

x_{s d}

Strong convexity and implications

This is a summary of the Boyd convex optimization book. The strong convexity assumption can be useful to explain the iterative minimization algorithms like gradient descent, steepest descent, and Newton’s method. The smallest and largest eigen value of Hessian

m ⪯ \nabla^{2} f (x) ⪯ M

with norm of gradient

‖ \nabla f (x) ‖_{2}

determine the boundary of optimal value

p^{*}

. The condition number of cond(

C_{α}

)

\leq \frac{M}{m}

, where

C_{α}

α

-sublevel.

Convex set

There is a homology between a line segment and a convex set. It is helpful to understand the convex set. A line, a line segment, and one sideline has homology to an affine set, a convex set, and a cone. A line is

{y | y = θ_{1} x_{1} + θ_{2} x_{2}, θ_{1} + θ_{2} = 1}

θ_{1}, θ_{2} \in R

, a line segment is if

θ_{1}, θ_{2} > 0

and an one side line if any

θ_{1}, θ_{2} < 0