# Acceleration convergence using a potential function

Posted on June 4, 2020
We cover the analysis of acceleration from this paper. The main motivation is to fill in some of the details that has been left out most likely for clarity.

### 1.1 Rewriting in terms of Mirror descent

We can write NAG:

as

where is -smooth, and (See e.g. this paper). This is the scheme we will be analysing.

### 1.2 Convergence

the general strategy is using potential function goes like this:

- Define a potential function

- Find bound for .
Get convergence by telescoping the difference,

##### Acceleration

For acceleration we start with the potential and

This suggests that we will apply primal analysis to and dual to .

We want to bound the difference which we can expand as

Using (for euclidean norm) and the update for

we rewrite

*Interpretation: we want to be small for to improve the most.*

-smooth and the update rule for gives us

*Interpretation: we want to be large for to improve the most.*

This lets us bound

By picking the slow step size to be such that the last term disappears this reduces to

By convexity

Using that the update is we choose the step size such that the terms cancels

The RHS of the inner product cancels,

Now we have a bound for . We directly get from eq. 1

So to be less than we have to run for at least iterations. This is a quadratic speed up over vanilla gradient descent on smooth functions which requires steps.

The crucial observation is that we bound in terms of the norm and in terms of the gradient. This technique is further exploited in Mirr+Grad paper for general Mirror maps in the aggressive update, .