# 13.3 Linear Optimization¶

## 13.3.1 Optimizer Selection¶

Two different types of optimizers are available for linear problems: The default is an interior-point method, and the alternative is the simplex method (primal or dual). The optimizer can be selected using the parameter `iparam.optimizer`

.

The Interior-point or the Simplex Optimizer?

Given a linear optimization problem, which optimizer is the best: the simplex or the interior-point optimizer? It is impossible to provide a general answer to this question. However, the interior-point optimizer behaves more predictably: it tends to use between 20 and 100 iterations, almost independently of problem size, but cannot perform warm-start. On the other hand the simplex method can take advantage of an initial solution, but is less predictable from cold-start. The interior-point optimizer is used by default.

The Primal or the Dual Simplex Variant?

**MOSEK** provides both a primal and a dual simplex optimizer. Predicting which simplex optimizer is faster is impossible, however, in recent years the dual optimizer has seen several algorithmic and computational improvements, which, in our experience, make it faster on average than the primal version. Still, it depends much on the problem structure and size. Setting the `iparam.optimizer`

parameter to `optimizertype.free_simplex`

instructs **MOSEK** to choose one of the simplex variants automatically.

To summarize, if you want to know which optimizer is faster for a given problem type, it is best to try all the options.

## 13.3.2 The Interior-point Optimizer¶

The purpose of this section is to provide information about the algorithm employed in the **MOSEK** interior-point optimizer for linear problems and about its termination criteria.

### 13.3.2.1 The homogeneous primal-dual problem¶

In order to keep the discussion simple it is assumed that **MOSEK** solves linear optimization problems of standard form

This is in fact what happens inside **MOSEK**; for efficiency reasons **MOSEK** converts the problem to standard form before solving, then converts it back to the input form when reporting the solution.

Since it is not known beforehand whether problem (1) has an optimal solution, is primal infeasible or is dual infeasible, the optimization algorithm must deal with all three situations. This is the reason why **MOSEK** solves the so-called homogeneous model

where \(y\) and \(s\) correspond to the dual variables in (1), and \(\tau\) and \(\kappa\) are two additional scalar variables. Note that the homogeneous model (2) always has solution since

is a solution, although not a very interesting one. Any solution

to the homogeneous model (2) satisfies

Moreover, there is always a solution that has the property \(\tau^* + \kappa^* > 0\).

First, assume that \(\tau^*>0\) . It follows that

This shows that \(\frac{x^*}{\tau^*}\) is a primal optimal solution and \((\frac{y^*}{\tau^*},\frac{s^*}{\tau *})\) is a dual optimal solution; this is reported as the optimal interior-point solution since

is a primal-dual optimal solution (see Section 12.1 for the mathematical background on duality and optimality).

On other hand, if \(\kappa^* >0\) then

This implies that at least one of

or

is satisfied. If (3) is satisfied then \(x^*\) is a certificate of dual infeasibility, whereas if (4) is satisfied then \(y^*\) is a certificate of primal infeasibility.

In summary, by computing an appropriate solution to the homogeneous model, all information required for a solution to the original problem is obtained. A solution to the homogeneous model can be computed using a primal-dual interior-point algorithm [And09].

### 13.3.2.2 Interior-point Termination Criterion¶

For efficiency reasons it is not practical to solve the homogeneous model exactly. Hence, an exact optimal solution or an exact infeasibility certificate cannot be computed and a reasonable termination criterion has to be employed.

In the \(k\)-th iteration of the interior-point algorithm a trial solution

to homogeneous model is generated, where

Optimal case

Whenever the trial solution satisfies the criterion

the interior-point optimizer is terminated and

is reported as the primal-dual optimal solution. The interpretation of (5) is that the optimizer is terminated if

- \(\frac{x^k}{\tau^k}\) is approximately primal feasible,
- \(\left\lbrace \frac{y^k}{\tau^k},\frac{s^k}{\tau^k} \right\rbrace\) is approximately dual feasible, and
- the duality gap is almost zero.

Dual infeasibility certificate

On the other hand, if the trial solution satisfies

then the problem is declared dual infeasible and \(x^k\) is reported as a certificate of dual infeasibility. The motivation for this stopping criterion is as follows: First assume that \(\left\| A x^k \right\|_\infty = 0\) ; then \(x^k\) is an exact certificate of dual infeasibility. Next assume that this is not the case, i.e.

and define

It is easy to verify that

which shows \(\bar{x}\) is an approximate certificate of dual infeasibility, where \(\varepsilon_{i}\) controls the quality of the approximation. A smaller value means a better approximation.

Primal infeasibility certificate

Finally, if

then \(y^k\) is reported as a certificate of primal infeasibility.

### 13.3.2.3 Adjusting optimality criteria and near optimality¶

It is possible to adjust the tolerances \(\varepsilon_{p}\), \(\varepsilon_{d}\), \(\varepsilon_{g}\) and \(\varepsilon_{i}\) using parameters; see table for details.

ToleranceParameter | name |
---|---|

\(\varepsilon_{p}\) | `dparam.intpnt_tol_pfeas` |

\(\varepsilon_{d}\) | `dparam.intpnt_tol_dfeas` |

\(\varepsilon_{g}\) | `dparam.intpnt_tol_rel_gap` |

\(\varepsilon_{i}\) | `dparam.intpnt_tol_infeas` |

The default values of the termination tolerances are chosen such that for a majority of problems appearing in practice it is not possible to achieve much better accuracy. Therefore, tightening the tolerances usually is not worthwhile. However, an inspection of (5) reveals that the quality of the solution depends on \(\left\| b \right\|_{\infty}\) and \(\left\| c \right\|_{\infty}\); the smaller the norms are, the better the solution accuracy.

The interior-point method as implemented by **MOSEK** will converge toward optimality and primal and dual feasibility at the same rate [And09]. This means that if the optimizer is stopped prematurely then it is very unlikely that either the primal or dual solution is feasible. Another consequence is that in most cases all the tolerances, \(\varepsilon_p\), \(\varepsilon_d\), \(\varepsilon_g\) and \(\varepsilon_i\), have to be relaxed together to achieve an effect.

In some cases the interior-point method terminates having found a solution not too far from meeting the optimality condition (5). A solution is defined as *near optimal* if scaling the termination tolerances \(\varepsilon_{p}\), \(\varepsilon_{d}\), \(\varepsilon_{g}\) and \(\varepsilon_{g}\) by the same factor \(\varepsilon_{n}\in [1.0,+\infty]\) makes the condition (5) satisfied. A near optimal solution is therefore of lower quality but still potentially valuable. If for instance the solver stalls, i.e. it can make no more significant progress towards the optimal solution, a near optimal solution could be available and be good enough for the user. Near infeasibility certificates are defined similarly. The value of \(\varepsilon_n\) can be adjusted with the parameter `dparam.intpnt_co_tol_near_rel`

.

The basis identification discussed in Section 13.3.2.4 requires an optimal solution to work well; hence basis identification should be turned off if the termination criterion is relaxed.

To conclude the discussion in this section, relaxing the termination criterion is usually not worthwhile.

### 13.3.2.4 Basis Identification¶

An interior-point optimizer does not return an optimal basic solution unless the problem has a unique primal and dual optimal solution. Therefore, the interior-point optimizer has an optional post-processing step that computes an optimal basic solution starting from the optimal interior-point solution. More information about the basis identification procedure may be found in [AY96]. In the following we provide an overall idea of the procedure.

There are some cases in which a basic solution could be more valuable:

- a basic solution is often more accurate than an interior-point solution,
- a basic solution can be used to warm-start the simplex algorithm in case of reoptimization,
- a basic solution is in general more sparse, i.e. more variables are fixed to zero. This is particularly appealing when solving continuous relaxations of mixed integer problems, as well as in all applications in which sparser solutions are preferred.

To illustrate how the basis identification routine works, we use the following trivial example:

It is easy to see that all feasible solutions are also optimal. In particular, there are two basic solutions, namely

The interior point algorithm will actually converge to the center of the optimal set, i.e. to \((x^*,y^*)=(1/2,1/2)\) (to see this in **MOSEK** deactivate Presolve).

In practice, when the algorithm gets close to the optimal solution, it is possible to construct in polynomial time an initial basis for the simplex algorithm from the current interior point solution. This basis is used to warm-start the simplex algorithm that will provide the optimal basic solution. In most cases the constructed basis is optimal, or very few iterations are required by the simplex algorithm to make it optimal and hence the final *clean-up* phase be short. However, for some cases of ill-conditioned problems the additional simplex clean up phase may take of lot a time.

By default **MOSEK** performs a basis identification. However, if a basic solution is not needed, the basis identification procedure can be turned off. The parameters

control when basis identification is performed.

The type of simplex algorithm to be used (primal/dual) can be tuned with the parameter `iparam.bi_clean_optimizer`

, and the maximum number of iterations can be set with `iparam.bi_max_iterations`

.

Finally, it should be mentioned that there is no guarantee on which basic solution will be returned.

### 13.3.2.5 The Interior-point Log¶

Below is a typical log output from the interior-point optimizer:

```
Optimizer - threads : 1
Optimizer - solved problem : the dual
Optimizer - Constraints : 2
Optimizer - Cones : 0
Optimizer - Scalar variables : 6 conic : 0
Optimizer - Semi-definite variables: 0 scalarized : 0
Factor - setup time : 0.00 dense det. time : 0.00
Factor - ML order time : 0.00 GP order time : 0.00
Factor - nonzeros before factor : 3 after factor : 3
Factor - dense dim. : 0 flops : 7.00e+001
ITE PFEAS DFEAS GFEAS PRSTATUS POBJ DOBJ MU TIME
0 1.0e+000 8.6e+000 6.1e+000 1.00e+000 0.000000000e+000 -2.208000000e+003 1.0e+000 0.00
1 1.1e+000 2.5e+000 1.6e-001 0.00e+000 -7.901380925e+003 -7.394611417e+003 2.5e+000 0.00
2 1.4e-001 3.4e-001 2.1e-002 8.36e-001 -8.113031650e+003 -8.055866001e+003 3.3e-001 0.00
3 2.4e-002 5.8e-002 3.6e-003 1.27e+000 -7.777530698e+003 -7.766471080e+003 5.7e-002 0.01
4 1.3e-004 3.2e-004 2.0e-005 1.08e+000 -7.668323435e+003 -7.668207177e+003 3.2e-004 0.01
5 1.3e-008 3.2e-008 2.0e-009 1.00e+000 -7.668000027e+003 -7.668000015e+003 3.2e-008 0.01
6 1.3e-012 3.2e-012 2.0e-013 1.00e+000 -7.667999994e+003 -7.667999994e+003 3.2e-012 0.01
```

The first line displays the number of threads used by the optimizer and the second line tells that the optimizer chose to solve the dual problem rather than the primal problem. The next line displays the problem dimensions as seen by the optimizer, and the `Factor...`

lines show various statistics. This is followed by the iteration log.

Using the same notation as in Section 13.3.2 the columns of the iteration log have the following meaning:

`ITE`

: Iteration index \(k\).`PFEAS`

: \(\left\| Ax^k - b \tau^k \right\|_{\infty}\) . The numbers in this column should converge monotonically towards zero but may stall at low level due to rounding errors.`DFEAS`

: \(\left\| A^T y^{k} + s^k - c \tau^k \right\|_{\infty}\) . The numbers in this column should converge monotonically towards zero but may stall at low level due to rounding errors.`GFEAS`

: \(| - c^T x^k + b^T y^{k} - \kappa^k|\) . The numbers in this column should converge monotonically towards zero but may stall at low level due to rounding errors.`PRSTATUS`

: This number converges to \(1\) if the problem has an optimal solution whereas it converges to \(-1\) if that is not the case.`POBJ`

: \(c^T x^k / \tau^k\). An estimate for the primal objective value.`DOBJ`

: \(b^T y^k / \tau^k\). An estimate for the dual objective value.`MU`

: \(\frac{ (x^k)^T s^k + \tau^k \kappa^k }{n+1}\) . The numbers in this column should always converge to zero.`TIME`

: Time spent since the optimization started.

## 13.3.3 The Simplex Optimizer¶

An alternative to the interior-point optimizer is the simplex optimizer. The simplex optimizer uses a different method that allows exploiting an initial guess for the optimal solution to reduce the solution time. Depending on the problem it may be faster or slower to use an initial guess; see section 13.3.1 for a discussion. **MOSEK** provides both a primal and a dual variant of the simplex optimizer.

### 13.3.3.1 Simplex Termination Criterion¶

The simplex optimizer terminates when it finds an optimal basic solution or an infeasibility certificate. A basic solution is optimal when it is primal and dual feasible; see Section 12.1 for a definition of the primal and dual problem. Due to the fact that computations are performed in finite precision **MOSEK** allows violations of primal and dual feasibility within certain tolerances. The user can control the allowed primal and dual tolerances with the parameters `dparam.basis_tol_x`

and `dparam.basis_tol_s`

.

Setting the parameter `iparam.optimizer`

to `optimizertype.free_simplex`

instructs **MOSEK** to select automatically between the primal and the dual simplex optimizers. Hence, **MOSEK** tries to choose the best optimizer for the given problem and the available solution. The same parameter can also be used to force one of the variants.

### 13.3.3.2 Starting From an Existing Solution¶

When using the simplex optimizer it may be possible to reuse an existing solution and thereby reduce the solution time significantly. When a simplex optimizer starts from an existing solution it is said to perform a *warm-start*. If the user is solving a sequence of optimization problems by solving the problem, making modifications, and solving again, **MOSEK** will warm-start automatically.

By default **MOSEK** uses presolve when performing a warm-start. If the optimizer only needs very few iterations to find the optimal solution it may be better to turn off the presolve.

### 13.3.3.3 Numerical Difficulties in the Simplex Optimizers¶

Though **MOSEK** is designed to minimize numerical instability, completely avoiding it is impossible when working in finite precision. **MOSEK** treats a “numerically unexpected behavior” event inside the optimizer as a *set-back*. The user can define how many set-backs the optimizer accepts; if that number is exceeded, the optimization will be aborted. Set-backs are a way to escape long sequences where the optimizer tries to recover from an unstable situation.

Examples of set-backs are: repeated singularities when factorizing the basis matrix, repeated loss of feasibility, degeneracy problems (no progress in objective) and other events indicating numerical difficulties. If the simplex optimizer encounters a lot of set-backs the problem is usually badly scaled; in such a situation try to reformulate it into a better scaled problem. Then, if a lot of set-backs still occur, trying one or more of the following suggestions may be worthwhile:

- Raise tolerances for allowed primal or dual feasibility: increase the value of
- Raise or lower pivot tolerance: Change the
`dparam.simplex_abs_tol_piv`

parameter. - Switch optimizer: Try another optimizer.
- Switch off crash: Set both
`iparam.sim_primal_crash`

and`iparam.sim_dual_crash`

to 0. - Experiment with other pricing strategies: Try different values for the parameters
- If you are using warm-starts, in rare cases switching off this feature may improve stability. This is controlled by the
`iparam.sim_hotstart`

parameter. - Increase maximum number of set-backs allowed controlled by
`iparam.sim_max_num_setbacks`

. - If the problem repeatedly becomes infeasible try switching off the special degeneracy handling. See the parameter
`iparam.sim_degen`

for details.

### 13.3.3.4 The Simplex Log¶

Below is a typical log output from the simplex optimizer:

```
Optimizer - solved problem : the primal
Optimizer - Constraints : 667
Optimizer - Scalar variables : 1424 conic : 0
Optimizer - hotstart : no
ITER DEGITER(%) PFEAS DFEAS POBJ DOBJ TIME TOTTIME
0 0.00 1.43e+05 NA 6.5584140832e+03 NA 0.00 0.02
1000 1.10 0.00e+00 NA 1.4588289726e+04 NA 0.13 0.14
2000 0.75 0.00e+00 NA 7.3705564855e+03 NA 0.21 0.22
3000 0.67 0.00e+00 NA 6.0509727712e+03 NA 0.29 0.31
4000 0.52 0.00e+00 NA 5.5771203906e+03 NA 0.38 0.39
4533 0.49 0.00e+00 NA 5.5018458883e+03 NA 0.42 0.44
```

The first lines summarize the problem the optimizer is solving. This is followed by the iteration log, with the following meaning:

`ITER`

: Number of iterations.`DEGITER(%)`

: Ratio of degenerate iterations.`PFEAS`

: Primal feasibility measure reported by the simplex optimizer. The numbers should be 0 if the problem is primal feasible (when the primal variant is used).`DFEAS`

: Dual feasibility measure reported by the simplex optimizer. The number should be 0 if the problem is dual feasible (when the dual variant is used).`POBJ`

: An estimate for the primal objective value (when the primal variant is used).`DOBJ`

: An estimate for the dual objective value (when the dual variant is used).`TIME`

: Time spent since this instance of the simplex optimizer was invoked (in seconds).`TOTTIME`

: Time spent since optimization started (in seconds).