11.4 Multiprocessor Scheduling

In this case study we consider a simple scheduling problem in which a set of jobs must be assigned to a set of identical machines. We want to minimize the makespan of the overall processing, i.e. the latest machine termination time.

The main aims of this case study are

  • to show how to define a Integer Linear Programming model,

  • to take advantage of Fusion operators to compactly express sets of constraints,

  • to provide the solver with an incumbent integer feasible solution.

Mathematical formulation

We are given a set of jobs \(J\) with \(|J|=n\) to be assigned to a set \(M\) of identical machines with \(|M|=m\). Each job \(j\in J\) has a processing time \(T_j>0\) and can be assigned to any machine. Our aim is to find the job scheduling that minimizes the overall makespan, i.e. the maximum completion time among all machines.

Formally, we introduce a binary variable \(x_{ij}\) that takes value \(1\) if the job \(j\) is assigned to the machine \(i\), zero otherwise. The only constraint we need to set is the requirement that a job must be assigned to a single machine. The optimization model takes the following form:

(11.17)\[\begin{split}\begin{array}{llr} & \min \max_{i\in M} \sum_{j\in J} T_j x_{ij} &\\ \mbox{s.t.} & \sum_{i\in M} x_{ij} = 1, & j\in J,\\ & x_{ij} \in \{0,1\} & \forall i\in M, j \in J. \end{array}\end{split}\]

Model (11.17) can be easily transformed into an integer linear programming model as follows:

(11.18)\[\begin{split}\begin{array}{llr} & \min t &\\ \mbox{s.t.} & \sum_{i\in M} x_{ij} = 1, & j\in J,\\ & t \geq \sum_{j\in J} T_j x_{ij}, & i\in M,\\ & x_{ij} \in \{0,1\}, & \forall i\in M, j \in J. \end{array}\end{split}\]

The implementation of this model in Fusion is straightforward:

  Model::t M = new Model("Multi-processor scheduling"); auto _M = finally([&]() { M->dispose(); });

  Variable::t x = M->variable("x", new_array_ptr<int, 1>({m, n}), Domain::binary());
  Variable::t t = M->variable("t", 1, Domain::unbounded());

  M->constraint( Expr::sum(x, 0), Domain::equalsTo(1.) );
  M->constraint( Expr::sub( Var::repeat(t, m), Expr::mul(x, T) ) , Domain::greaterThan(0.) );

  M->objective( ObjectiveSense::Minimize, t );

Most of the code is self-explanatory. The only critical point is

  M->constraint( Expr::sub( Var::repeat(t, m), Expr::mul(x, T) ) , Domain::greaterThan(0.) );

that implements the set of constraints

\[t \geq \sum_{j\in J} T_j x_{ij}, \quad i\in M.\]

To fit in Fusion we restate the constraints as

\[t - \sum_{j\in J} T_j x_{ij} \geq 0, \quad i\in M,\]

which corresponds in matrix form to

(11.19)\[t \mathbf{1} - x T \geq 0.\]

The function Var.repeat creates a vector of length \(m\), as required for (11.19). The same result can be obtained via matrix multiplication, i.e. using Expr.mul, but in this particular case Var.repeat is faster as it only performs a logical operation.

Longest Processing Time first rule (LPT)

The multiprocessor scheduling is known to be an NP-complete problem (see [GJ79]). Nevertheless there are effective heuristics, with provable worst case bounds, that are able to provide a good integer solution quickly. In particular, we will use the so-called Longest Processing Time first rule (LPT, proposed in [Gra69]).

The informal algorithm sketch is the following:

  • while M is not empty do

    • let k be the machine with the smallest load so far,

    • let i be the job in M with the longest completion time,

    • assign job i to machine k,

    • update the load of machine k,

    • remove i from M.

This simple algorithm is a \(\frac{1}{3}(4-\frac{1}{m})\) approximation. So for \(m=1\) we get the optimal solution (indeed there is no choice with a single machine); for \(m\rightarrow\infty\) the approximation factor is no worse than \(4/3\) (again see [Gra69]).

A simple implementation is given below.

  //LPT heuristic
  auto schedule = std::shared_ptr<ndarray<double, 1> >(new ndarray<double, 1>(m, 0.));
  auto init = std::shared_ptr<ndarray<double, 1> >(new ndarray<double, 1>(n * m, 0.));

  for (int i = 0; i < n; i++)
  {
    auto pos = std::distance(schedule->begin(), std::min_element(schedule->begin(), schedule->end()));
    (*schedule)[pos] += (*T)[i];
    (*init)[pos * n + i] = 1;
  }

An efficient implementation of the LPT rule is beyond the scope of this section. The important part is that the scheduling produced by the LPT algorithm can be used as incumbent solution for the MOSEK mixed-integer linear programming solver. The availability of an integer feasible solution can significantly improve the performance of the solver.

To input the solution we only need to use the Variable.setLevel method, as shown below

  x->setLevel(init);

We can test the program with and without providing the initial LPT solution. Our random datasets consists of a mix of tasks with long and short processing times and we accept a solution at relative optimality tolerance \(0.01\). Some results are shown in the table below.

Table 11.1 Sample test results for the makespan problem.

\(n\)

\(m\)

long tasks

short tasks

No LPT

With LPT

1000

8

\(20\%\)

\(80\%\)

13.36s

1.23s

1000

8

\(80\%\)

\(20\%\)

1.35s

1.24s

100

12

\(20\%\)

\(80\%\)

16.37s

0.11s

100

12

\(80\%\)

\(20\%\)

16.62s

10.01s

20

20

\(0\%\)

\(100\%\)

10.38s

21.88s

We can see that depending on the structure and parameters of the problem it may pay off to provide an initial LPT solution. Therefore it is always recommended to test the mixed-integer solver with different settings to find the most efficient setup for a given problem.

Listing 11.10 Complete code for the LPT scheduling example. Click here to download.
#include <iostream>
#include <random>
#include <sstream>

#include "fusion.h"

using namespace mosek::fusion;
using namespace monty;


int main(int arc, char** argv)
{
  double lb = 1.0;          //Bounds for the length of a short task
  double ub = 5.;

  int     n = 30;           //Number of tasks
  int     m = 6;            //Number of processors

  double  sh = 0.8;         //The proportion of short tasks
  int     n_short = (int)(sh * n);
  int     n_long = n - n_short;

  auto gen = std::bind(std::uniform_real_distribution<double>(lb, ub), std::mt19937(0));

  auto T = std::shared_ptr<ndarray<double, 1> >(new ndarray<double, 1>(n));
  for (int i = 0; i < n_short; i++) (*T)[i] = gen();
  for (int i = n_short; i < n; i++) (*T)[i] = 20 * gen();
  std::sort(T->begin(), T->end(), std::greater<double>());

  Model::t M = new Model("Multi-processor scheduling"); auto _M = finally([&]() { M->dispose(); });

  Variable::t x = M->variable("x", new_array_ptr<int, 1>({m, n}), Domain::binary());
  Variable::t t = M->variable("t", 1, Domain::unbounded());

  M->constraint( Expr::sum(x, 0), Domain::equalsTo(1.) );
  M->constraint( Expr::sub( Var::repeat(t, m), Expr::mul(x, T) ) , Domain::greaterThan(0.) );

  M->objective( ObjectiveSense::Minimize, t );

  //LPT heuristic
  auto schedule = std::shared_ptr<ndarray<double, 1> >(new ndarray<double, 1>(m, 0.));
  auto init = std::shared_ptr<ndarray<double, 1> >(new ndarray<double, 1>(n * m, 0.));

  for (int i = 0; i < n; i++)
  {
    auto pos = std::distance(schedule->begin(), std::min_element(schedule->begin(), schedule->end()));
    (*schedule)[pos] += (*T)[i];
    (*init)[pos * n + i] = 1;
  }

  //Comment this line to switch off feeding in the initial LPT solution
  x->setLevel(init);

  M->setLogHandler([ = ](const std::string & msg) { std::cout << msg << std::flush; } );

  M->setSolverParam("mioTolRelGap", .01);
  M->solve();

  std::cout << "initial solution: \n";
  for (int i = 0; i < m; i++)
  {
    std::cout << "M " << i << " [";
    for (int y = 0; y < n; y++)
      std::cout << int( (*init)[i * n + y] ) << ", ";
    std::cout << "]\n";
  }

  std::cout << "MOSEK solution:\n";
  for (int i = 0; i < m; i++)
  {
    std::cout << "M " << i << " [";
    for (int y = 0; y < n; y++)
      std::cout << int((*(x->index(i, y)->level()))[0]) << ", ";
    std::cout << "]\n";
  }

  return 0;
}