parallelizing the dual revised simplex method - maths.ed.ac.uk · parallelizing the dual revised...

20
Parallelizing the dual revised simplex method Qi Huangfu 1 Julian Hall 2 1 FICO 2 School of Mathematics, University of Edinburgh Birmingham 9 September 2016

Upload: lamnhan

Post on 28-Aug-2019

225 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Parallelizing the dual revised simplex method - maths.ed.ac.uk · Parallelizing the dual revised simplex method Qi Huangfu1 Julian Hall2 1FICO 2School of Mathematics, University of

Parallelizing the dual revised simplex method

Qi Huangfu1 Julian Hall2

1FICO

2School of Mathematics, University of Edinburgh

Birmingham

9 September 2016

Page 2: Parallelizing the dual revised simplex method - maths.ed.ac.uk · Parallelizing the dual revised simplex method Qi Huangfu1 Julian Hall2 1FICO 2School of Mathematics, University of

Overview

Background

Two parallel schemes

Single iteration parallelismMultiple iteration parallelism

An open source parallel simplex solver

Qi Huangfu and Julian Hall Parallelizing the dual revised simplex method 2 / 20

Page 3: Parallelizing the dual revised simplex method - maths.ed.ac.uk · Parallelizing the dual revised simplex method Qi Huangfu1 Julian Hall2 1FICO 2School of Mathematics, University of

Linear programming (LP)

minimize cTx

subject to Ax = b x ≥ 0

Background

Fundamental model in optimaldecision-making

Solution techniques

◦ Simplex method (1947)◦ Interior point methods (1984)

Large problems have

◦ 103–108 variables◦ 103–108 constraints

Matrix A is (usually) sparse

Example

STAIR: 356 rows, 467 columns and 3856 nonzeros

Qi Huangfu and Julian Hall Parallelizing the dual revised simplex method 3 / 20

Page 4: Parallelizing the dual revised simplex method - maths.ed.ac.uk · Parallelizing the dual revised simplex method Qi Huangfu1 Julian Hall2 1FICO 2School of Mathematics, University of

Solving LP problems

minimize fP = cTx maximize fD = bTy

subject to Ax = b x ≥ 0 (P) subject to ATy + s = c s ≥ 0 (D)

Optimality conditions

For a partition B ∪N of the variable set with nonsingular basis matrix B in

BxB + NxN = b for (P) and

[BT

NT

]y +

[sB

sN

]=

[cB

cN

]for (D)

with xN = 0 and sB = 0Primal basic variables xB given by b = B−1b

Dual non-basic variables sN given by cTN = cT

N − cTB B

−1N

Partition is optimal if there is

Primal feasibility b ≥ 0Dual feasibility cN ≥ 0

Qi Huangfu and Julian Hall Parallelizing the dual revised simplex method 4 / 20

Page 5: Parallelizing the dual revised simplex method - maths.ed.ac.uk · Parallelizing the dual revised simplex method Qi Huangfu1 Julian Hall2 1FICO 2School of Mathematics, University of

Simplex algorithm: Each iteration

RHS

aq

aTp

cTN

apq

cq

bp

b

N

B

Dual algorithm: Assume cN ≥ 0 Seek b ≥ 0

Scan bi , i ∈ B, for a good candidate p to leave B CHUZR

Scan cj/apj , j ∈ N , for a good candidate q to leave N CHUZC

Update: Exchange p and q between B and NUpdate b := b − θpaq θp = bp/apq UPDATE-PRIMAL

Update cTN := cTN − θd aTp θd = cq/apq UPDATE-DUAL

Qi Huangfu and Julian Hall Parallelizing the dual revised simplex method 5 / 20

Page 6: Parallelizing the dual revised simplex method - maths.ed.ac.uk · Parallelizing the dual revised simplex method Qi Huangfu1 Julian Hall2 1FICO 2School of Mathematics, University of

Simplex method: Computation

Standard simplex method: Major computational component

Update of tableau:N := N − 1

apqaqa

Tp

where N = B−1N

Hopelessly inefficient for sparse LP problems

Prohibitively expensive for large LP problems

Revised simplex method: Major computational components

πTp = eT

p B−1 BTRAN aTp = πT

p N PRICE

aq = B−1aq FTRAN Invert B INVERT

Qi Huangfu and Julian Hall Parallelizing the dual revised simplex method 6 / 20

Page 7: Parallelizing the dual revised simplex method - maths.ed.ac.uk · Parallelizing the dual revised simplex method Qi Huangfu1 Julian Hall2 1FICO 2School of Mathematics, University of

Simplex method: Algorithmic enhancements

Row selection: Dual steepest edge (DSE)

Weight bi by wi : measure of ‖B−Te i‖2Requires additional FTRAN but can reduce iteration count significantly

Column selection: Bound-flipping ratio test (BFRT)

Minimizes the dual objective whilst remaining dual feasible

Dual variables may change sign if corresponding primal variables can flip bounds

Requires additional FTRAN but can reduce iteration count significantly

Qi Huangfu and Julian Hall Parallelizing the dual revised simplex method 7 / 20

Page 8: Parallelizing the dual revised simplex method - maths.ed.ac.uk · Parallelizing the dual revised simplex method Qi Huangfu1 Julian Hall2 1FICO 2School of Mathematics, University of

Exploiting parallelism: Background

Data parallel standard simplex method

Good parallel efficiency of N := N − 1

apqaqa

Tp was achieved

Only relevant for dense LP problems

Data parallel revised simplex method

Only immediate parallelism is in forming πTp N

When n� m significant speed-up was achieved Bixby and Martin (2000)

Task parallel revised simplex method

Overlap computational components for different iterationsWunderling (1996), H and McKinnon (1995-2005)

Modest speed-up was achieved on general sparse LP problems

Qi Huangfu and Julian Hall Parallelizing the dual revised simplex method 8 / 20

Page 9: Parallelizing the dual revised simplex method - maths.ed.ac.uk · Parallelizing the dual revised simplex method Qi Huangfu1 Julian Hall2 1FICO 2School of Mathematics, University of

Single iteration parallelism: Dual revised simplex method

Computational components appear sequential

Each has highly-tuned sparsity-exploiting serial implementation

Exploit “slack” in data dependencies

Qi Huangfu and Julian Hall Parallelizing the dual revised simplex method 9 / 20

Page 10: Parallelizing the dual revised simplex method - maths.ed.ac.uk · Parallelizing the dual revised simplex method Qi Huangfu1 Julian Hall2 1FICO 2School of Mathematics, University of

Single iteration parallelism: Computational scheme

Parallel PRICE to form aTp = πTp N

Other computational componentsserial

Overlap any independent calculations

Only four worthwhile threads unlessn� m so PRICE dominates

More than Bixby and Martin (2000)

Better than Forrest (2012)

Huangfu and H (2014)

Qi Huangfu and Julian Hall Parallelizing the dual revised simplex method 10 / 20

Page 11: Parallelizing the dual revised simplex method - maths.ed.ac.uk · Parallelizing the dual revised simplex method Qi Huangfu1 Julian Hall2 1FICO 2School of Mathematics, University of

Single iteration parallelism: clp vs hsol vs sip

1 2 3 4 50

20

40

60

80

100

clp hsol sip (8 cores)

Performance on spectrum of 30 significant LP test problems

sip on 8 cores is 1.15 times faster than hsol

hsol (sip on 8 cores) is 2.29 (2.64) times faster than clp

Qi Huangfu and Julian Hall Parallelizing the dual revised simplex method 11 / 20

Page 12: Parallelizing the dual revised simplex method - maths.ed.ac.uk · Parallelizing the dual revised simplex method Qi Huangfu1 Julian Hall2 1FICO 2School of Mathematics, University of

Multiple iteration parallelism

sip has too little work to be performed in parallel to get good speedup

Perform standard dual simplex minor iterations for rows in set P (|P| � m)

Suggested by Rosander (1975) but never implemented efficiently in serial

RHS

aTP

cTN

b

bPB

N

Task-parallel multiple BTRAN to form πP = B−1ePData-parallel PRICE to form aTp (as required)

Task-parallel multiple FTRAN for primal, dual and weight updates

Huangfu and H (2011–2014)Qi Huangfu and Julian Hall Parallelizing the dual revised simplex method 12 / 20

Page 13: Parallelizing the dual revised simplex method - maths.ed.ac.uk · Parallelizing the dual revised simplex method Qi Huangfu1 Julian Hall2 1FICO 2School of Mathematics, University of

Multiple iteration parallelism: cplex vs pami vs hsol

1 2 3 4 50

20

40

60

80

100

cplex pami (8 cores) hsol pami

pami is less efficient than hsol in serial

pami speedup more than compensates

pami performance approaching cplex

Qi Huangfu and Julian Hall Parallelizing the dual revised simplex method 13 / 20

Page 14: Parallelizing the dual revised simplex method - maths.ed.ac.uk · Parallelizing the dual revised simplex method Qi Huangfu1 Julian Hall2 1FICO 2School of Mathematics, University of

Multiple iteration parallelism: cplex vs xpress

1 1.25 1.5 1.75 20

20

40

60

80

100

cplex xpress xpress (8 cores)

pami ideas incorporated in FICO Xpress (Huangfu 2014)

Qi Huangfu and Julian Hall Parallelizing the dual revised simplex method 14 / 20

Page 15: Parallelizing the dual revised simplex method - maths.ed.ac.uk · Parallelizing the dual revised simplex method Qi Huangfu1 Julian Hall2 1FICO 2School of Mathematics, University of

hsol: An open source parallel simplex solver

hsol and its variants (sip and pami) are high performance codes

Useful open-source resource?

Reliable?Well-written?Better than clp?

Limitations

No presolveNo crashNo advanced basis start

Qi Huangfu and Julian Hall Parallelizing the dual revised simplex method 15 / 20

Page 16: Parallelizing the dual revised simplex method - maths.ed.ac.uk · Parallelizing the dual revised simplex method Qi Huangfu1 Julian Hall2 1FICO 2School of Mathematics, University of

hsol: Performance and reliability

Extended testing using 159 test problems

98 Netlib

16 Kennington

4 Industrial

41 Mittelmann

Exclude 7 which are “hard”

Reliability on 152 test problems

hsol, sip and pami fail on cycle

pami with 8 threads fails on stat96v1

Qi Huangfu and Julian Hall Parallelizing the dual revised simplex method 16 / 20

Page 17: Parallelizing the dual revised simplex method - maths.ed.ac.uk · Parallelizing the dual revised simplex method Qi Huangfu1 Julian Hall2 1FICO 2School of Mathematics, University of

hsol: Performance and reliability

Performance

Benchmark against clp (v1.16) and cplex (v12.5)

Dual simplex

No presolve

No crash

Ignore results for 82 LPs with minimum solution time below 0.1s

Qi Huangfu and Julian Hall Parallelizing the dual revised simplex method 17 / 20

Page 18: Parallelizing the dual revised simplex method - maths.ed.ac.uk · Parallelizing the dual revised simplex method Qi Huangfu1 Julian Hall2 1FICO 2School of Mathematics, University of

hsol: Performance and reliability

1 2 3 4 50

20

40

60

80

100

clp hsol sip8 pami8 cplex

Qi Huangfu and Julian Hall Parallelizing the dual revised simplex method 18 / 20

Page 19: Parallelizing the dual revised simplex method - maths.ed.ac.uk · Parallelizing the dual revised simplex method Qi Huangfu1 Julian Hall2 1FICO 2School of Mathematics, University of

hsol: Addressing limitations

Presolve

Prototype implementation of a full range of presolve techniquesGalabova (2016)

Crash

Standard crash procedures are simple to implement

Questionable value for dual simplex:dual steepest edge is expensive to initialise when B 6= I

Advanced basis start

Essential for hsol to be used in MIP and SLP solvers

Only complication is dual steepest edge

Qi Huangfu and Julian Hall Parallelizing the dual revised simplex method 19 / 20

Page 20: Parallelizing the dual revised simplex method - maths.ed.ac.uk · Parallelizing the dual revised simplex method Qi Huangfu1 Julian Hall2 1FICO 2School of Mathematics, University of

Conclusions

Two parallel schemes for general LP problems

Meaningful performance improvementHave led to publicised advances in a leading commercial solver

High performance open-source parallel simplex solver in preparation

Slides: http://www.maths.ed.ac.uk/hall/NLAO 16/

Paper: Q. Huangfu and J. A. J. HallParallelizing the dual revised simplex methodTechnical Report ERGO-14-011, School of Mathematics, University of Edinburgh, 2014Accepted for publication in Mathematical Programming Computation

Qi Huangfu and Julian Hall Parallelizing the dual revised simplex method 20 / 20