Download - Maximization Problems with Submodular Objective Functions Moran Feldman Publication List Improved Approximations for k-Exchange Systems. Moran Feldman,

Maximization Problems with Submodular Objective Functions

Moran Feldman

Publication List• Improved Approximations for k-Exchange Systems.

Moran Feldman, Joseph (Seffi) Naor, Roy Schwartz and Justin Ward, ESA, 2011.• A Unified Continuous Greedy Algorithm for Submodular Maximization.

Moran Feldman, Joseph (Seffi) Naor and Roy Schwartz, FOCS 2011.• A Tight Linear Time (1/2)-Approximation for Unconstrained Submodular Maximization.

Niv Buchbinder, Moran Feldman, Joseph (Seffi) Naor and Roy Schwartz, FOCS 2012.

2

Outline• Preliminaries– What is a submodular function?

• Unconstrained Submodular Maximization• More Preliminaries• Maximizining a Submodular Function over a

Polytope Constraint

3

Set Functions

DefinitionGiven a ground set N, a set function f : 2N R assigns a number to every subset of the ground set.

Intuition• Consider a player participating in an auction on a set N of elements.• The utility of the player from buying a subset N’ N of elements is

given by a set function f.

Modular FunctionsA set function is modular (linear) if there is a fixed utility vu for every element u N, and the utility of a subset N’ N is given by:

Nu

uv

4

Properties of Set FunctionsMotivation• Modularity is often a too strong property.• Many weaker properties have been defined.• A modular function has all these properties.

Normalization• No utility is gained from the set of no elements.• f() = 0.

Monotonicity• More elements cannot give less utility.• For every two sets A B N, f(A) f(B).

Subadditivity• Two sets of elements give less utility together than separately.• For every two sets A, B N, f(A) + f(B) f(A B).• Intuition: I want to hang a single painting in the living room, but I have nothing

to do with a second painting.

5

Properties of Set Functions (cont.)Difficulty• Subadditivity is often a too weak property.

Solution• A stronger property called submodularity.• The marginal contribution of an element u to a set A decreases when

adding elements to A.

NotationGiven a set A, and an elements u, fu(A) is the marginal contribution of u to A:

Formal Definition

AfuAfAfu

For sets A B N, and u B:fu(A) fu(B)

For sets A, B N:f(A) + f(B) f(A B) + f(A B)

6

Submodular Function - Example

0567

10811

0

Too heavy

• Normalized• Nonmonotone• Submodular

54-8

7

Where can One Find Submodular Set Functions?

QuestionSubmodularity looks peculiar. Is it common in real life?

Answer• Yes, submodular set functions pop often in many settings.• Submodular functions represents “economy of scale”, which makes

them useful in economics.• Examples of submodular functions in combinatorial settings:

Ground Set Submodular FunctionNodes of a graph The number of edges leaving a

set of nodes.Collection of sets The number of elements in the

union of the of a sub-collection.

8

Unconstrained Submodular Maximization

InstanceA non-negative submodular function f : 2N R+.

ObjectiveFind a subset S N maximizing f(S).

Most Recent Previous Works• 0.41 approximation via simulated annealing [Gharan and Vondrak

11]• 0.42 approximation via a combination of simulated annealing and

the Structural Continuous Greedy Algorithm [Feldman et al. 11]• 0.5 hardness [Feige et al. 07]

9

The Players• The algorithm we present manages two solutions (sets):

– X – Initially the empty set.– Y – Initially the set of all elements (N).

• The elements are ordered in an arbitrary order: u1, u2, …, un.• The algorithm has n iterations, one for each element. In iteration i the algorithm

decides whether to add ui to X or to remove it from Y.

• The following notation is defined using these players:

Formally:

The contribution of adding ui to X

ai = f(X {ui}) – f(X)

The contribution of removing ui from Y

bi = f(Y \ {ui}) – f(Y)

10

More on the Algorithm

• Above we can see the initial configuration of the algorithm.– The set Y is red. Initially it contains all the elements.– The set X is green. Initially it contains no elements.

• Assume the algorithm now performs the following moves:– Add u1 to X.– Remove u2 from Y.

– Add u3 to X.

– Add u4 to X.– Remove u5 from Y.

• Notice: when the algorithm terminates, X = Y is the output of the algorithm.

IN

Out

u1 u2 u3 u4 u5

11

Notation• Xi and Yi - the values of the sets X and Y after i iterations.

• OPTi - the optimal set such that: Xi OPTi Yi.• V – the value of the algorithm’s output.

For i = 0OPT0 = OPT

X0 = Y0 = N

For i = nOPTn = Xn = Yn

f(OPTn) = f(Xn) = f(Yn) = V

Observations• As i increases, f(OPTi) deteriorates from f(OPT) to V.

• V [f(Xn) – f(X0) + f(Yn) – f(Y0)] / 2, i.e., V is at least the increase in f(X) + f(Y) over 2.

12

Roadmap• To analyze the algorithm we show that for some constant c and every

iteration i:f(OPTi-1) - f(OPTi) c ∙ [f(Xi) – f(Xi-1) + f(Yi) – f(Yi-1)]

Deterioration of OPT Improvement in f(X) + f(Y)

• Summing over all the iterations, we get:Deterioration in OPT c ∙ Improvement in f(X) + f(Y)f(OPT) - V 2c ∙ V

V f(OPT) / [1 + 2c]

• The value of c depends on how the algorithm decides whether to add ui to X or remove it from Y.

(*)

13

Deterministic RuleIf ai bi

• then add ui to X,

• else remove ui from Y.

LemmaAlways, ai + bi 0.

LemmaIf the algorithm adds ui to X (respectively, removes ui from Y), and OPTi-1 OPTi , then the deterioration in OPT is at most bi (respectively, ai).ProofSince OPTi-1 OPTi , we know that ui OPTi-1.

Non-standard greedy algorithm

f(OPTi-1) – f(OPTi) f(OPTi-1) – f(OPTi-1 {ui}) f(Y \ {ui}) – f(Y) = bi

14

Deterministic Rule - Analysis• Assume ai bi , and the algorithm adds ui to X (the other case is

analogues).• Cleary ai 0 since ai + bi 0.• If ui OPTi-1 , then:– f(OPTi-1) - f(OPTi) = 0– f(Xi) – f(Xi-1) + f(Yi) – f(Yi-1) = ai 0– Inequality (*) holds for every non-negative c.

• If ui OPTi-1 , then:– f(OPTi-1) - f(OPTi) bi

– f(Xi) – f(Xi-1) + f(Yi) – f(Yi-1) = ai bi

– Inequality (*) holds for c = 1.• The approximation ratio is: 1 / [1 + 2c] = 1/3.

15

Random Rule1. If bi 0, then add ui to X and quit.2. If ai 0, then remove ui from Y and quit.3. With probability ai / (ai + bi) add ui to X, otherwise (with probability bi / (ai +

bi)), remove ui from Y.

Three cases: one for each line of the algorithm.

Case 1 (bi 0)• Cleary ai 0 since ai + bi 0.

• Thus, f(Xi) – f(Xi-1) + f(Yi) – f(Yi-1) = ai 0.

• If ui OPTi-1 , then f(OPTi-1) - f(OPTi) = 0

• If ui OPTi-1 , then f(OPTi-1) - f(OPTi) bi 0• Inequality (*) holds for every non-negative c.

16

Random Rule - AnalysisCase 2 (ai 0)Analogues to the previous case.

Case 3 (ai , bi > 0)The improvement is:

Assume, w.l.o.g., ui OPTi-1, the deterioration in OPT is:

Inequality (*) holds for c = 0.5, since:

The approximation ratio is: 1 / [1 + 2c] = 1/2.

ii

iii

ii

ii

ii

iiiii ba

babba

baba

aYfYfXfXf

22

11

ii

iii

ii

i

ii

iii ba

baaba

bba

aOPTfOPTf

01

iiii baba 222

18

Polytope Constraints

• We abuse notation and identify a set S with its characteristic vector in [0, 1]N.– Given a set S, we use S also to denote its characteristic vector.– Given a vector x {0, 1}N, we denote by x also the set whose

characteristic vector is x.

NPx

xf1,0.t.s

max

Nxxxx

xxxxf

1,03322

13.t.smax

321

321

• Using this notation, we can define LP like problems:

• More generally, maximizing a submodular functrion subject to a polytope P constraint is the problem:

19

Relaxation• The last program:

– requires integer solutions.– generalizes “integer programming”.– is unlikely to have a reasonable approximation.

• We need to relax it.– We replace the constraint x {0,1}N with x [0,1]N.– The difficulty is to extend the objective function to fractional vectors.

• The multilinear extension F (a.k.a. extension by expectation) [Calinescu et al. 07].– Given a vector x [0, 1]N, let R(x) denote a random set containing

every element u N with probability xu independently.– F(x) = E[f(R(x))].

20

Introducing the ProblemMaximizining a Submodular Function over a Polytope Constraint• Asks to find (or approximate) the optimal solution for a relaxation

of the form presented.• Formally for as large a constant c as possible:

– Given , find a feasible x such that: F(x) c ∙ f(OPT).

Motivation• For many polytopes, a fractional solution can be rounded without

losing too much in the objective.– Matroid Polytopes – no loss [Calinescu et al. 11].– Constant number of knapsacks – (1 – ε) loss [Kulik et al. 09].– Unsplittable flow in trees – O(1) loss [Chekuri et al. 11].

NPx

xF1,0.t.s

max

21

The Continuous Greedy AlgorithmThe Algorithm [Vondrak 08]• Let δ > 0 be a small number.1. Initialize: y(0) and t 0.2. While t < 1 do:3. For every u N, let wu = F(y(t) u) – F(y(t)).4. Find a solution x in P [0, 1]N maximizing w ∙ x.5. y(t + δ) y(t) + δ ∙ x6. Set t t + δ 7. Return y(t)

Remark• Calculation of wu can be tricky for some functions.• Assuming one can evaluate f, the value of wu can be approximated via

sampling.

22

The Continuous Greedy Algorithm - Demonstration

Observations• The algorithm is somewhat like gradient descent.• The algorithm moves only in positive directions because the

extension F is guaranteed to be concave in such directions.

y(0)

y(0.01)y(0.02)

y(0.03)

y(0.04)

1x2x

3x

4x

23

The Continuous Greedy Algorithm - Results

Theorem [Calinescu et al. 11]• Assuming,– f is a normalized monotone submodular function.– P is a solvable polytope.

• The continuous greedy algorithm gives 1 – 1/e – o(n-1) approximation.

• By guessing the element of OPT with the maximal marginal contribution, one can get an optimal 1 – 1/e approximation.

Proof Idea• Show that w x∙ w ∙ OPT is large.• Show that the improvement in each iteration δ ∙ w ∙ x.• Sum up the improvements over all the iterations.

The Continuous Greedy Algorithm - Rethinking

QuestionWhat is the difference between gradient descent and the continuous greedy?

Answer• In gradient descent the direction chosen is the direction x

maximizing x ∙F.• In the continuous greedy the direction chosen is the direction x

maximizing x ∙ w, where wu is the marginal contribution of u.

24

)\( uNtyF tyF utyF tyu tyu1

Marginal contributionDerivative (gradient)

25

The Measured Continuous Greedy Algorithm

IdeaTo guarantee a gain of δ ∙ w ∙ x, it is enough to increase yu by: δ ∙ xu ∙ (1 – yu(t)).

The Algorithm• Let δ > 0 be a small number.1. Initialize: y(0) and t 0.2. While t < T do:3. For every u N, let wu = F(y(t) u) – F(y(t)).4. Find a solution x in P {0, 1}N maximizing w ∙ x.5. For every u N, yu(t + δ) yu(t) + δ ∙ xu (1 – ∙ yu(t)).6. Set t t + δ 7. Return y(t)

26

The Measured Continuous Greedy Algorithm - Results

Theorem• Assuming,– f is a non-negative submodular function.– P is a solvable down-montone polytope.

• The approximation ratio of the measured continuous greedy algorithm with T = 1 is 1/e – o(n-1).

Remarks• For monotone f, we get the an approximation ratio of 1-e-T.

For T = 1, this is the ratio of the continuous greedy.• The solution is no longer a convex combination of points of P:– The sum of the coefficients is at most T.– For T 1, the output is in P since P is down-monotone.

27

The Measured Continuous Greedy Algorithm - Analysis

Helper LemmaGiven x [0, 1]N and [0, 1], let T(x) be a set containing every element u N such that xu .• For every x [0, 1]N,ProofOmitted due to time constraints.

Lemma 1There is a good direction, i.e., w ∙ x e-t ∙ f(OPT) – F(y(t)).Proof• Notice that the increase in yu(t) is at most δ ∙ (1 – yu(t)).

• If δ is infinitely small, yu(t) is upper bounded by the solution of the following differential equation.

• For small δ, yu(t) is almost upper bounded by 1 – e-t.

1

0 dxTfxF

00),(1/ ytydtdy tety 1)(

28

The Continuous Greedy Algorithm – Analysis (cont.)

• OPT itself is a feasible direction. Its weight is:

Lemma 2The improvement is related to w ∙ x, i.e., F(y(t + δ)) F(y(t)) + δ ∙ w ∙ x.Proof• For small δ:

Nuuu

Nuuuu

xw

tyxtyFtyFtyF

1

tyFOPTfetyFdOPTtyTf

tyFOPTtyFtyFutyFw

t

e

OPTuOPTuu

t

1

1

29

Proof of the Theorem• Combining the last two lemmata gives:

F(y(t + δ)) - F(y(t)) δ ∙ [e-t ∙ f(OPT) – F(y(t))]

• Let g(t) = F(y(t)). For small δ the last equation becomes the differential equation:

dg/dt = e-t ∙ f(OPT) – g(t), g(0) = 0

• The solution of this equation is:g(t) = t ∙ e-t ∙ f(OPT)

• Hence, at time t = 1, the value of the solution is at least g(1) = e-1 ∙f(OPT).

30

Result for Monotone Functions

• For non-monotone functions, the approximation ratio is T ∙ e-T, which is maximized for T = 1.

• For monotone functions, the approximation ratio is 1 – e-T, which improves as T increases.

• In general, the solution produced for T > 1:– Is always within the cube [0, 1]N.– Might be outside the polytope.

• However, for some polytopes, somewhat larger values of T can be used.

31

The Submodular Welfare ProblemInstance• A set P of n players • A set Q of m items• Normalized monotone submodular utility function

wj: 2Q R+ for each player.

Objective• Let Qj Q denote the set of items the jth player gets.• The utility of the jth player is wj(Qj).• Distribute the items among the players, maximizing the sum

of utilities.

32

The PolytopeProblem Representation• Each item is represented by n elements in the ground set (each one

corresponding to its assignment to a different player).• The objective is the sum of the utility functions of the players, each

applied to the items allocated to it.• The polytope requires that every item will be allocated to at most one

player.

Analyzing a Single Constraint• All constraints are of the form:• Let ti be the amount of time in which the algorithm increases xi. Notice

that xi = 1 – e-ti.• By definition, , and it can be shown that for T -n ln (1–n∙ -1)

it always holds that:

11

n

i ix

Ttn

i i 1

1111

n

itn

i iiex

33

Approximating the Submodular Welfare Problem

• Apply the measured continuous greedy till time:T = - n ln (1 – 1/n)∙

• By the previous analysis, the solution produced will be in the polytope.

• The expected value of the solution is at least:

• Round the solution using the natural rounding:– Assign an item to each player with probability equal to the

corresponding variable.• This approximation ratio is tight for every n. [Vondrak 06]

nnn ne /1111 /11ln

34

Open Problems• The measured continuous greedy algorithm:– Provides tight approximation for monotone functions

[Nemhauser and Wolsey 78].– Is this also the case for non-monotone functions?– The current approximation ratio of e-1 is a natural number.

• Uniform Matroids– The approximability depends on k/n.

• For k/n = 0.5, we have 0.5 approximation.• Recently, a hardness of a bit less than 0.5 was shown for k/n

approaching 0. [Gharan and Vondrak 11]– What is the correct approximation ratio as a function of k/n?– We know it is always possible to bit the e-1 ratio.

Download - Maximization Problems with Submodular Objective Functions Moran Feldman Publication List Improved Approximations for k-Exchange Systems. Moran Feldman,

Top Related