combining adaptive and dynamic local search for satisfiability

24
Journal on Satisfiability, Boolean Modeling and Computation * (2007) 01-25 Combining Adaptive and Dynamic Local Search for Satisfiability Duc Nghia Pham [email protected] John Thornton [email protected] Charles Gretton [email protected] Abdul Sattar [email protected] SAFE Program, Queensland Research Lab, NICTA Ltd., Australia and IIIS, Griffith University, QLD, Australia Abstract In this paper we describe a stochastic local search (SLS) procedure for finding models of satisfiable propositional formulae. This new algorithm, gNovelty + , draws on the fea- tures of two other WalkSAT family algorithms: AdaptNovelty + and G 2 WSAT, while also successfully employing a hybrid clause weighting heuristic based on the features of two dynamic local search (DLS) algorithms: PAWS and (R)SAPS. gNovelty + was a Gold Medal winner in the random category of the 2007 SAT competi- tion. In this paper we present a detailed description of the algorithm and extend the SAT competition results via an empirical study of the effects of problem structure, parameter tuning and resolution preprocessors on the performance of gNovelty + . The study compares gNovelty + with three of the most representative WalkSAT-based solvers: AdaptG 2 WSAT0, G 2 WSAT and AdaptNovelty + , and two of the most representative DLS solvers: RSAPS and PAWS. Our new results augment the SAT competition results and show that gNovelty + is also highly competitive in the domain of solving structured satisfiability problems in comparison with other SLS techniques. Keywords: SAT-solver, local search, clause weighting, adaptive heuristic Submitted October 2007; revised January 2008; published 1. Introduction The satisfiability (SAT) problem is one of the best known and well-studied problems in com- puter science, with many practical applications in domains such as theorem proving, hard- ware verification and planning. The techniques used to solve SAT problems can be divided into two main areas: complete search techniques based on the well-known Davis-Putnam- Logemann-Loveland (DPLL) algorithm [3] and stochastic local search (SLS) techniques evolving out of Selman and Kautz’s 1992 GSAT algorithm [18]. As for SLS techniques, there have been two successful but distinct avenues of development: the WalkSAT family of algorithms [14] and the various dynamic local search (DLS) clause weighting approaches (e.g. [15]). Since the early 1990s, the state-of-the-art in SAT solving has moved forward from only being able to solve problems containing hundreds of variables to the routine solution of c 2007 Delft University of Technology and the authors.

Upload: independent

Post on 16-Nov-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

Journal on Satisfiability, Boolean Modeling and Computation * (2007) 01-25

Combining Adaptive and Dynamic Local Search forSatisfiability

Duc Nghia Pham [email protected]

John Thornton [email protected]

Charles Gretton [email protected]

Abdul Sattar [email protected]

SAFE Program, Queensland Research Lab, NICTA Ltd., AustraliaandIIIS, Griffith University, QLD, Australia

Abstract

In this paper we describe a stochastic local search (SLS) procedure for finding modelsof satisfiable propositional formulae. This new algorithm, gNovelty+, draws on the fea-tures of two other WalkSAT family algorithms: AdaptNovelty+ and G2WSAT, while alsosuccessfully employing a hybrid clause weighting heuristic based on the features of twodynamic local search (DLS) algorithms: PAWS and (R)SAPS.

gNovelty+ was a Gold Medal winner in the random category of the 2007 SAT competi-tion. In this paper we present a detailed description of the algorithm and extend the SATcompetition results via an empirical study of the effects of problem structure, parametertuning and resolution preprocessors on the performance of gNovelty+. The study comparesgNovelty+ with three of the most representative WalkSAT-based solvers: AdaptG2WSAT0,G2WSAT and AdaptNovelty+, and two of the most representative DLS solvers: RSAPS andPAWS. Our new results augment the SAT competition results and show that gNovelty+

is also highly competitive in the domain of solving structured satisfiability problems incomparison with other SLS techniques.

Keywords: SAT-solver, local search, clause weighting, adaptive heuristic

Submitted October 2007; revised January 2008; published

1. Introduction

The satisfiability (SAT) problem is one of the best known and well-studied problems in com-puter science, with many practical applications in domains such as theorem proving, hard-ware verification and planning. The techniques used to solve SAT problems can be dividedinto two main areas: complete search techniques based on the well-known Davis-Putnam-Logemann-Loveland (DPLL) algorithm [3] and stochastic local search (SLS) techniquesevolving out of Selman and Kautz’s 1992 GSAT algorithm [18]. As for SLS techniques,there have been two successful but distinct avenues of development: the WalkSAT familyof algorithms [14] and the various dynamic local search (DLS) clause weighting approaches(e.g. [15]).

Since the early 1990s, the state-of-the-art in SAT solving has moved forward from onlybeing able to solve problems containing hundreds of variables to the routine solution of

c©2007 Delft University of Technology and the authors.

D.N. Pham et al.

problems with millions of variables. One of the key reasons for this success has been thekeen competition between researchers and the public availability of the source code of thebest techniques. Nowadays the SAT community organises regular competitions on large setsof benchmark problems and awards prizes to the best performing algorithms in differentproblem categories. In this paper we introduce the current 2007 SAT competition1. GoldMedal winner in the satisfiable random problem category: gNovelty+.

gNovelty+ evolved from a careful analysis of the SLS solvers that participated in the2005 SAT competition and was initially designed only to compete on random SAT problems.It draws on the strengths of two WalkSAT variants which respectively came first and secondin the random category of the 2005 SAT competition: R+AdaptNovelty+ [1] and G2WSAT[12]. In addition, gNovelty+ connects the two branches of SLS (WalkSAT and DLS) byeffectively exploiting a hybrid clause weighting heuristic based on ideas taken from the twomain approaches to clause weighting DLS algorithms: additive weighting (e.g. PAWS [21])and multiplicative weighting (e.g. (R)SAPS [11]).

In the remainder of the paper we describe in more detail the techniques used in G2WSAT,R+AdaptNovelty+, PAWS and (R)SAPS before discussing the strengths and weaknessesof these solvers based on the results from the 2005 SAT competition and our own study.We then provide a full explanation of the execution of gNovelty+ followed by an experi-mental evaluation on a range of random and structured problems. As the performance ofgNovelty+ on random problems is now a matter of public record,2. this evaluation examinesthe performance of gNovelty+ on a broad benchmark set of structured problems, testingthe effects of parameter tuning and resolution preprocessing in comparison with a range ofstate-of-the-art SLS solvers. Finally, we present our conclusions and outline some directionsfor future research.

2. Preliminaries

In this section, we briefly describe and summarise the key techniques used in four SLSsolvers that represent the state-of-the-art in the two main streams of SLS development: theWalkSAT family and clause weighting DLS solvers.

2.1 AdaptNovelty+

During the mid-1990s, Novelty [14] was considered to be one of the most competitive tech-niques in the WalkSAT family. Starting from a random truth assignment to the problemvariables, Novelty repeatedly changes single variable assignments (i.e. it makes a flip move)until a solution is found. The cost of flipping a variable (i.e. flipping the assignment of thatvariable) is defined as the number of unsatisfied clauses after x is flipped. In more detail,at each search step Novelty greedily selects the best variable x from a random unsatisfiedclause c such that flipping x leads to the minimal number of unsatisfied clauses. If thereis more than one variable with the same flip cost, the least recently flipped variable willbe selected. In addition, if x is the most recently flipped variable, then the second best

1. http://www.satcompetition.org2. More detailed results from the competition are available at http://www.satcompetition.org/

2

Combining Adaptive and Dynamic Local Search for Satisfiability

variable from clause c will be selected with a fixed noise probability p. This flip selectionprocedure is outlined in lines 10-13 of Algorithm 1.

Although Novelty generally achieves better results than other WalkSAT variants intro-duced during its time [14], due to its deterministic variable selection3. it may loop indefinitelyand fail to return a solution even where one exists [7, 12]. We refer the reader to [7] foran example instance that is satisfiable but for which Novelty is unable to find a solutionregardless of the noise parameter setting. Hoos [7] solved this problem by adding a ran-dom walk behaviour (lines 7-9 in Algorithm 1) to the Novelty procedure. The resultingNovelty+ algorithm randomly flips a variable from a randomised unsatisfied clause c witha walk probability wp and behaves exactly as Novelty otherwise.

Algorithm 1 AdaptNovelty+(F, wp=0.01)1: randomly generate an assignment A;2: while not timeout do3: if A satisfies F then4: return A as the solution;5: else6: randomly select an unsatisfied clause c;7: if within a walking probability wp then8: randomly select a variable x in c;9: else

10: greedily select the best variable x in c, breaking ties by selecting the least recently flippedpromising variable;

11: if x is the most recently flipped variable in c AND within a noise probability p then12: re-select x as the second best variable;13: end if14: end if15: update A with the flipped value of x;16: adaptively adjust the noise probability p;17: end if18: end while19: return ‘no solution found’;

It was shown that the performance of every WalkSAT variant (including Novelty andNovelty+) critically depends on the setting of the noise parameter p which, in turn, controlsthe level of greediness of the search [8, 14]. This means that without extensive empiricaltuning, the average case performance of a WalkSAT algorithm is quite poor. Hoos [8] ad-dressed this problem by proposing an adaptive version of WalkSAT that dynamically adjuststhe noise value based on the automatic detection of search stagnation. This AdaptNovelty+

version of Novelty+ (outlined in Algorithm 1) starts with p = 0 (i.e. the solver is completelygreedy in selecting the next move). If the search enters a stagnation stage (i.e. it encountersa local minimum where none of the considered moves yields fewer unsatisfied clauses thanthe current assignment), then the noise value is gradually increased to allow the selectionof non-greedy moves that will allow the search to overcome its stagnation. Once the localminimum is escaped, the noise value is reduced to again make the search more greedy. Hoos[8] demonstrated experimentally that this adaptive noise mechanism is effective both withNovelty+ and the other WalkSAT variants.

3. Novelty only selects the next move from the two best variables of a randomly selected unsatisfied clause.

3

D.N. Pham et al.

2.2 G2WSAT

More recently Li and Huang [12] proposed a new heuristic to solve the problem of deter-minism in Novelty (discussed in the previous section). Rather than using a Novelty+-typerandom walk [7], they opted for a solution based on the timestamping of variables to makethe selection process more diversified. The resulting Novelty++ heuristic (lines 9-14 in Al-gorithm 2) selects the least recently flipped variable from a randomly selected clause c forthe next move with a diversification probability dp, otherwise it performs as Novelty. Liand Huang [12] further improved Novelty++ by combining the greedy heuristic in GSAT[18] with a variant of tabu search [6] as follows: during the search, all variables that, ifflipped, do not strictly minimise the objective function are considered tabu (i.e. they can-not be selected for flipping during the greedy phase). Once a variable x is flipped, onlythose variables that become promising as a consequence of flipping x (i.e. that will strictlyimprove the objective function if flipped) will lose their tabu status and become available forgreedy variable selection. The resulting G2WSAT solver (outlined in Algorithm 2) alwaysselects the most promising non-tabu variable for the next move, if such variable is available.If there is more than one variable with the best score, G2WSAT selects the least recentlyflipped one, and if the search hits a local minimum, G2WSAT disregards the tabu list andperforms as Novelty++ until it escapes.

Algorithm 2 G2WSAT(F, dp, p)1: randomly generate an assignment A;2: while not timeout do3: if A satisfies F then4: return A as the solution;5: else6: if there exist promising variables then7: greedily select the most non-tabu promising variable x, breaking ties by selecting the least

recently flipped promising variable;8: else9: randomly select an unsatisfied clause c;

10: if within a diversification probability dp then11: select the least recently flipped variable x in c;12: else13: select a variable x in c according to the Novelty heuristic;14: end if15: end if16: update A with the flipped value of x;17: update the tabu list;18: end if19: end while20: return ‘no solution found’;

2.3 (R)SAPS

As opposed to the previously discussed SLS algorithms (that use a count of unsatisfiedclauses as the search objective function) DLS algorithms associate weights with clauses of agiven formula and use the sum of weights of unsatisfied clauses as the objective function forthe selection of the next move. Typically, clause weights are initialised to 1 and are dynam-

4

Combining Adaptive and Dynamic Local Search for Satisfiability

ically adjusted during the search to help in avoiding or escaping local minima. Dependingon how clause weights are updated, DLS solvers can be divided into two main categories:multiplicative weighting and additive weighting. Algorithm 3 sketches out the basics of theScaling and Probabilistic Smoothing (SAPS) algorithm [11], which is arguably the currentbest DLS solver in the multiplicative category. At each search step, SAPS greedily attemptsto flip the most promising variable that strictly improves the weighted objective function.If no promising variable exists, SAPS randomly selects a variable for the next move withwalk probability wp. Otherwise, with probability (1−wp), SAPS multiplies the weights ofall unsatisfied clauses by a factor α > 1 and consequently directs future search to traversean assignment that will satisfy currently unsatisfied clauses. After updating weights, withsmooth probability sp clause weights are probabilistically smoothed and reduced to the av-erage clause weight by a factor ρ. This smoothing phase helps the search forget the earlierweighting decisions, as these past effects are generally no longer helpful to escape futurelocal minima.

Algorithm 3 (R)SAPS(F, wp=0.01, sp, α=1.3, ρ=0.8)1: initialise the weight of each clause to 1;2: randomly generate an assignment A;3: while not timeout do4: if A satisfies F then5: return A as the solution;6: else7: if there exist promising variables then8: greedily select a promising variable x that occurs in an unsatisfied clauses, breaking ties by

randomly selecting;9: else if within a walk probability wp then

10: randomly select a variable x;11: end if12: if x has been selected then13: update A with the flipped value of x;14: else15: scale the weights of unsatisfied clauses by a factor α;16: with probability sp smooth the weights of all clauses by a factor ρ;17: end if18: if in reactive mode then19: adaptively adjust the smooth probability sp;20: end if21: end if22: end while23: return ‘no solution found’;

SAPS has four parameters and its performance critically depends on finding the rightsettings for these parameters. Hutter, Tompkins & Hoos [11] attempted to dynamicallyadjust the value of the smooth probability sp using the same approach as AdaptNovelty+

[7], while holding the other three parameters (wp, α and ρ) fixed. Their experimental studyshowed that the new RSAPS solver can achieve similar and sometimes better results incomparison to SAPS [11]. However, the other parameters in RSAPS, especially ρ, still needto be manually tuned in order to achieve optimal performance [9, 20].

5

D.N. Pham et al.

2.4 PAWS

Recently, Thornton et al. [21] were the first to closely investigate the performance differencebetween additive and multiplicative weighting DLS solvers. Part of this study included thedevelopment of the Pure Additive Weighting Scheme (PAWS), which is now one of thebest DLS algorithms in the additive weighting category. The basics of PAWS are outlinedin Algorithm 4. Instead of performing a random walk when no promising variable existsas SAPS does, PAWS randomly selects and flips a flat-move variable with a fixed flat-move probability fp = 0.15.4. Otherwise, with probability (1 − fp), the weights of allunsatisfied clauses are increased by 1. After a fixed number winc of weight increases, PAWSdeterministically reduces the weights of all weighted clauses by 1. The experimental resultsconducted in [21, 20] demonstrated the overall superiority of PAWS over SAPS for solvinglarge and difficult problems.

Algorithm 4 PAWS(F, fp=0.15, winc)1: initialise the weight of each clause to 1;2: randomly generate an assignment A;3: while not timeout do4: if A satisfies F then5: return A as the solution;6: else7: if there exist promising variables then8: greedily select a promising variable x, breaking ties by randomly selecting;9: else if there exist flat-move variables AND within a flat-move probability fp then

10: randomly select a flat-move variable x;11: end if12: if x has been selected then13: update A with the flipped value of x;14: else15: increase the weights of unsatisfied clauses by 1;16: if has updated weights for winc times then17: reduce the weights of all weighted clauses by 1;18: end if19: end if20: end if21: end while22: return ‘no solution found’;

3. gNovelty+: An ‘Overall’ Solver for Random Problems

3.1 Observations from the 2005 SAT Competition

The initial development of gNovelty+ focussed on preparing for the 2007 SAT competi-tion. This meant concentrating on the random problem category, where SLS solvers havetraditionally outperformed complete solvers. Consequently we paid considerable atten-tion to the best performing techniques from this category in the 2005 SAT competition:

4. A flat-move variable is one that, if flipped, will cause no change to the objective function.

6

Combining Adaptive and Dynamic Local Search for Satisfiability

R+AdaptNovelty+, G2WSAT and R+PAWS.5. Table 1 summarises the performance ofthese solvers on random SAT instances in the first phase of the 2005 SAT competition.Note that R+AdaptNovelty+ and R+PAWS are variants of AdaptNovelty+ and PAWS, re-spectively, where resolution is used to preprocess the input problem before the main solveris called.

Large Size Problems Medium Size Problems

Solvers 3-sat 5-sat 7-sat 3-sat 5-sat 7-sat

R+AdaptNovelty+ 22 32 19 35 35 35

G2WSAT 37 2 14 35 35 35

R+PAWS 33 1 12 35 35 35

Table 1. The number of random instances solved by R+AdaptNovelty+, G2WSAT and R+PAWSin the first phase of the 2005 SAT competition.

From Table 1, it is clear that R+AdaptNovelty+ was able to win the 2005 competitionbecause of its superior performance on the large 5-sat and 7-sat instances. As the res-olution preprocessor employed by R+AdaptNovelty+ (and also R+PAWS) only operateson clauses of length ≤ 3 in the input and only adds resolvent clauses of length ≤ 3 to theproblem, this competition winning performance must be credited to the AdaptNovelty+

heuristic rather than to the effects of resolution.The large 3-sat instance results in Table 1 clearly show that R+AdaptNovelty+ was

outperformed by G2WSAT and R+PAWS. As AdaptNovelty+ limits its variable selectionto a single randomly selected unsatisfied clause while G2WSAT and PAWS pick the mostpromising variable from all unsatisfied clauses, we conjectured that the superior perfor-mance of G2WSAT and R+PAWS on 3-sat was due to this more aggressive greediness.

However, when considering the SAT competition results we should bear in mind thateach solver was only run once on each instance. This means that the random effects ofdifferent starting positions could have distorted the underlying average performance of eachalgorithm. In order to verify our observations, we therefore conducted our own experimentsin which each solver was run 100 times per instance to minimise any starting position effects.We used the original AdaptNovelty+ and PAWS algorithms6. to eliminate any advantagethese solvers may have obtained from the resolution preprocessor on the 3-sat instances.Figure 1 plots the head-to-head comparisons of these solvers on 12 3-sat instances, 12 5-satinstances and 10 7-sat instances randomly selected from the large benchmark set used inthe 2005 competition. All experiments (including those presented in subsequent sections)were performed on cluster of 16 computers, each with a single AMD Opteron 252 2.6GHzprocessor with 2GB of RAM, and each run was timed out at 600 seconds. More detailedresults are reported in Table 2.

These results confirm our conjectures that a more greedy heuristic (e.g. G2WSAT orPAWS) performs better on random 3-sat instances while a less greedy approach such as

5. R+PAWS was ranked third in the first phase of the 2005 SAT competition. However, due to the compe-tition rule that authors can have only one solver competing in the final phase, R+PAWS was withdrawnfrom the final phase of the competition as it was submitted by the same authors of R+AdaptNovelty+.

6. The winc parameter of PAWS in this experiment was set to 10 as it was in the competition.

7

D.N. Pham et al.

10−2

100

102

10−2

100

102

3−SAT

AdaptNovelty+

G2 W

SA

T

10−2

100

102

10−2

100

102

3−SAT

AdaptNovelty+

PA

WS

10−2

100

102

10−2

100

102

3−SAT

G2WSAT

PA

WS

10−2

100

102

10−2

100

102

5−SAT

AdaptNovelty+

G2 W

SA

T

10−2

100

102

10−2

100

102

5−SAT

AdaptNovelty+

PA

WS

10−2

100

102

10−2

100

102

5−SAT

G2WSAT

PA

WS

10−2

100

102

10−2

100

102

7−SAT

AdaptNovelty+

G2 W

SA

T

10−2

100

102

10−2

100

102

7−SAT

AdaptNovelty+

PA

WS

10−2

100

102

10−2

100

102

7−SAT

G2WSAT

PA

WS

Figure 1. Head-up comparison between AdaptNovelty+, G2WSAT and PAWS on selected ran-dom 3-SAT, 5-SAT and 7-SAT instances from the 2005 SAT competition.

AdaptNovelty+ is better on random 5-sat and 7-sat instances. The results also show thatPAWS without resolution preprocessing outperforms G2WSAT on 3-sat instances. Thisresult is consistent with the findings in [1] where resolution preprocessing was shown to harmthe performance of local search solvers on random problems. The outstanding performanceof PAWS further suggests that clause weighting provides useful guidance for random 3-SATinstances.

3.2 The Design of gNovelty+

On the basis of the preceding observations, we developed a new overall solver for ran-dom problems, called gNovelty+. We based this solver on G2WSAT as it provides agood framework for combining the strengths of the three solvers. We first replaced theNovelty++ heuristic in G2WSAT with the AdaptNovelty+ heuristic to enhance performanceon the 5-sat and 7-sat instances. We then moved the random walk step inherited from

8

Combining Adaptive and Dynamic Local Search for Satisfiability

Algorithm 5 gNovelty+(F, wp=0.01, sp, p)1: initialise the weight of each clause to 1;2: randomly generate an assignment A;3: while not timeout do4: if A satisfies F then5: return A as the solution;6: else7: if within a walking probability wp then8: randomly select a variable x that appears in an unsatisfied clause;9: else if there exist promising variables then

10: greedily select a non-tabu promising variable x, breaking ties by selecting the least recentlyflipped promising variable;

11: else12: greedily select the most promising variable x from a random unsatisfied clause c, breaking ties

by selecting the least recently flipped promising variable;13: if x is the most recently flipped variable in c AND within a noise probability p then14: re-select x as the second most promising variable;15: end if16: update the weights of unsatisfied clauses;17: with probability sp smooth the weights of all weighted clauses;18: end if19: update A with the flipped value of x;20: update the tabu list;21: adaptively adjust the noise probability p;22: end if23: end while24: return ‘no solution found’;

AdaptNovelty+ to the top of the solver to provide a better balance between diversificationand greediness. Finally, we integrated the additive clause weighting scheme from PAWS intogNovelty+. We selected the additive scheme as it is computationally cheaper and providesbetter guidance than its multiplicative counterpart. As shown in Table 2, RSAPS (whichimplements multiplicative weighting) performs significantly worse on random instances.However, we replaced the deterministic weight smoothing phase from PAWS with a linearversion of the probabilistic weight smoothing phase from SAPS. This gave us more flexibil-ity in controlling the greediness of gNovelty+ which proved to be useful in our experimentalstudy.

The basics of gNovelty+ are sketched out in Algorithm 5. It starts with a full randomassignment of values to all variables of the input problem and initialises all clause weightsto one. At each search step, gNovelty+ performs a random walk with a walk probability wpfixed to 0.01.7. With probability (1− wp), gNovelty+ selects the most promising non-tabuvariable that is also the least recently flipped, based on a weighted objective function thataims to minimise the sum of weights of all unsatisfied clauses. If no such promising variableexists, the next variable is selected using a heuristic based on AdaptNovelty that againuses the weighted objective function. After an AdaptNovelty step, gNovelty+ increasesthe weights of all currently unsatisfied clauses by 1. At the same time, with a smoothing

7. Hoos [7] empirically showed that setting wp to 0.01 is enough to make an SLS solver become “proba-bilistically approximately complete”.

9

D.N. Pham et al.

probability sp, gNovelty+ will reduce the weight of all weighted clauses by 1.8. It is alsoworth noting that gNovelty+ initialises and updates its tabu list of promising variablesin the same manner as G2WSAT with the following exception: all variables that becomepromising during the weight updating phase are removed from the tabu list. In addition,gNovelty+ only uses the tabu list when doing greedy variable selection and disregards thelist when it performs a random walk or an AdaptNovelty step.

We manually tuned the parameter sp of gNovelty+ on the small random 3-sat, 5-satand 7-sat instances by varying its value from 0 to 1 in steps of 0.1. It should be notedthat setting sp = 0 will stop gNovelty+ from performing its probabilistic weight smoothingphase, while setting sp = 1 will effectively turn off all gNovelty+’s clause weighting phases.It turned out that sp = 0.4 is the best setting for gNovelty+ on the random 3-sat instances,while sp = 1 was the best setting for the random 5-sat and 7-sat instances. We thereforeran gNovelty+ with these two sp settings on the 34 random problems reported in Figure 1 toevaluate its performance against its three predecessors. The detailed performance of thesetwo versions are reported in Table 2. The previously reported results of AdaptNovelty+,G2WSAT and PAWS are also included for comparison purposes. To give an idea of therelative performance of a multiplicative weighting algorithm, we included the results forRSAPS in Table 2 as well.

Overall these random problem results show that the performance of gNovelty+ closelyreflects the relative performance of the predecessor algorithms on which it is based. Firstly,on the 3-sat instances where PAWS dominates AdaptNovelty+ and G2WSAT, it is also thecase that gNovelty+ with weight (sp = 0.4) dominates its counterpart gNovelty+ withoutweight (sp = 1.0). Conversely, on the 5-sat and 7-sat results, where AdaptNovelty+

strongly dominates G2WSAT and PAWS, the gNovelty+ version without weight (sp = 1.0)performs significantly better than gNovelty+ with weight (sp = 0.4).

In addition, if we compare the best version of gNovelty+ against the best versionof its predecessors (i.e. gNovelty+(sp = 0.4) versus PAWS on the 3-sat instances andgNovelty+(sp = 1.0) versus AdaptNovelty+ on the 5-sat and 7-sat instances), the resultsshow that gNovelty+ is at least as good and often better than its counterparts when theproblems become bigger and harder. More specifically, gNovelty+ dominates all other solverson the bigger 5-sat problems (k5-v600 and k5-v800 instances) and is dominant interchange-ably with PAWS on the larger 3-sat k3-v6000 and k3-v8000 instances and AdaptNovelty+

on the 7-sat k7-v140 and k7-v160 instances. The runtime distributions (RTDs) in Figure2 further confirm that gNovelty+ has achieved our goal of becoming the best overall solveracross the three random problem categories.

Given the above results, we entered gNovelty+ into the 2007 SAT competition and setit to automatically adjust the value of its parameter sp depending on the input problemsize. If gNovelty+ detects that the input formula is a random 3-sat instance, it will runwith a smooth probability of sp = 0.4. Otherwise, it will reset sp back to 1.0. On thisbasis, gNovelty+ was able to win the Gold Medal for the Random SAT category of thecompetition.9.

8. A clause is weighted if its weight is greater than 1.9. http://www.satcompetition.org

10

Combining Adaptive and Dynamic Local Search for Satisfiability

Instances G2WSAT AdaptNovelty+ PAWS RSAPS gNovelty+ gNovelty+

(sp=0.4) (sp=1.0)

#flips #secs #flips #secs #flips #secs #flips #secs #flips #secs #flips #secs

k3-v4000-1672 450769

0.5000.796

8,0279,097

5.5906.388

296F466F

0.430F0.571F 0% success 693

1,0120.8651.247

18,49825,357

25.70036.303

k3-v4000-1674 9821,641

1.0351.655

13,76915,525

9.78010.988

458F705F

0.615F0.853F 0% success 1,316

1,8161.5602.161

36,98856,273

52.19079.284

k3-v4000-1680 560714

0.5500.689

8,30111,314

5.8557.902

326F493F

0.455F0.586F 0% success 859

1,1521.0251.362

18,83531,849

26.10544.762

k3-v4000-1681 1,2463,253

1.2402.911

20,08524,202

14.17017.131

667F1,203F

0.980F1.452F 0% success 1,953

2,3882.4452.937 94% success

k3-v6000-1682 1,8135,088

2.3505.842

34,91235,309

27.85028.277

755F1,247F

1.300F1.792F 0% success 3,059

3,6614.7205.513 88% success

k3-v6000-1683 90% success 59% success 96% success 0% success 29,138F36,750F

41.770F51.945F 2% success

k3-v6000-1684 4,67531,960

5.87534.043

81,38594,014

65.53075.126

1,550F3,773F

2.740F4.860F 0% success 5,997

8,9668.380

12.423 48% success

k3-v6000-1686 44,94492,358

46.09092.432 98% success 99% success 0% success 13,221F

19,550F19.545F28.457F 7% success

k3-v8000-1693 5,81329,910

8.19536.031

111,619126,067

102.240115.953

1,645F5,025F

3.255F7.306F 0% success 10,029

14,93717.91526.749 18% success

k3-v8000-1694 62% success 24% success 93% success 0% success 62,294F77,401F

104.160F126.752F 0% success

k3-v8000-1698 64,57892,255

72.910101.316 99% success 2,391F

10,251F4.930F

14.429F 0% success 17,23926,729

30.22045.820 3% success

k3-v8000-1701 83% success 54% success 98% success 0% success 47,290F58,711F

77.040F95.475F 1% success

k5-v500-1533 26,92634,975

88.675114.784

8,59810,069

16.29519.023F 6% success 0% success 2% success 3,317F

4,949F13.115F19.543

k5-v500-1537 11,89912,922

40.92044.209

1,6082,883

3.005F5.406F 35% success 1% success 17% success 1,133F

1,506F4.2955.757

k5-v500-1540 87% success 19,04026,131

36.180F49.562F 0% success 0% success 2% success 9,673F

13,817F38.45554.948

k5-v500-1541 11,28512,064

38.92041.344

1,3261,883

2.5403.598F 62% success 1% success 27% success 652F

995F2.520F3.868

k5-v600-1542 59% success 25,06941,388

49.29581.377 1% success 0% success 0% success 9,813F

13,311F41.375F56.013F

k5-v600-1544 45% success 31,19043,091

61.12084.402 0% success 0% success 0% success 13,561F

16,206F56.655F67.709F

k5-v600-1547 42% success 24,42743,789

47.560F85.215 0% success 0% success 0% success 13,894F

20,203F58.295

84.856Fk5-v600-1550 77% success 22,280

32,75843.54064.161 0% success 0% success 0% success 8,919F

11,992F37.255F50.094F

k5-v700-1552 42% success 97% success 0% success 0% success 0% success 12,073F17,648F

53.555F77.710F

k5-v700-1557 39% success 36,79651,272

73.570102.555 0% success 0% success 0% success 9,350F

11,692F40.925F51.131F

k5-v700-1558 55% success 23,39434,032

46.67568.051 0% success 0% success 0% success 7,207F

10,130F31.095F43.839F

k5-v700-1561 5% success 59% success 0% success 0% success 0% success 79% successFk7-v120-1583 99% success 3,989

6,40720.440F32.861

7,64313,564

49.41587.850 66% success 99% success 3,347F

4,560F21.935

29.881Fk7-v120-1584 97% success 11,652

15,93559.92081.898 94% success 37% success 90% success 6,100F

10,387F39.620F67.414F

k7-v120-1587 4,0136,204

28.34043.656

1,9352,709

9.92513.932

5,8517,779

36.05547.738 81% success 7,515

12,75247.99581.470

1,391F2,065F

8.940F13.231F

k7-v120-1591 11,54613,707

80.91595.852

6,6328,914

33.83545.474

7,18412,256

44.46075.892 61% success 98% success 4,830F

7,085F28.580F41.964F

k7-v140-1592 97% success 4,631F8,545F

24.495F45.177F 88% success 28% success 71% success 6,992

9,33448.19564.418

k7-v140-1597 58% success 95% success 48% success 8% success 37% success 96% successFk7-v140-1599 85% success 12,905F

18,652F69.960F

101.019F 71% success 19% success 50% success 99% success

k7-v140-1601 59% success 88% successF 55% success 12% success 32% success 81% success

k7-v160-1604 27% success 56% success 18% success 2% success 5% success 60% successFk7-v160-1606 7% success 37% successF 12% success 3% success 3% success 22% success

Table 2. Results on random k-SAT instances shown in the form: medianmean . Best results are marked

with F and flip counts are reported in thousands. On problems where a solver was timed out forsome runs, we report the percentage of success of that solver instead of its CPU time and flip count.

In the remainder of the paper, we focus on answering the question whether gNovelty+ (analgorithm designed specifically for random problems) has a wider field of useful application.To answer this, we devised an extensive experimental study to test gNovelty+ in comparisonwith other state-of-the-art SLS SAT solvers and across a range of benchmark structuredproblems.

11

D.N. Pham et al.

0 200 400 600 0%

20%

40%

60%

80%

100%3−SAT

CPU seconds

Sol

ved

inst

ance

s (%

)

gNovelty+

AdaptNovelty+

G2WSATPAWS

0 200 400 600 0%

20%

40%

60%

80%

100%5−SAT

CPU seconds

Sol

ved

inst

ance

s (%

)

0 200 400 600 0%

20%

40%

60%

80%

100%7−SAT

CPU seconds

Sol

ved

inst

ance

s (%

)

Figure 2. Runtime distribution of 4 solvers on random instances. The smooth probability ofgNovelty+ is set to 0.4 for 3-sat instances and 1.0 for 5-sat and 7-sat instances.

4. Experimental Setup and Benchmark Sets

As the performance of gNovelty+ in the SAT random category is already a matter of pub-lic record,10. we based our experimental study on a range of structured benchmark prob-lems that have been used in previous SLS comparison studies.11. Our problem test setcomprises of four circuit synthesis formula problems (2bitadd 11, 2bitadd 12, 3bitadd 31and 3bitadd 32), three all-interval series problems (ais10 to ais14), two blocksworld plan-ning problems (bw large.c and bw large.d), four Beijing scheduling problems (enddr2-1,enddr2-8, ewddr2-1 and ewddr2-8), two “flat” graph colouring problems (flat200-med andflat200-har), four large DIMACS graph colouring problems (g125.17 to g250.29), two lo-gistics planning problems (logistics.c and logistics.d), five 16-bit parity function learningproblems (par16-1-c to par16-5-c), and five hard quasi-group problems (qg1-08 to qg7-13).

As gNovelty+ combines the strengths of solvers from the WalkSAT series and DLSalgorithms, for comparison purposes we selected algorithms from each of the four possiblecategories, i.e. manual WalkSAT (G2WSAT [12]), adaptive WalkSAT (AdaptNovelty+ [8]),manual clause weighting (PAWS [20]) and adaptive clause weighting (RSAPS [11]). Inaddition, we included AdaptG2WSAT0 [13], an adaptive version of G2WSAT, as it camesecond in the random SAT category of the 2007 SAT competition. It should be noted thatthese algorithms have consistently dominated other local search techniques in the recentSAT competitions (where the majority of modern SAT solvers developed by the researchcommunity have competed). We therefore consider them to be a fair representation ofthe state-of-the-art. While other SAT solvers have been developed that may also haveproved competitive (e.g. commercial solvers), the lack of availability of their source codehas precluded their inclusion in the current work.

For this experimental study, we manually tuned the parameters of PAWS, G2WSAT andgNovelty+ to obtain optimal performance for each category of the problem set.12. Thesesettings are shown in Table 3 (note, only one parameter setting per algorithm was allowedfor each of the eight problem categories). Here we not only manipulated the gNovelty+ sp

10. See http://www.cril.univ-artois.fr/SAT07/slides-contest07.pdf11. See http://www.satlib.org12. The other three solvers (AdaptNovelty+, RSAPS and AdaptG2WSAT0) can automatically adapt the

values of their parameters during the search.

12

Combining Adaptive and Dynamic Local Search for Satisfiability

parameter but on some categories we also manually tuned the noise parameter of its Noveltycomponent. For G2WSAT we used the optimal settings for the noise and dp parameterspublished in [12, 13], and for PAWS we tuned the winc parameter.

Method Parameter Problem Category

bitadd ais bw large e*ddr flat200 g logistics par16 qg

gNovelty+ p adapted adapted 0.08 adapted adapted 0.10 adapted 0.05 0.02

sp 0.00 0.00 1.00 0.00 0.00 1.00 0.00 0.10 0.00

G2WSAT p 0.50 0.20 0.20 0.40 0.50 0.30 0.20 0.50 0.40

dp 0.05 0.05 0.00 0.45 0.06 0.01 0.05 0.01 0.03

PAWS winc 9 52 4 59 74 4 100 40 10

Table 3. Optimal parameter settings for each problem category.

5. Structured Problem Results

Table 4 shows the results obtained after manually tuning gNovelty+, G2WSAT and PAWSin comparison to the default adaptive behaviour of AdaptNovelty+, AdaptG2WSAT0 andRSAPS. Here the results for the best performing algorithm on each problem are shown inbold, with all results reporting the mean and median of 100 runs of each algorithm on eachinstance (each run was timed out after 600 seconds). In order to have a fair comparison, wedisabled the unit propagation preprocessor used in G2WSAT and AdaptG2WSAT0 in thetwo studies presented in this section. The results of all solvers in association with differentpreprocessors are discussed in later sections.

A brief overview shows that gNovelty+ has the best results for all bitadd, ais, bw large,e*ddr and logistics problems. In addition, it has the best results on the three hardestquasigroup problems (RSAPS won on two other instances) and is about equal with G2WSATon the flat graph colouring problems. Of the other algorithms, PAWS is the best for theparity problems, G2WSAT is the best for the two harder large graph instances while PAWSand RSAPS each won on one easier instance. On this basis gNovelty+ emerges as the bestalgorithm both in terms of the number of problems (19) and the number of problem classes(6) in which it dominates.

An even clearer picture emerges when we look at the overall proportion of runs thatcompleted within 600 seconds. Here, gNovelty+ achieves a 99.90% success rate comparedwith 88.90% for AdaptG2WSAT0, 88.32% for AdaptNovelty+, 84.13% for PAWS, 77.39%for G2WSAT and 72.06% for RSAPS. This observation is reinforced in the RTDs on theleft-hand of Figure 3 where the gNovelty+ curve dominates over the entire time range.

Overall, gNovelty+ not only outperforms the other techniques in the greatest numberof problem classes, it is within an order of magnitude of the best performing algorithmsin all remaining cases. It is this robust average case performance (that gNovelty+ alsodemonstrated in the SAT competition) that argues strongly for its usefulness as a generalpurpose solver.

However, if such robust behaviour depends critically on manually tuned parameter set-tings then the case for gNovelty+ must weaken. To evaluate this we tested gNovelty+ onthe same problem set with a default sp value of 0 (meaning clause weights are increased in

13

D.N. Pham et al.

Instances gNovelty+ AdaptG2WSAT0 G2WSAT AdaptNovelty+ RSAPS PAWS

#flips #secs #flips #secs #flips #secs #flips #secs #flips #secs #flips #secs

2bitadd 11 0.9280.947

0.000F0.001F

2.4582.348

0.0000.003

0.510F0.677F

0.0000.002

1.5921.693

0.0000.001

374577

0.2300.368 86% success

2bitadd 12 0.5760.625F

0.000F0.001F

1.7161.882

0.0000.003

0.394F200

0.0000.136

1.5211.554

0.0000.001

59111

0.0400.071

50,062127,135

30.10575.140

3bitadd 31 18F20F

0.060F0.068F

549817

1.1201.625 0% success 151

1630.4400.467 0% success 0% success

3bitadd 32 15F16F

0.060F0.058F

262318

0.5350.652 0% success 131

1350.4500.466 0% success 0% success

ais10 8.116F12F

0.010F0.012F

135200

0.1200.174

6688

0.0650.086

1,2911,937

1.1351.671

1318

0.0200.020

1320

0.0150.022

ais12 48F64F

0.060F0.080F

3,1274,320

3.4454.754

8791,428

1.0451.698

24,31535,292

28.14040.833

103151

0.1350.202

104192

0.1500.262

ais14 334F414F

0.435F0.537F

13,45523,628

15.85027.803

3,00810,800

3.87013.667 81% success 629

8920.8851.265

1,2671,677

1.8252.455

bw large.c 800F1,277F

1.440F2.222F

3,3174,930

3.9905.902

1,9913,297

2.0253.261

6,3509,355

6.5509.699

3,6065,361

10.72015.921

9911,604

1.4602.268

bw large.d 937F1,131F

2.645F3.098F

10,60015,714

21.55030.726

2,8914,296

4.7006.763

21,46230,616

37.05555.237 70% success 1,029

1,3712.7703.508

enddr2-1 39F44F

0.190F0.196F

2,4042,510

3.9804.110

223322

0.9151.187

5,7857,040

9.07511.187

6172

0.5050.534

4758

0.5000.519

enddr2-8 30F34F

0.170F0.180F

2,0882,164

3.6553.715

134158

0.5850.643

4,2875,173

6.9058.284

4954

0.4950.509

4144

0.5000.505

ewddr2-1 32F34F

0.190F0.192F

1,9392,079

3.4353.686

114134

0.5700.613

4,6606,193

7.97010.352

4854

0.5500.550

4547

0.5500.549

ewddr2-8 30F32F

0.200F0.198F

1,7571,828

3.3653.454

90103

0.5350.564

4,9596,515

8.69010.959

4850

0.6000.602

4344

0.5800.584

flat200-med 164241

0.0700.099

182242

0.0650.087

133F169F

0.050F0.065F

260392

0.0850.131

287377

0.1500.196

248348

0.1350.185

flat200-har 2,5764,037

1.040F1.638F

3,4185,476

1.2101.950

4,05016,903

1.4755.967

17,87922,572

6.0557.580

3,1434,200

1.5952.149

2,403F3,344F

1.2801.761

g125.17 6871,066

3.0654.792

8041,161

3.1754.546

528747

2.230F2.978F

9811,264

4.1105.408 2% success 492F

694F2.3103.415

g125.18 1315

0.0900.098

5251

0.1700.169

7.718F9.872F

0.0900.100

3536

0.1000.100

1,0571,787

4.9958.379

1113

0.080F0.084F

g250.15 2.5852.668

0.1100.112

2.8892.909

0.2100.208

2.3812.410

0.5450.545

3.3114.033

0.1100.118

2.208F2.219F

0.080F0.082F

2.2392.247

0.0900.086

g250.29 638704

12.72514.956

716765

8.2359.147

247F297F

5.455F6.259F

755895

9.96011.979 0% success 263

3208.0709.237

logistics.c 6.332F6.873F

0.010F0.007F

2,5503,631

1.3851.996

5265

0.0400.045

122152

0.0700.091

6.8117.814

0.0100.008

1012

0.0100.012

logistics.d 2732F

0.040F0.042F

111137

0.0900.109

86107

0.0900.102

170196

0.1000.114

23F33

0.0400.051

3042

0.0500.063

par16-1-c 6,9439,621

2.9204.064

13,75918,975

5.3007.402

5,689118,873

2.37048.676

17,18632,062

5.87010.978 73% success 1,557F

2,470F0.860F1.368F

par16-2-c 28,29138,826

11.97516.493

129,544190,040

51.65575.896 96% success 159,405

260,24053.41587.516 39% success 3,675F

4,805F2.020F2.668F

par16-3-c 18,13627,626

7.75011.787

40,23353,219

16.28521.554

46,13457,719

19.33024.175

66,942102,783

23.07535.486 42% success 2,613F

4,106F1.455F2.313F

par16-4-c 11,14616,938

4.8107.216

23,98439,982

9.44515.905

55,693156,597

23.06064.789

80,419112,780

27.90538.674 69% success 1,035F

2,183F0.570F1.211F

par16-5-c 11,83017,545

5.0207.436

23,48037,596

9.50515.329 69% success 90,820

126,32931.22543.841 41% success 3,169F

4,092F1.740F2.239F

qg1-08 647F920F

15.645F22.476F 99% success 30% success 99% success 59% success 80% success

qg2-08 2,545F3,295F

51.120F69.991F 52% success 3% success 43% success 36% success 20% success

qg5-11 99% success 0% success 0% success 1% success 2,287F3,288F

24.575F35.405F 22% success

qg6-09 7263,090

2.2359.322 5% success 1% success 14% success 29F

44F0.110F0.173F

8331,263

3.2854.995

qg7-13 98% successF 0% success 0% success 0% success 3% success 0% success

Table 4. Optimally tuned results on structured problems shown in the form: medianmean . Best results

are marked with F and flip counts are reported in thousands. On problems where a solver wastimed out for some runs, we report the percentage of success of that solver instead of its CPU timeand flip count.

each local minimum but never decreased) and with the noise parameter p adaptively ad-justed during the search.13. These results and the results of the default parameter values forG2WSAT (dp = 0.05 and p = 0.5) and PAWS (winc = 10) are shown in Table 5. To give anidea of the relative performance of these default setting algorithms against the other threeadaptive ones, the results of AdaptNovelty+, AdaptG2WSAT0 and RSAPS from Table 4are also reported again in Table 5.

13. Although gNovelty+’s noise parameter was also adjusted in Table 3, performance was not greatly im-proved, with the main benefits coming from adjusting sp.

14

Combining Adaptive and Dynamic Local Search for Satisfiability

0 100 200 300 400 500 600 30%

40%

50%

60%

70%

80%

90%

100%Optimally tuned

CPU seconds

Sol

ved

inst

ance

s (%

)

gNovelty+

AdaptG2WSAT0

G2WSAT

AdaptNovelty+

PAWSRSAPS

0 100 200 300 400 500 600 30%

40%

50%

60%

70%

80%

90%

100%Default setting

CPU seconds

Sol

ved

inst

ance

s (%

)

Figure 3. Run-time distributions over the complete data set.

In this second comparison, gNovelty+ remains the champion both in terms of the num-ber of problem classes (bitadd, ais, bw large, e*ddr, logistics and qg) and the numberof instances (19). Table 5 also shows that the performance of gNovelty+, G2WSAT andPAWS (especially the later two) is substantially reduced without parameter tuning, withAdaptG2WSAT0 taking over from PAWS as the winner on all parity problems and beatingG2WSAT on the two harder large graph instances. AdaptNovelty+ further dominates on theother large graph instances previously won by PAWS. Consequently, AdaptG2WSAT0 nowhas the best overall success rate of 88.90% followed by AdaptNovelty+ at 88.32%, the de-fault valued gNovelty+ at 82.23%, RSAPS at (72.06%), with G2WSAT (70.68%) and PAWS(52.32%) coming last (this is also illustrated in the RTDs in Figure 3). Looking in more de-tail, we can see that the main negative impact of a fixed parameter on gNovelty+ has comefrom its failure on the parity problems. Similarly, AdaptG2WSAT0 and AdaptNovelty+ failmainly on the quasi-group problems. If we put these two data sets aside, then the defaultgNovelty+ shows a clear advantage over AdaptG2WSAT0 and AdaptNovelty+, dominatingon five of the remaining seven problem classes.

6. Results with Pre-processing Enhancement

Although preprocessing has a generally negative effect on SLS solvers when solving randomproblems, it is now well understood that it can produce significant benefits on structuredproblems [16]. For this reason we decided to test the effects of the two most promising tech-niques, HyPre [2] and SatELite [5], on the performance of gNovelty+ and its competitors.We also included a simple UnitProp preprocessor as it is cheaper to compute and has beenused by G2WSAT and AdaptG2WSAT0. In detail, these preprocessors simplify an inputformula before passing the reduced formula to a particular solver as follows:

UnitProp simply applies the well-known unit propagation procedure [17] to the inputformula until saturation.

15

D.N. Pham et al.

Instances gNovelty+ AdaptG2WSAT0 G2WSAT AdaptNovelty+ RSAPS PAWS

#flips #secs #flips #secs #flips #secs #flips #secs #flips #secs #flips #secs

2bitadd 11 0.9280.947

0.000F0.001F

2.4582.348

0.0000.003

0.510F0.677F

0.0000.002

1.5921.693

0.0000.001

374577

0.2300.368 86% success

2bitadd 12 0.5760.625F

0.000F0.001F

1.7161.882

0.0000.003

0.394F200

0.0000.136

1.5211.554

0.0000.001

59111

0.0400.071

50,062127,135

30.10575.140

3bitadd 31 18F20F

0.060F0.068F

549817

1.1201.625 0% success 151

1630.4400.467 0% success 0% success

3bitadd 32 15F16F

0.060F0.058F

262318

0.5350.652 0% success 131

1350.4500.466 0% success 0% success

ais10 8.116F12F

0.010F0.012F

135200

0.1200.174

140187

0.1400.184

1,2911,937

1.1351.671

1318

0.0200.020

6895

0.0800.110

ais12 48F64F

0.060F0.080F

3,1274,320

3.4454.754

5,1288,732

6.24510.673

24,31535,292

28.14040.833

103151

0.1350.202

9171,424

1.4102.166

ais14 334F414F

0.435F0.537F

13,45523,628

15.85027.803 95% success 81% success 629

8920.8851.265

4,2057,108

6.79511.530

bw large.c 832F1,131F

1.485F2.044F

3,3174,930

3.9905.902 49% success 6,350

9,3556.5509.699

3,6065,361

10.72015.921

7,2389,440

17.31022.493

bw large.d 4,802F5,499F

14.515F16.738F

10,60015,714

21.55030.726 0% success 21,462

30,61637.05555.237 70% success 39% success

enddr2-1 39F44F

0.190F0.196F

2,4042,510

3.9804.110

1,5521,805

4.0204.565

5,7857,040

9.07511.187

6172

0.5050.534 20% success

enddr2-8 30F34F

0.170F0.180F

2,0882,164

3.6553.715

1,3151,383

3.4603.530

4,2875,173

6.9058.284

4954

0.4950.509 29% success

ewddr2-1 32F34F

0.190F0.192F

1,9392,079

3.4353.686

1,1651,231

3.0953.225

4,6606,193

7.97010.352

4854

0.5500.550 28% success

ewddr2-8 30F32F

0.200F0.198F

1,7571,828

3.3653.454

1,1311,164

3.2053.276

4,9596,515

8.69010.959

4850

0.6000.602 38% success

flat200-med 164241

0.0700.099

182242

0.0650.087

140F193

0.050F0.073F

260392

0.0850.131

287377

0.1500.196

145181F

0.0800.100

flat200-har 2,576F4,037F

1.040F1.638F

3,4185,476

1.2101.950

4,26813,931

1.5704.921

17,87922,572

6.0557.580

3,1434,200

1.5952.149

5,9238,168

3.1154.304

g125.17 3,5254,229

15.89018.808

804F1,161F

3.175F4.546F 99% success 981

1,2644.1105.408 2% success 6% success

g125.18 185185

0.9700.951

5251

0.1700.169

11F15F

0.1300.144

3536

0.100F0.100F

1,0571,787

4.9958.379

1925

0.1500.195

g250.15 2.2532.283

0.0800.084

2.8892.909

0.2100.208

2.4732.488

0.4100.407

3.3114.033

0.1100.118

2.208F2.219F

0.080F0.082F

2.2112.236

0.1100.113

g250.29 4,9845,008

136.745141.147

716F765F

8.235F9.147F 43% success 755

8959.960

11.979 0% success 0% success

logistics.c 6.332F6.873F

0.010F0.007F

2,5503,631

1.3851.996

4455

0.0300.041

122152

0.0700.091

6.8117.814

0.0100.008

118158

0.1000.128

logistics.d 2732F

0.040F0.042F

111137

0.0900.109

1,7752,946

1.5202.560

170196

0.1000.114

23F33

0.0400.051

275386

0.2750.375

par16-1-c 20% success 13,759F18,975F

5.300F7.402F 99% success 17,186

32,0625.87010.978 73% success 23% success

par16-2-c 10% success 129,544F190,040F

51.655F75.896F 98% success 159,405

260,24053.41587.516 39% success 1% success

par16-3-c 5% success 40,233F53,219F

16.285F21.554F

62,74597,922

26.60041.142

66,942102,783

23.07535.486 42% success 5% success

par16-4-c 9% success 23,984F39,982F

9.445F15.905F

99,333158,001

41.29065.417

80,419112,780

27.90538.674 69% success 16% success

par16-5-c 5% success 23,480F37,596F

9.505F15.329F 73% success 90,820

126,32931.22543.841 41% success 3% success

qg1-08 853F1,134F

18.900F24.939F 99% success 14% success 99% success 59% success 83% success

qg2-08 3,155F4,093F

68.675F91.038F 52% success 4% success 43% success 36% success 23% success

qg5-11 5,0126,863

39.11550.497 0% success 0% success 1% success 2,287F

3,288F24.575F35.405F 22% success

qg6-09 3432,281

1.1557.319 5% success 17% success 14% success 29F

44F0.110F0.173F

9681,229

4.0355.102

qg7-13 74% successF 0% success 0% success 0% success 3% success 0% success

Table 5. Default parameter setting results on structured problems shown in the form: medianmean .

Best results are marked with F and flip counts are reported in thousands. On problems where asolver was timed out for some runs, we report the percentage of success of that solver instead of itsCPU time and flip count.

HyPre [2] focuses on reasoning with binary clauses by implementing the HypBinRes pro-cedure, a restricted version of hyper-resolution [17] that only runs on binary clauses.It also uses the implication graph concept and the HypBinRes rule to infer new binaryclauses and avoid the space explosion of computing a full transitive closure. In addi-tion, HyPre incrementally applies unit and equality reductions to infer more binaryclauses and hence improve its performance.

SatELite [5] uses the (self-)subsumption rule and information about functionally depen-dent variables to further improve the simplification power of the Variable Elimination

16

Combining Adaptive and Dynamic Local Search for Satisfiability

Resolution (VER) procedure [4] (a process to eliminate a variable x by replacing allclauses containing x and x with their resolvents). Like its predecessor, NiVER [19],SatELite implements the VER process only if there is no increase in the number ofliterals after variable elimination.

We combined the three preprocessors with each of the default-valued algorithms reportedin the previous section, and tested these combinations on all problems that were able tobe simplified by a particular preprocessor. These results are summarised in the followingsections. Each combination was run 100 times on each instance and each run was timedout after 600 seconds.

6.1 Results on UnitProp-simplified Problems

Instances gNovelty+ AdaptG2WSAT0 G2WSAT AdaptNovelty+ RSAPS PAWS

#flips #secs #flips #secs #flips #secs #flips #secs #flips #secs #flips #secs

enddr2-1 22F26F

0.120F0.127F

206428

0.4650.776

555676

1.1351.386

346546

0.6651.012

3141

0.2550.277 27% success

enddr2-8 18F20F

0.100F0.103F

142165

0.3650.398

183203

0.4900.531

307365

0.5900.688

2632

0.2400.245 24% success

ewddr2-1 19F19F

0.100F0.103F

134190

0.3500.428

202261

0.5200.617

217268

0.4300.517

2428

0.2400.244 44% success

ewddr2-8 14F15F

0.090F0.089F

118123

0.3100.318

8198

0.3000.326

205225

0.3900.422

2123

0.2200.223 40% success

qg1-08 597757

2.8653.661

307F445F

1.045F1.517F

1,2441,553

5.4706.867

467675

1.2801.861

2,2803,285

9.24013.302

583873

2.1603.213

qg2-08 1,278F1,679F

6.5158.658

1,4362,026

5.095F7.180F

8,1269,988

36.60544.804

2,2883,019

6.4208.488

6,3708,453

26.23534.857

2,4363,109

9.72512.848

qg5-11 6589F

0.3500.470F 52% success 1,076

3,3324.800

12.0534,189

12,53110.30528.065

62F102

0.320F0.518

75115

0.3950.612

qg6-09 3.3434.335

0.0100.009

1,1462,664

1.3703.143

8.28516

0.0200.029

416630

0.5650.870

2.9553.282

0.0100.007

2.277F3.105F

0.010F0.007F

qg7-13 580F741F

4.835F5.789F 22% success 41% success 73% success 4,155

5,31836.45546.256

5,2007,557

45.17066.118

Table 6. Default parameter setting results on structured problems preprocessed with UnitProp-agation preprocessor: median

mean . Best results are marked with F and flip counts are reported inthousands. On problems where a solver was timed out for some runs, we report the percentageof success of that solver instead of its CPU time and flip count. The time taken to preprocess aproblem instance is included in the CPU time of each solver. Results on the problems where thepreprocessor makes no change to the CNF formulae are omitted.

The results in Table 6 show that UnitProp only had an effect on the e*ddr and qgproblems and that gNovelty+ remains the dominant algorithm on these simplified instances.Specifically, gNovelty+ had the best time performance on all 4 of the e*ddr problems and2 of the 5 qg problems, with AdaptG2WSAT0 and PAWS dominating on the remaining qgproblems.

UnitProp had a beneficial effect for all algorithms on these problems (compared tothe non-preprocessed results of Table 5 and graphed in Figure 4), producing significantimprovements for AdaptNovelty+ and the G2WSAT algorithms on the e*ddr problems andacross the board improvements on the qg problems. Overall, the benefits of UnitProp forgNovelty+ are less dramatic than for other techniques. However, this can be explained bythe fact that gNovelty+ was already doing well on these problems (without preprocessing),and that margin for improvement was consequently smaller.

17

D.N. Pham et al.

Instances gNovelty+ AdaptG2WSAT0 G2WSAT AdaptNovelty+ RSAPS PAWS

#flips #secs #flips #secs #flips #secs #flips #secs #flips #secs #flips #secs

2bitadd 11 0.260F0.275F

0.000F0.000F

1.8671.960

0.0000.004

0.4660.603

0.0000.003

3.6794.415

0.0000.002

0.2730.286

0.0000.000

0.4620.948

0.0000.001

2bitadd 12 0.228F0.231F

0.000F0.000F

1.5141.692

0.0000.004

0.392400

0.0000.306

2.9673.255

0.0000.002

0.2520.252

0.0000.000

0.2900.470

0.0000.001

3bitadd 31 12F12F

0.430F0.433F

3537

0.4900.490

1617

0.5000.502

369413

0.8150.866

1820

0.4500.454 0% success

3bitadd 32 9.457F9.428F

0.440F0.440F

3031

0.5000.497

13113

0.5100.832

255276

0.6900.711

1313

0.4500.450 7% success

ais10 11F13F

0.010F0.011F

92131

0.0700.100

85116

0.0800.097

1,0171,524

0.7101.060

1317

0.0100.015

2842

0.0300.039

ais12 64F77F

0.080F0.087F

1,1821,568

1.1001.467

1,3751,864

1.4401.934

12,92318,165

11.84016.480

98141

0.1100.157

320516

0.3800.617

ais14 372F449F

0.445F0.533F

15,31221,691

16.84024.058 94% success 85% success 669

8630.8351.092

3,8756,373

5.6209.241

bw large.c 38F58

1.440F1.460F

1,8312,600

2.9253.595

1,4412,210

2.9603.750

4,7166,993

5.4807.503

4356F

1.4501.466

74103

1.4901.529

bw large.d 508F725F

11.890F12.449F

22,06125,059

57.25561.824 53% success 96% success 2,218

3,79919.98526.924

1,3021,842

16.25518.665

enddr2-1 31F35F

6.010F6.016F

1,4001,475

8.0408.145

9631,065

7.6507.817

4,6145,747

12.44514.091

4861

6.1506.183 57% success

enddr2-8 21F22F

5.820F5.825F

450531

6.6106.716

416427

6.6406.662

571864

6.6156.999

3338

5.9505.965 48% success

ewddr2-1 21F23F

5.440F5.437F

635703

6.4356.527

458498

6.3806.447

1,9903,117

8.1059.667

3135

5.5705.578 52% success

ewddr2-8 15F17F

5.020F5.022F

300347

5.5905.631

180218

5.4405.501

364502

5.4655.641

2628

5.1505.151 56% success

g125.17 3,5464,512

13.08516.252

717F988F

1.875F2.520F

16,30820,545

62.78077.955

9311,488

2.5554.135 1% success 8% success

g125.18 160173

1.0701.107

4849

0.2300.230

9.607F13F

0.2000.213

3738

0.170F0.170F

1,3061,911

6.4209.370

1923

0.2050.226

g250.15 2.2612.291

0.5300.529

2.8852.933

0.6800.677

2.4102.475

0.7700.773

3.2173.599

0.520F0.526F

2.182F2.200F

0.5300.530

2.2162.225

0.5400.536

g250.29 4,7964,910

73.06074.602

748F866F

6.400F7.388F 67% success 778

8928.825

10.354 0% success 0% success

logistics.c 1.347F1.486

0.020F0.022F

8.0779.357

0.0300.031

2.8983.306

0.0300.027

6.6877.715

0.0200.025

1.3691.478F

0.0200.022

3.2425.152

0.0200.025

logistics.d 3.726F3.884F

0.850F0.851F

2729

0.9000.900

6.7857.123

0.8900.889

3437

0.8700.867

3.7803.915

0.8500.854

4.4727.067

0.8600.857

par16-1-c 16% success 14,787F20,756F

6.005F8.453F 99% success 18,151

29,8636.375

10.512 86% success 16% success

par16-2-c 4% success 83,478F122,795F

33.030F49.337F 95% success 124,928

176,08244.13061.248 54% success 1% success

par16-3-c 5% success 36,517F42,956F

14.395F17.183F

64,22487,618

27.46037.512

63,33992,704

22.71533.117 70% success 4% success

par16-4-c 6% success 18,058F31,792F

7.480F13.064F

76,621147,878

32.83062.198

32,43855,159

11.34519.297 79% success 7% success

par16-5-c 3% success 25,234F42,119F

10.215F17.261F 71% success 52,581

75,99718.14026.487 68% success 0% success

qg1-08 553673

2.9503.532

366F483F

1.610F2.019F

1,0961,582

5.3357.459

472813

1.6202.529

1,7432,495

7.18510.133

576813

2.4253.311

qg2-08 1,4371,796

7.4609.346

868F1,326F

3.500F5.121F

5,4897,129

25.52033.435

1,8412,557

5.5407.463

6,8948,319

28.68534.478

1,6882,287

6.9409.336

qg5-11 2028

0.3200.352

1,5592,913

3.7456.807

5188

0.4800.645

5701,558

1.6153.923

2027

0.3350.359

15F22F

0.315F0.351F

qg6-09 0.001F0.001F

0.050F0.050F

0.0010.001

0.0500.052

0.0010.001

0.0500.051

0.0010.009

0.0500.050

0.0010.002

0.0500.050

0.0010.002

0.0500.050

qg7-13 250F355F

2.310F3.178F 95% success 975

1,2995.9608.057 99% success 404

6883.7005.858

407589

3.6505.166

Table 7. Default parameter setting results on structured problems preprocessed with HyPrepreprocessor: median

mean . Best results are marked with F and flip counts are reported in thousands.The time taken to preprocess a problem instance is included in the CPU time of each solver. Onproblems where a solver was timed out for some runs, we report the percentage of success of thatsolver instead of its CPU time and flip count. Results on the problems where the preprocessormakes no change to the CNF formulae are omitted.

6.2 Results on HyPre-simplified Problems

Table 7 shows that HyPre was able to simplify all the problems in our test set with theexception of the two flat graph colouring problems. Again gNovelty+ remains dominant,having the best performance on all the bitadd, ais, bw large, e*ddr and logistics problemsand 2 of the 5 qg problems. Of the other algorithms, AdaptG2WSAT0 remained dominanton all the parity and 2 of the 4 large graph colouring problems, while improving over itsnon-preprocessed performance to win on 2 of the qg problems. Finally, AdaptNovelty+ also

18

Combining Adaptive and Dynamic Local Search for Satisfiability

improved over its non-preprocessed performance to win on 2 large graph colouring problemsand PAWS improved to win on qg5-11.

As with UnitProp, the HyPre simplified formulae have been generally easier to solvethan the original problems. However, the overhead of HyPre has outweighed these benefitson those problems that can already be solved relatively quickly without preprocessing. Forexample, the small improvements in the number of flips for gNovelty+ on the e*ddr problemshas been obtained at the cost of a more than 10 times increase in execution time (see Table5 and the comparative graphs in Figure 4).

6.3 Results on SatELite-simplified Problems

The SatELite results in Table 8 show a similar pattern to the HyPre results, with gNovelty+

dominating the bitadd, ais, bw large, e*ddr and logistics problems and 3 of the 5 qg prob-lems. This time, however, AdaptNovelty+ clearly dominated the parity problems (achievingthe best results of all the methods and preprocessing combinations tried on this problemclass) and further dominated on 3 of the 4 large graph colouring problems and 1 of the flatgraph colouring problems. This made AdaptNovelty+ the second best performing SatElite-enhanced algorithm (behind gNovelty+).

SatELite had the widest range of application of the three preprocessing techniques andwas able to simplify all 31 problem instances. However, like HyPre, despite generallyimproving the flip rates of most algorithms on most problems, the overhead of using SatELitecaused a deterioration in time performance on many instances. This is shown more clearly inFigure 4 where SatELite consistently appears as one of the worse options for any algorithmon the bitadd and large graph (g) problems.

6.4 Evaluation of Preprocessing

Overall it is not immediately clear whether preprocessing is a useful general purpose additionfor our algorithms. Of the three techniques, only UnitProp has a consistently positive effect,even though this is limited to two problem classes (e*ddr and qg). Although both SatELiteand HyPre have positive effects on certain problems for certain algorithms, neither of themis able to provide an overall improvement for all tested solvers across the whole benchmarkset. For instance, HyPre is generally helpful on the qg problems and SatELite is helpful onthe flat graph colouring problems. But arrayed against these gains are the unpredictableworsening effects of the more complex preprocessors on other problem classes. For instance,consider the negative effect of SatELite on AdaptG2WSAT0 on the bitadd and the largegraph colouring problems.

If we take the entire picture presented in Figure 4 two observations emerge. Firstly,gNovelty+ achieves the best overall performance regardless of the preprocessor used, andsecondly, of the preprocessors, only UnitProp is able to improve the overall performance ofgNovelty+. Therefore, our final recommendation from the preprocessor study would be touse gNovelty+ in conjunction with UnitProp.

19

D.N. Pham et al.

Instances gNovelty+ AdaptG2WSAT0 G2WSAT AdaptNovelty+ RSAPS PAWS

#flips #secs #flips #secs #flips #secs #flips #secs #flips #secs #flips #secs

2bitadd 11 1.1161.330

0.042F0.043F

7.84816

0.0520.051

0.679F0.993F

0.0420.044

1.8533.950

0.0420.044

2032

0.0620.066

5,3328,425

3.8026.075

2bitadd 12 0.6860.772F

0.049F0.050F

4.64610

0.0590.056

0.421F500

0.0490.389

1.5111.569

0.0490.050

5.17911

0.0490.057

9502,864

0.7242.069

3bitadd 31 32F41F

2.228F2.257F 0% success 0% success 249

2982.8032.887 0% success 0% success

3bitadd 32 17F18F

2.351F2.359F 1% success 0% success 147

1492.8112.811 0% success 0% success

ais10 8.504F10F

0.044F0.050F

1215

0.0540.059

1013

0.0540.056

1827

0.0640.077

1114

0.0540.061

2337

0.0840.117

ais12 62F73F

0.168F0.191F

78127

0.2030.292

108138

0.2680.331

138187

0.3480.451

7796

0.2530.299

324381

1.0131.199

ais14 447572F

1.010F1.272F

3,1044,795

5.9259.119

1,2671,832

2.6753.858

6891,008

1.5102.181

417F696

1.2902.098 9% success

bw large.c 960F1,113F

2.175F2.414F

4,1005,697

5.2207.218 39% success 5,475

8,3356.975

10.5574,4316,050

11.68516.166

5,4289,368

13.97524.912

bw large.d 4,479F5,690F

13.51016.524F

5,6048,965

11.405F17.520 0% success 21,199

25,73440.26048.444 61% success 36% success

enddr2-1 1216F

1.043F1.049F

104120

1.1981.213

4151

1.1731.186

36,88555,561

44.47866.173

12F18

1.0531.066 53% success

enddr2-8 10F12F

1.071F1.071F

101107

1.1911.201

3138

1.1811.192

19,86126,945

23.70631.876

1013

1.0811.090 53% success

ewddr2-1 6.8338.426

1.108F1.112F

5658

1.1981.198

1215

1.1981.203

3,7738,515

5.28810.578

6.041F7.381F

1.1181.122 75% success

ewddr2-8 5.734F6.554F

1.149F1.146F

5557

1.2391.241

1012

1.2291.232

1,4104,468

2.7196.056

5.9736.713

1.1591.157 69% success

flat200-med 106F147F

0.1140.135

195298

0.1490.193

266416

0.1890.269

125163

0.109F0.126F

398621

0.3290.489

114160

0.1290.168

flat200-har 1,180F1,822F

0.729F1.105F

2,3063,881

1.1541.891

2,8864,399

1.5442.329

1,6742,347

0.8091.117

3,4084,695

2.4093.309

1,4652,014

1.1241.516

g125.17 89% success 0% success 0% success 1,190F1,759F

16.811F23.931F 0% success 9% success

g125.18 277284

8.4098.511

1,1031,590

19.38427.250

5681

3.6744.391

35F36F

2.389F2.403F

1,5452,249

29.74444.530 91% success

g250.15 2.8062.935

8.9288.934

357375

23.83324.305

3.4653.551

9.5189.521

4.0324.181

8.9388.945

2.631F2.676F

8.888F8.892F 98% success

g250.29 66% success 0% success 0% success 827F958F

68.974F75.745F 0% success 0% success

logistics.c 2.9103.293

0.273F0.276F

7293

0.3230.332

9.26011

0.2830.285

2228

0.2930.291

2.538F2.982F

0.2730.276

7.47911

0.2830.283

logistics.d 1416

0.552F0.557F

380513

0.8570.973

707942

1.2921.535

130172

0.6370.669

12F16F

0.5520.559

3157

0.5820.606

par16-1-c 81% success 8,21113,494

4.3397.152

11,28416,358

6.4949.037

5,870F8,113F

2.794F3.848F 99% success 115,457

146,79660.47477.891

par16-2-c 35% success 40,27068,949

21.56336.520

33,09547,439

18.46826.128

32,693F44,368F

15.663F21.372F 90% success 48% success

par16-3-c 42% success 17,316F23,186

9.36512.566F

18,54923,013F

10.82013.027

18,58825,850

9.240F12.791 97% success 88% success

par16-4-c 47% success 14,469F20,584F

7.58110.873

23,39931,199

13.13117.307

15,54922,520

7.566F10.864F 99% success 69,676

105,25638.69657.880

par16-5-c 40% success 14,23918,460F

7.71510.017

19,50633,987

11.02518.486

11,341F19,486

5.425F9.299F 93% success 84% success

qg1-08 366522

2.3813.207

283F413F

1.371F1.806F

1,1541,687

5.2367.412

498662

1.7412.164

1,8222,671

7.56610.970

455721

2.0563.005

qg2-08 1,370F1,738F

7.3889.243

1,4942,037

5.788F7.670F

8,34910,682

37.90348.264

2,4623,111

7.5989.500

4,5178,122

19.48334.707

2,1423,307

9.13314.147

qg5-11 7183F

0.505F0.573F 48% success 1,092

1,9914.6457.081

5,48811,791

12.47526.332

67F108

0.5350.750

76123

0.5700.807

qg6-09 2.732F3.856

0.062F0.060F

1,2372,523

1.5373.069

8.04615

0.0720.079

367720

0.5671.048

2.8703.546F

0.0620.060

2.9254.241

0.0620.062

qg7-13 658F823F

5.254F6.669F 23% success 30% success 76% success 3,601

5,09935.32444.504

4,4816,718

37.52456.072

Table 8. Default parameter setting results on structured problems preprocessed with SatELitepreprocessor: median

mean . Best results are marked with F and flip counts are reported in thousands.The time taken to preprocess a problem instance is included in the CPU time of each solver. Onproblems where a solver was timed out for some runs, we report the percentage of success of thatsolver instead of its CPU time and flip count.

7. Discussion and Conclusions

The experimental evidence of this paper and the 2007 SAT competition demonstrates thatgNovelty+ is a highly competitive algorithm for random SAT problems. In addition, theseresults show that gNovelty+, with parameter tuning, can dominate several of the previouslybest performing SLS algorithms on a range of structured problems. If parameter tuningis ruled out (as it would be in most real-world problem scenarios), then gNovelty+ still

20

Combining Adaptive and Dynamic Local Search for Satisfiability

10−2

10−1

100

101

102

103

gNovelty+C

PU

sec

onds

10−2

10−1

100

101

102

103

AdaptG2WSAT0

CP

U s

econ

ds

10−2

10−1

100

101

102

103

G2WSAT

CP

U s

econ

ds

10−2

10−1

100

101

102

103

AdaptNovelty+

CP

U s

econ

ds

10−2

10−1

100

101

102

103

PAWS

CP

U s

econ

ds

10−2

10−1

100

101

102

103

RSAPSC

PU

sec

onds

bitadd ais bw e*ddr flat g log par16 qg

NoPrepUnitPropHyPreSatELite

bitadd ais bw e*ddr flat g log par16 qg

bitadd ais bw e*ddr flat g log par16 qg bitadd ais bw e*ddr flat g log par16 qg

bitadd ais bw e*ddr flat g log par16 qg bitadd ais bw e*ddr flat g log par16 qg

Figure 4. Comparing the performance of solvers (default settings) on the whole benchmark setwith NoPrep, UnitProp, HyPre and SatELite. Data is mean CPU time (logarithmic scale).

performs well, and only lost to its closest rival, AdaptG2WSAT0, on one structured problemclass.

Once again, as with PAWS and SAPS, the addition of a clause weighting heuristic togNovelty+ has required the addition of a sensitive weight decay parameter to get competitiveresults. Nevertheless, the situation with gNovelty+’s parameter does differ from SAPS andPAWS in that highly competitive performance can be obtained from a relatively small setof parameter values (i.e. 0.0, 0.1, 0.4 and 1.0). In contrast, SAPS and PAWS require muchfiner distinctions in parameter values to get even acceptable results [20]. This smaller setof values means that the process of tuning the smoothing parameter sp of gNovelty+ isconsiderably simpler than for other clause weight techniques. More importantly, the robustbehaviour of gNovelty+ indicates that it may be easier to devise an automatic adaptingmechanism for sp. To date, procedures for automatically adapting weight decay parametershave not produced the fastest algorithms.14. In future work, it therefore appears promising

14. Although machine learning techniques that are trained on test sets of existing instances and then appliedto unseen instances have proved useful for setting SAPS and Novelty parameters [10]

21

D.N. Pham et al.

to try and develop a simple heuristic that will effectively adapt sp in the structured problemdomain.

Finally we examined the effects of preprocessing on the performance of the algorithmsused in the study. Here we found that two of the best known modern preprocessing tech-niques (HyPre and SatELite) produced mixed results and had an overall negative impacton execution time across the whole problem set. These results appear to go against otherwork in [16] that found HyPre and SatELite to be generally beneficial for local search onSAT. However, in the current study many of the problems were solved quickly relative tothe overhead of using the more complex preprocessors. If we only consider flip rates thenHyPre and SatELite did show a generally positive effect. This means that for problemswhere execution times become large relative to the overhead of preprocessing, we wouldexpect both HyPre and SatELite to show greater improvements. Nevertheless, within theconfines of the current study, the simpler UnitProp preprocessing method (in conjunctionwith gNovelty+) had the best overall results: even though UnitProp only had positive effectson two problem classes this was balanced by the fact that its overhead on other problemswas relatively insignificant.

In conclusion, we have introduced gNovelty+, a new hybrid SLS solver that won therandom SAT category in the 2007 SAT competition. We have extended the SAT resultsand shown that gNovelty+ is also effective in solving structured SAT problems. In fact,gNovelty+ has not only outperformed five of the strongest current SLS SAT solvers, ithas also demonstrated significant robustness in solving a wide range of diverse problems.In achieving this performance, we have highlighted gNovelty+’s partial dependence on thesetting of its sp smoothing parameter. This leads us to recommend that future work shouldconcentrate on the automatic adaptation of this parameter.

Acknowledgements

We thankfully acknowledge the financial support from NICTA and the Queensland gov-ernment. NICTA is funded by the Australian Government’s Backing Australia’s Abilityinitiative, and in part through the Australian Research Council.

References

[1] Anbulagan, Duc Nghia Pham, John Slaney, and Abdul Sattar. Old resolution meetsmodern SLS. In Proceedings of the Twentieth National Conference on Artificial Intel-ligence (AAAI-05), pages 354–359, 2005.

[2] Fahiem Bacchus and Jonathan Winter. Effective preprocessing with hyper-resolutionand equality reduction. In Proceedings of the Sixth International Conference on Theoryand Applications of Satisfiability Testing (SAT-03), volume 2919 of Lecture Notes inComputer Science (LNCS), pages 341–355, 2003.

[3] Martin Davis, George Logemann, and Donald Loveland. A machine program for the-orem proving. Communications of the ACM, 5(7):394–397, 1962.

[4] Martin Davis and Hilary Putnam. A computing procedure for quantification theory.Journal of the ACM, 7:201–215, 1960.

22

Combining Adaptive and Dynamic Local Search for Satisfiability

[5] Niklas Een and Armin Biere. Effective preprocessing in SAT through variable andclause elimination. In Proceedings of the Eighth International Conference on Theoryand Applications of Satisfiability Testing (SAT-05), volume 3569 of Lecture Notes inComputer Science (LNCS), pages 61–75, 2005.

[6] Fred Glover. Tabu search - part 1. ORSA Journal on Computing, 1(3):190–206, 1989.

[7] Holger H. Hoos. On the run-time behaviour of stochastic local search algorithms forSAT. In Proceedings of the Sixteenth National Conference on Artificial Intelligence(AAAI-99), pages 661–666, 1999.

[8] Holger H. Hoos. An adaptive noise mechanism for WalkSAT. In Proceedings of theEighteenth National Conference on Artificial Intelligence (AAAI-02), pages 635–660,2002.

[9] Holger H. Hoos and Thomas Stutzle. Stochastic Local Search: Foundations and Appli-cations. Morgan Kaufmann, San Francisco, CA, 2005.

[10] Frank Hutter, Youssef Hamadi, Holger H. Hoos, and Kevin Leyton-Brown. Perfor-mance prediction and automated tuning of randomized and parametric algorithms.In Proceedings of the Twelfth International Conference on Principles and Practice ofConstraint Programming (CP-06), volume 4204 of Lecture Notes in Computer Science(LNCS), pages 213–228, 2006.

[11] Frank Hutter, Dave A.D. Tompkins, and Holger H. Hoos. Scaling and probabilisticsmoothing: Efficient dynamic local search for SAT. In Proceedings of the Eighth Inter-national Conference on Principles and Practice of Constraint Programming (CP-02),volume 2470 of Lecture Notes in Computer Science (LNCS), pages 233–248, 2002.

[12] Chu Min Li and Wen Qi Huang. Diversification and determinism in local search forsatisfiability. In Proceedings of the Eighth International Conference on Theory and Ap-plications of Satisfiability Testing (SAT-05), volume 3569 of Lecture Notes in ComputerScience (LNCS), pages 158–172, 2005.

[13] Chu Min Li, Wanxia Wei, and Harry Zhang. Combining adaptive noise and look-aheadin local search for SAT. In Proceedings of the Third International Workshop on LocalSearch Techniques in Constraint Satisfaction (LSCS-06), pages 2–16, 2006.

[14] David A. McAllester, Bart Selman, and Henry A. Kautz. Evidence for invariantsin local search. In Proceedings of the Fourteenth National Conference on ArtificialIntelligence (AAAI-97), pages 321–326, 1997.

[15] Paul Morris. The Breakout method for escaping from local minima. In Proceedings ofthe Eleventh National Conference on Artificial Intelligence (AAAI-93), pages 40–45,1993.

[16] Duc Nghia Pham. Modelling and Exploiting Structures in Solving Propositional Satis-fiability Problems. PhD thesis, Griffith University, Queensland, Australia, 2006.

23

D.N. Pham et al.

[17] John Alan Robinson. Automated deduction with hyper-resolution. Journal of Com-puter Mathematics, 1(3):227–234, 1965.

[18] Bart Selman, Hector Levesque, and David Mitchell. A new method for solving hardsatisfiability problems. In Proceedings of the Tenth National Conference on ArtificialIntelligence (AAAI-92), pages 440–446, 1992.

[19] Sathiamoorthy Subbarayan and Dhiraj K. Pradhan. NiVER: Non increasing variableelimination resolution for preprocessing SAT instances. In Proceedings of the SeventhInternational Conference on Theory and Applications of Satisfiability Testing (SAT-04), volume 3542 of Lecture Notes in Computer Science (LNCS), pages 276–291, 2004.

[20] John R. Thornton. Clause weighting local search for SAT. Journal of AutomatedReasoning, 35(1-3):97–142, 2005.

[21] John R. Thornton, Duc Nghia Pham, Stuart Bain, and Valnir Ferreira Jr. Additive ver-sus multiplicative clause weighting for SAT. In Proceedings of the Twentieth NationalConference on Artificial Intelligence (AAAI-04), pages 191–196, 2004.

24