exact programming by example · introduction program synthesis is the task of automatically...

Exact Programming by Example

Dana Drachsler Cohen

Technion - Computer Science Department - Ph.D. Thesis PHD-2017-09 - 2017

Exact Programming by Example

Research Thesis

Submitted in partial fulfillment of the requirements

for the degree of Doctor of Philosophy

Dana Drachsler Cohen

Submitted to the Senate

of the Technion — Israel Institute of Technology Sivan

5777 Haifa June 2017


This research was carried out under the supervision of Prof. Eran Yahav, in the Faculty of

Computer Science.

Some results in this thesis have been published as articles by the author and research collaborators

in conferences and journals during the course of the author’s doctoral research period, the most

up-to-date versions of which being:

Nader Bshouty, Dana Drachsler-Cohen, Martin T. Vechev, and Eran Yahav. Learning disjunctions ofpredicates. In Proceedings of the 30th Conference on Learning Theory, COLT 2017, 2017.

Dana Drachsler-Cohen, Sharon Shoham, and Eran Yahav. Synthesis with abstract examples. In ComputerAided Verification - 29th International Conference, CAV 2017, 2017.

Dana Drachsler-Cohen, Martin T. Vechev, and Eran Yahav. Optimal learning of specifications fromexamples (in preparation). CoRR, abs/1608.00089, 2016.

ACKNOWLEDGEMENTS

First and foremost, I would like to thank Prof. Eran Yahav, who I have been fortunate to have

as my advisor. Thank you for your contagious enthusiasm that kept me optimistic throughout

my studies. Thank you for so many insightful discussions, especially those during late nights

before deadlines. Thank you for teaching me how to write in a simple and elegant way, how to

find and explain the essence of any complex idea, and how to shorten my sentences (though we

might have to keep working on this...). Thank you for teaching me to always pursue the most

interesting research questions and overcome any challenge along the way. But above all, thank

you for the endless belief in me. For all these and more, I will be forever grateful.

I would also like to thank my collaborators who contributed greatly to this thesis. To Prof.

Martin Vechev, thank you for the late hours and for the discussions and advice, for the short and

long term. To Prof. Nader H. Bshouty, thank you for the great help with the theoretical aspect

of this thesis; I have learned so much from you. Finally, to Prof. Sharon Shoham, thank you for

the long hours, for teaching me how to always look for ways to simplify ideas, algorithms and

proofs, and how to track obscure pitfalls and elegantly overcome them.

Last but not least, I would like to thank my parents, Ilana and Gabriel, my sister, Dorin,

and my beloved husband, Gal. Thank you for the support through the intense times, for always

putting things in perspective, and above all, for your unconditional love and belief in me. This

thesis is dedicated to you.

The generous financial help of the Technion is gratefully acknowledged.


Contents

List of Figures

Abstract 1

1 Introduction 31.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.1.1 Program Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.1.2 Exact Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2 Preliminaries 112.1 Formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.2 Exact Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3 Time-Series Patterns from Charts 133.1 The Challenge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.2 Definitions and Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . 15

3.2.1 Technical Analysis Terms . . . . . . . . . . . . . . . . . . . . . . . . 15

3.2.2 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.3 Learning Patterns from Charts . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.3.1 Learning through Examples . . . . . . . . . . . . . . . . . . . . . . . 17

3.3.2 Learning with an Initial Positive Example . . . . . . . . . . . . . . . . 18

3.4 Synthesizing Code from Formulas . . . . . . . . . . . . . . . . . . . . . . . . 25

3.4.1 The AmiBroker Trading Platform . . . . . . . . . . . . . . . . . . . . 26

3.4.2 Generating AFL Code . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.4.3 Supporting Numerical Constraints . . . . . . . . . . . . . . . . . . . . 28

3.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.5.1 Common Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.5.2 The Efficiency of the Synthesis Process . . . . . . . . . . . . . . . . . 29

3.5.3 The Quality of the Synthesized Queries . . . . . . . . . . . . . . . . . 32

3.6 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36


4 Learning Disjunctions and Conjunctions of Predicates 374.1 The Search Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

4.1.1 The Nodes of the Search Space . . . . . . . . . . . . . . . . . . . . . 38

4.1.2 The Edges of the Search Space . . . . . . . . . . . . . . . . . . . . . . 38

4.2 Searching the Space with Witnesses . . . . . . . . . . . . . . . . . . . . . . . 40

4.3 The D-SPEX Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

4.4 The C-SPEX Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.5 A Polynomial Time Algorithm for Variable Inequalities . . . . . . . . . . . . . 47

4.5.1 Acyclic Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

4.5.2 Cyclic Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

5 Learning a DNF of Predicates 555.1 The Search Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

5.2 Searching the Space with Witnesses . . . . . . . . . . . . . . . . . . . . . . . 56

5.3 Learning when Predicates are Closed under Negation . . . . . . . . . . . . . . 59

5.3.1 A Lower and Upper Bound . . . . . . . . . . . . . . . . . . . . . . . . 59

5.3.2 Learning with Representative Positive Examples . . . . . . . . . . . . 60

5.4 Learning when Predicates are Anti-closed under Negation . . . . . . . . . . . . 67

5.4.1 The Search Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

5.4.2 A Learning Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 68

5.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

6 Synthesis with Abstract Examples 716.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

6.2 Abstract Specifications and Sequence Expressions . . . . . . . . . . . . . . . . 73

6.2.1 Abstract Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

6.2.2 Sequence Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . 73

6.2.3 Sequence Expressions as Abstract Examples . . . . . . . . . . . . . . 75

6.3 An Algorithm for Learning Abstract Examples . . . . . . . . . . . . . . . . . 77

6.3.1 Input Generalization . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

6.3.2 Completion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

6.3.3 Guarantees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

6.3.4 Running Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

6.4 Synthesis with Abstract Examples . . . . . . . . . . . . . . . . . . . . . . . . 82

6.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

6.5.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

6.5.2 Synthesis Framework Evaluation . . . . . . . . . . . . . . . . . . . . . 86

6.5.3 Abstract Example Specification Evaluation . . . . . . . . . . . . . . . 88

6.6 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

6.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89


7 Conclusion 91

Hebrew Abstract i


List of Figures

1.1 Using Flash Fill to send meeting appointments. . . . . . . . . . . . . . . . . . 4

1.2 The differences between (classic) program synthesis, programming by example

(PBE), and exact programming by example. In program synthesis, an expert

user provides a specification in the form of a logical formula and the synthesizer

returns a program meeting the specification. In PBE, an end user provides a

set of input-output examples and the synthesizer returns a program consistent

with the examples, but possibly not fully capturing the user’s intent. In exact

PBE, the synthesizer learns the user’s intent by interacting with the user through

examples. Then, the synthesizer returns a program that captures the user’s intent. 5

3.1 (a) A price chart (from Yahoo Finance). (b) The head and shoulders pattern

(from [Inv]). (c) The complete synthesized program for the head and shoulders

pattern. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.2 Example of the head and shoulders pattern. . . . . . . . . . . . . . . . . . . . 15

3.3 The new patterns (figures taken from [Inv]). . . . . . . . . . . . . . . . . . . . 30

3.4 Recall as a function of the number of questions presented in the learning process. 35

6.1 SE grammar: σ ∈ Σ, x ∈ x, X ∈ X, k ∈ K, R ∈ R, f ∈ F . . . . . . . . . . . 74

6.2 Detailed results for B(8). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87


Abstract

The vast majority of computer users do not know how to code, and thus can only leverage

computers to the extent provided by distributed softwares. The abundance of softwares that

target the same domains demonstrates that off-the-shelf softwares can be complex for users or

not sufficiently suitable for their needs. Programming by example (PBE) has flourished in recent

years to mitigate exactly this problem and enable users to write their own programs by describing

their intent only through examples, without writing or examining a single piece of code. An

inherent problem of PBE is that examples often under-specify the full intent of the users and thus

PBE algorithms must heuristically choose one program from many non-equivalent programs.

While this approach has been shown to be successful in some cases, it cannot guarantee that the

user’s intent will be fully captured, and is thus impractical in many cases.

In this work, we study the problem of learning the exact user intent from examples. We

model user intent as formulas over an arbitrary set of predicates and study several classes of

formulas. We start with conjunctive formulas capturing patterns in streams. We then study

conjunctive and disjunctive formulas over arbitrary predicates. We finally study the class of

disjunctive normal form (DNF) formulas, which implies that any formula can be learned. These

algorithms are inspired by exact learning algorithms that were shown to be successful in other

domains (e.g., learning automata). Our setting is novel in that the types of predicates were

limited in previous works, and thus too the expressibility of user intent.

In the last part of this work, we define the notion of abstract examples and show that it

can help to drastically reduce the number of examples posed in the learning process. Abstract

examples provide a middle ground between concrete examples and formulas that describe

program behavior on multiple examples. We show an algorithm that describes a program

through abstract examples. This algorithm can extend previous PBE synthesizers with the ability

to communicate to the user a candidate program in an intuitive language. User acceptance

of a set of abstract examples covering the input domain implies that the candidate program

is guaranteed to capture his intent. We exemplify this approach on the string and bit vector

domains.

1


2


Chapter 1

Introduction

Program synthesis is the task of automatically generating a (low-level) program from a (high-

level) specification. The specification is often declarative and does not explain how the program

should be implemented. Thus, synthesizers cannot syntactically translate specifications to

executable code, like compilers do. The first synthesizers coped with this challenge using

deductive and transformational methods [MW71]. The nice aspect of these synthesizers is

that the output programs are correct-by-construction. Their disadvantage, however, is that the

choice of rules is non-deterministic, and thus they are not guaranteed to terminate. Though some

techniques considered heuristics to improve the rules chosen [JNR02], modern synthesizers have

turned to constraint-solving approaches [SLJB08] or to enumerative approaches [SSA13]. In

these approaches, specifications are a set of constraints and any solution meeting the constraints

is considered valid. The premise of these approaches is that the specification is complete.

Namely, any solution meeting the specification is also globally correct, even on inputs not

covered by the specification.

At the same time, the setting of programming by example (PBE) has gained popula-

rity [Gul10, LWDW03, DSPGMW10, HG11, Gul11, GHS12, SG12, YTM+13, AGK13, ZS13,

MTG+13, LG14, FCD15, BGHZ15, PG15, SG16, RBVK16]. In programming by example, the

specification is a set of input-output examples. Compared to other synthesis settings, where

the specification is a logical formula or an inefficient program implementation, PBE requires

no a priori knowledge on how to represent the specification. Thus, if in former settings, the

users had to be experts or programmers, in PBE the users can be any end user. This means

that the target audience is significantly larger, which makes the potential impact of PBE much

greater. The premise behind this setting is that users can convey their intent with a few examples.

Unfortunately, this is not true and examples inherently provide an under-specification of the

user’s intent. Thus, PBE algorithms can only guarantee to output a program consistent with the

provided examples, and cannot guarantee to capture user intent on unseen inputs. A user who

wishes to guarantee correctness for all possible inputs has to manually inspect the synthesized

program, an error-prone and challenging task.

Example Eli Gold is a crisis manager at a respected law firm. Due to a crisis, he has to meet all

office members personally. After setting up times and storing the meeting times in an Excel

3


Figure 1.1: Using Flash Fill to send meeting appointments.

spreadsheet (Fig. 1.1), Eli wants to send emails with a personal message notifying each member

of the time of the meeting. He starts typing the messages in Excel. While typing the third

message, Flash Fill [Gul11] (a PBE synthesizer integrated in Excel) synthesizes a program

and creates messages for all members on the list. In our example, the input-output examples

Flash Fill takes are the first two rows, with columns A–D serving as the input and column

E serving as the output. The rest of the rows are unseen inputs, which the synthesizer does

not consider when generating the output program but executes it on them when it completes.

Without describing the full details of the Flash Fill operation, suffice it to say that it considers

programs over strings that can copy a substring from the input, add a constant, and concatenate

strings. Flash Fill is designed to be invoked when it detects a pattern. With respect to the e-mail

from our crisis manager, this means that after the first two examples, Flash Fill has detected a

pattern. Thus, it synthesizes a program and fills in the missing outputs. At first glance, Flash

Fill seems to have learned the correct program. However, careful inspection reveals that instead

of the desired “Hi” greeting, the message’s first word is an “H” followed by the second letter

of the person’s first name. This demonstrates the importance of inspecting the synthesis result

before relying on it to handle additional examples (e.g., lines 4–8 in the Excel spreadsheet).

In this thesis, we show algorithms that are guaranteed to learn the user’s intent on all inputs,

while still enabling the user to communicate through examples. To this end, we formalize the

problem of learning user intent from examples as an instance of the exact learning problem1.

Exact learning is a field in computational learning theory that is usually associated with one

of the following models: (i) identification in the limit (Gold [Gol67]), (ii) PAC learning (Va-

liant [Val84]), and (iii) query learning (Angluin [Ang88]). In this thesis, we follow Angluin’s

model, which includes a teacher and a student. The teacher knows a concept and the student’s

goal is to learn this concept. To this end, Angluin defines two types of queries that the student

can pose: membership queries and equivalence queries. The secondary goal of the student

is to pose as few queries as possible. In our context, the teacher is the end user, the student

is the synthesizer, and the concept is a formula describing the user’s intent on all possible

inputs. A membership query is whether a given input-output pair satisfies the target formula.

An equivalence query (or, validation query) is whether a certain formula describes the user’s

intent. If the teacher accepts a validation query, the learning is complete. If not, the teacher

provides a counterexample. Though validation queries are inapplicable in PBE settings, we

note that there are works in synthesis that take this approach [ABJ+13, IGIS10, SL08]. In

these works, the teacher is realized as a verifier with a formal specification (rather than a user).

1This is why the thesis is titled Exact Programming by Example.

4


Program Synthesis𝑃 ⊨ 𝜑

PBE

𝜑 𝑋, 𝑌, 𝑍 = 𝑐𝑜𝑛𝑐𝑎𝑡("𝐻𝑖", 𝑐𝑜𝑛𝑐𝑎𝑡(𝑋, 𝑐𝑜𝑛𝑐𝑎𝑡("𝑝𝑙𝑒𝑎𝑠𝑒 𝑐𝑜𝑚𝑒 𝑡𝑜 𝑚𝑦 𝑜𝑓𝑓𝑖𝑐𝑒 𝑎𝑡", 𝑐𝑜𝑛𝑐𝑎𝑡(𝑍, " -EG"))

𝑃 ⊨ 𝑖𝑛1 → 𝑜𝑢𝑡1∧ 𝑖𝑛2 → 𝑜𝑢𝑡2

𝒊𝒏𝟏 → 𝒐𝒖𝒕𝟏 =Diane Lockhart 11:00→Hi Diane, please come to my office at 11:00. -EG𝒊𝒏𝟐 → 𝒐𝒖𝒕𝟐 =Will Gardner 12:00→Hi Will, please come to my office at 12:00. -EG

Exact PBE

𝝋𝒊𝒏𝟏 → 𝒐𝒖𝒕𝟏 =Diane Lockhart 11:00→Hi Diane, please come to my office at 11:00. -EG𝒊𝒏𝟐 → 𝒐𝒖𝒕𝟐 =Will Gardner 12:00→Hi Will, please come to my office at 12:00. -EG

𝑃 ⊨ 𝜑

Cary Agos 15:00→?

Hi Cary, please come to my office at 15:00. -EG

Figure 1.2: The differences between (classic) program synthesis, programming by example(PBE), and exact programming by example. In program synthesis, an expert user provides aspecification in the form of a logical formula and the synthesizer returns a program meeting thespecification. In PBE, an end user provides a set of input-output examples and the synthesizerreturns a program consistent with the examples, but possibly not fully capturing the user’sintent. In exact PBE, the synthesizer learns the user’s intent by interacting with the user throughexamples. Then, the synthesizer returns a program that captures the user’s intent.

The formal specification provides an efficient way to answer validation questions automati-

cally. Our formulation of PBE as an instance of exact learning is novel. Except for a single

work [JGST10] that exhaustively presents membership queries until a single program remains,

no PBE synthesizers guarantee to learn the user’s intent, beyond the provided examples. While

the approach in [JGST10] guarantees to output a program that fully captures the user’s intent, it

has no non-trivial bounds on the number of membership queries posed. In contrast, in exact

learning the effectiveness of an algorithm is demonstrated by analyzing the membership query

complexity and comparing to the lower bound.

To illustrate the gap our approach addresses, consider Fig. 1.2, which continues our example

from Fig. 1.1. Program synthesis targets settings where an (expert) user provides a formal

specification, e.g., a logical formula (in Fig. 1.2, the specification is a first order formula over the

string theory) and the synthesizer looks for a program that meets this specification. While this

approach guarantees to synthesize programs meeting the specification, writing the specification

is not trivial. Programming by example targets settings where a user provides a set of input-

output examples, and the synthesizer looks for a program consistent with the examples. While

this approach provides an intuitive way to convey the user’s intent, it does not guarantee to

synthesize a program that captures the user’s intent on all possible inputs. Exact programming

by example targets the gap between these approaches – the user and synthesizer interact to learn

the user’s intent through examples. Then, the synthesizer synthesizes a program that guarantees

to capture the user’s intent.

5


In this thesis, we make the following contributions.

An Exact PBE Synthesizer for Time-Series Patterns (Chapter 3) We begin by showing

an exact PBE algorithm that learns patterns in time-series charts. Time-series charts are used

in many domains, including financial analysis ([Bul05]), medicine ([CF07]), and seismology

([MEMlT+10]). We formalize patterns as conjunctive formulas over variable inequalities (i.e.,

predicates of the form xi > xj over n variables) and study the problem of learning this class

of formulas with membership queries (that take the form of charts). In this setting, we assume

the learning begins with an initial positive example (i.e., chart). We then show how to extend

this algorithm with a synthesizer that takes a formula describing a pattern and generates an

executable program that detects this pattern in stock streams. We experimentally evaluate this

algorithm and show that it learns a range of popular chart patterns with few questions, and that

synthesized programs are able to detect popular pattern occurrences with an average precision

of 95% in real stock streams.

We continue by generalizing the learning problem to learn formulas over an arbitrary set

of predicates. These algorithms make it possible to split the exact PBE problem into two sub-

problems: (i) learning a formula describing the user’s intent on all inputs, and (ii) synthesizing a

program from that formula. We first study the class of disjunctions and conjunctions and then

study the class of disjunctive normal form (DNF) formulas.

Exact Learning Algorithms to Learn Disjunctions and Conjunctions (Chapter 4) In this

chapter, we study the learnability of the class of disjunctions over a set of predicates. In

this setting predicates may be dependent, and thus syntactically different formulas may be

semantically equivalent. Thus the challenge is to identify the non-equivalent formulas to avoid

posing redundant membership queries. Since it is expensive to compute whether two formulas

are equivalent, it is crucial to limit this kind of computation as much as possible. In this chapter,

we present an algorithm that traverses the space of non-equivalent formulas, but in a lazy fashion

– it computes members of the space only when doing so is required for the learning. We then

present the dual algorithm to learn the class of conjunctions. Lastly, we revisit the problem of

learning patterns from charts, but without the requirement that the user provide initial examples.

In this case, the class of formulas can express cyclic constraints (i.e., x1 < x2 ∧ x2 < x1). We

thus first study the class which does not permit cyclic constraints. We show that for this class,

learning can be done in polynomial time. We then study the general case, and show that learning

is equivalent to the problem of enumerating all the maximal acyclic subgraphs of a directed

graph, which is still an open problem ([ABC+12, BCL+13, Was16]).

Exact Learning Algorithms to Learn DNF Formulas (Chapter 5) We continue with a

study of the class of disjunctive normal form (DNF) formulas over arbitrary predefined predicates.

We begin with a general algorithm and then focus on two special settings where the set of

predicates is: (i) closed under negation, and (ii) “anti-closed” under negation. We show for each

6


setting an algorithm with better query complexity. In particular, for the first setting, we show an

algorithm optimal in the number of membership queries.

This chapter actually completes the topic of learning specifications from examples, as for

any formula there is an equivalent DNF formula. Thus, in the last chapter we show a different

approach to obtain exactness in PBE.

Synthesis from Abstract Examples (Chapter 6) In this chapter we show an exact learning

algorithm that takes a different approach from the previous chapters, where the task of learning

user intent is separate from the program synthesis task (and the former task is the primary

focus). However, PBE experts often believe that the program space itself should drive the

search to the target program. Thus, in the final chapter (which we believe opens a new field

for future work), we show how to obtain exactness while performing the search in the program

space. The main idea is to interact with the user through abstract examples, to be used by

the program synthesizer to communicate its behavior. The abstract examples serve as an

intuitive specification for candidate programs. Thus, through abstract examples, the final

candidate program is guaranteed to capture the user’s intent on all inputs. We have implemented

our approach and we experimentally show that our synthesizer communicates with the user

effectively by presenting on average 3 abstract examples until the user rejects false candidate

programs. Further, we show that a synthesizer that prunes the program space based on the

abstract examples reduces the overall number of required concrete examples in up to 96% of the

cases.

1.1 Related Work

In this section, we survey works in program synthesis and exact learning.

1.1.1 Program Synthesis

Some consider the roots of synthesis to be in the works of mathematicians, who have been

developing algorithmic approaches to prove theorems and solve problems since the 30’s [Kol32,

Gre69, DP60]. However, program synthesis, with the aim of generating a program from some

specification, dates to the end of the 60’s [WL69], when Waldinger and Lee showed an algorithm

that takes a first-order logical formula and generates a LISP program meeting this specification.

Their main idea is to phrase LISP instructions as axioms, the formula as a theorem, and use a

theorem prover to find a “proof”. If found, the proof is processed to a program. This approach,

known later as deductive synthesis, was further developed in the 70’s, mostly by Waldinger

and Manna [MW71, MW79, MW80]. In parallel, Waldinger and Manna [MW75] also showed

an artificial intelligence approach to synthesize programs. The idea was to define rules that

gradually transform a goal specification into a program, where each step introduces new sub-

specifications (goals). Both deductive synthesis and goal-based synthesis rely on having a

complete specification, which is often difficult for users to provide. Thus, at the same time,

7


another approach to program synthesis was developed, where instead of giving the synthesizer

a formal specification, input-output examples were provided [Har74, SSG75, Sum77, Bie78].

Later this setting became known as programming by example. Another setting that was studied

was that of concrete execution examples, which explain how to obtain the output from the

input. This setting was first presented by Smith [Smi75] and implemented in a system called

Pygmalion. Later this setting became known as programming by demonstration. Since then,

works in synthesis have developed new algorithms and paradigms; however, the settings studied

remained mostly the same. A semi-new setting assumes that in addition to a formal specification

a (limited) program syntax is provided [SLJB08, SSL11, AFSS16, BTGC16, ABJ+13]. This is

known as syntax-guided synthesis.

Moving forward to more recent years, the development of hardware and the increasing

size of CPU memory made previously impractical solutions viable. In particular, solutions

such as enumerating the program space [URD+13], succinctly representing all programs with

graphs [Gul11], and looking for solutions with constraint-solvers [SL08] have become effective

in domains that once were too large for the search to complete. A very partial list of the studied

domains includes domains for end users, such as string manipulation programs in spreads-

heets [Gul11]; data extraction [LG14] and smartphone applications [LGS13]; domains used

by programmers, such as data-structures [SLJB08, SSL11] and SQL-queries [ZS13]; and the

domain of compilers and optimizations, such as optimization of bit manipulations [JGST10],

compilation to low-power architectures [PJS+14], compilation of general-purpose language pro-

grams to optimized DSLs [CKSL15], and optimization of programs interacting with databases

via ORMs [CSLM13].

In this thesis, we focus on the setting of programming by example. This setting has gained

increased popularity over the last fifteen years [Gul10, LWDW03, DSPGMW10, HG11, Gul11,

GHS12, SG12, YTM+13, AGK13, ZS13, MTG+13, LG14, FCD15, BGHZ15, PG15, SG16,

RBVK16] due to its simplicity, which makes it tractable to end users who are not programmers.

The vast majority of PBE algorithms synthesize programs consistent with the input-output

examples, which may not capture the user’s intent on unseen inputs. However, some works

guarantee to output the target program. For example, CEGIS [SL08] learns a program by

introducing equivalence queries upon finding a candidate program that is consistent with the

provided examples. By accepting the equivalence query, the user confirms that his intent has

been captured on all inputs. In oracle-guided synthesis [JGST10], the program space is assumed

to be finite (in fact, it is assumed to be a permutation of a fixed number of instructions). This

enables the posing of membership queries to prune the program space until only one program

remains (more precisely, until all programs that remain are equivalent). While correctness is

guaranteed, there is no guarantee on the number of queries posed, as in each step two non-

equivalent programs are chosen arbitrarily and then an input on which they return different

outputs is presented as a membership query.

8


1.1.2 Exact Learning

Learning from examples has been extensively studied in computational learning theory. As

mentioned, there are three main models: (i) identification in the limit (by Gold [Gol67]),

(ii) query learning (by Angluin [Ang88]), and (iii) PAC learning (by Valiant [Val84]). While

these models vary in their settings and goals, they all learn languages and although their queries

are a bit different, they all support queries that ask the teacher whether a specific word belongs

to the target language. Our setting follows Angluin’s setting, which defines the teacher-student

model and two types of queries: membership and equivalence (also called validation). In a

membership query the student picks a word in the domain and asks whether it is part of the

target language. The teacher responds with a yes or no. In an equivalence query the student

picks a language and asks whether it is the target language. If the teacher accepts that language,

the learning is complete. Otherwise, the teacher provides a counterexample, that is, a word that

belongs to one language but not to the other.

In this thesis, we focus on learning with membership queries only. The literature has shown

many results for this setting in many applications, including group testing [DH00, DH06], blood

testing [Dor43], chemical leak testing, chemical reactions [AC08], electrical short detection,

codes, multi-access channel communications [BG07], molecular biology, VLSI testing, AIDS

screening, whole-genome shotgun sequencing [ABK+02], DNA physical mapping [GK98],

game theory [Pel02], and many other applications [DH00, ND00, BGV05, DH06, Cic13, BG07].

However, no work has studied the class we address of formulas over an arbitrary set of predicates.

The setting of learning with membership queries is also related to the notion of teaching

dimension. Goldman and Kearns [GK95] define the teaching dimension of a concept class as

the minimum number of examples a teacher must reveal to uniquely identify any concept in

the class. In particular, Goldman and Kearns have studied the teaching dimension for the class

of formulas over monomials and provided lower bounds on the required number of questions.

In this thesis, we study the class of formulas over an arbitrary set of predicates and analyze

the query complexity compared to the lower bound (i.e., the teaching dimension). Due to the

potential correlations between predicates, their approach is inapplicable in our setting.

9


10


Chapter 2

Preliminaries

In this section, we provide preliminaries and notations used throughout the thesis.

2.1 Formulas

In this section, we define common terminology related to sets and formulas.

Throughout the thesis, we focus on quantifier-free first-order logical formulas (or simply,

formulas) restricted to predicate symbols. Formally, we focus on the following class of formulas:

Definition 2.1.1. Let x0, x1, ... be an infinite set of variables andQ be a set of predicate symbols.

The set of formulas is defined inductively:

• Predicate symbols: If q is an n-ary predicate symbol, and xi1 , ..., xin are variables, then

q(xi1 , ..., xin) is a formula.

• If ϕ is a formula, then ¬ϕ is a formula.

• If ϕ and ψ are formulas, then ϕ ∨ ψ and ϕ ∧ ψ are formulas.

We follow the common semantics to evaluate the truth value of a formula. In the following,

we abuse the term predicate to refer both to its syntactic meaning (the symbol that is part of the

formula) and to its semantic meaning (a set).

A formula ϕ is called disjunctive (resp. conjunctive) or a disjunction (resp. conjunction) if

it is of the form∨q∈Q q (resp.

∧q∈Q q) where Q is a set of literals. The literals over Q is the

set L = Q∪ ¬q | q ∈ Q. The set of literals L is closed under negation, where we equate ¬¬lwith l.

A formula ϕ is called a DNF if it is of the form (∧q∈Q1

q) ∨ ... ∨ (∧q∈Qk q), where Qi are

sets of literals. In the following, we denote ∨Q the disjunctive formula ∨q∈QR and ∧Q the

conjunctive formula ∧q∈Qq, where Q is a set of literals.

2.2 Exact Learning

We follow an exact learning model that is heavily inspired by Angluin’s exact learning mo-

del [Ang88]. Given a domain D and a set of predicates over D,Q, our goal is to learn a member

11


ψ in a certain class of formulas over Q, denoted by Qclass, where class is either ∨ (the class

of disjunctions), ∧ (the class of conjunctions), or DNF (the class of DNF formulas over) over

Q. That member ψ, also called the target formula, defines a subset H (the hypothesis) of D

and thus has a single free variable. Formally, denote byM the model whose domain is D and

its predicates are those in S. Then, for all e ∈ D,M |= ϕ[e] if and only if e ∈ H . If D is a

Cartesian product of size n, e.g., D = X ×X...×X︸︷︷︸n times

, we will define ψ with n free variables to

address individual elements in the tuples. We sometimes treat formulas as Boolean functions

over D. The Boolean functions are defined as follows: for all e ∈ D, ϕ(e) = 1 ifM |= ϕ[e],

and ϕ(e) = 0, otherwise.

We assume a teacher (also called the user) that has a target formula ψ ∈ Qclass and a

learner that knows Qclass but not ψ. The teacher can answer membership queries for the target

function, that is, given e ∈ D (from the learner), the user returns true if e |= ψ and false

otherwise. The goal of the learner (the learning algorithm) is to find the target formula ψ with a

minimum number of membership queries.

Following are a few notations used throughout the thesis. OPT(Qclass) denotes the mi-

nimum worst case number of membership queries required to learn a formula ψ in Qclass.Given ψ ∈ Qclass, we denote by Q(ψ) the set that consists of all predicates in ψ. For example,

S(R1 ∨R2) = R1, R2. Given e ∈ D, we denote byQ(e) the set of all predicates satisfied by

e: Q(e) = q ∈ Q | e |= q.

12


Chapter 3

Time-Series Patterns from Charts

In this chapter, we show an exact PBE algorithm that learns patterns in time-series charts.

Time-series charts are used in many domains including financial analysis ([Bul05]), medicine

([CF07]), and seismology ([MEMlT+10]). Experts use these charts to predict important events

(e.g., trend changes in a stock price) indicated by special patterns. There is a lot of study on

common patterns and there are many softwares that enable these experts to write a program

that alerts upon detecting their customized pattern (e.g., some platforms for finance analysts are

MetaTrader, MetaStock, Amibroker). Unfortunately, writing programs is a complex task for

these experts, who are not programmers.

We present a novel, interactive synthesis approach that relieves analysts from programming,

and allows them instead to specify their intent directly via visual examples. Our approach is

based on two key ideas: (i) a logical fragment expressive enough to capture interesting chart

patterns, and (ii) an interactive algorithm which leverages our logical fragment to learn target

queries by presenting the user with a polynomial number of examples to be classified. Our

results are general to any application of time-series patterns; however, in the following we focus

on patterns of financial streams. In particular, to evaluate our approach, we implemented a

procedure that transforms the synthesized queries into directly executable programs in a popular

trading platform. Experimental results show that our synthesizer learns a range of popular chart

patterns with few questions, and that synthesized programs are able to detect popular pattern

occurrences with an average precision of 95% in real stock streams.

3.1 The Challenge

Technical analysis is used by millions of traders for trading various assets, including stocks,

futures, and commodities. Technical analysis tries to predict future price movement based on

past price changes visualized in charts (e.g., Fig. 3.1(a)) and on special forms known as patterns.

The occurrence of a pattern in a chart is used as a predictor of future price trends. For example,

the head and shoulders pattern in Fig. 3.1(b) predicts price decline.

To detect chart patterns analysts use pattern queries, queries that take an input price stream

and report matches of the patterns in the streams. There are many trading platforms that

13


http://www.metaquotes.net/en/metatrader5

http://www.metastock.com/

http://amibroker.com/

(a)

(b)

1: Price = Close;2: thrs = 0.5;3: P5 = Peak(Price, thrs, 1);4: PB5 = LastValue(PeakBars(Price, thrs, 1));5: P3 = Peak(Price, thrs, 2);6: PB3 = LastValue(PeakBars(Price, thrs, 2));7: P1 = Peak(Price, thrs, 3);8: PB1 = LastValue(PeakBars(Price, thrs, 3));9: P-1 = Peak(Price, thrs, 4);10: PB-1 = LastValue(PeakBars(Price, thrs, 4));11: P0 = LLV(Ref(Price, -PB1-1), PB-1-PB1-1);12: P2 = LLV(Ref(Price, -PB3-1), PB1-PB3-1);13: P4 = LLV(Ref(Price, -PB5-1), PB3-PB5-1);14: P6 = LLV(Price, PB5);15: Filter = P0 < P2 AND P2 < P1 AND P1 < P3 AND P2 <

P4 AND P4 < P5 AND P5 < P3 AND P6 < P0;

(c)Figure 3.1: (a) A price chart (from Yahoo Finance). (b) The head and shoulders pattern(from [Inv]). (c) The complete synthesized program for the head and shoulders pattern.

provide built-in queries; however, analysts typically want to define patterns based on their own

viewpoint [LMW00], ideally via a quick and intuitive process that enables them to adapt queries

after obtaining preliminary results.

There are various domain-specific languages (DSLs) for writing pattern queries; however,

the task of writing queries is complex and error-prone, especially for analysts who are not

expert programmers. For example, Fig. 3.1(c) shows a query written in AFL, a DSL of the

well-known AmiBroker trading platform. This query detects the head and shoulders pattern by

locating seven peaks and lows (P0, ..., P6) in a price stream and checking whether they meet the

conditions that characterize the pattern. To write such queries, analysts are not only required to

know the language primitives (marked in bold in Fig. 3.1(c)), but also how to combine them

correctly in the query. For example, when LLV receives as input a mathematical expression

(Lines 11-13), each operand must be defined using the LastValue operation. Failing to do so

results in an incorrect query.

Current Approaches The interest in technical analysis queries led to the development of DSLs

in many trading platforms (e.g., MetaTrader, MetaStock, Amibroker, NinjaTrader). While these

DSLs offer tailored primitives, they are still strict programming languages and require familiarity

with programming. Microsoft’s StreamInsight [CGM10] allows analysts to express patterns via

state machines; however, they are still required to encode them in C], which is non-trivial even

for programmers. CPL [ACK01] is a Haskell-based language designed to simplify programming

of chart pattern queries. Yet it requires familiarity with Haskell and functional programming,

which even experienced programmers may not have. All these approaches require analysts

to express chart patterns in strict programming languages. In contrast, to these approaches,

we present a new synthesis approach that enables analysts to work directly with visual chart

examples and not with programs. To describe patterns, analysts provide visual chart examples

to our interactive synthesizer, called SyFi. SyFi uses visual examples to learn formulas that

14



http://www.metaquotes.net/en/metatrader5

http://www.metastock.com/


http://www.ninjatrader.com/

Figure 3.2: Example of the head and shoulders pattern.

capture the patterns, and synthesizes efficient queries from the formulas. The elegance of this

approach is that analysts need not understand formulas or programming languages and only deal

with intuitive and familiar charts.

3.2 Definitions and Problem Definition

In this section, we provide definitions and state the problem addressed.

3.2.1 Technical Analysis Terms

Price Streams A price stream is a function mapping time points (such as date or hour) to

prices. For example, Fig. 3.2 shows a price stream at the resolution of days where each date

is mapped to the stock closing price on that date. To simplify presentation, we assume price

streams are stock closing prices and that time points are dates. Formally, a stream is a function

mapping natural numbers to real valued prices: S : N→ R.

Prices do not move in a straight line, but rather in zigzags (e.g., Fig. 3.1(a)). Still, prices

exhibit trends that are the overall direction of the price stream at a certain period of time.

Trends are determined by the notable peak and low points. Notable can be interpreted in many

ways; thus, to accommodate any interpretation, we henceforth assume an extremum function

ES : N→ 0, 1 defined over a stream S that flags the extremum points:

Es(i) = 1⇔ S(i) is an extremum point.

Let i and j be points such that ES(i) = ES(j) = 1 and ES(i′) = 0 for all i′ ∈ (i, ..., j).

We say that i and j show an uptrend if S(i) < S(j), a downtrend if S(i) > S(j), or a sideways

trend if S(i) = S(j)1. For example, the points in the left part of Fig. 3.1(a) show an uptrend.

Line Charts Streams are inspected in bounded time frames, known as charts. A price chart is

a function mapping a finite set of consecutive dates to their corresponding prices. A line chart is

a chart that shows the line connecting the daily closing prices (e.g., Fig. 3.1(a)). We focus on

line charts because many analysts believe that the closing price is the most significant indicator

of price activity, and thus believe that line charts are a more indicative measure of this activity

1Practically, one defines thresholds and defines S(i) > S(j) if S(i) > S(j) + thes, S(i) < S(j) if S(i) +thes < S(j) and S(i) = S(j) if S(i) < S(j) + thes and S(i) + thes > S(j).

15


than other chart types. Formally, given a stream S, a chart from date d of size k is a function

mapping k consecutive dates starting from d to their prices: Sd..k : 0, ..., k − 1 → R, such

that:

Sd..k(i) = S(d+ i)

When referring to an arbitrary chart, where the starting date is not important, we denote S0..k.

Patterns Technical analysts predict future trends based on past trends that form into a known

pattern. Namely, a pattern is a sequence of trends. For example, the head and shoulders pattern

(Fig. 3.2) consists of three uptrends each followed by a downtrend. We note that while the trend

sequence is the main characteristic of a pattern, there are other characteristics, such as stock

volume ([Bul05]), which are ignored in this work. Formally, a pattern is a formula defined over

n variables, p0, ..., pn−1, that belongs to the class Qn∧ where:

Qn = pi≺pj ,¬(pi≺pj) | 0 ≤ i, j < n

For example, the following is a head and shoulders formula:

ϕHS(p0, . . . , p6) = p0≺p1 ∧ p2≺p1 ∧ p1≺p3 ∧ p5≺p3 ∧ p4≺p5 ∧ p6≺p5

The size of a pattern ϕP ∈ Qn∧, denoted by |ϕP |, is the maximal index of the variables. For

example, |p1≺p3| = 3.

A price chart Sd..k meets a pattern ϕP of size n if its extremum points are a model of ϕP :

Sd..k |= ∃i0, ..., in−1 [∧j∈i0,...,in−1 0 ≤ j < k] ∧

∀j.[j ∈ i0, ..., in−1 ⇔ ES(d+ j) = 1] ∧ϕP (Sd..k(i0), ..., Sd..k(in−1))

3.2.2 Problem Definition

We address the problem of synthesizing pattern queries from price charts. We split this problem

into two parts:

• Given a price chart, interactively learn a formula ϕP that captures the desired pattern.

• Given a formula, synthesize an executable query that detects charts that meet the pattern

in a stream.

The first task is exactly the problem of learning the class ofQn∧ over the domain D = Rn, where

we assume a user who can answer membership queries. Technically, the membership queries

are charts – a sequence of real numbers uniquely defines a price chart – which implies that the

interaction is based on (visual) charts. The second task completes the first task by synthesizing

a query over a programming language. While this task is more straightforward, it requires

detecting charts that have n extremum points and report to the user in case they meet the pattern.

16


3.3 Learning Patterns from Charts

In this section, we describe our algorithm for learning formulas from charts, called SyFi

(standing for synthesis of finance queries). We begin with the main insight that guides the

algorithm, then provide the algorithm, and finally illustrate it on an example.

3.3.1 Learning through Examples

In this section, we give a few results that guide our algorithm.

Lemma 3.3.1. Let ψ be the target pattern of size n and let S0..n be a chart. If S0..n |= ψ, then

for any q ∈ Qn such that S0..n 6|= q, ψ 6|= q. In particular, q /∈ Q(ψ).

Proof. Since there is S0..n 6|= q such that S0..n |= ψ, it follows that ψ 6|= q. Since ψ is a

conjunction, if it does not logically imply q, q /∈ Q(ψ).

Lemma 3.3.2. Let ψ be the target pattern of size n and let Q ⊆ Qn be a set of predicates such

that Q(ψ) ⊆ Q. If q ∈ Q such that (∧(Q \ q)) 6|= ψ, then q ∈ Q(ψ).

Proof. Assume in contradiction that q /∈ Q(ψ), namely Q(ψ) ⊆ Q \ q. Then, it must be that

(∧(Q \ q)) |= ψ– a contradiction.

These lemmas guide our algorithm. It maintains a set Q′ that is known to be a superset of

the Q(ψ) and at each step it picks a predicate q and generates a chart. If this chart is a positive

example, then q is not in Q(ψ) and otherwise q is in Q(ψ). Namely, the high-level algorithm is:

Algorithm 1: High-level SyFi1 Q′ = ?2 for q ∈ Q′ do3 Sq = model(Q′ \ q ∧ ¬q)4 if Sq == null then ?5 if ψ(Sq) = 1 then // Pose a membership query

6 Q′ = Q′ \ q // By Lemma 3.3.1, q /∈ Q(ψ)

7 else8 // do nothing // By Lemma 3.3.2, q ∈ Q(ψ)

9 return ∧Q′ // At this point, Q′ = Q(ψ)

If for every q there exists a chart Sq, the lemmas guarantee that this algorithm returns ψ.

Thus, the questions that remain to be addressed are:

• How should Q′ be initialized?

• How can it be guaranteed that Sq exists for every q?

While a natural candidate for initializing Q′ is Qn (i.e., all possible predicates), doing so would

result in no Sq for any q (assuming |Q| > 2). This follows since Qn is closed under negation,

and thus starting from Qn means that Q′ contains a pair of a predicate and its negation.

Instead, we address both questions by assuming that the learning process starts from a

positive example provided by the user, which is the topic of the next section. This assumption

17


Algorithm 2: SyFi(Su0..n)1 Q′ = Q(Su0..n) = q ∈ Qn | Su0..n |= q2 Qψ = ∅ // The set of predicates logically implied by ψ

3 while Q′ \ Qψ 6= ∅ do4 pi≺pj = argminpi≺pj∈Q′\Qψ |Su0..n(i)− Su0..n(j)|5 S = model(∧(Q′ \ pi≺pj) ∧ ¬pi≺pj)6 if ψ(S) = 1 then // Pose a membership query

7 Q′ = Q′ \ pi≺pj // By Lemma 3.3.1

8 S = model(Q′ \ ¬pj≺pi ∧ pj≺pi)9 if ψ(S) = 1 then // Pose a membership query

10 Q′ = Q′ \ ¬pj≺pi // By Lemma 3.3.1

11 else12 Qψ = Qψ ∪ q′ ∈ Q′ | Qψ ∪ ¬pj≺pi |= q′ // By Lemma 3.3.2

13 else14 Qψ = Qψ ∪ q′ ∈ Q′ | Qψ ∪ pi≺pj |= q′ // By Lemma 3.3.2, q ∈ Q(ψ)

15 return ∧Q′ // At this point, ∧Q′ ≡ ψ

does not impose any burden on the user and it enables our algorithm to be linear in the size of

Qn if there are no equal points. In the next chapter, we provide a different solution that does

not require starting from an example but is not linear in the number of predicates and does not

support the case where there are equal points.

3.3.2 Learning with an Initial Positive Example

In this section, we address the questions that were raised when describing the high-level

algorithm. We address them by assuming the user provides an initial chart that meets the target

pattern. We denote this chart by Su0..n. We address both points by leveraging Su0..n:

• Initializing Q′: We initialize Q′ to q ∈ Qn | S0..n |= q. By Lemma 3.3.1, it is

guaranteed that Q(ψ) ⊆ Q′.• Guaranteeing that Sq 6= null: We address this by posing an order over the predicates

inspected, such that at each point it is guaranteed that the predicate considered will have

the required chart. The details are provided in the reminder of this section.

We begin by explaining our solution when there are no equal points in the initial Qn (namely,

there are no i, j such that ¬pi≺pj ,¬pj≺pi ∈ Qn), and then explain how to extend this solution

when there are equal points.

Learning with No Equal Points If there are no equal points, then for every pair of points

i 6= j, either pi≺pj ,¬pj≺pi ∈ Q′ or pi≺pj ,¬pj≺pi ∈ Qn. Our next lemma shows that if

one considers the predicate pi≺pj such that S(i) and S(j) have a minimal distance, then Spi≺pjand S¬pj≺pi exist.

Lemma 3.3.3. Let ψ be the target pattern of size n, let Su0..n be the initial chart, and let

Q ⊆ Q(Su0..n) be a set such that:

• Q(ψ) ⊆ Q,

18


• ∧Q is satisfiable, and

• For every i 6= j and for every k: If pi ≺ pj ,¬pk ≺ pi, pk ≺ pj ∈ Q, then Su0..n(i) <

Su0..n(k) < Su0..n(j).

If pi≺pj = argminpi≺pj∈Q′\Qψ |Su0..n(i)− Su0..n(j)|, then the following are satisfiable:

1. Q \ pi≺pj ∧ ¬pi≺pj2. Q \ pi≺pj ,¬pj≺pi ∧ pj≺pi

Proof. We first show that ϕ = (∧Q) ∧ ∀k.(k 6= i, j → pk ≺ pi ∨ ¬pk ≺ pj) is satisfiable.

Since ∧Q is satisfiable, if ϕ is unsatisfiable then there exists k 6= i, j such that ¬pk≺pi, pk≺pj ∈ Q. Since Q ⊆ Q(Su0..n), then Su0..n(i) < Su0..n(k) < Su0..n(j). Namely, pi ≺ pj 6=argminpi≺pj∈Q′\Qψ |S

u0..n(i)−Su0..n(j)| – a contradiction. Since ϕ is satisfiable, there is a chart

S |= ϕ.

1. Define:

S′(i′) =

S(j), if i′ = i

S(i′), otherwise

We show that S′ |= ∧Q \ pi≺pj ∧ ¬pi≺pj . Let q ∈ Q \ pi≺pj ∧ ¬pi≺pj .• If q = pk ≺ pk′ or q = ¬pk ≺ pk′ such that k, k′ 6= i: S′ |= q since S |= q and

S′(k) = S(k), S′(k′) = S(k′).

• If q = ¬pi≺pj or q = ¬pj≺pi: S′ |= q since S′(i) = S′(j).

• If q = pk≺pi: Since S |= Q, then S(k) ≤ S(i), and since S(i) < S′(i) it follows

that S′ |= Q.

• If q = ¬pk≺ pi for k 6= j: In this case, by the definition of ϕ, S(k) ≥ S(j) and

thus in particular S′(k) ≥ S(j) = S′(i) and thus S′ |= Q.

2. Define:

S′(i′) =

S(j), if i′ = i

S(i), if i′ = j

S(i′), otherwise

We show that S′ |= ∧Q \ pi ≺ pj ,¬pj ≺ pi ∧ pj ≺ pi. Let q ∈ Q \ pi ≺ pj ,¬pj ≺pi ∧ pj≺pi.• If q = pk≺ pk′ or q = ¬pk≺ pk′ such that k, k′ 6= i, j: S′ |= q since S |= q and

S′(k) = S(k), S′(k′) = S(k′).

• If q = pj≺pi: Since S(i) < S(j), by the definition of S′, S′(i) > S′(j).

• If q = pk≺pi or q = ¬pk≺pj : Continues to hold by the definition of S′.

• If q = pk ≺ pj : By the definition of ϕ, S(k) < S(i) and thus in particular

S′(k) < S′(j) = S(i) and thus S′ |= Q.

• If q = ¬pk≺ pi for k 6= j: In this case, by the definition of ϕ, S(k) ≥ S(j) and

thus in particular S′(k) ≥ S(j) = S′(i) and thus S′ |= Q.

Lemma 3.3.4. Let Su0..n be a chart such that for every i 6= j, if ¬pi≺pj ∈ Q, then ¬pj≺pi /∈Q. Then, SyFi terminates and outputs a formula that is equivalent to the target ψ.

19


Proof. • SyFi completes:

– At every iteration either there is pi, pj such that pi≺pj ∈ Q′ \ Qψ or Q′ \ Qψ = ∅:Follows because for every pi, pj either pi ≺ pj ∈ Q′ or pj ≺ pi ∈ Q′ and each

iteration either removes or adds to Q′ψ pi≺pj and ¬pj≺pi or pj≺pi and ¬pi≺pj .– There are models in Lines 5 and 8: From Lemma 3.3.3, it is sufficient to show that

the lemma preconditions are met. Initially Q′ is satisfiable by Su0..n. Also, since

there are no equal points, for every i 6= j and k, if pi≺pj ,¬pk≺pi, pk≺pj ∈ Q,

then Su0..n(i) < Su0..n(k) < Su0..n(j). Thus, it is sufficient to show that Q(ψ) ⊆ Q′.We show this by induction. Base: Follows sinceQ′ = Q(Su0..n) and by Lemma 3.3.1.

Step: A predicate q is removed only when an example satisfying ¬q is discovered

as positive, and thus from Lemma 3.3.1, Q(ψ) ⊆ Q \ q.• SyFi returns a formula equivalent to the target ψ: Since Q(ψ) ⊆ Q′ throughout the

execution, ∧Q′ |= ψ. We now show that ψ |= ∧Q′ψ throughout the execution, and since

when SyFi completes Q′ = Qψ, the claim follows. We show this by induction. Base:

Follows since Qψ = ∅. Step: A predicate q is added to Qψ either when:

– An example satisfying ∧(Qn \ q) is discovered as negative, and thus from

Lemma 3.3.2, q ∈ Q(ψ).

– The predicate is logically implied by Qψ ∪ q′ for q′ satisfying q′ ∈ Q(ψ). By

transitivity, ψ |= ∧Qψ ∪ q′ |= q.

From this lemma, we get the next theorem.

Theorem 3.1. Let Su0..n be a chart such that for every i 6= j, if¬pi≺pj ∈ Q, then¬pj≺pi /∈ Q.

SyFi learns the target formula with at most |Qn| membership queries.

Learning with Equal Points We next extend the previous results in the case that Su0..n has

equal points. There are two challenges when addressing this setting:

1. Identifying the set of points that are equal in ψ.

2. Learning the relation of the other points to the equal points.

We begin by explaining the second challenge through an example. Assume an initial chart Su0..3where two points, 0, 1, are known to be equal in ψ (i.e., ¬p0≺p1,¬p1≺p0 ∈ Q(ψ)) and the

other one is smaller in Su0..3. Then, by the initialization:

Q′ = ¬p0≺p1,¬p1≺p0, p2≺p0,¬p0≺p2, p2≺p1,¬p1≺p2

If we let SyFi run as defined in Algorithm 2, it would begin the loop and pick p2≺p0 or p2≺p1.

Assume it picks p2≺p0. Then, SyFi looks for an example satisfying the conjunction:

∧Q′ \p2≺p0∪¬p2≺p0 = ∧¬p0≺p1,¬p1≺p0,¬p2≺p0,¬p0≺p2, p2≺p1,¬p1≺p2

This is equivalent to satisfying (p0 = p1) ∧ (p0 = p2) ∧ (p2≺p1), which is unsatisfiable. The

problem arises because if points are known to be equal (p0 = p1), negating only one predicate

that relates to one of them (p2≺p0) while leaving the equivalent predicate (p2≺p1) results in

20


an unsatisfiable formula. To avoid this situation, after obtaining the equal points in ψ (which we

describe shortly), we pick a representative for each set of equal points and remove all constraints

that pertain to the other points in T (except the ones involving the representative, which describe

T ). Formally, if T is a set of equal points, we define the representative as the minimal point in

T , denoted by min(T ) and update Q′ as follows:

Q′T = Q′ ∩ pi≺pj ,¬pi≺pj | i, j /∈ T ∨ i = min(T ) ∨ j = min(T )

We next formalize this, extend the definition to multiple sets of equal points, and prove

that the resulting predicate set is logically equivalent to the original predicate set. We begin by

defining equal point sets and then provide the lemma.

Definition 3.3.5. Let ψ of size n be a target formula. A set T ⊆ 0, ..., n−1 is called an equal

point set of ψ if for all i, j ∈ T , ¬pi≺pj ,¬pj≺pi ∈ Q(ψ). T is a maximal equal point set of

ψ if T is equal point set and for every i ∈ T, k /∈ T : ¬pi≺pk /∈ Q(ψ) or ¬pk≺pi /∈ Q(ψ). A

set T1, ..., Tm is a maximal equal set of ψ if every Ti is a maximal equal point set and every

other subset of 0, ..., n− 1 is not a maximal equal point set.

Definition 3.3.6. Let T ⊆ 0, ..., n be a set of indices. Given a set Q′, we define

Q′T = Q′ ∩ pi≺pj ,¬pi≺pj | i, j /∈ T ∨ i = min(T ) ∨ j = min(T )

Given a set of index sets, T1, ..., Tm, we denote Q′T1,...,Tm = (((Q′T1)T2)...Tn).

Lemma 3.3.7. Let ψ be a target formula, Su0..n a positive example, Q(Su0..n) the predi-

cates in Qn satisfied by Su0..n and T1, ..., Tm maximal equal point sets of ψ. Then,

∧(Q(Su0..n)T1,...,Tm) ≡ ∧Q(Su0..n).

Proof. We prove by induction on m. Base: n = 0 trivial. Step: We show that

∧(Q(Su0..n)T1,...,Tm) ≡ ∧(Q(Su0..n)T1,...,Tm−1) and by transitivity and the induction hypothesis

we get the result. Since (Q(Su0..n)T1,...,Tm) ⊆ (Q(Su0..n)T1,...,Tm−1), ∧(Q(Su0..n)T1,...,Tm−1) |=∧(Q(Su0..n)T1,...,Tm). To show that ∧(Q(Su0..n)T1,...,Tm) |= ∧(Q(Su0..n)T1,...,Tm−1), we show that

for any q ∈ (Q(Su0..n)T1,...,Tm−1), ∧(Q(Su0..n)T1,...,Tm) |= q. We split to cases:

• If q ∈ Q(Su0..n)T1,...,Tm : The claim clearly follows.

• If q /∈ Q(Su0..n)T1,...,Tm : By definition, q = pi ≺ pj or ¬pi ≺ pj such that either

i ∈ Tm \ min(Tm) or j ∈ Tm \ min(Tm) (or both). Assume w.l.o.g. that

q = pi ≺ pj and i ∈ Tm \ min(Tm). Since Q(Su0..n)T1,...,Tm ⊆ QSu0..n , and i and

min(Tm) are equal points in ψ, it must be that pmin(Tm)≺pj ∈ Q(Su0..n). We show that

¬pi≺pmin(Tm),¬pmin(Tm)≺pi, pmin(Tm)≺pj ∈ Q(Su0..n)T1,...,Tm and this implies the

claim. First, since T1, ..., Tm−1, Tm are maximal sets: i,min(Tm) /∈ T1 ∪ ... ∪ Tm−1.

Thus ¬pi ≺ pmin(Tm),¬pmin(Tm) ≺ pi ∈ Q(Su0..n)T1,...,Tm−1 and in particular ¬pi ≺pmin(Tm),¬pmin(Tm)≺pi ∈ Q(Su0..n)T1,...,Tm . Since pi≺pj ∈ Q(Su0..n)T1,...,Tm−1 , either

j /∈ T1 ∪ ...Tm−1 or that for some j′ ≤ m− 1, j = min(Tj′). In any case, it follows that

pmin(Tm)≺pj ∈ Q(Su0..n)T1,...,Tm−1 and thus pmin(Tm)≺pj ∈ Q(Su0..n)T1,...,Tm .

21


We next address the first question of how to obtain T1, ..., Tm. If it is known that all equal

points in the given chart are also equal in the target ψ, then the maximal equal set is the set of

equivalence classes: [i]=Su0..n | 0 ≤ i < n where (i, j) ∈=S if and only if S(i) = S(j). If it

is unknown, then we run a procedure called getEqualPointChart (described shortly) that

takes a chart Su0..n and outputs a chart S which is identical, except that points that are equal in

Su0..n but are not equal in ψ are not equal in S. This enables us to assume that the maximal equal

set is the set of equivalence classes: [i]=Su0..n | 0 ≤ i < n. Equipped with this, we show an

extension of SyFi that learns any formula in Q∧:

Algorithm 3: SyFiwEqualPoints(Su0..n)1 Su0..n = getEqualPointChart(Su0..n)2 T1, ...Tm = [i]=Su0..n | 0 ≤ i < n3 Q′ = Q(Su0..n)T1,...Tm

4 Qψ = ¬pi≺pj ,¬pi≺pj | ¬pi≺pj ,¬pi≺pj ∈ Q′ // Logically implied by ψ

5 while Q′ \ (Qψ ∪⋃k∈1,...,m¬pi≺pj ,¬pj≺pi | i, j ∈ Tk) do

6 [SyFi (Algorithm 2) Lines 4–14]

7 return ∧Q′

Before describing getEqualPointChart, we prove that this algorithm meets the precon-

ditions of the lemmas presented in the previous section and thus SyFiwEqualPoints learns

the class Q∧. To facilitate the proof, we show the following lemma:

Lemma 3.3.8. Given the target formula ψ, a positive example Su0..n, and a maximal equal set

T1, ..., Tm of ψ, there exists ψ′ such that ψ′ ≡ ψ and Q(ψ′) ⊆ (QSu0..n)T1,...,Tm .

Proof. We construct ψ′. We begin with Q(ψ′) = Q(Su0..n)T1,...,Tm ∩ Q(ψ) (i.e., ψ |= ψ′) and

add predicates as follows such that ψ |= ψ′. Let q ∈ Q(ψ) \ Q(ψ′). Then, there exists i, j such

that q = pi≺pj or q = ¬pi≺pj and k such that w.l.o.g. i ∈ Tk. We split to cases:

• If j /∈ T1 ∪ ... ∪ Tm, then pmin(Tk) ≺ pj (or ¬pmin(Tk) ≺ pj) is in Q(Su0..n)T1,...,Tm .

Also, ¬pi≺ pmin(Tk),¬pmin(Tk)≺ pi are in Q(Su0..n)T1,...,Tm . Thus, we add these three

predicates to Q(ψ′). Since q ∈ Q(ψ), ψ logically implies these three predicates and thus

it continues to hold that ψ |= ψ′.

• If there exists k′ such that j ∈ Tk′ , we split to cases:

– If k′ 6= k: pmin(Tk) ≺ pmin(Tk′ )(or ¬pmin(Tk) ≺ pmin(Tk′ )

) is in Q(Su0..n)T1,...,Tm .

Also, ¬pi ≺ pmin(Tk),¬pmin(Tk) ≺ pi,¬pj ≺ pmin(Tk′ ),¬pmin(Tk′ )

≺ pj are in

Q(Su0..n)T1,...,Tm . Thus, we add these five predicates to Q(ψ′). Since q ∈ Q(ψ), ψ

logically implies these five predicates and thus it continues to hold that ψ |= ψ′.

– If k′ = k (i.e., q is in fact ¬pi ≺ pj): ¬pmin(Tk) ≺ pj ,¬pj ≺ pmin(Tk) are in

Q(Su0..n)T1,...,Tm . Also, ¬pi ≺ pmin(Tk),¬pmin(Tk) ≺ pi are in Q(Su0..n)T1,...,Tm .

Thus, we add these four predicates to Q(ψ′). Since q ∈ Q(ψ), ψ logically implies

these four predicates and thus it continues to hold that ψ |= ψ′.

Lastly, we show that ψ′ |= ψ. Let S |= ψ′. We show that for every q ∈ Q(ψ), S |= q, which

implies S |= ψ. Let q ∈ Q(ψ). If q ∈ Q(ψ′), the claim follows. Otherwise, consider the

22


predicates that were added to ψ′ when constructing it and considering q. Then, since S |= ψ′, it

satisfies these predicates, and thus S |= q.

Lemma 3.3.9. Given the target formula ψ and an initial chart Su0..n such that the maximal equal

set of ψ is equal to T1, ..., Tm = [i]=Su0..n | 0 ≤ i < n, SyFiwEqualPoints completes

and outputs the a formula equivalent to ψ.

Proof. • SyFiwEqualPoints completes:

– At every iteration either there is pi, pj such that pi≺pj ∈ Q′ \ Qψ or Q′ \ (Qψ ∪⋃k∈1,...,m¬pi ≺ pj ,¬pj ≺ pi | i, j ∈ Tk) = ∅: Follows because for every

constraint involving pi, pj one of the following is true:

∗ They belong to a maximal equal point set and thus are not in Q′ or are in⋃k∈1,...,m¬pi ≺ pj ,¬pj ≺ pi | i, j ∈ Tk. In either case, they are not in

Q′ \ (Qψ ∪⋃k∈1,...,m¬pi≺pj ,¬pj≺pi | i, j ∈ Tk).

∗ Otherwise, they are not in the same maximal equal point set and are inQ′. Thus,

they are not equal in Su0..n and hence either pi≺pj ∈ Q′ or pj≺pi ∈ Q′. In that

case, some iteration removes or adds to Q′ψ the predicates pi≺pj and ¬pj≺pior pj≺pi and ¬pi≺pj , thus reducing the size ofQ′\(Qψ∪

⋃k∈1,...,m¬pi≺

pj ,¬pj≺pi | i, j ∈ Tk).

– There are models in Lines 5 and 8: From Lemma 3.3.3, it is sufficient to show that

the lemma preconditions are met. Initially Q′ is satisfiable by Su0..n. Also, for every

i 6= j and k, if pi≺pj ,¬pk≺pi, pk≺pj ∈ Q′, then i, j, k are not in T1, ..., Tm or

belong to different sets. Thus, Su0..n(i) < Su0..n(k) < Su0..n(j). Thus, it remains to

show that Q(ψ) ⊆ Q′. To show that, we use ψ′ ≡ ψ from Lemma 3.3.8, for which

Q(ψ′) ⊆ Q′. The proof is then by induction identically to the proof in Lemma 3.3.4

(except that it shows that Q(ψ′) ⊆ Q′ throughout the execution).

• SyFiwEqualPoints returns a formula equivalent to the target ψ: Since Q(ψ′) ⊆ Q′

throughout the execution, ∧Q′ |= ψ′ (where ψ′ is the formula from Lemma 3.3.8).

We now show that ψ′ |= ∧(Q′ψ ∪⋃k∈1,...,m¬pi ≺ pj ,¬pj ≺ pi | i, j ∈ Tk)

throughout the execution, and since when SyFiwEqualPoints completes Q′ =

Qψ ∪⋃k∈1,...,m¬pi ≺ pj ,¬pj ≺ pi | i, j ∈ Tk), the claim follows. We show

this by induction. Base: Follows sinceQψ = ∅ and since T1, ..., Tm are maximal equal

sets of ψ and thus in ψ′. Step: A predicate q is added to Qψ either when:

– An example satisfying ∧(Qn \ q) and is discovered as negative; thus from

Lemma 3.3.2, q ∈ Q(ψ′).

– The predicate is logically implied by Qψ ∪ q′ for q′ satisfying q′ ∈ Q(ψ). By

transitivity, ψ |= ∧Qψ ∪ q′ |= q.

We conclude this section by explaining getEqualPointChart procedure, which takes a

chart Su0..n and returns a chart S as similar to Su0..n as possible but where points are equal only if

it is required by the target formula ψ. The getEqualPointChart procedure (Algorithm 4)

23


Algorithm 4: getEqualPointChart(Su0..n)1 S = Su0..n2 while true do3 for T ∈ [i]=S | 0 ≤ i < n do4 for i = 1; i ≤ |T |/2; i+ + do5 for T ′ ∈ T ′ ∈ 2T | |T ′| = i do6 S′ = model(∧[Q(S) \ ¬pk≺pk′ |k ∈ T ′, k ∈ T \ T ′ ∪ pk≺pk′ | k ∈

T ′, k ∈ T \ T ′])7 if ψ(S′) = 0 then // A membership query

8 S′ = model(∧[Q(S) \ ¬pk′≺pk | k ∈ T ′, k ∈ T \ T ′ ∪ pk′≺pk | k ∈T ′, k ∈ T \ T ′])

9 if ψ(S′) = 0 then continue // A membership query

10 S = S′

11 goto Line 2

12 break

13 return S

starts from Su0..n and the equivalence classes of =Su0..n. It then starts a loop that iterates the

equivalence classes and checks for each whether it has a subset whose values may differ from

those of the other points in the class (but the points in the subset are equal to one another). To

this end, an inner loop checks all possible subsets of the equivalence class (by symmetry, it is

sufficient to check subsets up to half of the class size). For each subset, two charts are generated:

one where the subset has value smaller than the other points and one where the subset has

value greater than the other points. For each chart a membership query is posed. If a chart is

a positive example, then it serves as the new chart to proceed from and the operation restarts.

Eventually, the algorithm completes when all the equal points of the current chart have to be

equal in charts satisfying ψ. To prove correctness, we show that (i) there are models in Lines 6

and 8, (ii) getEqualPointChart completes, and (iii) the final chart meets the requirement.

Lemma 3.3.10. For every S, there are models in Lines 6 and 8.

Proof. We show a model for Line 6, and the model for Line 8 is similar. Denote the minimal

difference between different points in S bymin, i.e., min = minS(i)−S(j) | S(i)−S(j) >

0. We define S′ as follows:

S′(i) =

S(i)−min/2, if i ∈ T ′

S(i), otherwise

We show that S′ |= ∧[Q(S)\¬pk≺pk′ |k ∈ T ′, k ∈ T \T ′∪pk≺pk′ | k ∈ T ′, k ∈ T \T ′].Let q ∈ ∧[Q(S) \ ¬pk≺pk′ |k ∈ T ′, k ∈ T \ T ′ ∪ pk≺pk′ | k ∈ T ′, k ∈ T \ T ′]:• If q = pk ≺ pk′ such that k ∈ T ′, k′ ∈ T \ T ′: Then, S(k) = S(k′) and thus S′(k) =

S(k)−min/2 < S(k′) = S′(k).

• If q = pk≺ pk′ or q = ¬pk≺ pk′ for k, k′ /∈ T : Follows since S |= q and since by the

definition of S′, S′(k) = S(k) and S′(k′) = S(k′).

24


• If q = pk ≺ pk′ or q = ¬pk′ ≺ pk such that k ∈ T ′ and k′ /∈ T : Follows since

S′(k) = S(k)−min/2 < S(k) < S(k′) = S′(k′).

• If q = ¬pk ≺ pk′ or q = pk′ ≺ pk such that k ∈ T ′ and k′ /∈ T : Follows since

S′(k) = S(k)−min/2 > S(k)−min ≥ S(k′) = S′(k′).

• If q = ¬pk≺pk′ such that k, k′ ∈ T ′: Follows since S′(k) = S(k) = S(k′) = S′(k′).

Lemma 3.3.11. getEqualPointChart terminates.

Proof. By the previous lemma, getEqualPointChart cannot get stuck when looking for

models. Further, at each iteration of the outer loop, either S is replaced with S′, which has fewer

equal points, or S is returned. Thus, the number of iterations is bounded by the number of pairs

of equal points in the initial chart, and thus is guaranteed to terminate.

Lemma 3.3.12. Let S be a chart returned by getEqualPointChart. Then, for every 0 ≤i 6= j < n, if S(i) = S(j), then ¬pi≺pj ,¬pi≺pi ∈ Q(ψ).

Proof. Assume by contradiction that S is returned and there is a pair S(i) = S(j) such that

¬pi ≺ pj /∈ Q(ψ) or ¬pj ≺ pi /∈ Q(ψ). Consider T = j′ | S(i) = S(j′), which is

inspected by getEqualPointChart, and consider all subsets that are equal point sets of ψ:

Ts = j′ | ¬pi′ ≺ pj′ ,¬pj′ ≺ pi′ ∈ Q(ψ) | i′ ∈ T. Since S is a positive example, it

satisfies all predicates in Q(ψ) and thus Q(ψ) ⊆ Q(S). We define a partial order over Ts, <

as follows: T1 < T2 if there exists i′ ∈ T1 and j′ ∈ T2 such that ψ 6|= ¬pi′ ≺ pj′ . Let T ′ be

a minimal element in Ts over <. Namely, Q(ψ) ⊆ Q(S) \ ¬pk≺ pk′ |k ∈ T ′, k ∈ T \ T ′.This implies that if |T ′| ≤ |T |/2, then S′ defined in Line 6 is a positive example, and otherwise

S′ defined in Line 8 is a positive example. Thus S will be replaced with S′. Since each iteration

reduces the number of equal points, S will not be reconsidered, and thus will not be returned – a

contradiction.

The above algorithm implies that the query complexity is dominated by the maximal number

of equal points in the original chart, which is bounded by the chart size, n. This provides us

with the following theorem.

Theorem 3.2. The number of membership queries posed by SyFiwEqualPoints is at most

|Qn|+ 2n/2, where n is the initial chart size.

3.4 Synthesizing Code from Formulas

In this section, we present the query synthesis process, taking the formula we learned from

visual examples, and realizing it as a program for a trading platform. Here, we show one way to

synthesize a query to detect a pattern. We show a program synthesized in AFL, the programming

language of a popular trading platform, called AmiBroker. This approach is mostly technical

and in particular one can design another synthesizer that compiles patterns (i.e., formulas) to

executable code. We provide the details for completeness of the presentation.

25


We begin with a high-level description of the query structure. We then provide a short

background on AmiBroker (Section 3.4.1), followed by an explanation of how to synthesize

queries from formulas (Section 3.4.2), and then a description of how to extend the queries with

quantitative constraints (Section 3.4.3).

3.4.1 The AmiBroker Trading Platform

AmiBroker is a popular high-speed trading platform that supports writing and executing pattern

queries. Analysts can use the queries for real-time trading by configuring the query to buy or

sell when the pattern is detected.

Queries are written in a DSL called AFL (other platforms offer their own DSLs, with

similar primitives). AFL is an array-based language and as such its primitives are arrays,

functions receive and return arrays, and expressions (Boolean or computational) consist of

and are evaluated to arrays. Array indices are non-positive and they end at index zero (in the

examples below, the rightmost value is the cell at index zero). In addition, the index zero plays a

special role as arrays are often identified with the value that appears at that cell. For example,

the expression A>3 for A = [2, 4] can be treated as evaluated to true, instead of the array

[false, true]. Thus, and for simplicity’s sake, the functions below are described as returning

values instead of arrays (whereas actually they return arrays whose cells at index zero contain

these values). The one case in which arrays cannot be treated as values is discussed at the end

of Section 3.4.2.

We next present the AFL primitives and functions that appear in the synthesized queries.

The Close array contains the stock closing prices where the value at index zero contains today’s

price and values at negative indices refer to historical prices. Similarly, Open, High, and Low

are arrays containing the opening, high, and low prices. The function Ref(A,n) returns the

value of A n days from today; for example, Ref(Close,-2) returns the closing price two days

ago. The function LLV(A, n) returns the lowest low in A over the last n days; for instance,

LLV([1, 4, 2, 3],3) returns 2. The function LLVBars(A, n) returns the number of days since

the lowest low value was reached over the last n days; e.g., LLVBars([1, 4, 2, 3],3) returns 1.

The function Peak(A, t, n) returns the nth most recent peak in A, where n ≥ 1 and t is the

threshold in percentages for identifying peaks. The function PeakBars(A, t, n) returns the

number of days since the nth recent peak in A. The function LastValue(A) is unique and

does not return an array but rather the value of A at index zero.

3.4.2 Generating AFL Code

The synthesizeAFL (Algorithm 5) operation synthesizes AFL code from a pattern formula.

It takes as arguments the pattern formula, ϕP , the price stream, Stream, and the threshold for

identifying peaks, K. SynthesizeAFL consists of the following steps:

• Splitting the points in ϕP into peaks and lows (Lines 1–4).

• Generating the query header (Line 5).

• Generating the code finding the peaks (Lines 6–10).

26


Algorithm 5: synthesizeAFL(ϕP , Stream, K)1 plastO = maxpi | pi ∈ ϕP and i is odd2 plastE = maxpi | pi ∈ ϕP and i is even3 peaks = odd peaks? lastO, ..., 3, 1, -1 : lastE , ..., 2, 04 lows = odd peaks? 0, 2, ..., lastE : 1, 3, ..., lastO5 q +=“Price = Stream; thrs = K;”6 n = 17 for p in peaks do8 q += “Pp = Peak(Price, thrs, n);”9 q += “PBp = LastValue(PeakBars(Price, thrs, n));”

10 n++

11 plast = maxpi | pi ∈ ϕP 12 for l in lows do13 prc = (l < last)? “Ref(Price, -PBl+1-1)” : “Price”14 n = (l < last)? “PBl−1-PBl+1-1” : “PBl−1”15 q += “Pl = LLV(prc, n);”

16 q += “Filter = ”17 for pi < pj in ϕP do18 q += “Pi < Pj AND”

19 for ¬(pi < pj) in ϕP do20 q += “Pi ≥ Pj AND”

• Generating the code finding the lows (Lines 11–15).

• Generating the code checking whether these peaks and lows satisfy ϕP (Lines 16–20).

We next explain these steps.

Splitting into Peaks and Lows (Lines 1–4) Since the pattern formula ϕP refers only to the

pattern’s peaks and lows, either the odd points in ϕP are the peaks and the even points are the

lows, or vice versa. Thus, to split the points, we check whether the odd points are the peaks

(e.g., by checking whether p1 is greater than p0) and accordingly initialize the peaks and lows

sets to the peaks’ and lows’ indices. Since lows are later defined relatively to their surrounding

peaks, if p0 is a low, a new point, p-1, is added as a peak.

Generating the Query Header (Line 5) The second step begins to generate the query, which

will be stored in q. This step generates the header, which consists of the price stream to scan

(Price) and the threshold for identifying peaks (thrs). Price is set to the parameter Stream,

which could be any price stream supported by AFL, such as Close, Open, Low, or High, while

thrs is set to the parameter K.

Finding the Peaks (Lines 6–10) The third step scans the set peaks from the most recent peak

to the oldest one and defines each of them using Peak and PeakBars. As mentioned, Peak

and PeakBar take as arguments the stream, Price; the threshold for identifying peaks, thrs;

and the ordinary number of peaks, n, which begins at 1 (the most recent peak) and is increased

by one after each peak definition. In the query generated, Pp is the peak’s price and PBp is the

number of days since the peak was reached. The PBp definition uses the LastValue operation;

we postpone the explanation for why we use this to the end of this section. Also, for simplicity

of presentation, the edge case where the last peak is the last point in ϕP is omitted. In this case,

27


Peak cannot be used for technical reasons, and another function is used instead.

Finding the Lows (Lines 11–15) The fourth step scans the set lows and defines each of them

between the two peaks surrounding it using the function LLV. LLV searches for the lowest low

point in the stream prc over the last n days. Thus, to define a low l between its surrounding

peaks, Pl−1 and Pl+1, we define prc to be the price stream that ends at the price Pl+1 (exclusive)

and search for the lowest low in this stream that appears after Pl−1. Namely, prc is the stream

Price shifted in -PBl+1-1 days (obtained using Ref) and n is equal to PBl−1-PBl+1-1 (i.e.,

the number of days between Pl−1 and Pl+1). A special case arises if l is the last point in ϕP , in

which case Pl+1 is undefined. Instead, we search for the lowest low that appears after the peak

Pl−1 and thus prc is Price and n is PBl-1.

Checking against ϕP (Lines 16–20) The final step generates the pattern formula. To detect

patterns, AFL allows a formula to be assigned to a variable called Filter, and whenever the

formula is satisfied, a notification is sent to the user. Thus, the final step translates ϕP to code

and assigns it to Filter (for simplicity’s sake, the code presented includes a redundant AND at the

end of the formula). We note that Filter can be replaced with Buy or Sell and then AmiBroker

will buy or sell stocks when the formula is satisfied.

A Note on LastValue We next explain why it is necessary to add LastValue in the instructi-

ons that compute PBp (Line 9). Computational expressions that appear inside LLV and Ref (as

generated in lines 13–14) may be evaluated to unexpected values if the operands are arrays and

not numbers (i.e., arrays cannot be treated as values, as mentioned in Section 3.4.1). Thus, we

need to access values in the arrays in order to use them in LLV and Ref, and we obtain them

using the LastValue operation.

3.4.3 Supporting Numerical Constraints

The generated code captures the pattern’s formation, but sometimes the user wishes to express

numerical constraints such as: (i) the minimal difference between two points required to consider

them as not equal, (ii) the maximum (or minimum) number of days between two notable peaks

or lows, and (iii) the ratio between two price points.

All these constraints can be easily added to the queries SyFi synthesizes by adding constraints

concerning the Pi and PBi variables. The user can express numerical constraints by configuring

a set of parameters. The parameters may apply to a specific pair of points or to all pairs. We

next show how to express the above numerical constraints.

To express constraints of type (i), SyFi replaces constraints of the form Pi<Pj with Pi ·(1 + diff ) <Pj . The parameter that defines diff globally for all pairs is called Klow and it is

described here since we refer to it in the next section. Klow is the fraction of thrs (the threshold

for identifying peaks) required to determine that one point is lower than the other one, namely,

diff = thrsKlow

· 100 (the fraction is multiplied by 100 since thrs is given in percentages).

To express constraints of type (ii), SyFi adds constraints of the form PBi − PBj ≥ M

where M is an integer. Though PBi were defined only for the peaks, they can be defined also

for the lows using the LLVBars operation.

28


To express constraints of type (iii), SyFi adds constraints of the form Pi · r ≥ Pj where r is

a real number.

3.5 Evaluation

In this section, we evaluate the effectiveness of SyFi by investigating answers to the following

research questions:

• How long does the synthesis process take to learn common technical analysis patterns?

• How precisely do learned formulas capture the patterns in real stock streams?

We begin by describing the common patterns (Section 3.5.1), continue with a study of the

synthesis process (Section 3.5.2), and conclude with a study of its effectiveness in detecting

patterns in stock prices (Section 3.5.3). Experiments were run on a Sony Vaio PC with Intel i7

processor and 16GB RAM.

3.5.1 Common Patterns

Many of the common patterns are described in textbooks, with most of them being similar to

each other up to slight modifications. Thus, to evaluate the effectiveness of SyFi in capturing

patterns, we selected six basic patterns, of which we believe all the others to be variations.

Hence, we believe that identifying them successfully implies that SyFi is general enough to

capture a wide range of patterns. The selected patterns are: (i) head and shoulders, (ii) cup with

handle, (iii) double tops, (iv) symmetrical triangle, (v) rectangle, and (vi) flag. The last five

patterns are illustrated in Fig. 3.3; for further reading see [Bul12, Bul05, Inv].

3.5.2 The Efficiency of the Synthesis Process

To study the efficiency of the synthesis process, we measured how long it took to learn the

formulas that capture the patterns precisely, namely the formulas that are satisfied by all pattern

occurrences and only by them.

The Evaluated Factors SyFi’s synthesis process consists of the learning process (Section 3.3)

and the code generation process (Section 3.4). Since the code generation completes instantly,

we focus on the learning process. The duration of the learning process is affected by the number

of questions presented to the user and the time it takes to synthesize the charts. Thus, we study

both factors on the different patterns.

The Experiments To study the learning process, we conducted several experiments. In each

experiment, we defined a priori the goal formula that described the pattern, created an example

(shown in Fig. 3.3), and let SyFi learn the formula from the example (interactively). The result

of the experiment was the overall learning time and questions.

Coping with Subjectivity Pattern definitions are subjective. To overcome the challenge in evalu-

ating subjective definitions that might lead to inconclusive results, we ran several experiments

for each pattern, each with a different formula (but with the same example). The different

29


Pattern Figure Example Chart

Head and Shoulders

Cup with Handle

Two Tops

Symmetrical Triangle

Flag

Rectangle

Figure 3.3: The new patterns (figures taken from [Inv]).

definitions, taken from textbooks and online forums, span a range of possible definitions, from

the most permissive to the most restrictive. We next provide a general description of the patterns

and the definitions used.

Head and Shoulders Three peaks, the middle is the highest.

(1) Most permissive – three peaks, middle one is the highest.

(2) (1) with shoulders higher than all lows.

(3) (2) where p0, p6 are lower than the other points.

(4) (3) with ascending “neckline” (p0≺p2≺p4) and p6≺p0.

(5) Most restrictive – the given chart is the only valid chart.

Cup with Handle A rise, followed by a cup-shape, then a decline (“the handle”), and finally

another rise.

(1) Most permissive – all four parts exist.

(2) (1) with significant rise: p5 is higher than the other points.

(3) (2) with p0 lower than the other points.

(4) (3) with handle not lower than the cup (¬(p4≺p2)).

30


Pattern |Sall| Def. |Spat| Q’s Avg. Time (std.dev.)

MaxTime

Head and Shoulders 42

(1) 6 28 0.074 (0.05) 0.15(2) 10 24 0.09 (0.059) 0.176(3) 10 16 0.086 (0.054) 0.166(4) 7 9 0.103 (0.061) 0.166(5) 6 6 0.115 (0.054) 0.165

Cup with Handle 30

(1) 5 25 0.043 (0.031) 0.1(2) 6 18 0.055 (0.038) 0.119(3) 7 13 0.054 (0.045) 0.127(4) 6 11 0.054 (0.035) 0.109(5) 6 10 0.068 (0.045) 0.141

Two Tops 20

(1) 5 11 0.076 (0.027) 0.107(2) 5 9 0.086 (0.035) 0.131(3) 5 6 0.089 (0.041) 0.134(4) 6 6 0.189 (0.184) 0.562

Symmetrical Triangle 42(1) 7 15 0.083 (0.076) 0.259(2) 7 7 0.207 (0.111) 0.357

Flag 42(1) 7 9 0.094 (0.054) 0.166(2) 6 6 0.107 (0.048) 0.146

Rectangle 20(1) 6 9 0.159 (0.105) 0.376(2) 6 6 0.183 (0.183) 0.56

Table 3.1: Learning Process Evaluation Results.


Two Tops Two peaks of equal height.

(1) Most permissive – there are two equal height tops.

(2) (1) with middle low (p2) not lower than the other lows.

(3) (2) with last point (p4) lower than the other points.


The next patterns are captured by constraints that leave little room for different definitions and

thus only two are listed.

Symmetrical Triangle Descending peaks (p1p3p5), ascending lows (p2≺p4 ≺ p6), and

p2≺p0, p0≺p1.

(1) Most permissive – p0 appears between p1 and p2.


Flag A pole followed by descending peaks (p1p3p5), descending lows (p2p4p6), and p0

lower than all points.

(1) Most permissive – p2 and p5 may be equal.


Rectangle Peaks (p1, p3) are equal, lows (p2, p4) are equal, and p0 is not higher than p1.

(1) Most permissive – p0 is not higher than p1.


Results The results are shown in Table 3.1. The table shows the pattern name (Pattern); the

31


total number of predicates that may appear in the formula (|Sall|); the number of the definition

used (Def.); the number of predicates in the learned formula (|Spat|); the number of questions

presented to the user (Q’s); the average and maximum time (Avg. Time, Max Time) in seconds

passed between the user response and the next chart display. Standard deviation is shown in

brackets.

Question Analysis Table 3.1 shows that there are relatively few questions and that this number

is in correlation with the number of irrelevant constraints satisfied by the example. Namely, as

the pattern definition tested was more restrictive, the number of questions declined. In particular,

the most restrictive definitions were learned within 10 questions. The table also shows that SyFi

required fewer questions (up to 15) to learn patterns that were more restrictive (triangle, flag,

and rectangle). For patterns that consisted of several parts (head and shoulders and cup with

handle), SyFi required more questions to learn (up to 30). Nevertheless, we believe that even

if the number of questions reaches 30, the overall learning process can be completed by the

analysts quickly as classifying visual charts is a simple and intuitive task.

Time Analysis In all experiments, the average time was < 0.2 seconds and the maximum time

was < 0.6 seconds. These times are not noticeable to users and thus we believe users will not

observe delays during the learning process.

3.5.3 The Quality of the Synthesized Queries

Although SyFi learns the precise formulas that capture patterns, the queries it synthesizes contain

parameters which affect the detection of patterns in charts. In this section, we evaluate the

quality of SyFi’s final outcome – the queries.

The Experiments To evaluate the query quality, we conducted several experiments. In each

experiment, we ran one query over 10 stock streams (taken from [YF]), each containing the

closing prices over the last six years. Thus, each query was evaluated over more than 15, 000

charts.

For each pattern, we evaluated one query, the one that detected the most popular definition.

As opposed to the previous section where the definition that was used affected the evaluation

results (the number of questions), in this section the evaluation results (the detection quality)

are affected by the query parameters, which are independent of the pattern definition. The

definitions used were: head and shoulders-(4), cup with handle-(4), two tops-(3), symmetrical

triangle-(1), flag-(1), and rectangle-(1).

In the experiments, thrs (Section 3.4.2) was set to 0.5 and Klow (Section 3.4.3) was set

to 13 . These values were chosen after studying the values used by technical analysis users (as

described in online forums) and examining the data. We did not examine different parameters

because our goal is to show that the synthesized queries detect patterns well. In real-world

scenarios, such parameters are tuned by the analysts.

The Evaluated Factors To evaluate the detection quality, we measured precision and recall.

Precision is the percentage of detected charts that were pattern occurrences, while recall is the

32


Pattern Head and Shoulders Cup with Handle Two Tops Symmetrical Triangle

StockSymbol

Pat.Oc.

Pre.(%)

Rec.(%)

Rec.0(%)

Pat.Oc.

Pre.(%)

Rec.(%)

Rec.0(%)

Pat.Oc.

Pre.(%)

Rec.(%)

Rec.0(%)

Pat.Oc.

Pre.(%)

Rec.(%)

Rec.0(%)

AAPL 7 100 100 0 128 86 96 1 6 100 83 33 5 100 80 0GOOGL 10 100 90 0 112 92 99 4 11 100 55 9 5 100 80 0MSFT 30 100 93 20 104 87 93 11 2 100 50 0 7 100 71 0AXP 9 100 67 0 102 88 99 15 12 100 58 0 4 100 50 0BA 7 100 57 0 116 89 100 3 13 100 92 15 3 100 67 0CAT 7 100 100 0 119 89 100 2 6 100 100 0 0 - - -CSCO 5 100 80 0 65 59 98 2 8 100 100 13 2 100 50 0CVX 13 100 100 0 107 88 99 13 17 100 94 29 3 100 100 0DD 14 100 79 0 87 88 97 7 11 100 73 0 2 100 50 0DIS 10 100 60 0 91 77 98 4 10 100 100 70 1 100 100 0

Summary 112 100 83 2 1031 84 98 6 96 100 81 17 32 100 72 0

Table 3.2: Detection statistics: number of pattern occurrences, precision, recall, and recallwithout SyFi’s learning.

percentage of detected pattern occurrences from all pattern occurrences. Formally, precision

equals TPTP+FP ·100 and recall equals TP

TP+FN ·100 where TP (true positive) is the number of

detected pattern occurrences, FP (false positive) is the number of detected charts that were not

pattern occurrences, and FN (false negative) is the number of pattern occurrences that were

missed. To determine which occurrences were pattern occurrences, we manually classified

streams based on the pattern formulas. We did not consider the values of thrs and Klow during

the manual classification, and instead determined visually whether peaks were significant and

whether two points were equal. We believe such classification simulates the way analysts detect

patterns.

Results The results are shown in Table 3.2 and Table 3.3: Pat. Oc. is the number of pattern

occurrences (classified manually), that is, charts that meet the pattern, Pre. is precision, Rec. is

recall, and Rec.0 is the recall that would have been obtained had we not applied the learning

process before generating the query (i.e., the formula used is the conjunction of all predicates

over the≺-predicate satisfied by initial chart example). Cells containing “-” could not be

computed either because there were no pattern occurrences or because no charts were detected

by the query.

Precision Analysis Tables 3.2 and 3.3 show that the precision is mostly high (on average, 95%),

namely most detected charts are indeed pattern occurrences. The only exception is the cup with

handle pattern. In this pattern, we observed that capturing patterns through extremum points and

the≺predicate cannot capture cup shapes. To extend the query to capture cup shapes, SyFi can

be extended to allow users to mark in the chart example the points that form cup shapes, and

it would add cup shape constraints to the query accordingly. Even without such an extension,

precision is still relatively high (84%). The flag pattern also encounters low precision at times.

Close inspection revealed that SyFi missed occurrences in which the points had very close

values, namely a lower Klow would have made it possible to detect these charts.

Recall Analysis Tables 3.2 and 3.3 show that the recall is relatively high (on average, 77%) and

33


Pattern Flag Rectangle

StockSymbol

Pat.Oc.

Pre.(%)

Rec.(%)

Rec.0(%)

Pat.Oc.

Pre.(%)

Rec.(%)

Rec.0(%)

AAPL 7 100 71 71 8 100 75 25GOOGL 4 100 75 0 3 100 67 67MSFT 4 100 25 0 5 100 60 60AXP 7 100 57 0 6 100 67 67BA 2 0 0 0 6 100 83 67CAT 9 89 89 33 5 100 80 80CSCO 2 - 0 0 5 100 100 60CVX 1 100 100 0 6 100 67 50DD 2 100 50 50 7 100 86 43DIS 6 100 67 0 7 100 86 57

Summary 44 88 53 15 58 100 77 58

Table 3.3: Detection statistics (continued).

is significantly better than Rec.0 (on average, 16%), which is the recall that would have been

obtained without the learning process. We inspected all pattern occurrences that the query did

not detect and observed that the most common reason was that thrs was too low (especially

in the head and shoulders and two tops). Because thrs was low, peaks that visually looked

insignificant were considered significant by the query and thus the query did not check the more

significant peaks, which were required to satisfy the pattern formula. The second reason for

missed occurrences was that Klow was too high. This affected especially the rectangle and two

tops patterns, both of which consist of points of the same height. Flag and triangle were also

affected by the high Klow as some pattern occurrences were missed when the points had very

close prices.

To improve recall, analysts may tune thrs and Klow. Yet, there is some trade-off between

precision and recall, and thus each analyst has to decide which metric is more important. Here,

we chose to show that very high precision could be obtained while maintaining relatively high

recall. This choice is due to our belief that the common approach is to prefer precision over

recall because too many false alarms will result in analysts ignoring the query reports.

Time Analysis To evaluate the efficiency of the queries, we measured the time it took the

query to scan the 10 stock streams. The results are summarized in Table 3.4, which shows the

average time (in seconds) taken for the queries to complete on one stream. Standard deviation

is provided in brackets. The table shows that the queries generated are highly efficient and

complete scanning 1500 charts in a few seconds.

Partial Learning We next study whether the learning process can be beneficial even if the

learning is not complete. To this end, we define that learning that is stopped after the kth

question outputs a formula that is the conjunction of predicates in Sall \ Snopat. This set is

guaranteed to include all pattern predicates, but it also may include irrelevant predicates (which

were not learned yet)2.

2The alternative is to generate a conjunction from the predicates in Spat. However, this is likely to result in toomany false reports, especially when Spat is empty, in which case every chart will be reported by SyFi.

34


Pattern Avg. Time (std. dev.)

Head and Shoulders 3.425 (0.245)

Cup with Handle 3.11 (0.274)

Two Tops 2.287 (0.18)

Symmetrical Triangle 3.169 (0.271)

Flag 2.986 (0.269)

Rectangle 2.492 (0.175)

Table 3.4: Pattern detection times for sets of 1500 charts.

Head and Shoulders Cup with Handle Two Tops

Rec

all(

%)

Figure 3.4: Recall as a function of the number of questions presented in the learning process.

Fig. 3.4 shows the graphs of recall as the number of questions varies for three patterns. The

graphs show that recall improves as more questions are presented. Further, the graphs show that

partial learning may obtain good recall and is thus preferable over no learning. However, there

is no common behavior for the rate at which the recall is improved. For example, recall of head

and shoulders is low until the learning is almost complete, while that of two tops reaches its

maximum after one question, and that of cup with handle improves consistently.

3.6 Related Work

Queries over Finance Streams Several works aim to help technical analysts. Many trading

software platforms provide domain-specific languages for writing queries where the user defines

the query and the system is responsible for the sliding window mechanism, e.g., MetaTrader,

MetaStock, NinjaTrader, and Microsoft’s StreamInsight [CGM10].

Recently, Amibroker added a feature that supports writing queries in natural language.

However, this feature is limited to a small set of phrases in English provided by AmiBroker that

does not cover all AmiBroker’s instructions. Thus, this feature cannot be used for writing pattern

queries. Another tool designed to help analysts is Stat! [BCD+13], an interactive tool enabling

analysts to write queries in StreamInsight. Gradually and at each step it shows the results of

the current query. CPL [ACK01] is a Haskell-based high-level language designed for chart

pattern queries. Its unique features include support in fuzzy constraints and pattern composition.

Composition simplifies the encoding of complex patterns by first defining their segments and

then composing them to form the pattern. This approach is applicable only for pattern definitions

35


www.metaquotes.net

www.metastock.com

www.ninjatrader.com

amibroker.com

that do not have constraints pertaining to pairs of points from different segments. As shown

in Section 3.5, many definitions contain such constraints.

Queries over Streams Many other languages support queries for streams. SASE [WDR06] is a

system designed for RFID streams (Radio Frequency Identification) that offers a user-friendly

language and can handle large volumes of data. Cayuga [BDG+07] is a system for detecting

complex patterns in streams, whose language is based on the Cayuga algebra. SPL [HAG+13]

is IBM’s stream processing language supporting pattern detections. ActiveSheets [VTR+14] is

a platform that extends Microsoft Excel with abilities to process real-time streams from within

spreadsheets. ActiveSheets enables users to process streams using Excel formulas and it can be

used to detect patterns in stock streams by defining corresponding automata and encoding their

states and transitions in the spreadsheet.

3.7 Conclusion

We presented SyFi, a tool for synthesizing pattern queries over finance streams. SyFi receives

an example chart and interacts with the analyst by presenting a series of charts to learn the

pattern formula. SyFi then produces programs that execute over real-time trading platforms and

detect pattern occurrences in price streams. We showed that SyFi learns common patterns and

synthesizes efficient queries that detect these patterns in real stock streams with high precision

and recall.

36


Chapter 4

Learning Disjunctions andConjunctions of Predicates

In the previous chapter, we showed an exact learning algorithm that interacts with a user to

learn his intent, which was modeled as a conjunction over a particular set of predicates. In this

chapter, we consider a more general setting, where the user’s intent is a disjunctive (and dually,

a conjunctive) formula over arbitrary predefined predicates. More formally, let Q be a set of

predicates over a domain D. Our goal is to learn the class Q∨ = ∨q∈Pq | P ⊆ Q of any

disjunction of predicates inQ. We give a learning algorithm D-SPEX that learns any function in

Q∨ with polynomially many queries. We then show that given some computational complexity

conditions on the set of predicates, D-SPEX runs in polynomial time.

We demonstrate the above on the class of conjunctions over QI , where QI is the set of

variable inequalities, i.e., predicates of the form xi > xj over n variables. If the set is acyclic

(∧QI 6≡ false), we show that learning can be done in polynomial time. If the set is cyclic

(∧QI ≡ false), we show that learning is equivalent to the problem of enumerating all the

maximal acyclic subgraphs of a directed graph, which is still an open problem ([ABC+12,

BCL+13, Was16]).

We begin this chapter with notations and main definitions. We then provide our algorithm

that learns disjunctions, discuss complexity, and describe conditions under which D-SPEX is

polynomial. We then provide the dual algorithm to learn conjunctions. Finally, we discuss the

case where the class of predicates is conjunctions over variable inequalities.

4.1 The Search Space

In this section, we describe the search space of the learning problem. We begin with defining

the nodes in the search space. To this end, we define an equivalence relation over the set

of disjunctions and the representatives of the equivalence classes. The nodes are then these

representatives. We continue with defining a partial order over the disjunctions, which defines

the edges between the nodes. We finally present related notions (descendant, ascendant, and

lowest/ greatest common descendant/ ascendant) that are translated later to the search paths.

37


4.1.1 The Nodes of the Search Space

Clearly, the nodes should correspond to the elements in Q∨. However, formulas in Q∨ may

be equivalent. To reduce the size of the search space, we define a node for each set of equi-

valent formulas. To define the node, we first define an equivalence relation over Q∨ and the

representatives of the equivalence classes. Then, the nodes are the representative elements.

Let Q be a set of predicates over the domain D. The equivalence relation ≡ over Q∨ is

defined as follows: two disjunctions ϕ1, ϕ2 ∈ Q∨ are equivalent (ϕ1 ≡ ϕ2) if ϕ1 is logically

equal to ϕ2. We denote equivalence classes by [ϕ], where ϕ ∈ Q∨. Notice that if [ϕ1] = [ϕ2],

then [ϕ1 ∨ ϕ2] = [ϕ1] = [ϕ2]. We define for every [ϕ] the representative element to be

Gϕ = ∨q∈Pq where P ⊆ Q is the maximum size set that satisfies ∨P ≡ ϕ. We denote by

G(Q∨) the set of all representative elements. That is, G(Q∨) = Gϕ | ϕ ∈ Q∨.

Example 1. Consider the domain D = 1, 2 × 1, 2 and the set Q = x1 ≥ 1, x1 ≥ 2, x2 ≥1, x2 ≥ 2. There are 16 formulas in Q∨ and five representative formulas: G(Q∨) = (x1 ≥1)∨ (x1 ≥ 2)∨ (x2 ≥ 1)∨ (x2 ≥ 2), (x1 ≥ 2)∨ (x2 ≥ 2), (x1 ≥ 2), (x2 ≥ 2), false (where

false is a contradiction).

The following facts hold immediately from the above definitions:

Lemma 4.1.1. Let Q be a set of predicates. Then,

1. The size of the search space is |G(Q∨)|.2. For every ϕ ∈ Q∨: Gϕ ≡ ϕ.

3. For every ϕ ∈ G(Q∨) and q ∈ Q\Q(ϕ): ϕ ∨ q 6≡ ϕ.

4.1.2 The Edges of the Search Space

In this section, we define a partial order over Q∨. This partial order defines a Hasse diagram

over G(Q∨), which serves as the search space and in particular describes the edges between

the nodes in G(Q∨). The partial order, denoted by ⇒, is defined as follows: ϕ1⇒ϕ2 if ϕ1

logically implies ϕ2, i.e., ϕ1 |= ϕ2. Consider the Hasse diagram H(Q∨) of G(Q∨) for this

partial order. The maximum (top) element in the diagram is Gmax = ∨q∈Qq. The minimum

(bottom) element is Gmin ≡ ∨q∈Øq, i.e., a contradiction.

In a Hasse diagram, G1 is a descendant (resp. ascendent) of G2 if there is a (nonempty)

downward path from G2 to G1 (resp. from G1 to G2), i.e., G1⇒G2 (resp. G2⇒G1) and

G1 6= G2. G1 is an immediate descendant of G2 in H(Q∨) if G1⇒G2, G1 6= G2 and there is

no G such that G 6= G1, G 6= G2 and G1⇒G⇒G2. G1 is an immediate ascendant of G2 if G2

is an immediate descendant of G1. We now show some preliminary results.

Properties of the Hasse Diagram

Lemma 4.1.2. Let G1 be an immediate descendant of G2 and ϕ ∈ Q∨. If G1⇒ϕ⇒G2, then

G1 ≡ ϕ or G2 ≡ ϕ.

38


Proof. Since ϕ ≡ Gϕ, G1⇒Gϕ⇒G2. By the definition of immediate descendant, G1 = Gϕ or

G2 = Gϕ.

Lemma 4.1.3. If G1 is a descendant of G2, then Q(G1) ( Q(G2).

Proof. • Q(G1) ⊆ Q(G2): Assume there is q ∈ Q(G1) \ Q(G2). If G1 is a descendant

of G2, then G1⇒G2 and thus G1 ∨ G2 ≡ G2. By Lemma 4.1.1, q ∨ G2 6≡ G2, which

contradicts G1 ∨G2 ≡ G2.

• Q(G1) ( Q(G2): Assume otherwise, then G1 = G2 and thus G1 is not a descendant of

G2.

We denote by De(G) and As(G) the sets of all the immediate descendants and immedi-

ate ascendants of G, respectively. We further denote by DE(G), AS(G) the sets of all G’s

descendants and ascendants, respectively.

Lowest Common Ascendant and Greatest Common Descendant For G1 and G2, we de-

fine their lowest common ascendent (resp. greatest common descendant) G = lca(G1, G2)

(resp. G = gcd(G1, G2)) to be the formula G ∈ G(Q∨) that is the minimum (resp. maximum)

element in AS(G1) ∩AS(G2) (resp. DE(G1) ∩DE(G2)). This gives us the following lemma.

Lemma 4.1.4. Let G1, G2 ∈ G(Q∨) and ϕ ∈ Q∨.

1. If G1⇒ϕ⇒lca(G1, G2) and G2⇒ϕ⇒lca(G1, G2), then ϕ ≡ lca(G1, G2).

2. If gcd(G1, G2)⇒ϕ⇒G1 and gcd(G1, G2)⇒ϕ⇒G2, then ϕ ≡ gcd(G1, G2).

Proof. 1. Gϕ ≡ ϕ and Gϕ ∈ G(Q∨). Since G1⇒Gϕ⇒lca(G1, G2) and

G2⇒Gϕ⇒lca(G1, G2), by the definition of lca, Gϕ ≡ lca(G1, G2). Bullet 2. is similar.

We next characterize lca and gcd.

Lemma 4.1.5. Let G1, G2 ∈ G(Q∨). Then, lca(G1, G2) ≡ G1 ∨G2. In particular, if G1, G2

are two distinct immediate descendants of G, then G1 ∨G2 ≡ G.

Proof. Since G1⇒lca(G1, G2) and G2⇒lca(G1, G2), we get G1 ∨ G2⇒lca(G1, G2). Since

G1⇒(G1 ∨ G2)⇒lca(G1, G2) and G2⇒(G1 ∨ G2)⇒lca(G1, G2), by Lemma 4.1.4, we get

G1 ∨G2 ≡ lca(G1, G2).

Note that this does not imply that the predicates in these formulas are the same; namely, it

does not imply thatQ(G1∨G2) = Q(G1)∪Q(G2) = Q(lca(G1, G2)). In particular, G1∨G2

is not necessarily in G(Q∨). However, for the gcd, it is the case that its predicates are the

intersection of the predicates of G1 and G2, which is our next lemma.

Lemma 4.1.6. Let G1, G2 ∈ G(Q∨). Then, Q(G1) ∩Q(G2) = Q(gcd(G1, G2)).

In particular, if G1, G2 ∈ G(Q∨), then ∨(Q(G1) ∩Q(G2)) ∈ G(Q∨).

Also, if G1, G2 are two distinct immediate ascendants of G, then Q(G1) ∩Q(G2) = Q(G).

39


Proof. • Q(gcd(G1, G2)) ⊆ Q(G1) ∩ Q(G2): Follows since by Lemma 4.1.3,

Q(gcd(G1, G2)) ⊆ Q(G1) and Q(gcd(G1, G2)) ⊆ Q(G2).

• Q(G1) ∩ Q(G2) ⊆ Q(gcd(G1, G2)): Since Q(gcd(G1, G2)) ⊆ Q(G1) ∩ Q(G2),

gcd(G1, G2) = ∨Q(gcd(G1, G2))⇒ ∨ (Q(G1) ∩ Q(G2)). Since ∨(Q(G1) ∩Q(G2))⇒G1 and ∨(Q(G1) ∩ Q(G2))⇒G2, by Lemma 4.1.4 we get gcd(G1, G2) =

∨(Q(G1) ∩Q(G2)). Thus, Q(G1) ∩Q(G2) ⊆ Q(gcd(G1, G2)).

If G1 and G2 are two distinct immediate ascendants of G:

• Q(G) ⊆ Q(G1) ∩ Q(G2) = Q(gcd(G1, G2)): Follows since Q(G) ⊆ Q(G1) and

Q(G) ⊆ Q(G2).

• Q(gcd(G1, G2) ⊆ Q(G): Q(gcd(G1, G2)) ⊆ Q(G1) ∩ Q(G2) and Q(G) ⊆ Q(G1) ∩Q(G2). Since G is an immediate descendant of G1 and G2, gcd(G1, G2)⇒G⇒G1 and

gcd(G1, G2)⇒G⇒G2, by Lemma 4.1.4 we get gcd(G1, G2) = G.

4.2 Searching the Space with Witnesses

In this section, we describe how the search space is traversed. We begin with defining a key

term called witness. Let G1 and G2 be elements in G(Q∨). An element e ∈ D is a witness

for G1 and G2 if G1(e) 6= G2(e) (here we treat formulas as Boolean functions, as described in

Chapter 2).

We begin with providing a few properties of a witness. The first lemma describes which

predicates are satisfied by the witness.

Lemma 4.2.1. Let G1 be an immediate descendant of G2. If e ∈ D is a witness for G1 and G2,

then:

1. G1(e) = 0 and G2(e) = 1.

2. For every q ∈ Q(G1), q(e) = 0.

3. For every q ∈ Q(G2) \ Q(G1), q(e) = 1.

Proof. Since G1⇒G2 it must be that G2(e) = 1 and G1(e) = 0. Namely, for every q ∈ Q(G1),

q(e) = 0. Let q ∈ Q(G2)\Q(G1). Consider G1∨ q. By bullet 3 in Lemma 4.1.1, G1∨ q 6≡ G1.

Since G1⇒G1 ∨ q⇒G2, by Lemma 4.1.2, G1 ∨ q ≡ G2. Therefore, 1 = G2(e) = G1(e) ∨q(e) = q(e).

Our next lemma states that given a node, it has a different witness for every immediate

descendant.

Lemma 4.2.2. Let De(G) = G1, G2, . . . , Gt be the set of immediate descendants of G. If e

is a witness for G1 and G, then e is not a witness for Gi and G for all i > 1. That is, G1(e) = 0,

G(e) = 1, and G2(e) = · · · = Gt(e) = 1.

Proof. By Lemma 4.2.1 G(e) = 1 and G1(e) = 0. For any Gi, i ≥ 2, G1 and Gi are

immediate descendants of G and thus, by Lemma 4.1.5, G ≡ G1 ∨Gi. Therefore, 1 = G(e) =

G1(e) ∨Gi(e) = Gi(e).

40


Algorithm 6: D-SPEX1 return Learn(Gmax,∅)2 Function Learn(G, T ):3 Q = Q(G) // The set of predicates that the target ψ contains

4 Flag = true // Indicates whether the target is suspected to be G

5 for G′ ∈ getAllImmDe(G) do6 if ∃P ∈ T.Q(G′) ⊆ P then continue // G′ was eliminated by an ancestor

7 e = model(G ∧ ¬G′) // get a witness for G and G′

8 if ψ(e) = 0 then // pose membership query

9 Q = Q∩Q(G′); Flag = false // The target is G′ or its descendant

10 else11 T = T ∪ Q(G′) // Eliminate G′ and all its descendants

12 if Flag then return G13 Learn(∨Q, T )

Finally, we show how the witness enables the space to be searched for the target formula.

Lemma 4.2.3. Let G′ be an immediate descendant of G, e ∈ D be a witness for G and G′, and

G′′ be a descendant of G.

1. If G′′(e) = 0, G′′ is a descendant of G′ or equal to G′. In particular, Q(G′′) ⊆ Q(G′).

2. If G′′(e) = 1, G′′ is not a descendant of G′ nor equal to G′. In particular, Q(G′′) 6⊂Q(G′).

Proof. Since G′′ is a descendant of G, we have Q(G′′) ( Q(G). By Lemma 4.2.1, for every

q ∈ Q(G′), R(e) = 0 and for every q ∈ Q(G) \ Q(G′), q(e) = 1. Thus, if G′′(e) = 0, then no

q ∈ Q(G) \ Q(G′) is in Q(G′′) (otherwise, G′′(e) = 1). Therefore, Q(G′′) ⊆ Q(G′) and G′′

is a descendant of G′ or equal to G′. Otherwise, if G′′(e) = 1, then G′′ is not a descendant of

G′ nor equal to G′ (since if it were it must have been that G′′(e) = 0).

4.3 The D-SPEX Algorithm

In this section, we present our algorithm, called D-SPEX, that learns the classQ∨. Our algorithm

relies on the results from the previous section. To find the target formula ψ (more precisely,

its representative in G(Q∨), Gψ), D-SPEX starts from the maximal element in G(Q∨) and

traverses downwards the Hasse diagram. At each step, D-SPEX considers an element G, checks

its witnesses with its immediate descendants, and poses a membership query for each. The

witness is obtained by obtaining a satisfying example for the formula G ∧ ¬G′ (e.g., using an

SMT-solver). If ψ and G agree on the witness of G and G′, then by Lemma 4.2.3, ψ cannot be

G′ or its descendant, and thus these are pruned from the search space. Otherwise, if ψ and G′

agree on the witness, then ψ must be G′ or its descendant, and thus all other elements in G(F∨)

are pruned.

The D-SPEX algorithm is depicted in Figure 6. D-SPEX calls the recursive algorithm

Learn, which takes a candidateG and a set of subsets ofQ, T , that stores the already eliminated

41


elements from Q∨. Learn computes Q, a set of predicates over which ψ (i.e., Gψ) is defined

(i.e.,Q(Gψ) ⊆ Q). During the execution,Qmay be reduced. If not, thenG = ∨Q ≡ ψ. Learn

begins by initializing Q to the predicates in G, i.e., Q(G). Then, it examines the immediate

descendants of G whose ancestors have not been eliminated. When considering G′, a witness

e is obtained and Learn poses a membership query to learn ψ(e). If ψ(e) = 0 (recall that

G(e) = 1 since e is a witness), then G 6≡ ψ and ψ is inferred to be a descendant of G′ and is

thus over the predicates in Q(G′). Thus, Q is reduced. Otherwise, ψ is not a descendant of G′,

and thus G′ and its descendants are eliminated from the search space by adding Q(G′) to T .

Finally, ifG and ψ agreed on all witnesses (evident by the Flag variable), thenG is returned.

Intuitively, correctness follows since an invariant of the execution is that Gψ is G or one of its

descendants, and if G and ψ agreed on all witnesses, then by Lemma 4.2.3 Gψ is not any of G’s

descendants. Otherwise, if G and ψ did not agree on all witnesses (Flag is false), then Gψ is

inferred to be one of G’s descendants (by Lemma 4.2.3). More precisely, Gψ is a descendant

of the children that agreed with ψ on their witnesses. By the definition of gcd, ψ must be

that gcd or its descendant. Thus, Learn is invoked on their gcd, which by Lemma 4.1.6, is

the disjunction of their common predicates (stored in Q). Note that by the same lemma, this

disjunction is part of the Hasse Diagram (i.e., ∨Q ∈ G(Q∨)).

We now analyze D-SPEX’s complexity.

Theorem 4.1. If the immediate descendants of any G ∈ G(Q∨) can be found in time t, then

D-SPEX learns the target formula in time t · |Q| and at most |Q| · maxG∈G(Q∨) |De(G)|membership queries.

The complexity proofs follow directly from the height of the Hasse diagram (|Q|) and the

maximal number of immediate descendants (maxG∈G(Q∨) |De(G)|). The fact that D-SPEX

learns the target formula follows from the following lemma.

Lemma 4.3.1. Let ψ be the target formula. If Learn returns G, then Gψ = G (?). Otherwise,

if Learn calls Learn(∨Q, T ), then:

1. Q(Gψ) ⊆ Q. That is, Gψ is a descendant of ∨Q or equal to ∨Q.

2. Q(Gψ) 6⊂ P for all P ∈ T . That is, Gψ is not a descendant of any ∨P , for P ∈ T or

equal to ∨P .

Proof. The proof is by induction. Obviously, the induction hypothesis is true for (Gmax,Ø).

Assume the induction hypothesis is true for (∨Q, T ). That is, Q(Gψ) ⊆ Q and Q(Gψ) 6⊂ Pfor all P ∈ T . Let G′1, . . . , G

′` be all the immediate descendants of ∨Q. If Q(G′i) ⊆ P for

some P ∈ T , G′i and all its descendants G′′ satisfy Q(G′′) ⊆ Q(G′i) ⊆ P and thus Gψ is not

G′i or a descendant of G′i.

Assume now that Q(G′i) 6⊂ P for all P ∈ T . Let e(i) be a witness for ∨Q and G′i. If

ψ(e(i)) = 1, then by Lemma 4.2.3 Gψ is not a descendant of G′i and not equal to G′i. This

implies that Q(Gψ) 6⊂ Q(G′i) which is why Q(G′i) is added to T . This proves bullet 2.

42


If ψ(e(i)) = 1 for all i, then Gψ = G. This follows since by Lemma 4.2.3, ψ is not any of

G’s descendants, and thus by the induction hypothesis it must be G. This is the case when the

Flag variable does not change to false and D-SPEX outputs G. This proves (?).

If ψ(e(i)) = 0, then by Lemma 4.2.3, Gψ is a descendant of G′i or equal to G′i. Let I

be the set of all indices i for which ψ(e(i)) = 0. Then, Gψ is a descendant of (or equal to)

all G′i, i ∈ I and therefore Gψ is a descendant or equal to gcd(G′ii∈I). By Lemma 4.1.6,

Q(gcd(G′ii∈I)) = ∩i∈IQ(Gi). Thus, D-SPEX takes the new Q to be ∩i∈IQ(Gi). This

proves bullet 1.

We now prove the lower bound.

Theorem 4.2. Any learning algorithm that learns Q∨ must pose at least

max(log |G(Q∨)|,maxG∈G(Q∨) |De(G)|) membership queries. In particular, D-SPEX

poses at most |Q| ·OPT(Q∨) membership queries.

Proof. • OPT(Q∨) ≥ log |G(Q∨)|: The number of different formulas in Q∨ is |G(Q∨)|,and thus from the information theoretic lower bound we get OPT(Q∨) ≥ dlog |G(Q∨)|e.

• OPT(Q∨) ≥ maxG∈G(Q∨) |De(G)|: Let G′ be such that m = |De(G′)| =

maxG∈G(Q∨) |De(G)|. Let G1, . . . , Gm be the immediate descendants of G′. If the

target formula is either G′ or one of its immediate descendants, then any learning al-

gorithm must pose a membership query e(i) such that G′(e(i)) = 1 and Gi(e(i)) = 0.

Without such an assignment the algorithm cannot distinguish between G′ and Gi. By

Lemma 4.2.2, e(i) is a witness only to Gi and therefore the algorithm requires at least m

membership queries.

Finding All Immediate Descendants of G A missing detail in D-SPEX is how to find the

immediate descendants of G in the Hasse diagram H(S(G)) (in Line 5). In this section, we

explain how to obtain them. We first characterize the elements in H(S(G)) (compared to

the other elements in Q∨), which is required because the immediate descendants are part

of H(S(G)). We then give a characterization of the immediate descendants (compared to

other descendants), which leads to an operation that computes an immediate descendant from

a descendant. We finally show how to compute descendants that lead to obtaining different

immediate descendants. This completes the description of how D-SPEX can obtain all immediate

descendants.

By the definition of a representative, for every ϕ ∈ Q∨: Gϕ = ∨q|=ϕq. To decide whether

ϕ ∈ Q∨ is a representative, i.e., whether ϕ ∈ G(Q∨), we use the following lemma.

Lemma 4.3.2. Let ϕ ∈ Q∨. ϕ ∈ G(Q∨) if and only if for every q ∈ Q \ Q(ϕ): ϕ ∨ q 6≡ ϕ.

Proof. Follows from the definition of G(Q∨) (Lemma 4.1.1).

The next lemma shows how to decide whether G′ is an immediate descendant of G.

43


Algorithm 7: GetAllImmDe(G)

1 Function GetImmDe(G,G′′):2 Q′ = Q(G′′)3 while ∃q ∈ Q(G) \ Q′ : (∨Q′) ∨ q 6≡ G do Q′ = Q′ ∪ q4 return Q′

5 De = GetImmDe(G, false)6 e = model(G ∧

∧mi=1

∨q∈Q(G)\Q(Gi)

¬q)7 while e 6= ⊥ do8 De = De ∪ GetImmDe(∨q ∈ Q(G) | e |= ¬q)9 e = model(G ∧

∧mi=1

∨q∈Q(G)\Q(Gi)

¬q)

10 return De

Lemma 4.3.3. Let G,G′ ∈ G(Q∨). G′ is an immediate descendant of G if and only if G′ is a

descendant of G and for every q ∈ Q(G) \ Q(G′) we have G′ ∨ q ≡ G.

Further, if G′ is a descendant of G and for some q ∈ Q(G) \ Q(G′) we have G′ ∨ q 6≡ G,

then GG′∨q is a descendant of G and an ascendant of G′.

Proof. Only if: Let G′ be an immediate descendant of G, i.e., G′⇒G. Let q ∈ Q(G) \ Q(G′).

Since G′⇒(G′ ∨ q)⇒G and G′ 6≡ G′ ∨ q (since G′ ∈ G(Q∨)), we get from Lemma 4.1.2 that

G′ ∨ q ≡ G.

If: Suppose G′ is a descendant of G and for every q ∈ Q(G) \ Q(G′) we have G′ ∨ q ≡ G.

If G′ is not an immediate descendant of G, then let G′′ be a descendant of G and an immediate

ascendant of G′′. Take any q ∈ Q(G′′) \ Q(G′) ( Q(G) \ Q(G′). Then, as before by

Lemma 4.1.2, G′ ∨ q ≡ G′′. However, G′′ 6≡ G and thus G′ ∨ q 6≡ G – a contradiction. This

also proves the last statement of the lemma.

The above lemma guides the computation of an immediate descendant from a descendant:

predicates from Q are added to the descendant as long as the resulting formula is not equivalent

to G. We phrase this in an operation called GetImmDe (Algorithm 7). GetImmDe takes G and a

descendant G′′ of G (which can even be the contradiction false), initializes Q = Q(G′′), and

repeatedly extendsQ as follows while possible: For q ∈ Q(G) \Q if (∨Q)∨ q 6≡ G, q is added

to Q.

GetImmDe can be used to obtain the first immediate descendant by calling it with the

contradiction false. Then the question is how to obtain a descendant for which GetImmDe will

return a different immediate descendant. More generally, the question is how to obtain a new

immediate descendant after computing a set of immediate descendants, or determine there are

no more immediate descendants. We first give intuition and then formalize it.

Let G′ be an immediate descendant of G. By Lemma 4.1.3, the predicates of any descendant

of G′ are contained in Q(G′) and none are in Q(G) \ Q(G′). Thus, a descendant of G which is

not a descendant of G′ can be constructed by looking for a descendant of G that satisfies one of

the predicates inQ(G) \ Q(G′). This ensures, by the operation of getImmDe, that the resulting

immediate descendant will also contain that predicate. Technically, a descendant of G might

be found by looking for an element satisfying ∨q∈(Q(G)\Q(G′))q. However, that element may

44


also satisfy the rest of the predicates in Q(G) and thus getImmDe would result in returning G

itself. To ensure that we find an element that is a descendant of G, we look for a witness of G

and an (unknown) descendant. Namely, we look for an element that satisfies G and falsifies that

descendant’s predicates. Though we do not know that descendant’s predicates, we know that

they intersect withQ(G)\Q(G′). In general, we know that given a set of immediate descendants

Gi ∈ G1, . . . , Gm, that descendant’s predicates intersect with each Q(G) \ Q(Gi). This

guides us to look for an example satisfying G ∧∧mi=1

∨q∈Q(G)\Q(Gi)

¬q. Given a satisfying

example, we construct the descendant by collecting the predicates whose negation was satisfied

by the example. We next formalize this.

Lemma 4.3.4. Let G1, . . . , Gm be immediate descendants of G. There is no other immediate

descendant for G if and only if

G ∧m∧i=1

∨q∈Q(G)\Q(Gi)

¬q (4.1)

is unsatisfiable. If (4.1) is satisfiable, then for any example e for (4.1) we have ∨q ∈ Q(G) |e |= ¬q is a descendant of G but not equal to and neither is a descendant of any Gi, i =

1, . . . ,m.

Proof. Only if: Suppose G ∧∧mi=1

∨q∈Q(G)\Q(Gi)

¬q is satisfiable and let e be a satisfying

example. Namely, for every i there is qi ∈ Q(G) \ Q(Gi) such that qi(e) = 0. Denote

Ge = ∨q ∈ Q(G) | e |= ¬q. Since qi 6∈ Q(Gi), Ge is not a descendant of Gi (Lemma 4.1.3).

Q(Ge) ( Q(G), since there exists q ∈ Q(G) such that e |= q, and thus e 6|= ¬q and q /∈ Q(Ge).

Thus, since Ge 6≡ G and Ge⇒G, Ge is a descendant of G. Since Ge is not a descendant of any

Gi, there must be another immediate descendant of G.

If: Assume that there is another immediate descendant of G, denoted by G′, we show that

(4.1) is satisfiable. Let e be a witness for G and G′. Then by Lemma 4.2.1, e |= G and for

every q ∈ Q(G′) we have q(e) = 0. Since Q(G′) 6⊆ Q(Gi), (Q(G)\Q(Gi)) ∩ Q(G′) is not

empty. Choose qi ∈ (Q(G)\Q(Gi)) ∩Q(G′). Then, qi ∈ Q(G)\Q(Gi) and since qi ∈ Q(G′),

qi(e) = 0 and ¬qi(e) = 1. Therefore, (4.1) is satisfiable.

4.4 The C-SPEX Algorithm

The C-SPEX algorithm that learns conjunctions over Q is dual to D-SPEX. C-SPEX learns the

class Q∧ = ∧Q′ | Q′ ⊆ Q. The changes to D-SPEX (Algorithm 4.2.3) are: (i) the witness

is obtained by taking an example satisfying ¬G ∧G′, (ii) the condition on Line 6 is changed

from 0 to 1, and (iii) Learn is invoked on ∧Q instead of ∨Q. By duality (De Morgan’s Law),

all our results are true for learning Q∧ (after swapping ∨ with ∧). In this section, we provide

the lemmas where the changes are more than swapping ∨ with ∧.

Witnesses We begin with lemmas that provide characteristics of the witnesses.

45


Lemma 4.4.1. Let G1 be an immediate descendant of G2. If e ∈ D is a witness for G1 and G2,

then:

1. G1(e) = 1 and G2(e) = 0.

2. For every q ∈ Q(G1), q(e) = 1.

3. For every q ∈ Q(G2) \ Q(G1), q(e) = 0.

Proof. Since G1⇒G2 it must be that G2(e) = 0 and G1(e) = 1. Namely, for every q ∈ Q(G1),

q(e) = 1. Let q ∈ Q(G2) \ Q(G1). Consider G1 ∧ q. By bullet 3 in (the dual) Lemma 4.1.1,

G1 ∧ q 6≡ G1. Since G2⇒G1 ∧ q⇒G1, by (the dual) Lemma 4.1.2, G1 ∧ q ≡ G2. Therefore,

0 = G2(e) = G1(e) ∧ q(e) = q(e).

Lemma 4.4.2. Let De(G) = G1, G2, . . . , Gt be the immediate descendants of G. If e is a

witness for G1 and G, then e is not a witness for Gi and G for all i > 1. That is, G1(e) = 1,

G(e) = 0, and G2(e) = · · · = Gt(e) = 0.

Proof. By the previous lemma (Lemma 4.4.1) G(e) = 0 and G1(e) = 1. For any Gi, i ≥ 2, G1

and Gi are immediate descendants of G and thus, by (the dual) Lemma 4.1.5, G ≡ G1 ∧Gi.Therefore, 0 = G(e) = G1(e) ∧Gi(e) = Gi(e).

C-SPEX We continue with the lemmas pertaining the correctness of C-SPEX.

Lemma 4.4.3. Let G′ be an immediate descendant of G, e ∈ D be a witness for G and G′, and

G′′ be a descendant of G.

1. If G′′(e) = 1, G′′ is a descendant of G′ or equal to G′. In particular, Q(G′′) ⊆ Q(G′).

2. If G′′(e) = 0, G′′ is not a descendant of G′ nor equal to G′. In particular, Q(G′′) 6⊂Q(G′).

Proof. Since G′′ is a descendant of G, we have Q(G′′) ( Q(G). By Lemma 4.4.1, for every

q ∈ Q(G′), q(e) = 1 and for every q ∈ Q(G) \ Q(G′), q(e) = 0. Thus, if G′′(e) = 1, then no

q ∈ Q(G) \ Q(G′) is in Q(G′′) (otherwise, G′′(e) = 0). Therefore, Q(G′′) ⊆ Q(G′) and G′′

is a descendant of G′ or equal to G′. Otherwise, if G′′(e) = 0, then G′′ is not a descendant of

G′ nor equal to G′ (since if it were it must have been that G′′(e) = 1).

Lemma 4.4.4. Let ψ be the target formula. If Learn returns G, then Gψ = G (?). Otherwise,

if Learn calls Learn(∧Q, T ), then:

1. Q(Gψ) ⊆ Q. That is, Gψ is a descendant of ∧Q or equal to ∧Q.

2. Q(Gψ) 6⊂ P for all P ∈ T . That is, Gψ is not a descendant of any ∧P , for P ∈ T or

equal to ∧P .

Proof. The proof is by induction. The induction hypothesis is true for (Gmax,Ø). Assume the

induction hypothesis is true for (∧Q, T ) (Q(Gψ) ⊆ Q and Q(Gψ) 6⊂ P for all P ∈ T ). Let

G′1, . . . , G′` be all the immediate descendants of ∧Q. If Q(G′i) ⊆ P for some P ∈ T , G′i and

all its descendants G′′ satisfy Q(G′′) ⊆ Q(G′i) ⊆ P , and thus Gψ is not G′i or a descendant of

G′i.

46


Assume now that Q(G′i) 6⊂ P for all P ∈ T . Let e(i) be a witness for ∧Q and G′i. If

ψ(e(i)) = 0, then by Lemma 4.4.3 Gψ is not a descendant of G′i and not equal to G′i. This

implies that Q(Gψ) 6⊂ Q(G′i), which is why Q(G′i) is added to T . This proves bullet 2.

If ψ(e(i)) = 0 for all i, then Gψ = G. This follows since by Lemma 4.4.3, ψ is not any of

G’s descendants, and thus by the induction hypothesis it must be G. This is the case when the

Flag variable does not change to false and C-SPEX outputs G. This proves (?).

If ψ(e(i)) = 1, then by Lemma 4.4.3, Gψ is a descendant of G′i or equal to G′i. Let I be

the set of all indices i for which ψ(e(i)) = 1. Then, Gψ is a descendant (or equal) of all G′i,

i ∈ I and therefore Gψ is a descendant or equal to gcd(G′ii∈I). By (the dual) Lemma 4.1.6,

Q(gcd(G′ii∈I)) = ∩i∈IQ(Gi). Thus, C-SPEX takes the new Q to be ∩i∈IQ(Gi). This

proves bullet 1.

Immediate Descendants Finally, we provide the dual lemmas pertaining to obtaining the

immediate descendants. Proofs are identical when swapping ∨ with ∧.

Lemma 4.4.5. Let ϕ ∈ Q∧. ϕ ∈ G(Q∧) if and only if for every q ∈ Q \ Q(ϕ): ϕ ∧ q 6≡ ϕ.

Lemma 4.4.6. Let G,G′ ∈ G(Q∧). G′ is an immediate descendant of G if and only if G′ is a

descendant of G and, for every q ∈ Q(G) \ Q(G′), we have G′ ∧ q ≡ G.

Further, if G′ is a descendant of G and for some q ∈ Q(G) \ Q(G′) we have G′ ∧ q 6≡ G,

then GG′∧q is a descendant of G and an ascendant of G′.

Lemma 4.4.7. Let G1, . . . , Gm be immediate descendants of G. There is no other immediate

descendant for G if and only if (¬G)∧∧mi=1

∨q∈Q(G)\Q(Gi)

q is unsatisfiable. If it is satisfiable,

then for any satisfying example e for we have ∧q ∈ Q(G) | e |= q is a descendant of G but

not equal and not a descendant of any Gi, i = 1, . . . ,m.

Proof. Only if: Suppose (¬G) ∧∧mi=1

∨q∈Q(G)\Q(Gi)

q is satisfiable and let e be a satisfying

example. Namely, for every i there is qi ∈ Q(G) \ Q(Gi) such that Qi(e) = 1. Denote

Ge = ∧q ∈ Q(G) | e |= q. Since qi 6∈ Q(Gi), Ge is not a descendant of Gi (the dual

Lemma 4.1.3). Ge is not equivalent to G, since there exists q ∈ Q(G) such that e 6|= q

(e |= ¬G) and thus q /∈ Q(Ge). Since Ge is a descendant of G (G⇒Ge and G 6≡ Ge) and not a

descendant of any Gi, there must be another immediate descendant of G.

If: Assume that there is another immediate descendant of G, denoted by G′. We show that

this formula is satisfiable. Let e be a witness for G and G′. Then by Lemma 4.4.1, e 6|= G and

for every q ∈ Q(G′) we have q(e) = 1. SinceQ(G′) 6⊆ Q(Gi), (Q(G)\Q(Gi))∩Q(G′) is not

empty. Choose qi ∈ (Q(G)\Q(Gi)) ∩Q(G′). Then, qi ∈ Q(G)\Q(Gi) and since qi ∈ Q(G′),

qi(e) = 1. Therefore, the formula is satisfiable.

4.5 A Polynomial Time Algorithm for Variable Inequalities

In this section, we study the learnability of conjunctions over variable inequality predicates.

An application of this class was studied in the previous chapter. There, the learning algorithm

47


started from a positive example, and here we make no such assumption. In this section, we

fix the domain to D = Rn (where R is the set of real numbers) and denote the examples as

tuples of variables e = (xe1, . . . , xen). The predicates we consider are pair-wise inequalities

over these variables, i.e., they take the form of xi > xj . More formally, given I ⊆ [n]2 where

[n] = 1, 2, . . . , n, the set of predicates we consider is QI = xi > xj | (i, j) ∈ I. We

assume throughout this section that (i, i) 6∈ I for all i.

We first focus on a subset of conjunctions where conjunctions do not imply cyclic constraints

(e.g., the conjunction x1 > x2 ∧ x2 > x1 has a cyclic constraint). We give a polynomial time

learning algorithm for learning this class. We then study the general case, where any conjunction

over QI is allowed. We show that in this case the learning problem is equivalent to the open

problem of enumerating all the maximal acyclic subgraphs of a given directed graph.

The main idea of the proofs is to represent a conjunction as a directed graph, where the

nodes are the variables and the edges are the constraints. Before introducing the graph, we

provide notations. For a set J ⊆ [n]2, we define ϕJ = ∧(i,j)∈Jxi > xj . For ϕ ∈ QI∧ we define

I(ϕ) = (i, j) | xi > xj is in ϕ. Note that I(ϕJ) = J . For example, I((x1 > x2) ∧ x3 >

x1)) = (1, 2), (3, 1).Given a set I ⊆ [n]2, its directed graph is GI = ([n], I). The reachability matrix of I ,

denoted by R(I), is an n× n matrix where R(I)i,j is 1 if there is a (directed) path from i to j

in GI ; and 0 otherwise. We say that I is acyclic (resp. cyclic) if the graph GI is acyclic (resp.

cyclic). We say that an assignment to the variables e ∈ Rn is a topological sorting of I if for

every (i, j) ∈ I we have xei > xej . It is known that I has a topological sorting if and only if I is

acyclic. Also, it is known that a topological sorting for an acyclic set can be found in linear time

(see [Knu97], Volume 1, section 2.2.3 and [CSRL01]).

We now study the properties of the graph and the reachability matrix, and then the learnability

ofQI∧ when I is acyclic and when I can be cyclic. Our first lemma states the connection between

satisfiability of a conjunction to topological order in the graph.

Lemma 4.5.1. Let ϕ ∈ QI∧, and e ∈ Rn. Then, e |= ϕ if and only if e is a topological sorting

of GI(ϕ).

Proof. If e |= ϕ, then for every (i, j) ∈ I , xei < xej and thus e is a topological order in GI . If

e is a topological order, then consider (i, j) ∈ I . By the definition of GI , it has the edge (i, j).

Since e is a topological order, i must be before j in the topological order, namely, xei < xej .

This lemma implies the following corollary.

Corollary 4.3. Let ϕ ∈ QI∧. ϕ is satisfiable if and only if there is a topological order in GI(ϕ).

In particular, a satisfying assignment e ∈ Rn can be found in linear time.

The next lemma connects equivalence of formulas with equality of the reachability matri-

ces. This will later enable us to focus on reachability matrices when looking for immediate

descendants of a given node: An immediate descendant of G is a formula G′ such that for any

q ∈ Q(G) \ Q(G′), G′ ∧ q ≡ G – this lemma reduces the problem of checking this equivalence

to checking the reachability matrices of G′ ∧ q and G.

48


Lemma 4.5.2. Let ϕ1, ϕ2 ∈ QI∧. Then, if R(I(ϕ2))=R(I(ϕ1)), we have ϕ1 ≡ ϕ2. Further, if

I is acyclic, then ϕ1 ≡ ϕ2 if and only if R(I(ϕ2))=R(I(ϕ1)).

Proof. 1) Assume R(I(ϕ2)) = R(I(ϕ1)). Suppose on the contrary that ϕ2 6≡ ϕ1. Then, there

is an example e such that ϕ2(e) = 1 and ϕ1(e) = 0 (w.l.o.g.). Since ϕ1(e) = 0, e is not a

topological sorting of GI(ϕ1) (Lemma 4.5.1). Therefore, there is an edge i→ j in GI(ϕ1) such

that xei ≤ xej though xi > xj ∈ I . Since R(I(ϕ2))i,j = R(I(ϕ1))i,j = 1, there is a path from i

to j in GI(ϕ2). We now show that ϕ2(e) = 0 and thus get a contradiction. SinceR(I(ϕ2))i,j = 1

there is a path i = i1 → i2 → · · · → i` = j from i to j in GI(ϕ2). Therefore, ϕ2 contains

ϕ′ = [xi1 > xi2 ] ∧ [xi2 > xi3 ] ∧ · · · ∧ [xi`−1> xi` ]. Since ϕ2⇒ϕ′⇒xi1 > xi` = xi > xj and

our assignment satisfies [xei > xej ] = 0, we get ϕ2(e) = 0.

2) Assume I is acyclic and ϕ1 ≡ ϕ2. Suppose on the contrary that there are i, j such that

w.l.o.g. R(I(ϕ1))i,j = 0 andR(I(ϕ2))i,j = 1. Since I is acyclic andR(I(ϕ2))i,j = 1, there is

no path from j to i in GI (and therefore in GI(ϕ1)). Since R(I(ϕ1))i,j = 0, there is also no path

from i to j in GI(ϕ1). Therefore, we can match the vertices i and j in GI(ϕ1) (unify them to a

single vertex) and get an acyclic graph G′. Using the topological sorting of G′ we get a satisfying

assignment e for ϕ1 that satisfies xei = xej . Namely, ϕ1(e) = 1. We now show that ϕ2(e) = 0

and thus get a contradiction. SinceR(I(ϕ2))i,j = 1 there is a path i = i1 → i2 → · · · → i` = j

from i to j in GI(ϕ2). Therefore, ϕ2 contains ϕ′ = [xi1 > xi2 ]∧ [xi2 > xi3 ]∧· · ·∧ [xi`−1> xi` ].

Since ϕ2⇒ϕ′⇒[xi1 > xi` ] = [xi > xj ] and our assignment satisfies [xei > xej ] = 0, we get

ϕ2(e) = 0.

4.5.1 Acyclic Sets

In this section, we study the case when I is acyclic. To make C-SPEX polynomial, we have

to guarantee that the following are polynomial: (i) computing the immediate descendants and

(ii) computing witnesses. We show how to obtain the immediate descendants in quadratic time

(in n) and the witnesses in linear time. Finally, we show that the number of membership queries

is at most |I|.We show that the immediate descendants of a graph G are its sub-graphs that differ from G

in one edge and one pair of nodes that are unreachable. A witness can be obtained for G and

an immediate descendant by computing a topological order for the descendant that violates the

order between the pair of nodes that are reachable in G (thus, this topological order is not a

topological order for G). Before we describe this, we first characterize the members in G(Q∧)

through their reachability matrix.

Lemma 4.5.3. Let I be an acyclic set and ϕ ∈ QI∧. ϕ ∈ G(QI∧) if and only if, for every

(i, j) ∈ I\I(ϕ), there is no path from i to j in GI(ϕ).

Proof. If: If for every (i, j) ∈ I\I(ϕ) there is no path from i to j in GI(ϕ), then for every

(i, j) ∈ I\I(ϕ), R(I(ϕ) ∪ (i, j)) 6= R(I(ϕ)) . By Lemma 4.5.2 this implies that ϕ ∧ (xi >

xj) 6≡ ϕ. By Lemma 4.4.5 the result follows.

49


Only if: Now let ϕ ∈ G(QI∧). By Lemma 4.4.5, for every xi > xj not in ϕ we have

ϕ ∧ (xi > xj) 6≡ ϕ. Therefore, there is an assignment e that satisfies xei ≤ xej and ϕ(e) = 1.

As before, if there is a path in GI(ϕ) from i to j, then we get a contradiction.

We now show how to determine the immediate descendants of G in polynomial time.

Lemma 4.5.4. Let I be acyclic. The immediate descendants of G ∈ G(QI∧) are all Gr,s =

ϕI(G)\(r,s) where (r, s) ∈ I(G) and there is no path from r to s in GI(G)\(r,s).

In particular, for all G ∈ G(QI∧), we have |De(G)| ≤ |I(G)| ≤ |I|.

Proof. On the one hand, (r, s) ∈ I(G), and thus R(I(G))r,s = 1. On the other hand, since

there is no path from r to s in GI(G)\(r,s), we haveR(I(Gr,s))r,s = 0. Therefore, R(I(G)) 6=R(I(Gr,s)) and by Lemma 4.5.2, we get G 6≡ Gr,s. By Lemma 4.4.6, Gr,s is an immediate

descendant of G.

To show that there is no other immediate descendant, we use Lemma 4.4.7. Note that

Q(G)\Q(Gr,s) = xr > xs and thus by Lemma 4.4.7 it is sufficient to prove that ¬G ∧∧(i,j)∈J(xi > xj) is unsatisfiable, where J = (i, j) ∈ I(G) | there is no path from i to

j in GI(G)\(i,j). To prove this, it is sufficient to show that G ≡∧

(i,j)∈J(xi > xj). By

Lemma 4.5.2, it is sufficient to show that R(I(G)) = R(J).

If R(J)i,j = 1, then R(I(G))i,j = 1 since GJ is a subgraph of GI(G). If R(I(G))i,j = 1,

there is a path p from i to j in GI(G). Let (r, s) 6∈ I(G)\J . Then, (r, s) ∈ I(G) and there is a

path (other than r → s) r → v1 → v2 → · · · → v` = s in GI(G). We now show that there is a

path from i to j in GI(G)\(r,s). This is true because if the path p (in GI(G)) contains the edge

r → s, then we can replace this edge with the path r → v1 → v2 → · · · → v` = s and get a

new path from i to j in GI(G)\(r,s). Therefore, R(I(G)\(r, s))i,j = 1. By repeating this on

the other edges in I(G)\J we get R(J)i,j = 1.

Corollary 4.4. The immediate descendants of G can be found in polynomial time.

Proof. By Lemma 4.5.4, this involves finding a path between every two nodes in the directed

graph GI(G), which can be done in polynomial time (e.g, using Dijkstra’s algorithm).

We now show how to find a witness.

Lemma 4.5.5. Let I be acyclic, G ∈ G(QI∧), and Gr,s = ϕI(G)\(r,s) be an immediate

descendant of G. A witness for G and Gr,s can be found in linear time.

Proof. By Lemma 4.5.4, (r, s) ∈ I(G) and there is no path from r to s in GI(G)\(r,s).

Therefore, if we match vertices r and s in GI(G)\(r,s) we get an acyclic graph G′. Then, a

topological sorting e for G′ is a satisfying assignment for Gr,s that satisfies xer = xes. Since

[xr > xs] ∈ S(G), we get G(e) = 0. Therefore, e is a witness for G and Gr,s.

Corollary 4.5. The class QI∧ is learnable in polynomial time with at most |I|2 membership

queries.

50


Proof. Follows from Theorem 4.1 and Corollary 4.5.

We now show that in fact the number of membership queries is actually lower and equal to |I|.

Theorem 4.6. Let I ⊆ [n]2 be acyclic. The class QI∧ is learnable in polynomial time with at

most |I| membership queries.

Proof. Let ψ be the target function. Consider an execution of Learn(G,T). Let G = q1 ∧ q2 ∧· · ·∧qt where qi ∈ QI . By Lemma 4.5.4 and w.l.o.g.,G(i) = q1∧q2∧· · ·∧qi−1∧qi+1∧· · ·∧qt,where i = 1, . . . , ` are all the immediate descendants of G. Namely, Q = qi | i = 1, . . . , t.Let e(i) be the witness for G and G(i), i = 1, . . . , `. If ψ(e(i)) = 1, then Learn removes qifrom Q and qi never returns to Q. If ψ(e(i)) = 0, then the set q1, q2, . . . , qi−1, qi+1, · · · , qtis added to T , which means that C-SPEX never considers a descendant that does not contain qi.

Namely, for every qi there is at most one membership query that is posed by C-SPEX.

4.5.2 Cyclic Sets

In this section, we consider the general case, that is, where I ⊆ [n]2 can be any set. We first

show that if I is acyclic, then Gmax ≡ false and its immediate descendants are the maximal

acyclic subgraphs. Thus, obtaining them is equivalent to enumerating all maximal acyclic

subgraphs of GI .

Lemma 4.5.6. Let I ⊆ [n]2 be any set with cycles. Then:

1. Gmax ≡ false is in QI∧.

2. The immediate descendants of Gmax are all ∧(i,j)∈J [xi > xj ] where GJ is a maximal

acyclic subgraph of GI .

Proof. 1. I has a cycle, namely, there exists a cycle of constraints: x1 → x2 → · · · →xc → x1. Thus, Gmax⇒xi1 > xi1 ≡ false and thus Gmax ≡ false.

2. • If GJ is a maximal acyclic subgraph of GI , then adding any edge in I\J to GJcreates a cycle. This implies that for any xi > xj ∈ S(ϕI)\S(ϕJ), we have

ϕJ ∧(xi > xj) ≡ false ≡ Gmax. By Lemma 4.4.6, ϕJ is an immediate descendant

of Gmax.

• If ϕJ is an immediate descendant of Gmax, then J is acyclic because otherwise

ϕJ ≡ false ≡ Gmax. If GJ is not a maximal acyclic subgraph of GI , then there is an

edge (i, j) such that J ∪(i, j) is acyclic. Then, either ϕJ∪(i,j) ≡ ϕJ – in which

case ϕJ is not inG(Q∧) and thus not an immediate descendant – or ϕJ∪(i,j) 6≡ ϕJ– in which case Gmax⇒ϕJ∪(i,j)⇒ϕJ and Gmax 6≡ ϕJ∪(i,j) 6≡ ϕJ and therefore

ϕJ is not an immediate descendant of Gmax.

Corollary 4.7. Finding all the immediate descendants of Gmax is equivalent to enumerating

all the maximal acyclic subgraphs of GI .

51


Let G be any directed graph and denote by N(G) the number of the maximal acyclic

subgraphs of G. The following lemma follows immediately from Theorem 5.8 and Lemma 4.5.6.

Lemma 4.5.7. OPT(QI∧) ≥ N(GI).

The problem of enumerating all the maximal acyclic subgraphs of a directed graph is still

an open problem ([ABC+12, BCL+13, Was16]). We next show that learning a function in QI∧(where I ⊆ [n]2) in polynomial time is possible if and only if the enumeration problem can be

solved in polynomial time.

Theorem 4.8. There is a polynomial time learning algorithm (poly(OPT(QI∧), n, |I|)) that,

for an input I ⊆ [n]2, learns ϕ ∈ QI∧ if and only if there is an algorithm that, for an input Gthat is a directed graph, enumerates all the maximal acyclic subgraph of G(V,E) in polynomial

time (poly(N(G), |V |, |E|)).

Proof. If: Let A be an algorithm that, for an input G, which is a directed graph, enumerates

all the maximal acyclic subgraphs in polynomial time (poly(N(G), |V |, |E|)). The first step

of C-SPEX finds all the immediate descendants of Gmax. By Lemma 4.5.6, this is equivalent

to enumerating all the maximal acyclic subgraphs of GI . This can be done by A in time

poly(N(GI), n, |I|). For every immediate descendant G′ of Gmax ≡ false, any topological

sorting of G′ is a witness for G′ and G. Once C-SPEX calls Learn on one of the immediate

descendants of Gmax, the algorithm proceeds as in the acyclic case. This algorithm runs in

poly(N(GI), n, |I|) time and poses at most N(GI) + |I| membership queries. By Lemma 4.5.7,

the algorithm runs in poly(OPT(QI∧), n, |I|) and poses at most OPT(QI∧) + |I| queries.

Only if: Let B be a learning algorithm that runs in poly(OPT(QI∧), n, |I|) time. By the

above argument, it follows that:

OPT(QI∧) ≤ N(GI) + |I|. (4.2)

Let G = ([n], E) be any directed graph. We run the learning algorithm with the target ϕE . For

any membership query posed by the algorithm, we answer 0 until the algorithm outputs the

hypothesis Gmax ≡ false. Let A be the set of all membership queries that are posed by the

algorithm. Then:

1. |A| = poly(N(G), n, |E|): Follows since the algorithm runs in poly(OPT(QI∧), n, |I|)and by (4.2) this is poly(N(G), n, |E|). Thus, the number of membership queries is

poly(N(G), n, |E|).

2. If G′ is a maximal acyclic subgraph of G, then there is an example e ∈ A such that

E(G′) = (i, j) ∈ E | xei > xej, where E(G′) is the set of edges of G′: There is an

example e ∈ A that satisfies ϕE(G′)(e) = 1, because otherwise the algorithm cannot

distinguish between ϕE(G′) and Gmax, which violates the correctness of the algorithm.

Now, since ϕE(G′)(e) = 1, we must have E(G′) ⊆ (i, j) ∈ E | xei > xej. Since E(G′)

is maximal (adding another edge will create a cycle), E(G′) = (i, j) ∈ E | xei > xej.

52


The algorithm that enumerates all the maximal acyclic subgraphs of G(V,E) continues to

run as follows. For each e ∈ A, it defines Ee := (i, j) ∈ E | xei > xej. If Ge = ([n], Ee) is a

maximal acyclic subgraph, and then it lists Ge. Testing whether Ge = ([n], Ee) is maximal can

be done in polynomial time (e.g., by checking edge-by-edge in E). Thus, the overall algorithm

runs in poly(N(G), |V |, |E|) time.

4.6 Conclusion

In this chapter, we studied the learnability of disjunctions (Q∨) and conjunctions (Q∧) over

a set of predicates Q. We showed two algorithms, D-SPEX and C-SPEX, which pose at most

|Q| · OPT (Q∨) membership queries. We further showed a class that C-SPEX can learn in

polynomial time.

53


54


Chapter 5

Learning a DNF of Predicates

In the previous chapter, we introduced algorithms for the general but limited setting of learning

a disjunction and conjunction over an arbitrary set of predicates. In this chapter, we extend this

setting further, and study the problem of learning a disjunctive normal form (DNF) formula (or

dually, a conjunctive normal form) describing the user’s intent through membership queries.

DNF formulas are disjunctions of cubes, where a cube is a conjunction of predicates. As in the

previous chapter, the formulas are over arbitrary predefined predicates. More formally, let Q be

a set of predicates over a domain D. Our goal is to learn the class QDNF = ∨C∈S

∧q∈C q |

S ⊆ 2Q. We further focus on two settings:

• Predicates that are closed under negation: for all q ∈ Q, ¬q ∈ Q.

• Predicates that are “anti-closed” under negation: for all q ∈ Q, ¬q /∈ Q.

We show for each setting an improved algorithm. For the first setting, we show an algorithm

optimal in the number of membership queries.

5.1 The Search Space

In this section, we describe the search space. Corresponding each node to a DNF formula

will result in an exponential blow-up. Instead, we correspond nodes to cubes and the goal is

to find a set of cubes (nodes) such that the target formula is equivalent to their disjunction.

Formally, our search space is identical to the one presented in Section 4.1: the nodes are the set

of non-equivalent formulas over Q, G(Q∧), and the edges are defined by the Hasse diagram

over the partial order⇒, where G1⇒G2 if G1 logically implies G2. All notions of (immediate)

descendants/ ascendants, and lca/ gcd are identical. Given this search space, our problem

definition can be stated as follows:

Definition 5.1.1. Given a target DNF formula ψ over Q, find a set of nodes S in the Hasse

diagram of H(G(Q∧)) for⇒, such that∨C∈S C ≡ ψ.

55


Algorithm 8: DNF-SPEX1 return Learn(Gmax,∅, ∅)2 Function Learn(Gs, S, T ):3 if Gs = ∅ then return S4 NewGs = ∅5 for G ∈ Gs do6 Flag = true7 for G′ ∈ getAllImmDe(G) do8 if ∃P ∈ T.Q(G′) ⊆ P then continue // G′ was eliminated by an ancestor

9 e = model(¬G ∧G′) // get a witness for G and G′

10 if ψ(e) = 1 then // pose membership query

11 NewGs = NewGs ∪ G′ // A cube is G′ or its descendant

12 Flag = false

13 else14 T = T ∪ Q(G′) // Eliminate G′ and all its descendants

15 if Flag then S = S ∪ G // G is one of the cubes

16 Learn(newGs, S, T )

5.2 Searching the Space with Witnesses

The notion of witness, as defined in Section 4.2, and the lemma defined in this section, remain

correct (as the space has not changed, but rather the goal). In the previous chapter we relied on

the fact that if the target is implied by two (non-comparable) nodes, then the target is implied by

their gcd (which enabled Learn to be invoked on the intersection of these nodes’ predicates in

D-SPEX, Line 13). Here, however, the goal is to find a set of nodes, and it may be that the target

contains these two nodes but is not implied by their gcd. Thus, C-SPEX cannot be used directly

to learn the cubes of a DNF formula. Instead, we consider a new algorithm, called DNF-SPEX,

which modifies C-SPEX for the new goal.

DNF-SPEX, depicted in Algorithm 8, traverses the space G(Q∧) to find the cubes of the

target formula ψ (more precisely, to find a set of cubes whose disjunction is equivalent to ψ).

As in the previous chapter, DNF-SPEX invokes Learn to learn the target. Learn takes three

sets: (i) a set that contains nodes that cubes are reachable from – Gs, (ii) a set of nodes that are

known to be part of the cubes – S , and (iii) a set that contains nodes that are pruned; that is, they

and their descendants are not part of the cubes – T . The first invocation of Learn, called by

DNF-SPEX, is on a set Gs that contains the maximal element in G(Q∧) and the empty sets for

S and T . At each step, Learn examines all members in Gs. For each such element G, it checks

the witnesses of G and its immediate descendants, and poses a membership query for each. If ψ

and G agree on the witness of G and an immediate descendant G′, then by Lemma 4.4.3, ψ is

not implied by G′ or its descendants, and thus these are pruned from the search space by adding

G′ to T . Otherwise, if ψ and G′ agree on the witness, then G′ or its descendants are part of the

cubes of ψ. Thus, G′ is added to newGs (for the next call of Learn).

If G and ψ agreed on all witnesses (evident by the Flag variable), then G is inferred to

be a cube in ψ. As before, correctness follows since an invariant of the execution is that one

56


of ψ’s cubes is G or one of its descendants, and if G and ψ agreed on all witnesses, then by

Lemma 4.4.3 neither of G’s descendants is a cube of ψ. Eventually, Learn is invoked on

NewGs, T , and S. Learn terminates when NewGs is empty, at which point it is guaranteed

that S contains all cubes, i.e., ψ ≡∨C∈S

∧C.

We now analyze DNF-SPEX’s complexity.

Theorem 5.1. If the immediate descendants of any G ∈ G(Q∧) can be found in time t, then

DNF-SPEX learns the target formula in time t · |G(Q∧)| and at most |G(Q∧)| membership

queries.

The complexity proofs follow directly from the size of the search space. While this may

seem to be a naıve algorithm (and indeed we will show improvements for some classes), this

complexity should be compared to the search space of all DNF formulas, which is of size

G(QDNF ) = Ω(2width(G(Q∧)))1, where width stands for the width of the Hasse diagram of

G(Q∧) (i.e., the maximal number of non-comparable nodes).

The fact that DNF-SPEX learns the target formula follows from the following lemma.

Lemma 5.2.1. Let ψ be the target formula. If Learn returns S, then ψ ≡∨C∈S

∧C (?).

Otherwise, if Learn calls Learn(Gs,S, T ), then for every G ∈ Gs:1. G |= ψ; that is, there is a cube in ψ that is G or a descendant of G.

2. For all P ∈ T , ∧P 6|= ψ. That is, no cube in ψ is a descendant of ∧P or equal to ∧P ,

for P ∈ T .

3. For every C ∈ S, C |= ψ and no descendant of C logically implies ψ.

Proof. The proof is by induction. Initially, the induction hypothesis is true for (Gmax,Ø,Ø).

Assume the induction hypothesis is true for (Gs,S, T ). Let G ∈ Gs and let G′1, . . . , G′` be all

its immediate descendants. If Q(G′i) ⊆ P for some P ∈ T , then by the induction hypothesis,

G′i and all its descendants G′′ satisfy G′′ 6|= ψ (since ∧P 6|= ψ, there is e |= ∧P such that

e 6|= ψ. On the other hand, Q(G′′) ⊆ Q(G′i) ⊆ P and thus ∧P |= G′i |= G′′). Thus, no cube is

G′i or a descendant of G′i.

Assume now that Q(G′i) 6⊂ P for all P ∈ T . Let e(i) be a witness for G and G′i. If

ψ(e(i)) = 0, then by Lemma 4.4.3 there is no cube in ψ that is a descendant of G′i or equal to

G′i. This implies that Q(Gψ) 6⊂ Q(G′i), which is why Q(G′i) is added to T . This proves bullet

2 of the lemma.

If ψ(e(i)) = 0 for all i, then G is a cube in ψ. This follows since by Lemma 4.4.3, no cube

is a descendant of any of G’s descendants, and thus by the induction hypothesis it must be

that G is a cube in ψ. This is the case when the Flag variable does not change to false and

DNF-SPEX adds G to S. This proves bullet 3.

If ψ(e(i)) = 1, then by Lemma 4.4.3, there is a cube that is a descendant of G′i or equal to

G′i. Thus, G′i |= ψ. Thus, DNF-SPEX adds G′i to NewGs. This proves bullet 1.

1This follows since the DNF formula space is at least of magnitude of∑width(G(Q∧))i=0

(width(G(Q∧))

i

), where i

stands for the number of cubes.

57


We finally show that if Learn returns∨C∈S

∧C, then ψ ≡

∨C∈S

∧C. If Learn returns∨

C∈S∧C, then for everyC ∈ S ,C |= ψ (by the induction hypothesis), and thus

∨C∈S

∧C |=

ψ. We now prove that for every e |= ψ, e |=∨C∈S

∧C. Let e be an example such that

e |= ψ and consider Ce = ∧q ∈ Q | e |= q. We show that DNF-SPEX must have

considered Ce and thus either Ce ∈ S or a descendant of Ce, denoted G′′, is in S; in either case,

e |= Ce |= G′′ |=∨C∈S

∧C.

• First, Ce ∈ G(Q∧): For every q /∈ Ce, e 6|= Ce ∧ q (by the definition of Ce). Thus,

Ce 6≡ Ce ∧ q, and thus, by Lemma 4.4.5, Ce ∈ G(Q∧).

• Second, Ce |= ψ: If e |= ψ, there is a cube C in ψ such that e |= C and by the definition

of Ce, Ce |= C.

• Third, consider a path from Gmax to Ce: Gmax = G0 |= G1 |= . . . |= Gk = Ce. We

show:

– Each Gi is considered by DNF-SPEX (i.e., at some point they are at Gs): By

induction. Base is trivial. Assume in contradiction that Gi is not considered. Then,

there exists P ∈ T such that Q(Gi) ⊆ P . However, in this case Gi 6|= ψ, and

thus in particular Ce 6|= ψ (as a descendant of Gi), which is a contradiction to the

previous bullet.

– Either Ce ∈ S or a descendant of Ce, G′′, is in S: We prove that if Ce ∈ Gs, then

either Ce is added to S or a descendant of Ce is added to S. If Ce is added to S –

we are done. Thus, assume it is not added to S. Namely, Flag is set to false, and

thus an immediate descendant of Ce is added to Gs. Then, from the same argument,

either the descendant is added to S or its descendant is added to Gs. We continue

with this argument until Gs contains Gmin. Gmin must be added to S (it has no

immediate descendants and thus Flag remains true) and thus the claim follows.

We now prove the lower bound.

Theorem 5.2. Any learning algorithm that learnsQDNF must ask at least log(|G(Q∧)|) mem-

bership queries.

Proof. The number of different formulas in QDNF is at least |G(Q∧)|, and thus from the

information theoretic lower bound we get OPT(QDNF ) ≥ dlog |G(Q∧)|e.

Note that QDNF can be equal to G(Q∧) for some predicate sets. For example, consider

Q = (x = 1), (y = 1), (x = 1 ∨ y = 1). In this case, G(Q∧) = (x = 1 ∧ y = 1), (x =

1), (y = 1), (x = 1∨ y = 1), true and every disjunction overQ is equivalent to some formula

in G(Q∧).

We next show that if further information is available about the search space, we can obtain

improved algorithms and bounds on the number of membership queries.

58


5.3 Learning when Predicates are Closed under Negation

In this section, we consider the special case where the predicate set is closed under negation.

Namely, for every q ∈ Q: ¬q ∈ Q. We begin this section with a few terms and then outline

our contributions. We say that two examples e1, e2 ∈ D are equivalent with respect to Q if

for all q ∈ Q, e1 |= q ⇔ e2 |= q. If two examples are equivalent with respect to Q, we write

e1 ≡Q e2. A non-equivalent example set is a maximal subset of D such that no pair of examples

is equivalent. Formally, E ⊆ D is a non-equivalent example set if and only if:

• ∀e1, e2 ∈ E : e1 6≡Q e2 and

• ∀e ∈ D \ E: ∃e′ ∈ E : e ≡Q e′.We next show that for this class to learn a target DNF formula, the classification of all

non-equivalent examples must be known. We then consider a setting oriented for program

synthesis applications, where one is given a set of representative positive examples. For this

setting, we show an algorithm optimal in the number of membership queries.

5.3.1 A Lower and Upper Bound

In this section, we show that without further assumptions, to learn a DNF formula all non-

equivalent examples must be observed.

Lemma 5.3.1. Let ψ be a target (unknown) DNF formula and E a non-equivalent example set.

Any learning algorithm A that learns ψ from membership queries has to pose for every e ∈ Ea membership query a ∈ D such that e ≡Q a. In particular, A has to pose |E| membership

queries.

Proof. Let A be such a learning algorithm. Assume in contradiction that there exists e ∈ Esuch that for any membership query a that A posed, e 6≡Q a. Let ϕ be the formula A learned.

We show that possibly ψ 6≡ ϕ, even though ψ is consistent with the membership queries Aobserved. This proves that A cannot distinguish between two non-equivalent formulas, and thus

it returned an incorrect formula.

Let EA be the set of examples A observed which were discovered as positive examples. We

split to cases:

• If e |= ϕ: Define ψ =∨e∈EA

∧Ce where Ce = q ∈ Q | e |= q. For every positive

example A observed, ψ returns 1. We now prove:

– e 6|= ψ: Since for every a ∈ EA, a 6≡Q e, e 6|= Ca for every a ∈ EA. This follows

since if a 6≡Q e, there exists q ∈ Q such that a |= q and e 6|= q (this is true because

Q is closed under negation), and thus q ∈ Ca and e 6|=∧Ca).

– For every negative example that A observed, ψ returns 0: Let e′ be a negative

example. Thus, for every a ∈ EA, a 6≡Q e′. As before, e′ 6|= Ca for every a ∈ EA.

Thus, e′ 6|= ψ.

Namely, ψ is consistent with the membership queries A observed and e 6|= ψ. Since

e |= ϕ, ψ 6≡ ϕ.

59


• If e 6|= ϕ: Define ψ =∨e∈EA∪e

∧Ce where Ce is defined as before. Proof is similar to

the former case: ψ is consistent with all positive and negative examples (none of which is

equivalent to e with respect to Q) and e |= ψ. Since e 6|= ϕ, ϕ 6≡ ψ.

This lower bound is in fact a tight bound, since if a non-equivalent example set is provided,

the target formula can be inferred immediately. This is our next corollary.

Corollary 5.3. Let ψ be a target (unknown) DNF formula and E a non-equivalent example set.

There exists a learning algorithm that learns ψ with |E| membership queries.

Proof. The algorithm acts as follows. It poses a membership query for every e ∈ E. Then, if

EP ⊆ E is the set of positive examples (i.e., ψ(e) = 1), the algorithm returns ϕ =∨e∈EP

∧Ce,

where Ce = q ∈ Q | e |= q. We now prove that the algorithm is correct on all inputs. Let

e ∈ D. Thus, there exists a ∈ E such that e ≡Q a, and thus e |= ψ ⇔ a |= ψ. If a is a positive

example, then Ca is part of ϕ, and since e ≡Q a, e |= ϕ. Otherwise, if a is negative, then since

for every a′ ∈ E, a 6≡Q E, a 6|= Ca′ and thus e 6|= Ca′ and in particular, e 6|= ϕ.

5.3.2 Learning with Representative Positive Examples

In this section, we consider a special setting oriented for program synthesis. A common as-

sumption of programming by example is that the user provides a set of representative positive

examples. Typically, this notion is interpreted intuitively and implies that the provided examples

indicate the desired behavior on all examples: all positive examples “resemble” one of the repre-

sentative examples and none of the negative examples resemble them. Here, we formalize this

notion and then show an algorithm that learns a target DNF formula from a set of representative

examples. If the set of representative examples is minimal, then the algorithm is optimal in the

number of examples it needs for learning. We begin with the main definitions.

Let ψ be a DNF formula and let Sψ be the set of cubes in ψ, that is, Sψ ⊆ 2Q and∨C∈Sψ

∧C = ψ. A set of examples E ⊆ D is called a representative set for ψ if for every

C ∈ Sψ there is e ∈ E such that e |= C. Note that examples in E may satisfy more than one

cube (|E| may be smaller than |Sψ|). A representative set for ψ is minimal if every strict subset

is not a representative set for ψ.

The search space, G(Q∧), does not allow pruning in this setting. Consider the set of

predicates Q = x1,¬x1, . . . , xn,¬xk, where xi are Boolean variables. In this example,

the number of immediate descendants of Gmax is 2|Q|/2. This is because every subset of

Q that contains either xi or ¬xi (but not both) is not equivalent to Gmax but adding it any

other predicate will make it equivalent to Gmax. Thus, the current algorithm will introduce a

membership query for every Gmax and its immediate descendant, resulting in presenting all

non-equivalent examples.

Instead, we consider a different search space that enables pruning in our setting. The

main insight is that instead of looking for all cubes of the target formula – which cannot

be done without a non-equivalent example set – we look for all cubes satisfied by a given

60


positive example. To this end, we define a new search space where nodes correspond to non-

equivalent examples from D and we organize them in a way that enables pruning – based on

the given positive example. More specifically, given a positive example e ∈ D, we search

for all cubes that are satisfied by e and that logically imply the target DNF ψ, i.e., all sets

C ⊆ Q such that e |= ∧C |= ψ. To guarantee that e |= ∧C, the search considers subsets of

Q(e) = q ∈ Q | e |= q. This is performed by an on-the-fly traversal of a directed graph

Ge = (V,Ee) where each node v ∈ V is associated with such a subset C ⊆ Q(e). The traversal

ensures that for every node that remains in the graph, the set associated with it satisfies ∧C |= ψ.

We first present this new search space. We then show how it can be used to optimally search

for the set of conjunctions satisfied by a single positive example. Finally, we show an algorithm

to learn a DNF formula from a representative set. If the representative set is minimal, we prove

that this algorithm is optimal in the number of membership queries.

A New Search Space

Given an example e, we define a new search space, denoted by Ge = (V,Ee). To describe the

nodes, we first divide the predicates to two sets:

• The core predicate set, denoted byQc, which is the maximal subset ofQ that is not closed

under negation: ∀q ∈ Qc : ¬q /∈ Qc and for every q ∈ Q \ Qc : ∃q′ ∈ Qc : q ≡ ¬q′.• The negated predicate set, denoted by Q¬, which are all the predicates’ negation: Q¬ =

Q \ Qc.In the following, we use the standard names and refer toQ as a literal set, and writeL = Qc∪Q¬and l for elements (literals) in L. We further simplify writing and assume all predicates in Q¬take the form of ¬q where q ∈ Qc. Let q be a literal inQc, we say that q is its positive form and

¬q is its negative form. Another notation we use is L(e), for e ∈ D, and this is the set of literals

satisfied by e: L(e) = l ∈ L | e |= l.

Nodes The set V of nodes in Ge is the set of all subsets of L in which every predicate appears

exactly once, either in its positive form or its negative form. Namely,

V = v ⊆ L | ∀l ∈ L. l ∈ v ↔ ¬l 6∈ v

Nodes in V are classified to positive, negative or unsatisfiable by associating them with concrete

examples. For a node v ∈ V and an example e′ ∈ D, if e′ |= ∧v, we say that e′ is an example

corresponding to v. If e′ |= ψ (the target DNF formula), v is positive; if e′ 6|= ψ, v is negative;

if there is no e′ ∈ D such that e′ |= ∧v, v is unsatisfiable. Note that a node may have multiple

concrete examples from D that correspond to it; however, they are all equivalent modulo L.

Thus, in the following we refer to corresponding examples in a singular form (e.g., we say the

corresponding example of a node).

We say that nodes v 6= v′ ∈ V are logically equivalent if ∧v ≡ ∧v′. As we show next, this

can only happen if v and v′ are unsatisfiable.

61


Lemma 5.3.2. Logically equivalent nodes in V are unsatisfiable: If v 6= v′ and ∧v ≡ ∧v′, then

∧v is unsatisfiable.

Proof. Since v 6= v′ ∈ V , there exists l ∈ v such that ¬l ∈ v′. Assume in contradiction that

there exists e |= ∧v. Then, on the one hand e |= ∧v |= l. On the other hand, ∧v ≡ ∧v′ and thus

e |= ∧v′ |= ¬l, a contradiction.

Corollary 5.4. For every e′ ∈ D, there exists a single node v ∈ V such that L(e′) = v.

Note that the set of nodes in Ge, their corresponding examples, and their classification are

independent of e. Next, we define the set of edges, as well as the cube associated with each

node, which depend on e.

Associated Cubes and Edges

To define the set of edges Ee, we first define a labeling of nodes. Given an example e the e-label

of a node v ∈ V , denoted Ce(v), is the set of literals from L(e) in v. That is,

Ce(v) = v ∩ L(e)

Therefore, for every v ∈ V , e |= ∧Ce(v), which ensures that the e-label of each node represents

a candidate set C that satisfies e |= ∧C |= ψ. We next show that each node has a unique e-label

and that for each subset C ⊆ L(e) there is a node with that e-label.

Lemma 5.3.3. If v 6= v′, then Ce(v) 6= Ce(v′).

Proof. Since v 6= v′, there exists l ∈ v such that ¬l ∈ v′. Assume w.l.o.g. that l ∈ L(e). Then,

l ∈ Ce(v) and l /∈ Ce(v′).

Lemma 5.3.4. For every C ⊆ L(e) there exists a node v ∈ V such that Ce(v) = C.

Proof. Given C ⊆ L(e), we define v = C ∪Q1∪Q2 such that Q1 = ¬q ∈ L | q ∈ L(e)\Cand Q2 = q ∈ L | ¬q ∈ L(e) \ C. We show v ∈ V :

• If q /∈ v, then ¬q ∈ v: If q /∈ v, then q /∈ C and q /∈ Q2. If ¬q ∈ C – we are done.

Otherwise, since ¬q /∈ C and ¬q /∈ L(e)\C, then ¬q /∈ L(e). This implies that q ∈ L(e)

and thus q ∈ L(e) \ C, which implies ¬q ∈ Q1 and ¬q ∈ v.

• If ¬q /∈ v, then q ∈ v: Similar.

• There are no q,¬q ∈ v: Assume in contradiction that there are q,¬q ∈ v. SinceC ⊆ L(e),

it cannot be that q,¬q ∈ C. Since q,¬q /∈ L(e), it cannot be that q,¬q ∈ L(e) \ C and

thus it cannot be that ¬q ∈ Q1 and q ∈ Q2. If q ∈ C and ¬q ∈ Q1, then q ∈ L(e) \ C –

a contradiction. If ¬q ∈ C and q ∈ Q2, then ¬q ∈ L(e) \ C – a contradiction.

Using the labeling, we define the set of edges, Ee, in a way that allows checking whether

∧Ce(v) |= ψ. The idea is to define the edges such that the set of ancestors of a node v in Ge

62


represents all the ways of extending Ce(v) into maximal cubes (where every predicate appears

exactly once). This ensures that ∧Ce(v) |= ψ if and only if v and all its ancestors are positive.

Formally, we define:

Ee = (v, v′) ∈ V × V | Ce(v′) ( Ce(v) and ¬∃v′′ ∈ V.Ce(v′) ( Ce(v′′) ( Ce(v)

We next define terminology. We define the parents of v through a function parentse : V →P(V ):

parentse(v) = v′ ∈ V | (v′, v) ∈ Ee

The ancestors of a node v, ancestorse(v), is the minimal set containing parentse(v) and closed

under the parentse function. The descendants of v, descendantse(v), is the set v′ ∈ V | v ∈ancestorse(v

′).

Observation 5.3.5. The following hold:

1. ancestors(v) = v′ ∈ V | Ce(v) ( Ce(v′) = v′ ∈ V | Ce(v) ( v′

2. descendants(v) = v′ ∈ V | Ce(v′) ( Ce(v) = v′ ∈ V | v′ ( Ce(v)

Lemma 5.3.6. Let v be a node such that every v′ ∈ v ∪ ancestors(v) is either unsatisfiable

or positive. Then, ∧Ce(v) |= ψ.

Proof. Assume in contradiction that ∧Ce(v) 6|= ψ. That is, there exists e′ |= ∧Ce(v) and

e′ 6|= ψ. Consider the node v′ = L(e′). Since e′ |= ∧Ce(v), Ce(v) ⊆ L(e′), and since

Ce(v) ⊆ L(e), Ce(v) ⊆ Ce(v′). Namely, v′ ∈ v ∪ ancestors(v) whose corresponding

example is negative. This contradicts the assumption.

Lemma 5.3.7. Let v be a negative node and v′ ∈ v ∪ descendants(v). Then, ∧Ce(v′) 6|= ψ.

Proof. Let e′ be a corresponding example of v, i.e., e′ 6|= ψ (since v is negative). Since e′ |= ∧v,

then ∧v 6|= ψ and since ∧v |= ∧Ce(v), ∧Ce(v) 6|= ψ. Also, for every v′ ∈ descendants(v),

Ce(v) ⊆ Ce(v′), and thus we get ∧Ce(v′) 6|= ψ.

Finally, we show that the graph Ge has no cycles.

Observation 5.3.8. G has no cycles.

Proof. If there is a cycle containing v, then v is an ancestor of v and thus Ce(v′) ( Ce(v′), a

contradiction.

An Algorithm for Learning Cubes from an Example

In this section, we present an algorithm for learning cubes from an example, called Cube-SPEX.

The algorithm takes a positive example e and learns all implicants of ψ that are over predicates

inQ. Cube-SPEX (Algorithm 9) maintains throughout the algorithm a set of nodes, Φ, which is

63


Algorithm 9: Cube-SPEX(e, L)1 Φ = V2 V isited = ∅3 while Φ \ V isited 6= ∅ do4 pick v ∈ Φ \ V isited5 V isited = V isited ∪ v6 e′ = model(v)7 if e′ == ⊥ then continue8 if ψ(e’) == 0 then9 Φ = Φ \ (v ∪ descendants(v)) // descendants(v) = v′ | Ce(v′) ( Ce(v)

10 return Φ

initially V . At the end of the execution, v ∈ Φ if and only if ∧v |= ψ. The algorithm presents a

corresponding example for each node that was not pruned from Φ. If a corresponding example is

negative, the node and its descendants are pruned (as from Lemma 5.3.7, none of them logically

implies ψ). Eventually, every node v that remained in Φ has the property that v and its ancestors

are positive nodes or unsatisfiable nodes. From Lemma 5.3.6, ∧v |= ψ.

The algorithm operates as follows. While there is an unpruned node v that has not been

considered yet, a corresponding example is obtained. If there is no such example, v remains in

Φ. Otherwise, a membership query is posed to the oracle. If the example is negative, v and its

descendants are pruned. When all nodes were considered or pruned, the set Φ is returned. We

note that a clean-up step can be done later to return the nodes whose Ce(v) is minimal:

Φcleaned = v ∈ Φ |6 ∃v′ ∈ Φ.Ce(v′) ( Ce(v)

The final set of cubes is then: ∧Ce(v) | v ∈ Φcleaned. As another clean-up step (which

is computationally more expensive), equivalent labels can be removed from Φcleaned to get a

minimal number of conjunctions with the same semantic meaning.

Lemma 5.3.9 (Soundness). At the end of the execution, for every v ∈ Φ, ∧Ce(v) |= ψ.

Proof. From Lemma 5.3.6 it is sufficient to show that v and its ancestors are positive or

unsatisfiable. Assume in contradiction that there exists v′ ∈ v∪ancestors(v) that is negative,

i.e., its corresponding example is negative. We show that in this case v must have been pruned

from Φ and thus get a contradiction. Initially, v′ ∈ Φ. Thus, either it is explored by Cube-SPEX

or it was pruned before it was explored. If v′ was explored, then its corresponding example is

negative and thus v is pruned from Φ, because it is equal to v′ or is v′’s descendant. If v′ was

not explored, then another node v′′ has caused v′ to be pruned. Namely, v′ ∈ descendants(v′′).

However, in this case, v ∈ descendants(v′′) and is thus pruned, too.

Lemma 5.3.10 (Completeness). At the end of the execution, for every C ⊆ L such that e |=∧C |= ψ, there exists v ∈ V such that v ∈ Φ and Ce(v) = C.

64


Proof. Let C be such a subset. Since e |= ∧C, by definition C ⊆ L(e). From Lemma 5.3.4,

there exists v ∈ V such that Ce(v) = C. Initially, v ∈ Φ. From Lemma 5.3.7, since ∧C |= ψ,

there is no ancestor of v that is negative and v is also not negative. Thus, v is never pruned

from Φ.

Lemma 5.3.11. If e′ is presented by Cube-SPEX, Cube-SPEX has not previously presented an

example e′′ such that e′ ≡L e′′.

Proof. Assume in contradiction that there is e′′ ≡L e′ such that e′′ was presented by Cube-SPEX

when it presented e′. Then, L(e′) = L(e′′). From Corollary 5.4, they correspond to a single

node v′. This means that v′ is considered twice by Cube-SPEX, but every node considered is

added to V isited and is thus not considered again – a contradiction.

Optimal Learning Algorithm from an Example

In this section, we optimize the search to leverage pruning as much as possible. The idea is to

search the space in a BFS order, namely by examining (and potentially) pruning sets before

considering their subsets. Thus, subsets are only examined, and pose a membership query, if all

their ancestors are in Φ. Namely, we change the unspecified pick of a node v (Line 4) to pick

nodes in a BFS order. We prove that in this case, the number of membership queries is minimal.

We call this variation BFS-Cube-SPEX.

Lemma 5.3.12. Let ψ be the target formula, e be a positive example, and Seψ be the set of cubes

from ψ satisfied by e. Further, let E be a non-equivalent example set and A be a learning

algorithm that learns for every e maximal cubes from Q∧ satisfied by e. Denote by EA the set

of queries that A posed. Then, for every e′ ∈ E, one of the following is true:

• There exists e′′ ∈ EA such that e′′ ≡Q e′.• There exists e′′ ∈ EA such that Ce(e′′) |= Ce(e

′) and e′′ 6|=∨C∈Seψ

∧C.

Proof. Let e′ ∈ E and assume there is no e′′ ∈ EA such that e′ ≡Q e′′ or Ce(e′′) |= Ce(e′) and

e′′ 6|=∨C∈Seψ

∧C. We split to cases:

• If A returns a conjunction c such that e′ |= c: Let EP be the set of positive examples

that A observed. By our assumption, for every e′′ ∈ EP : e′′ 6≡Q e′. Thus, we set

ψ =∨e′∈EP

∧L(e′). This hypothesis aligns with all A queries (positive and negative).

However, e′ 6|= ψ and e′ |= c. Thus c 6|= ψ and c is incorrect.

• If A returns no conjunction such that e′ |= c: Let EP , EN be the set of positive and

negative examples A observed, respectively. By our assumption, for every e′′ ∈ EP :

e′′ 6≡Q e′ and for every e′′ ∈ EN either Ce(e′′) 6|= Ce(e′) or e′′ |=

∨C∈Seψ

∧C. Then, we

set ψ =∨e′′∈EP∪ancestors(v)∪e′

∧∧L(e′′). This hypothesis aligns with all A queries

(positive and negative). Namely, e |= ∧Ce′(v) and e′ |= Ce′(v). Since A did not return a

conjunction satisfied by e′ which is satisfied by e, it did not return Ce′(v). Thus, A did

not return a maximal set of cubes over Q∧.

65


From the above lemma, it follows that to show that BFS-Cube-SPEX poses a minimal

number of queries, it is sufficient to show that if it presented an example which was discovered

as negative, it was not redundant. That is, there was no e′′ ∈ EA such that Ce(e′′) |= Ce(e′)

and e′′ 6|=∨C∈Seψ

∧C and thus by the lemma, this query could not be avoided. We prove this

in the following lemma.

Lemma 5.3.13. If BFS-Cube-SPEX poses a membership query to a node v, then every v′ ∈ancestors(v) is positive or unsatisfiable.

Proof. Let v be a node for which BFS-Cube-SPEX poses a membership query. Assume in

contradiction that there is a node v′ ∈ ancestors(v) that is negative. Then, by the BFS order

and since G has no cycles (Observation 5.3.8), v′ is considered before v. Thus, v′ is pruned

along with its descendants, including v. Thus, BFS-Cube-SPEX does not explore v or present a

membership query for v.

Learning DNF Formulas from Representative Sets

In this section, we provide DNF-SPEX, an algorithm that learns a DNF formula from a repre-

sentative set. DNF-SPEX (Algorithm 10) is straightforward: it iterates the positive examples

from the representative set and for each it learns a maximal set of cubes via Cube-SPEX. To

avoid posing equivalent queries, DNF-SPEX maintains a set of positive and negative examples,

to which Cube-SPEX also adds examples. When Cube-SPEX considers a node v, it checks

whether there is already an example corresponding to v in EP or EN .

Algorithm 10: Neg-Closed-DNF-SPEX(E)1 EN = ∅2 EP = E3 S = ∅4 for e ∈ E do5 S = S ∪ Cube-SPEX(e,L, EP , EN )6 clean(CP )7 return

∨C∈S ∧C

Theorem 5.5. At the end of the execution, ψ ≡∨C∈S

∧C.

Proof. By the correctness of Cube-SPEX,∨C∈S

∧C |= ψ. We now prove ψ |=

∨C∈S

∧C.

Let e |= ψ. Namely, there exists a cube C in ψ such that e |=∧C. Since E is a representative

set, there exists e′ ∈ E such that e′ |= C. Since e |=∧C |= ψ, by the soundness of Cube-SPEX,

it returns a conjunction C and thus C ′ ∈ S.

Theorem 5.6. If E is a minimal representative set and BFS-Cube-SPEX is executed, Neg-

Closed-DNF-SPEX learned ψ with a minimal number of examples.

Proof. Let e be a considered example. If e is presented by BFS-Cube-SPEX, then no equivalent

example has already been considered, and by the optimality of BFS-Cube-SPEX, this example

66


is required for correctness. If e is part of E, then since E is minimal, there exists a cube C in ψ

such that C is not satisfied by any other example in E. Namely, for e′ ∈ E \ e, e′ 6|=∧C.

In particular,∧L(e′) 6|=

∧C. Thus, any cube returned by invoking BFS-Cube-SPEX on e′ is

implied by∧L(e′) and thus does not imply

∧C. Namely, without e, Neg-Closed-DNF-SPEX

will return S such that ψ 6|=∨C∈S

∧C. Thus, e is necessary.

5.4 Learning when Predicates are Anti-closed under Negation

In this section, we study another class of predicate sets – those that are anti-closed to negation. A

predicate set is anti-closed to negation if for every q ∈ Q, there is no q′ ∈ Q such that ¬q ≡ q′.The unique aspect of this class is that it enables pruning that follows due to the inability to

express negation of predicates. We illustrate this class with the following example. Let D be

set of natural numbers and the following set of predicates Q = q×2, q×3, q×5, where q×n is

satisfied by all elements in D that are a multiplication of n. Assume a target formula ψ over

Q. If 2 is a positive example, which satisfies q×2,¬q×3,¬q×5, it must be that q×2 |= ψ. This

follows because if 2 is a positive example, there must be a cube over Q in ψ that is satisfied by

2. Since the cube q×2 and the tautology true are the only ones that are satisfied by 2, we get

q×2 |= ψ.

This intuition guides our algorithm that learns DNF formulas in this class: when considering

whether a cube C implies ψ, our algorithm presents an example that satisfies the negations of

the predicates in Q \ C. If such example exists and is positive, then∧C logically implies ψ

and so do sets containing C’s predicates. Otherwise, if∧C 6|= ψ, then neither C nor its subsets

logically imply ψ. We begin with defining the search space and then provide the algorithm.

5.4.1 The Search Space

We define the search space similarly to the one defined in the previous section, but with respect

to the predicate set Q ∪ ¬q | q ∈ Q. The labeling function is now defined differently (and

independently of concrete examples):

C(v) = v ∩Q

The corresponding examples and the edges are defined identically.

We begin with the main lemma that guides the pruning of the search space.

Lemma 5.4.1. Let ψ be a target formula, v a node, and e ∈ D a corresponding example for v.

1. If ψ(e) = 0, ∧C(v) 6|= ψ and for every descendant of v, v′: ∧C(v′) 6|= ψ.

2. If ψ(e) = 1, ∧C(v) |= ψ and for every ascendant of v, v′: ∧C(v′) |= ψ.

Proof. 1. If ψ(e) = 0: Since e is a corresponding example for v, in particular e |= C(v),

thus C(v) 6|= ψ. For every descendant of v, v′, C(v′) ⊆ C(v), and thus C(v) |= C(v′)

and in particular e |= C(v′) and C(v′) 6|= ψ.

67


2. If ψ(e) = 1: Either ψ is a tautology, in which case the claim trivially holds, or there exists

a cube C ⊆ Q in ψ such that e |= ∧C. Since e does not satisfy any of the predicates in

Q \ C(v), it must be that C ⊆ C(v). Since ∧C |= ψ and ∧C(v) |= ∧C, it follows that

∧C(v) |= ψ. Also, since for every ascendant of v, v′, C(v′) ⊆ C(v), the claim follows.

We now characterize when nodes logically imply a target formula.

Lemma 5.4.2. Let ψ be a target formula and v a node. Each one of the following implies

∧C(v) |= ψ:

• v corresponds to a positive example.

• v is unsatisfiable and for every v′ ∈ ancestors(v): ∧C(v′) |= ψ.

• There exists a descendant v′ of v such that ∧C(v′) |= ψ.

Further, if ∧C(v) |= ψ then for every v′ ∈ ancestors(v): ∧C(v′) |= ψ.

Proof. • If v corresponds to a positive example, then by Lemma 5.4.1 ∧C(v) |= ψ.

• If v is unsatisfiable and every ancestor of v logically implies ψ, then since v is unsatisfiable,

C(v) ≡∨v′∈ancestors(v)

∧C(v′). Since for each of them ∧C(v′) logically implies ψ,

∧C(v) also logically implies ψ.

• If there exists a descendant of v that logically implies ψ, then by Lemma 5.4.1, ∧C(v) |=ψ.

• Assume in contradiction that ∧C(v) |= ψ but there is v′ ∈ ancestors(v): ∧C(v′) 6|= ψ.

Then, it must be that the corresponding example to v′ is negative (Lemma 5.4.1). Thus,

from Lemma 5.4.1, ∧C(v) 6|= ψ – a contradiction.

5.4.2 A Learning Algorithm

In this section, we present our algorithm that learns a DNF formula over a set of predicates that

is anti-closed to negation. The algorithm is almost identical to Cube-SPEX (Algorithm 9), but

with one difference – if a corresponding example is positive, then all its ancestors are added to

V isited, indicating that they are known to logically imply ψ. This means that these nodes will

not be inspected at a later point.

We now prove soundness and completeness. We then provide a bound on the number of

membership queries posed.

Lemma 5.4.3 (Soundness). At the end of the execution, for every v ∈ Φ, ∧C(v) |= ψ.

Proof. From Lemma 5.4.2 it is sufficient to show that one of the following holds:

• v corresponds to a positive example.

• v is unsatisfiable and for every v′ ∈ ancestors(v): ∧C(v′) |= ψ.

• There exists a descendant v′ of v such that ∧C(v′) |= ψ.

68


Algorithm 11: Neg-Anti-Closed-DNF-SPEX1 Φ = V2 V isited = ∅3 while Φ \ V isited 6= ∅ do4 pick v ∈ Φ \ V isited5 V isited = V isited ∪ v6 e′ = model(v)7 if e′ == ⊥ then continue8 if ψ(e’) == 0 then9 Φ = Φ \ (v ∪ descendants(v))

10 else11 V isited = V isited ∪ ancestors(v)

12 return∨v∈Φ

∧C(v)

If v remained in Φ by the end of the execution, it was not pruned from Φ and was added

to V isited. A node can be added to V isited either when it is explored or when one of its

descendants adds it to V isited. In the former case, since v remained in Φ, it means that v

corresponded to a positive example or was unsatisfiable. If v was unsatisfiable but was not

pruned, it means that all of its ancestors logically imply ψ (as otherwise, it would have been

pruned). In the latter case, if v was added to V isited by a descendant, then the descendant

corresponded to a positive example. In either case, the claim follows.

Lemma 5.4.4 (Completeness). At the end of the execution, for every C ⊆ Q such that ∧C |= ψ

there exists v ∈ V such that v ∈ Φ and C(v) = C.

Proof. Let C be such subset. A lemma similar to Lemma 5.3.4 shows that there exists v ∈ Vsuch that C(v) = C. Initially, v ∈ Φ. From Lemma 5.4.2, v has no ancestor that is negative and

v is also not negative. Thus, v is never pruned from Φ.

Corollary 5.7. Neg-Anti-Closed-DNF-SPEX learns the target formula.

Lower Bound

In this section, we show that if pick is implemented as a binary search, then the number of

membership queries posed is log(|Q|) ·OPT .

Theorem 5.8. Any learning algorithm that learns QNeg-Anti-Closed-DNF must ask at least

max(log |G(Q∧)|, |Sψ| ·(1+maxC∈G(Q∧) |De(C)|) membership queries, where Sψ is the mini-

mal number of cubes in the target formula ψ; that is, for every C ∈ Sψ,∨C′∈Sψ\C

∧C ′ 6≡ ψ.

In particular, Neg-Anti-Closed-DNF-SPEX poses at most |log(Q)| ·OPT(QNeg-Anti-Closed-DNF)

membership queries.

Proof. • OPT(QNeg-Anti-Closed-DNF) ≥ log |G(Q∧)|: The number of different formulas in

QNeg-Anti-Closed-DNF is at least |G(Q∧)|, and thus from the information theoretic lower

bound we get the result.

69


• OPT(QNeg-Anti-Closed-DNF) ≥ |Sψ| · (1 + maxC∈G(Q∧)): Let C ∈ Sψ, and C1, . . . , Cm

be the immediate descendants of C. Any learning algorithm must pose a query for C (that

is positive) and a query for each immediate descendant (which is negative). Without such

queries the algorithm cannot distinguish between C and Ci. By our construction, every

such example is unique per node. Therefore, the algorithm requires at least 1 +De(C)

membership queries. In total, it requires at most |Sψ| · (1 + maxG∈G(Q∨) |De(G)|)membership queries.

To find every C ∈ Sψ (and its descendants), Neg-Anti-Closed-DNF-SPEX poses at most

log(Q) queries and thus the claim follows.

5.5 Conclusion

In this chapter, we studied the learnability of DNFs (QDNF ) over a set of predicates Q. We

showed how to extend C-SPEX to learn DNFs. We then focused on two sub-classes, those

whose predicates are closed under negation, and those whose predicates are “anti-closed” under

negation. We showed for each sub-class an algorithm that poses fewer membership queries than

the extension of C-SPEX.

70


Chapter 6

Synthesis with Abstract Examples

So far, we have focused on learning specifications that can be used to synthesize executable

programs. Namely, the search was in a specification space. PBE experts often believe that the

program space should be the one to drive the search to the target program. Their motivation

is Occam’s razor principle, which in our context of program synthesis implies that the user’s

intent is likely to be captured by a short program. In this chapter, we show that this approach

can be taken without sacrificing exactness. We present a novel synthesis framework that enables

us to extend PBE synthesizers (under some assumptions) with the ability to communicate a

candidate program’s behavior through a few abstract examples. The abstract examples serve

as an intuitive specification for candidate programs. Thus, through abstract examples, the user

is guaranteed that the final candidate program captures his intent on all inputs. The abstract

examples are a new form of examples that represent a potentially unbounded set of concrete

examples. An abstract example captures how part of the input space is mapped to corresponding

outputs by the synthesized program. Our framework uses a generalization algorithm to compute

abstract examples, which are then presented to the user. The user can accept an abstract example,

or provide a counterexample, in which case the synthesizer will explore a different program.

When the user accepts a set of abstract examples that covers the entire input space, the synthesis

process is completed.

We have implemented our approach and we experimentally show that our synthesizer

communicates with the user effectively by presenting on average 3 abstract examples until

the user rejects false candidate programs. Further, we show that a synthesizer that prunes the

program space based on the abstract examples reduces the overall number of required concrete

examples in up to 96% of the cases.

6.1 Overview

In this section, we provide an informal overview of abstract examples and their use in our

interactive synthesis framework. Our interactive synthesis framework communicates with a user

only through abstract membership queries—asking the user whether an abstract example of the

current candidate program should be accepted or rejected— and guarantees that the synthesized

71


program is correct on all inputs. Abstract examples are a new form of examples that represent

a potentially unbounded set of concrete examples of a candidate program. Abstract examples

are natural for a user to understand and inspect (similarly to examples), and at the same time

enable validation of the synthesis result without enumerating all concrete examples (which is

only possible for a finite domain, and even then is often prohibitively expensive). In fact, an

abstract membership question can also be viewed as a partial validation question. Instead of

presenting the user with a program and asking him to determine whether or not it is correct (a

validation question), we present an abstract example, which describes (declaratively) how the

candidate program transforms part of the input space. In this way, abstract examples allow us to

perform exact synthesis without a predefined specification.

Throughout the synthesis process, as the synthesizer explores the space of candidate pro-

grams to find the one that matches the user’s intent, it presents to the user abstract examples of

candidate programs. The user can accept an abstract example, or provide a counterexample, in

which case the synthesizer will explore a different candidate program. By accepting an abstract

example, the user confirms the behavior of the candidate program on part of the input space.

That is, the synthesizer learns the desired behavior for an unbounded number of concrete inputs.

Thus, it can prune every program that does not meet the confirmed abstract example. This

pruning is correct even if later the candidate program is rejected by another abstract example.

Generally, pruning based on an abstract example removes more programs than pruning based

on a concrete example. Thus, our synthesizer is likely to converge faster to the target program

compared to the current alternative (see Section 6.5). When the user accepts a set of abstract

examples that covers the entire input space, our synthesizer returns the corresponding candidate

program and the synthesis process is completed.

A key ingredient of our synthesizer is a generalization algorithm, called L-SEP. L-SEP

takes a concrete example and a candidate program, and generalizes the example to a maximally

general abstract example consistent with the candidate program. We illustrated this on our

motivating example from the Introduction (Chapter 1), where the candidate program is the one

synthesized by Flash Fill (that returns “H” followed by the second letter of the person’s first

name, etc.) and the initial concrete example is the first member on the list (i.e., Diane). Our

generalization algorithm produces the following abstract example:

a0a1A2 B C → Ha1 a0a1A2, please come to my office at C . -EG

This example describes the program behavior on the cells in columns A, B, and C, for the

case where the string in cell A has at least two characters, denoted by a0 and a1, followed by

a string sequence of arbitrary size (including 0), denoted by A2. For such inputs, the example

describes the output as a sequence consisting of: (i) the string “H” followed by a1, (ii) the entire

string at A followed by a comma, (iii) the string: “please come to my office at”, (iv) the string at

C, and (v) the string: “. -EG”.

This abstract example is presented to the user. The user rejects it and provides a concrete

counterexample (e.g., line 4 in the Excel spreadsheet). Thus, the synthesizer prunes the space

72


of candidate programs and generates a new candidate program. Eventually, the synthesizer

generates the target program (as a candidate program), and our synthesizer presents the following

abstract example:

A B C → Hi A, please come to my office at C . -EG

This time, the user accepts it. Since this abstract example covers the entire input space, the

synthesizer infers that this program captures the user’s intent on all inputs and returns it. In

general, covering the input space may require multiple abstract examples.

6.2 Abstract Specifications and Sequence Expressions

In this section, we define the key terms pertaining to abstract examples. We then present a

special class of abstract examples for programs that manipulate strings. For simplicity’s sake,

from here on we assume that programs take one input. This is not a limitation, as multiple inputs

(or outputs) can be joined with a predefined delimiter (e.g., the inputs in the motivating example

can be considered as one string separated by spaces).

6.2.1 Abstract Examples

Program Semantics The semantics of a program P is a function over a domain D: JP K : D →D. We equate JP K with its input-output pair set: (in, JP K(in)) | in ∈ D.

Abstract Examples An abstract example ae defines a set JaeK ⊆ D × D, which represents

a partial function: if (in, out1), (in, out2) ∈ JaeK, then out1 = out2. An abstract example

ae is an abstract example for program P if JaeK ⊆ JP K. We define the domain of ae to be

dom(ae) = in ∈ D | ∃out. (in, out) ∈ JaeK.

Abstract Example Specifications An abstract example specification of P is a set of abstract

examples A for P such that⋃ae∈A dom(ae) = D. Note that A need not be finite and the

example domains need not be disjoint.

6.2.2 Sequence Expressions

In this work, we focus on programs that manipulate strings, i.e., D = Σ∗ for a finite alphabet Σ.

Thus, it is desirable to represent abstract examples as expressions that represent collections of

concrete strings and can be readily interpreted by humans. A prominent candidate for this goal

is regular expressions, which are widely used to succinctly represent a set of strings. However,

regular expressions are restricted to constant symbols (from Σ). Thus, they cannot relate outputs

to inputs, which is desirable when describing partial functions (abstract examples). To obtain

this property, we introduce a new language, Sequence Expressions (SE), that extends regular

expressions with the ability to relate the outputs to their inputs via shared variables. We begin

this section with a review of regular expressions, and then introduce the two types of sequence

expressions: input SEs, for describing inputs, and output SEs, for describing outputs.

73


SI ::= SI · SI | ε | σ | xR | XR | σk SO ::= SO · SO | ε | σ | x | f(x) | X | f(X) | σk(a) Input SE (b) Output SE

Figure 6.1: SE grammar: σ ∈ Σ, x ∈ x, X ∈ X, k ∈ K, R ∈ R, f ∈ F .

Regular Expressions (RE) The set of regular languages over a finite alphabet Σ is the minimal

set containing ε, σ1, ..., σ|Σ| that is closed under concatenation, union, and Kleene star. A

regular expression r is a text representation of a regular language over the symbols in Σ and the

operators ·, |,∗ (concatenation, or, and Kleene star).

Input SE Syntax Fig. 6.1(a) shows the grammar of input SEs. In contrast to RE, SEs are

extended with three kinds of variables that later help to relate the output to the input:

• Character variables, denoted x ∈ x, used to denote an arbitrary letter from Σ.

• Sequence variables, denoted X ∈ X, used to denote a sequence of arbitrary size.

• Star variables, denoted k ∈ K, used instead of the Kleene star to indicate the number of

consecutive repeating occurrences of a symbol. For example, 0k has the same meaning as

the RE 0∗.

To eliminate ambiguity, in our examples we underline letters from Σ. For example, xXa

represents the set of words that have at least two letters and end with an a (a ∈ Σ).

We limit each variable (i.e., x,X, k) to appear at most once in an input SE. We also limit

the use of a Kleene star to single letters from the alphabet. Also, since the goal of each SE is to

describe a single behavior of the program, we exclude the ‘or’ operator. Instead, we extend the

grammar to enable to express ‘or’ to some extent via predefined predicates that put constraints

on the variables. We denote these predicates by R ∈ R, and their meaning (i.e., the set of

words that satisfy them) by JRK ⊆ Σ∗. We note that we do not impose restrictions on the setR;

however, our algorithm relies on an SMT-solver, and thus predicates inR have to be encodable

as formulas.

Some examples for predicates and their meaning are: JnumK = w ∈ Σ∗ |w consists of digits only, JanumK = w ∈ Σ∗ | w consists of letters and digits only, JdelK =

., \t, ; , Jno delK = Σ∗ \ JdelK.

We assume that the predicate satisfied by any string, T, (where JT K = Σ∗) is always in

R. We abbreviate xT, XT to x,X . In the following, we refer to these as atomic constructs:

σ, xR, XR, σk. Given an input SE se, we denote by xse, Xse, and Kse the set of variables in se.

Input SE Semantics To define the semantics, we first define interpretations of an SE, which

depend on assignments. An assignment env for an input SE se maps every x ∈ xse to a letter in

Σ, everyX ∈ Xse to a sequence in Σ∗, and every k ∈ Kse to a natural number (including 0). We

denote by env[se] the sequence over Σ obtained by substituting the variables with their interpreta-

tions. Formally: (i) env[ε] = ε (ii) env[σ] = σ (iii) env[xR] = env(x) (iv) env[XR] = env(X)

(v) env[σk] = σenv(k) (vi) env[S1 · S2] = env[S1] · env[S2] (where · denotes string concate-

nation). An assignment is valid if for every xR and XR in se, env(x), env(X) ∈ JRK. In the

following we always refer to valid assignments.

The semantics of an input SE se, denoted by JseK, is the set of strings obtained by the set

74


of all valid assignments, i.e. JseK = s ∈ Σ∗ | ∃env. env[se] = s. For example, JσK = σ,JxK = Σ, JXK = Σ∗, and JσkK = ε, σ, σσ, ....

Output SE Fig. 6.1(b) shows the grammar of output SEs. Output SEs are defined with respect

to an input SE and they can only refer to its variables. Formally, given an input SE se, an output

SE over se is restricted to variables in xse, Xse, and Kse. Unlike input SEs, an output SE is

allowed to have multiple occurrences of the same variable, and variables are not constrained

by predicates. In addition, output SEs can express invocations of unary functions over the

variables. Namely, the grammar is extended by f(x) and f(X), where x ∈ xse and X ∈ Xse,

and f : Σ→ Σ∗ is a function.

An interpretation of an output SE is defined with respect to an assignment, similarly to the

interpretation of an input SE. We extend the interpretation definition for the functions as follows:

env[f(x)] = f(env(x)) and if env(X) = σ1 · · ·σn then env[f(X)] = f(σ1) · · · f(σn), i.e.,

env[f(X)] is the concatenation of the results of invoking f on the characters of the interpretation

of X . (If env(X) = ε, env[f(X)] = ε.)

Input-Output SE Pairs An input-output SE (interchangeably, an SE pair) is a pair io = sein →seout consisting of an input SE, sein, and an output SE, seout, defined over sein. Given

io = sein → seout, we denote in(io) = sein and out(io) = seout. The semantics of io is the

set of pairs: JioK = (sin, sout) ∈ Σ∗ ×Σ∗ | ∃env. sin = env[in(io)]∧ sout = env[out(io)].The domain of io is dom(io) = Jin(ae)K.

Example An input-output SE for the pattern of column D based on columns A,B in Fig. 1.1 is:x0no delX1no del X2 → flowercase(x0).flowercase(X2)@lockhart-gardner.com

where x0 is a character variable, X1 and X2 are sequence variables and denotes a column

delimiter (taken from Σ). The predicate no del is satisfied by words that do not contain a

delimiter. The semantics of this SE pair is the set of all word pairs whose first element is a

string consisting of a first name, a delimiter, and a last name, and the second element is the

email address, which is the sequence of the first letter of the first name in lower case, a dot, the

lower-cased last name, and the suffix “@lockhart-gardner.com”.

6.2.3 Sequence Expressions as Abstract Examples

SE pairs provide an intuitive means to describe relations between outputs and inputs. In this work,

we focus on learning abstract examples that can be described with SE pairs. For simplicity’s

sake, in the following we ignore predicates and functions (i.e., R,F). Our definitions and

algorithms can be easily extended to arbitrary (but finite) setsR and F .

We say that an input-output SE is an abstract example if JioK describes a partial function.

Note that in general, an SE pair is not necessarily an abstract example. For example, the pair

ioXY = XY → XaY , can be interpreted to (bbb, babb) (by env1 = X 7→ b, Y 7→ bb) and

(bbb, bbab) (by env2 = X 7→ bb, Y 7→ b). Thus, JioXY K is not a partial function and hence

not an abstract example.

Given a program P , we say that an input-output SE is an abstract example for P if JioK ⊆JP K. Since JP K is a function, this requirement subsumes the abstract example requirement.

75


Given an input SE sein, we say that an output SE seout over sein is a completion of sein for P

if sein → seout is an abstract example for P .

Example We next exemplify how SEs can provide an abstract example specification to describe

a program behavior. Assume a user has a list of first names and middle names (space delimited),

some of which are only initials, and the goal is to create a greeting message of the form “Dear

<name>”. The name in the greeting is the first string if it is identified as a name, i.e., has at

least two letters; otherwise, the name is the entire string. For example: (i) Adam→ Dear Adam,

(ii) Adam R.→ Dear Adam, (iii) A. Robert→ Dear A. Robert (iv) A.R.→ Dear A.R.. In this

example, we assume the predicate set contains the predicates R = T, name, other, where

JnameK = A, a, ..., Z, z+ \ A, a, ..., Z, z, JotherK = (Σ \ )∗ \ JnameK. An abstract

example specification is: (i) X0name → Dear X0 (ii) X0name X1 → Dear X0 (iii) X0other →Dear X0 (iv) X0other X1 → Dear X0 X1.

Discussion While SEs can capture many program behaviors, they have limitations. One limi-

tation is that an SE can only describe relations between output characters to input characters,

but not among input characters. For example, it cannot capture inputs that are palindromes or

inputs of the form XX (e.g., abab). This limitation arises because we chose input SEs to be

(a subset of) regular expressions, which cannot capture such languages. Also, tasks that are

not string manipulations are likely to have a specification that contains (many) trivial abstract

examples (i.e., concrete input-output examples). For example, consider a program that takes

two digits and returns their multiplication. Some abstract examples describing it are X 1→ X

and 1 X → X . However, the specification also contains 9 2 → 18, 9 3 → 27,...,9 9 → 81.

Moreover, an abstract example specification consists of a set of independent abstract examples,

with no particular order. As a result, describing if-else rules requires encoding the negation of

the “if” condition explicitly in order to obtain the same case splitting as an if-else structure.

Generalization Order We next define a partial order between SEs that are abstract examples.

This order is leveraged by our algorithm in the next section. We call this order the generalization

order and if an abstract example is greater than another one, we say it is more general or abstract.

We begin with defining a partial order on the atomic constructs of SEs, as follows:

X

x σk

σ

where σ ∈ Σ, x ∈ x, X ∈ X and k ∈ K.

We say that an input SE se′ is more general than se, se se′, if its atomic constructs

are pointwise more general than the atomic constructs of se. Namely, for se = a1 · · · an and

se′ = a′1 · · · a′n (where ai and a′i are atomic constructs), se se′ if for every 1 ≤ i ≤ n,

ai a′i. If se se′ ∧ se 6= se′, we write se ≺ se′. For example, abc ≺ abkc ≺ xY Z. In

addition, we define that for any atomic construct a, a 6 ε and ε 6 a. The generalization order

implies the following:

Lemma 6.2.1. Let se, se′ be two input SEs. If se se′, then JseK ⊆ Jse′K.

76


The proof follows directly from the definition of and the semantics of an input SE. Note that

the converse does not necessarily hold. For example, JXY K = JZK, but XY 6 Z and Z 6 XY .

In fact, may only relate SEs of the same length. In practice, we partly support generalizations

beyond (see Section 6.3).

The generalization order of input SEs induces a generalization order on input-output SEs:

io io′ if in(io) in(io′). If io and io′ are abstract examples for the same program P ,

this implies that JioK ⊆ Jio′K. Moreover, in that case, JioK ⊆ Jio′K if and only if Jin(io)K ⊆Jin(io′)K. This observation enables our algorithm to focus on generalizing the input SE instead

of generalizing the pair as a whole.

6.3 An Algorithm for Learning Abstract Examples

In this section, we describe L-SEP, our algorithm for automatically Learning an SE Pair. This

pair is an abstract example for a given program and it generalizes a given concrete example. In

Section 6.4, we will use L-SEP repeatedly in order to generate an abstract example specification.

L-SEP (Algorithm 12) takes as input a program P (e.g., the program Flash Fill learned)

and a (concrete) input in (e.g., Diane). These two define the initial SE with which to begin:

(in, JP K(in)) (namely, the concrete example). The algorithm outputs an input-output SE,

io = sin → sout, such that (in, JP K(in)) ∈ JioK ⊆ JP K. Namely, io generalizes (or abstracts)

the concrete example and is consistent with P . L-SEP’s goal is to find an io that is maximal

with respect to .

The high-level operation of L-SEP is as follows. First, it sets io = in→ JP K(in). Then, it

gradually generalizes io as long as this results in pairs that are abstract examples for P . The

main insight of L-SEP is that instead of generalizing io as a whole, it generalizes the input SE,

in(io), and then checks whether there is a completion of in(io) for P , namely an output SE

over in(io) such that the resulting pair is an abstract example for P . This is justified by the

property that io io′ if and only if in(io) in(io′).

6.3.1 Input Generalization

We now explain the pseudo-code of L-SEP. After initializing io by setting sin = in and

sout = JP K(in), L-SEP stores in InCands the set of candidates generalizing sin (which are

the input components of io’s generalizations). Then, a loop attempts to generalize sin as long as

InCand 6= ∅. Each iteration picks a minimal element from InCands, s′in, which is a candidate

to generalize sin. To determine whether s′in can generalize sin, findCompletion is called.

If it succeeds, it returns s′out such that s′in → s′out is an abstract example for P . If it fails, ⊥is returned. Either way, the search space, InCands, is pruned: if the generalization succeeds,

then the candidates are pruned to those generalizing s′in; otherwise, they are pruned to those

except the ones generalizing s′in. If the generalization succeeds, sin and sout are updated to s′inand s′out.

Our next lemma states that if findCompletion returns ⊥, pruning InCands does not

77


Algorithm 12: L-SEP(P , in)1 sin = in; sout = JP K(in)2 InCands = s ∈ SEin | s sin3 while InCands 6= ∅ do4 s′in = pick a minimal element from InCands5 s′out = findCompletion(P , s′in) // if succeeds, Js′in → s′outK ⊆ JP K6 if s′out 6= ⊥ then7 sin = s′in ; sout = s′out8 InCands = InCands ∩ s ∈ SEin | s sin9 else

10 InCands = InCands \ s ∈ SEin | s s′in

11 return (sin, sout)

remove input SEs that have a completion for P . The lemma guarantees that L-SEP cannot miss

abstract examples for P because of this pruning.

Lemma 6.3.1. If s′′in s′in and s′in has no completion for P , s′′in has no completion for P .

sketch. We prove by induction on the number of generalization steps required to get from s′in to

s′′in. Base is trivial. Assume the last generalization step is to replace a′i in sin′ with a′′i in s′′in. If

s′′in has a completion s′′out for P , then substitute a′′i in s′′out by a′i to obtain a completion for s′in.

However, this contradicts our assumption.

InCands For ease of presentation, L-SEP defines InCand as the set of all generalizations

of in that remain to be checked, where initially it contains all generalizations. However,

the size of this set is exponential in the length of in, and thus in practice, L-SEP does not

maintain it explicitly. Instead, it maintains two sets: MinCands, which records the minimal

generalizations of the current candidate sin that remain to be checked, and Pruned, which

records the minimal generalizations that were overruled (and hence none of their generalizations

need to be inspected). In Line 2 and Line 8 L-SEP initializes MinCands based on the

current candidate sin by computing all of its minimal generalizations. In Line 10 it removes

from MinCand the generalization that was last checked and failed, and also records this

generalization in Pruned to indicate that none of its generalizations need to be inspected.

Pruned is used immediately after initializingMinCand in Line 8 to remove fromMinCands

any generalization that generalizes a member of Pruned – this efficiently implements the update

of InCands in Line 10. Using this representation of InCands we can now establish:

Lemma 6.3.2. The number of iterations of L-SEP is O(|in|2 · |R|2).

Proof. The number of iterations is at most the maximal size of MinCands multiplied by the

number of initializations of MinCands based on a new candidate sin in Line 8. The size

of MinCands computed based on some sin is at most |in| · (|R| + 1). This follows since

a minimal generalization of sin differs from sin in a single construct that is more general

than the corresponding construct in sin (with respect to the partial order of constructs). The

78


number of initializations of MinCands in Line 8 is bounded by the longest (possible) chain of

generalizations. This follows because each such initialization is triggered by the update of sinto a more general SE. Since the longest chain of generalizations is at most |in| · (|R|+ 1), the

number of iterations is O(|in|2 · |R|2).

Lemma 6.3.2 implies that MinCands and Pruned provide a polynomial representation of

InCands (even though the latter is exponential). Further, the use of these sets enables L-SEP

to run in polynomial time because they provide a quadratic bound on the number of iterations,

and because findCompletion is also polynomial, as we shortly prove.

Picking a Minimal Generalization We now discuss how L-SEP picks a minimal generalization

of sin in Line 4. One option is to do so arbitrarily. However, this greedy approach may result

in a sub-optimal maximal generalization, namely, a maximal generalization that concretizes

to fewer concrete inputs than some other possible maximal generalization. On the other hand,

to obtain an optimal generalization, all generalizations that have a completion have to be

computed and only then can the best one be picked by comparing the number of concretizations.

Unfortunately, this approach results in an exponential time complexity and is thus impractical.

Instead, our implementation of L-SEP takes an intermediate approach: it considers all minimal

generalizations that have a completion and picks one that concretizes to a maximal number of

inputs. To avoid counting the number of inputs (which may be computationally expensive), our

implementation employs the following heuristic. It syntactically compares the generalizations

by comparing the construct in each of them that is not in sin (i.e., where generalization took

place). It then picks the generalization whose construct is maximal with respect to the order:

X > σk > x. If there are generalized constructs that are not comparable w.r.t. this order (e.g.,

σk1 vs. σk2 ), one is picked arbitrarily.

6.3.2 Completion

findCompletion (Algorithm 13) takes P and an input generalization s′in and returns a com-

pletion of s′in for P , if one exists; or ⊥, otherwise.

In contrast to input SEs, if a certain candidate s′′out is not a completion of s′in for P , this does

not imply that its generalizations are also not completions of s′in. Thus, a pruning procedure

similar to the one in L-SEP may result in missing completions. Consider, for example, a program

P whose abstract example specification is xX → bX. Assume that while L-SEP looks for a

completion for s′in = ax it considers s′out = ba, which is not a completion. Pruning SEs that are

more general than s′out will result in pruning the completion bx. Likewise, pruning elements that

are more specific than a candidate that is not a completion may result in pruning completions.

Since the former pruning cannot be used to search the output SE, findCompletion

searches differently, by making gradual attempts to construct a completion s′out construct-

by-construct. If an attempt fails, it backtracks and attempts a different construction. This is

implemented via the recursive function findOutputPrefix. At each step, a current prefix

sprefout (initially ε) is extended with a single atomic construct sym (i.e., σ, x,X, σk). Then, it

checks whether the current extended construction is partially consistent with P (Line 7). If the

79


Algorithm 13: findCompletion(P , s′in)1 return findOutputPrefix(P ,s′in, ε)2 Function findOutputPrefix(P , s′in, sprefout ):3 if Js′in → sprefout K ⊆ JP K then return sprefout

4 Cands = s ∈ SEout(s′in) | s is an atomic construct5 while Cands 6= ∅ do6 sym = pick and remove a minimal element from Cands

7 if Js′in → sprefout · symK ⊆ (in, op) | ∃os ∈ Σ∗.(in, op · os) ∈ JP K then8 sprefout = sprefout · sym9 s′out = findOutputPrefix(P , s′in, sprefout )

10 if s′out 6= ⊥ then return s′out

11 return ⊥

check fails, this extended prefix is discarded, thereby pruning its extensions from the search

space. Otherwise, further extension of the extended prefix is attempted. We next define partial

consistency.

Definition 6.3.3. An SE pair s′in → sprefout is partially consistent with P if for every assignment

env, env[sprefout ] is a prefix of JP K(env[s′in]).

When s′in is clear from the context, we say that sprefout is partially consistent with P .

By the semantics definition, a pair s′in → sprefout is partially consistent with P if and only

if Js′in → sprefout K ⊆ (in, op) | ∃os ∈ Σ∗.(in, op · os) ∈ JP K (which is the check of line 7).

Partial consistency is a necessary condition (albeit not sufficient) for sprefout · sym to be a prefix

of a completion s′out. Thus, if sprefout · sym is not partially consistent, there is no need to check

its extensions. Note that even if a certain prefix sprefout · sym is partially consistent, it may be

that this prefix cannot be further extended (namely, the suffixes cannot be realized by an SE).

In this case, this prefix will be discarded in later iterations and sprefout and a different attempt to

extend sprefout will be made. This extension process terminates when an extension results in a

completion, in which case it is returned, or when all extensions fail, in which case ⊥ is returned.

Lemma 6.3.4. The recursion depth of Algorithm 13 is bounded by the length of JP K(in).

Proof. Denote by n the length of JP K(in). Assume to the contrary that the recursion depth

exceeds n. Namely, the current prefix, sprefout , is strictly longer than n. We show that in this case,

the partial consistency check is guaranteed to fail. To this end, we show an assignment env

to s′in such that env[sprefout ] is not a prefix of JP K(env[s′in]). Consider the assignment env that

maps each variable in s′in to its original value in in (namely, env[s′in] = in). This assignment

maps each variable to exactly one letter. By our assumption, the length of env[sprefout ] is greater

than n. Thus, env[sprefout ] (of length > n) cannot be a prefix of JP K(in) (of length n).

6.3.3 Guarantees

Lemma 6.3.2 and Lemma 6.3.4 ensure that both the input generalization and the completion

algorithms terminate in polynomial time. Thus, the overall runtime of L-SEP is polynomial.

80


Finally, we discuss the guarantees of these algorithms.

Lemma 6.3.5. findCompletion is sound and complete: if it returns s′out, then s′out is a

completion of s′in for P , and if it returns ⊥, then s′in has no completion for P .

Soundness follows since findOutputPrefix returns s′out only after validating that

Js′in → s′outK ⊆ JP K. Completeness follows since s′out is gradually constructed and every

possible extension is examined.

Lemma 6.3.6. L-SEP is sound and complete: for every (in, out) pair, an SE pair is returned,

and if L-SEP returns an SE pair, then it is an abstract example for P .

Soundness is guaranteed from findCompletion. Completeness follows since even if all

generalizations fail, L-SEP returns the concrete example as an SE pair.

Theorem 6.1. L-SEP returns an abstract example io for P such that (in, JP K(in)) ∈ JioK and

io is maximal w.r.t. .

This follows from Lemma 6.3.1, Lemma 6.3.5, and since L-SEP terminates only when InCands

is empty (i.e., when there are no more input generalizations to explore).

We note that in our implementation, findCompletion runs heuristics instead of the

expensive backtracking. In this case, maximality is no longer guaranteed.

6.3.4 Running Example

We next exemplify L-SEP on the (shortened) example from the Overview (Section 6.1), where

we start from the concrete example in = Diane and we wish to obtain the abstract example

a0a1A2 → Ha1 a0a1A2. L-SEP starts with: sin = Diane and sout = Hi Diane. It then picks

a minimal candidate that generalizes sin. A minimal candidate differs from sin in one atomic

construct in some position i. By , if sin[i] = σ, then s′in[i] is x or σk.

Assume that L-SEP first tests this minimal candidate: s′in = Dk0 iane. To test it, L-SEP

calls findCompletion to look for a completion. The completion is defined over s′in and in

particular can use the variable k0. Then, findCompletion invokes findOutputPrefix(P ,

Dk0 iane, ε). In the first call of findOutputPrefix, all extensions of the current prefix, ε,

except for H, fail in the partial consistency check. This follows since the output of P always

starts with an ‘H’ (and not, e.g., with ‘Hk0’). Thus, a recursive call is invoked (only) for the

output SE prefix H. In this call, all extensions (i.e., Hσ or Hσk0) fail. For example, Hi fails since

the output prefix is not always “Hi” (e.g., P (DDiane) = HD DDiane. Since the prefix H cannot be

extended further, ⊥ is returned. This indicates that the input generalization s′in = Dk0 iane fails.

Thus, L-SEP removes from InCands all generalizations whose first construct generalizes Dk0 .

L-SEP then tests another minimal generalization: s′in = x0iane. It then calls

findCompletion (which can use x0). As before, (only) the prefix SE H is found parti-

ally consistent. Next, a second call attempts to extend H. This time, the extension Hi succeeds

81


because for all interpretations of x0iane, the output prefix is “Hi”. The recursion continues, until

obtaining and returning the completion Hi x0iane.

When L-SEP learns that s′in is a feasible generalization, it updates sin and sout, and prunes

InCands to candidates generalizing x0iane (for example, InCands contains x0x1ane). Eventu-

ally, sin is generalized to s′in = x0x1X2X3X4 with the completion s′out = Hx1 x0x1X2X3X4.

In a postprocessing step (performed when L-SEP is done), X2X3X4 is simplified to Y , resulting

in the abstract example x0x1Y → Hx1 x0x1Y . Note that the last “generalization” is no longer

according to .

6.4 Synthesis with Abstract ExamplesIn this section, we present our framework for synthesis with abstract examples. We assume

the existence of an oracle O (e.g., a user) with a fixed target program Ptar. Our framework

is parameterized with a synthesizer S that takes concrete or abstract examples and returns a

consistent program. Note that the guarantee to finally output a program equivalent to Ptar is the

responsibility of our framework and not S . Nonetheless, candidate programs are provided by S .

Goal The goal of our framework is to learn a program equivalent to the target program. Note

that this is different from the traditional goal of PBE synthesizers, which learn a program that

agrees with the target program at least on the observed inputs. More formally, our goal is to

learn a program P ′ such that JPtarK = JP ′K, whereas PBE synthesizers that are given a set

of input-output examples E ⊆ D ×D can only guarantee to output a program P ′′ such that

JPtarK ∩ E = JP ′′K ∩ E.

Interaction Model We assume that the oracleO can accept abstract examples or reject them and

provide a counterexample. If the oracle accepts an abstract example io, then JioK ⊆ JPtarK. If it

returns a counterexample cex = (in′, out′), then (i) (in′, out′) ∈ JPtarK, (ii) (in′, out′) /∈ JioK,

and (iii) in′ ∈ Jin(io)K.

Operation Our framework (Algorithm 14) takes an initial (nonempty) set of input-output

examples E ⊆ D ×D. This set may be extended during the execution. The algorithm consists

of two loops: an outer one that searches for a candidate program and an inner one that computes

abstract examples for a given candidate program. The inner loop terminates when one of the

abstract examples is rejected (in which case a new iteration of the outer loop begins) or when the

input space is covered (in which case the candidate program is returned along with the abstract

example specification).

The algorithm begins by initializing A to the empty set. This set accumulates abstract

examples that eventually form an abstract example specification of Ptar. Then the outer loop

begins (Lines 2–10). Each iteration starts by asking the synthesizer for a program P consistent

with the current set of concrete examples in E and abstract examples in A. Then, the inner loop

begins (Lines 4–9). At each inner iteration, an input in is picked and L-SEP(P, in) is invoked.

When an abstract example io is returned, it is presented to the oracle. If the oracle provides a

counterexample cex = (in′, out′), then JP K 6= JPtarK (see Lemma 6.4.1). In this case, E is

extended with cex, and a new outer iteration begins. If the oracle accepts the abstract example,

82


Algorithm 14: synthesisWithAbstractExamples(E)1 A = ∅ // initialize the set of abstract examples

2 while true do3 P = S(E, A) // obtain a program consistent with the examples

4 while ∪io∈AJin(io)K 6= D do // A does not cover D

5 Let in ∈ D \ ∪io∈AJin(io)K // obtain uncovered input

6 io = L-SEP(P, in) // learn abstract example

7 cex= O(io) // ask the oracle

8 if cex = ⊥ then A = A ∪ io // abstract example is correct

9 else E = E ∪ cex ; break // add a counterexample

10 return (P , A)

io, the abstract example is added to A (since it is an abstract example for Ptar). The idea is that

the synthesizer extends its set of examples with more examples (potentially an infinite number).

This (potentially) enables faster convergence to Ptar (in case additional outer iterations are

needed). If the inner loop terminates without encountering counterexamples, then A covers

the input domain D. At this point it is guaranteed that JP K = JPtarK (see Theorem 6.2). Thus,

P is returned, along with the abstract example specification A. Note that A has already been

validated and need not be inspected again.

We remark that although abstract examples can help the synthesizer to converge faster to

the target program, the convergence speed (and the number of counterexamples required to

converge) still depends on the synthesizer (which is a parameter to our framework) and not on

L-SEP or our synthesis framework.

Lemma 6.4.1. If O(io) = (in′, out′) ( 6= ⊥), then JP K 6= JPtarK.

Proof. From the oracle properties (in′, out′) ∈ JPtarK, (in′, out′) /∈ JioK, and in′ ∈ Jin(io)K.

Thus, there exists out′′ 6= out′ such that (in′, out′′) ∈ JioK. Since by construction, JioK ⊆ JP K,

it follows that (in′, out′′) ∈ JP K. Thus, JP K 6= JPtarK.

Theorem 6.2. Upon termination, Algorithm 14 returns a program P s.t. JP K = JPtarK.

Proof. Upon termination, for every in ∈ D there exists io ∈ A s.t. in ∈ Jin(io)K. By

construction JioK ⊆ JP K, and thus (in, JP K(in)) ∈ JioK. By the oracle properties, JioK ⊆JPtarK; thus (in, JPtarK(in)) ∈ JioK. Altogether, JP K(in) = JPtarK(in).

We emphasize that the interaction with the oracle (user) takes place only after both a

candidate program and an abstract example have been obtained; the goal of the interaction is

to determine whether the candidate program is correct. Rejection of the abstract example by

the user means rejection of the candidate program, in which case the PBE synthesizer S looks

for a new candidate program. In particular, the goal of the interaction is not to confirm the

correctness of the abstract examples – L-SEP always returns (without any interaction) a correct

generalization with respect to the candidate program.

83


E P (x) Abstract Examples Counterexample?

(10101,10111) P (x) = 10111 X → 10111 1→ 11

(10101,10111),(1,11)

P (x) = OR(x, 2) X0x1x2 → X01x2 0→ 1

(10101,10111),(1,11),(0,1)

P (x) = OR(x+ 1, 1) X00x2 → X0x21 NoX00→ X01 NoX00x11→ X0x1x11 11→ 111

(10101,10111),(1,11),(0,1),(11,111)

P (x) = OR(x+ 1, x) X001k → X01k1 No

Table 6.1: A running example for learning a program that flips the rightmost 0 bit with oursynthesis framework. The target program is Ptar(x) = OR(x+ 1, x).

Example We next exemplify our synthesis framework in the bit vector domain. We consider

a program space P defined inductively as follows. The identity function and all constant

functions are in P . For every op ∈ Not,Neg and P ∈ P , op(P ) ∈ P , and for every

op ∈ AND,OR,+, –,SHL,XOR,ASHR and P1, P2 ∈ P , op(P1, P2) ∈ P . We assume a

naıve synthesizer that enumerates the program space by considering programs of increasing

size and returning the first program consistent with the examples. In this setting, we consider

the task of flipping the rightmost 0 bit, e.g., 10101 → 10111 (taken from the SyGuS compe-

tition [AFSS16]). While this task is easy to explain intuitively through examples, phrasing

it as a logical formula is cumbersome. Assume a user provides to Algorithm 14 the set of

examples E = (10101, 10111). Table 6.1 shows the execution steps taken by our synt-

hesis framework: E shows the current set of examples, P (x) shows the candidate program

synthesized by the naıve synthesizer, Abstract Examples shows the abstract examples com-

puted by L-SEP and Counterexample? is either No if the user accepts the current abstract

example (to its left) or a pair of input-output example contradicting the current abstract ex-

ample. In this example, L-SEP uses the set of functions F = fneg in the output SE, where

fneg(0) = 1, fneg(1) = 0, and we abbreviate fneg(y) with y. Further, since the bit vector domain

consists of vectors of a fixed size (namely, Σn for a fixed n instead of Σ∗), the SE’s seman-

tics in this domain is defined as the suffixes of size n of its (normal) interpretation. Formally,

JseKn = s ∈ Σn | ∃env. s is a suffix of env[se]. The semantics of an input-output SE is

defined similarly. In the example, the first two programs are eliminated immediately by the user,

whereas the third program is eliminated only after showing the third abstract example describing

it. This enables the synthesizer to prune a significant portion of the search space. Note that

since abstract examples are interpreted over fixed sized vectors (as explained above), the last

abstract example covers the input space: if k = n, the input isn times︷︸︸︷11...1; if k = 0, the input takes

the form of b0...bn−10 (where the bi-s are bits); and if 0 < k < n, the input takes the form of

b0...bn−k−101k.

84


Leveraging Counterexamples for Learning Abstract Examples A limitation of L-SEP is that

it only generalizes the existing characters of the concrete input. For example, consider a

candidate program generated by S that returns the first and last character of the string, which

can be summarized by the abstract example x0X1x2 → x0x2. In the process of generating

an abstract example specification for the candidate program, if the first example provided by

Algorithm 14 to L-SEP for generalization is ab, then it is generalized to x0x1 → x0x1. On the

other hand, if the first example is acb, then it is generalized to x0X1x2 → x0x1, whose domain

is a strict superset of the former’s domain. This exemplifies that some inputs may provide

better generalizations than others. Although eventually our framework will learn the better

generalizations, if Algorithm 14 starts from the less generalizing examples, then its termination

is delayed, and unnecessary questions are presented to the oracle (in our example, x0x1 → x0x1

will be presented, followed by x0X1x2 → x0x2, both of which are accepted, but the former

perhaps could have been avoided). We believe that the way to avoid this delay in the algorithm’s

termination is to pick “good” examples. We leave the question of how to identify them to future

work, but note that if the oracle is assumed to provide “good” examples (e.g., representative),

then Line 5 can be changed to first look for an uncovered input in E.

6.5 EvaluationIn this section, we discuss our implementation and evaluate L-SEP and our synthesis framework.

We evaluate our algorithms in two domains: strings and bit vectors (of size 8). The former

domain is suitable for end users, as targeted by approaches like Flash Fill or learning regular

expressions. The latter domain is of interest to the synthesis community (evident by the SyGuS

competition [AFSS16]). We begin with our implementation and then discuss the experiments.

All experiments ran on a Sony Vaio PC with Intel(R) Core(TM) i7-3612QM processor and

8GB RAM.

6.5.1 ImplementationWe implemented our algorithms in Java. We next provide the main details.

Program Spaces The program space we consider for bit vectors is the one defined in the example

at the end of Section 6.4. The program space P we consider for the string domain is defined

inductively as follows. The identity function and all constant functions are in P . For every

P1, P2 ∈ P , concat(P1, P2) ∈ P . For P ∈ P and integers i1, i2, Extract(P, i1, i2) ∈ P . For

P1, P2 ∈ P , and a condition e over string programs and integer symbols, ITE(e, P1, P2) ∈ P .

SE Spaces In the bit vector domain we consider F = fneg where fneg(b) = 1− b.

findCompletion To answer the containment queries (Lines 3 and 7), we use the Z3 SMT-

solver [DMB08]. To this end, we encode the candidate program P and the SEs as formulas.

Roughly speaking, an SE is encoded as a conjunction of sequence predicates, each encoding a

single atomic construct. A sequence predicate extends the equality predicate with a start position

and is denoted by t1i= t2. An interpretation d1, d2 for t1, t2 satisfies t1

i=t2 if starting from the

ith character of d1 the next |d2| characters are equal to d2. The term t1 is either a unique variable

tin, representing the input (for input SEs), or P (tin) (for output SEs). The term t2 can be (i) σ

85


(a letter from Σ), (ii) σk where k is a star variable, or (iii) a character or sequence variable. For

example, X0abk2x3 is encoded as: tin0=X0 ∧ tin

|X0|= a ∧ (∀i.1 + |X0| ≤ i < 1 + |X0|+ k2 →

tini=b)∧ tin

1+|X0|+k2= x3. Note that the positions can be a function of the variables. In the string

domain, the formulas are encoded in string theory (except for i and k2, which are integers). In

the bit vector domain, entities are encoded as bit vectors and i= is implemented with masks.

Synthesis Framework To check whether A covers the input domain and obtain an uncovered

input in if not, we encode the abstract examples in A as formulas. We then check whether one

of the concrete examples from E does not satisfy any of these formulas. If so, it is taken as in.

Otherwise, we check whether there is another input that does not satisfy the formulas, and if so

it is taken as in; otherwise the input domain is covered.

Synthesizer Our synthesizer is a naıve one that enumerates the program space by considering

programs of increasing size and returning the first program consistent with the examples.

Technically, we check consistency by submitting the formula P (in) = out to an SMT-solver

for every (in, out) ∈ E. Likewise, P is checked to be consistent with the abstract examples

by encoding them as formulas and testing whether they imply P . More sophisticated PBE

synthesizers, such as Flash-Fill, can in many cases be extended to handle abstract examples in a

straightforward manner.

6.5.2 Synthesis Framework Evaluation

In this section, we evaluate our synthesis framework on the bit vector domain. We consider

three experimental questions: (1) Do abstract examples reduce the number of concrete examples

required from the user? (2) Do abstract examples enable better pruning for the synthesizer?

(3) How many abstract examples are presented to the user before he rejects a program? To

answer these questions, we compare our synthesis framework (denoted AE) to a baseline

that implements the current popular alternative ([SL08]), which guarantees that a synthesized

program is correct. The baseline acts as follows. It looks for the first program that is consistent

with the provided examples and then asks the oracle whether this program is correct. The oracle

checks whether there is an input for which the synthesized program and the target program return

different outputs. If so, the oracle provides this input and its correct output to the synthesizer,

which in turn looks for a new program. If there is no such input, the oracle reports success,

and the synthesis completes. We assume a knowledgable user (oracle), implemented by an

SMT-solver, which is oblivious to whether the program is easy for a human to understand,

making the comparison especially challenging.

Benchmarks We consider three benchmarks, B(4), B(6), and B(8), each consisting of 50

programs. A program is in B(n) if baseline required at least n examples to find it. To find such

programs, we randomly select programs of size 4, for each we execute baseline (to find it), and

if it required at least n examples, we add it to B(n) and execute our synthesis framework (AE)

to find the same (or an equivalent) program.

Consistency of Examples The convergence of these algorithms is highly dependent on the

examples the oracle provides. To guarantee a fair comparison, we make sure that the same

86


B(4) B(6) B(8)AE baseline AE baseline AE baseline

#Concrete examples (candidate programs) 4.42 5.64 5.50 7.68 6.62 10.26Spec-final 11.04 9.36 13.22#AE-intermediate 1.98 2.00 3.23%Better than baseline 68% 76% 96%%Equal to baseline 30% 22% 2%%Worse than baseline 2% 2% 2%

Table 6.2: Experimental results on the bit vector domain.

Figure 6.2: Detailed results for B(8).

examples are presented to both algorithms whenever possible. To this end, we use a cache

that stores the examples observed by the baseline. When our algorithm asks the oracle for an

example, it first looks for an example in the cache. Only if none meets its requirements, can it

ask (an SMT-solver) for a new concrete example.

Results Table 6.2 summarizes the results. It reports the following:

• #Concrete examples: the average number of concrete examples the oracle provided, which

is also the number of candidate programs.

• Spec-final: the average size of the final abstract example specification (after removing

implied abstract examples).

• #AE-intermediate: the average number of abstract examples shown to the user before he

rejected the corresponding candidate program.

• %Better/ equal/ worse than baseline: the percentage of all programs in the benchmark

that required fewer/ same/ more (concrete) examples than the baseline.

We observed that the time to generate a single abstract example is a few seconds (≈ 6 seconds).

Results indicate that our synthesis framework (AE), which prunes the program space based

on the abstract examples, improves the baseline in terms of the examples the user needs to

provide. This becomes more significant as the number of examples required increases: AE

improves the baseline on B(4) by 22%, on B(6) by 30%, and on B(8) by 37%. Moreover, in

each benchmark AE performed worse than the baseline only in a single case – and the common

case was that it performed better (in B(8), AE performed better on all cases except two).

Fig. 6.2 provides detailed evidence of the improvement: it shows for each experiment (the

x-axis) the number of concrete examples each algorithm required (the y-axis). The figure

illustrates that the improvement can be significant. For example, in the 47th experiment, AE

reduced the number of examples from 17 to 7.

87


The String Program #Abstract Ex-amples

Concatenates the string “Dear” to the last name. 1Concatenates the first letter of the first name to the last name. 1Concatenates the first letter of the first name to the last name and to “@lockhart-gardner.com”.

1

Generates the message presented in the motivating example. 2Concatenates the first two characters of the first name to the third and fourth characters ofthe last name and to the second digit of the meeting time.

6.57

Table 6.3: Experimental results on the string domain.

The number of concrete examples is also the number of candidate programs generated by the

synthesizer. Thus, the lower number of examples indicates that the abstract examples improve

the pruning of the program space. Namely, abstract examples help the overall synthesis to

converge faster to the target program.

6.5.3 Abstract Example Specification EvaluationIn this section, we evaluate our generalization algorithm, L-SEP, in the string domain and check

how well it succeeds in learning small specifications. To this end, we fix a program and a

concrete example to start with and run L-SEP. We repeat this with uncovered inputs until the

set of abstract examples covers the string domain. We then check how many abstract examples

were computed.

The programs we considered are related to the motivating example. For each program, we

run five experiments. Each experiment uses a different Excel row (lawyer) as the first concrete

example. We note that our implementation assumes that the names and meeting times are

non-empty strings and are space-delimited. Table 6.3 reports the programs and the average

number of abstract examples. Results indicate that the average number of abstract examples

required to describe the entire string domain is low.

6.6 Related WorkIn this section, we survey the work closely related to ours.

Learning Specifications Learning regular languages from examples has been extensively studied

in the computational learning theory, under different models: (i) identification in the limit

(Gold [Gol67]), (ii) query learning (Angluin [Ang88]), and (iii) PAC learning (Valiant [Val84]).

Our setting is closest to Angluin’s setting, which defines a teacher-student model and two types

of queries: membership (concrete examples) and equivalence (validation). The literature has

many results for this setting, including learning automata, context-free grammars, and regular

expressions (see [Sak97]). In the context of learning regular expressions, current algorithms

impose restrictions on the target regular expression. For example, [BC94] allows at most one

union operator, [Kin10] prevents unions and allows loops up to depth 2 , [Fer09] assumes that

input samples are finite and Kleene stars are not nested, and [BNST06] assumes that expressions

consist of chains that have at most one occurrence of every symbol. In contrast, we learn an

extended form of regular expressions but we also impose some restrictions. In the context of

88


learning specifications, [TLHL11] learns specifications for programs in the form of logical

formulas, which are not intuitive for most users. Symbolic transducers [VHL+12, BB13]

describe input-output specifications, but these are more natural to describe functions over

streams than input manipulations.

Least General Generalization L-SEP takes the approach of least general generalization to

compute an abstract example. The approach of least general generalization was first introduced

by Plotkin [Plo70], who pioneered inductive logic programming and showed how to generalize

formulas. This approach was later used to synthesize programs from examples in a PBE

setting [MF90, RGMF14]. In contrast, we use this approach not to learn the low-level program,

but the high-level specification in the form of abstract examples.

Pre/Post- Condition Inference Learning specifications is related to finding the weakest pre-

conditions, strongest post-conditions, and inductive invariants [Dij75, GT07, Riv05, CCL11,

CCFL13, GLMN14]. Current inference approaches are mostly for program analysis and aim to

learn the conditions under which a bad behavior cannot occur. Our goal is different: we learn

the (good and bad) behaviors of the program and present them through a high-level language.

Applications of Regular Expressions There are many applications of regular expressions, for

example in data filtering (e.g., [WGS16]), learning XML file schemes (DTD) (e.g., [Fer09,

BNST06]), and program boosting (e.g., [CDL+15]). All of these learn expressions that are

consistent with the provided examples and have no guarantee on the target expression. In

contrast, we learn expressions that precisely capture program specifications.

6.7 ConclusionWe presented a novel synthesizer that interacts with the user via abstract examples and is

guaranteed to return a program that is correct on all inputs. The main idea is to use abstract

examples to describe a program behavior on multiple concrete inputs. To that end, we showed

L-SEP, an algorithm that generates maximal abstract examples. L-SEP enables our synthesizer

to describe candidate programs’ behavior through abstract examples. We implemented our

synthesizer and experimentally showed that it required few abstract examples to reject false

candidates and reduced the overall number of concrete examples required.

89


90


Chapter 7

Conclusion

In this thesis, we studied the problem of exact programming by example. In programming by

example, a user provides a set of input-output examples, and a synthesizer generates a program

consistent with these examples. The premise of programming by example is that these examples

capture the user’s intent, and thus the synthesizer will return a program that captures it even on

unseen inputs. Unfortunately, this is typically not the case and examples often under-specify the

user’s intent, especially when they are few and the input domain is large or infinite. Previous

approaches in programming by example either assumed that the user can inspect the final

program (directly or by looking at its outputs on new inputs) and provide more examples if

the outcome is incorrect, or they exhaustively presented membership queries to the user until

converging to a single program without providing bounds on the number of queries.

In this research, we formalized the problem of learning the user’s intent from examples as

an instance of exact learning. We captured user intent as a formula over arbitrary predicates

and limited the student’s (i.e., the synthesizer’s) queries to membership queries. We began by

studying a novel domain for program synthesis – patterns in time-series charts. We formalized

patterns as conjunctions over variable inequalities and showed an exact learning algorithm that

learns the pattern from charts. We then generalized this algorithm to algorithms that learn the

class of conjunctions and disjunctions over arbitrary predicates. The crux of these algorithms

is to identify non-equivalent formulas, which is crucial for reducing the search space size and

lowering the number of queries posed. Finally, we turned to the most general class: DNF

formulas over arbitrary predicates. We showed algorithms to learn this class and further studied

two important sub-classes: DNF formulas over predicates that are closed under negation, and

DNF formulas over predicates that are anti-closed under negation. Since any formula has a

representation as a DNF formula over the same predicates, this implies that any user intent can

be learned from examples with algorithms that minimize the number of membership queries

posed. In the final chapter, we investigated a different approach to guarantee exactness while

interacting through abstract examples. Abstract examples provide a middle ground between

membership queries and validation queries: they provide the same guarantee as validation

queries, while enjoying the simplicity of examples. We demonstrated how synthesizers can

benefit from abstract examples, both in communicating a candidate program’s specification

91


in an intuitive language and in pruning the program space quickly to speed the convergence

towards the target program.

The novelty of our research can be summarized as follows. Previous works in programming

by example did not learn the exact user intent or did not provide a bound on the number of

queries presented. Previous works in exact learning have focused on specific types of predicates,

and thus could not capture any user intent. Thus, our work is a contribution to both program

synthesis and exact learning, and a demonstration of their tight connection. We hope this

research will inspire others to pursue this exciting field of exact programming by example. Some

interesting directions for future study are:

• Improving query complexity: Some of the algorithms presented are not optimal, which

leaves room for improved algorithms with better query complexity.

• Improving query complexity for special predicate sets: We presented general algorithms

to learn formulas over arbitrary predicate sets. As Chapter 3 demonstrates, fixing a

predicate set may yield algorithms with better query complexity. As many of program

synthesis works focus on specific domains, they may design domain-specific learning

algorithms with better query complexity than our general-purpose algorithms.

• Learning the important predicates: An inherent assumption of our algorithms is that a

predicate set is provided. This raises the question of how to obtain the predicates. Fixing

a domain can significantly help in this task; however, there is a tradeoff between the

expressibility of the predicate set (i.e., in separating elements of the input domain) and

the size of the predicate set, which is an important factor in the number of queries posed.

• Developing abstract examples: Finally, we have shown that abstract examples can serve

as a middle ground between validation queries and membership queries. However, the

effectiveness of abstract examples is highly dependent on their representation. We have

shown one representation that is suitable to describe string-manipulation programs. An

interesting future direction is to find succinct representations of concrete examples in

other domains.

92


Bibliography

[ABC+12] Vicente Acuna, Etienne Birmele, Ludovic Cottret, Pierluigi Crescenzi,

Fabien Jourdan, Vincent Lacroix, Alberto Marchetti-Spaccamela, Andrea

Marino, Paulo Vieira Milreu, Marie-France Sagot, and Leen Stougie.

Telling stories: Enumerating maximal directed acyclic graphs with a

constrained set of sources and targets. Theoretical Computer Science,

457:1 – 9, 2012.

[ABJ+13] Rajeev Alur, Rastislav Bodık, Garvit Juniwal, Milo M. K. Martin, Mu-

kund Raghothaman, Sanjit A. Seshia, Rishabh Singh, Armando Solar-

Lezama, Emina Torlak, and Abhishek Udupa. Syntax-guided synthesis.

In Formal Methods in Computer-Aided Design, FMCAD 2013, Portland,

OR, USA, October 20-23, 2013, pages 1–8, 2013.

[ABK+02] Noga Alon, Richard Beigel, Simon Kasif, Steven Rudich, and Benny

Sudakov. Learning a hidden matching. In Proceedings of the 43rd

Symposium on Foundations of Computer Science, FOCS ’02, pages 197–

206, Washington, DC, USA, 2002. IEEE Computer Society.

[AC08] Dana Angluin and Jiang Chen. Learning a hidden graph using queries

per edge. Journal of Computer and System Sciences, 74(4):546 – 556,

2008. Carl Smith Memorial Issue.

[ACK01] Saswat Anand, Wei-Ngan Chin, and Siau-Cheng Khoo. Charting patterns

on price history. In Proceedings of the Sixth ACM SIGPLAN International

Conference on Functional Programming (ICFP ’01), Firenze (Florence),

Italy, September 3-5, 2001, pages 134–145, 2001.

[AFSS16] Rajeev Alur, Dana Fisman, Rishabh Singh, and Armando Solar-Lezama.

Sygus-comp 2016: Results and analysis. In Proceedings Fifth Workshop

on Synthesis, SYNT@CAV 2016, Toronto, Canada, July 17-18, 2016,

pages 178–202, 2016.

[AGK13] Aws Albarghouthi, Sumit Gulwani, and Zachary Kincaid. Recursive

program synthesis. In Computer Aided Verification - 25th Internatio-

93


nal Conference, CAV 2013, Saint Petersburg, Russia, July 13-19, 2013.

Proceedings, pages 934–950, 2013.

[Ang88] Dana Angluin. Queries and concept learning. Machine Learning,

2(4):319–342, 1988.

[BB13] Matko Botincan and Domagoj Babic. Sigma*: Symbolic learning of

input-output specifications. In Proceedings of the 40th Annual ACM

SIGPLAN-SIGACT Symposium on Principles of Programming Langua-

ges, POPL ’13, pages 443–456, New York, NY, USA, 2013. ACM.

[BC94] Alvis Brazma and Karlis Cerans. Efficient learning of regular expressions

from good examples. In 5th International Workshop on Algorithmic

Learning Theory, ALT ’94, Reinhardsbrunn Castle, Germany, October

10-15, 1994, Proceedings, pages 76–90, 1994.

[BCD+13] Mike Barnett, Badrish Chandramouli, Robert DeLine, Steven Drucker,

Danyel Fisher, Jonathan Goldstein, Patrick Morrison, and John Platt.

Stat!:an interactive analytics environment for big data. In Proceedings of

the ACM SIGMOD International Conference on Management of Data,

SIGMOD 2013, New York, NY, USA, June 22-27, 2013, pages 1013–1016,

2013.

[BCL+13] Michele Borassi, Pierluigi Crescenzi, Vincent Lacroix, Andrea Marino,

Marie-France Sagot, and Paulo Vieira Milreu. Telling stories fast. In

Experimental Algorithms: 12th International Symposium, SEA 2013,

Rome, Italy, June 5-7, 2013. Proceedings, pages 200–211, 2013.

[BDG+07] Lars Brenna, Alan Demers, Johannes Gehrke, Mingsheng Hong, Joel

Ossher, Biswanath Panda, Mirek Riedewald, Mohit Thatte, and Walker

White. Cayuga: A high-performance event processing engine. In Procee-

dings of the ACM SIGMOD International Conference on Management of

Data, Beijing, China, June 12-14, 2007, pages 1100–1102, 2007.

[BG07] E. Biglieri and L. Gyrfi. Multiple Access Channels: Theory and Practice.

IOS Press, Amsterdam, The Netherlands, 2007.

[BGHZ15] Daniel W. Barowy, Sumit Gulwani, Ted Hart, and Benjamin Zorn.

Flashrelate: Extracting relational data from semi-structured spreadsheets

using examples. In Proceedings of the 36th ACM SIGPLAN Conference

on Programming Language Design and Implementation, Portland, OR,

USA, June 15-17, 2015, pages 218–228, 2015.

[BGV05] Annalisa De Bonis, Leszek Gasieniec, and Ugo Vaccaro. Optimal two-

stage algorithms for group testing problems. SIAM Journal on Computing,

34(5):1253–1270, 2005.

94


[Bie78] Alan W. Biermann. The inference of regular lisp programs from examples.

IEEE Transactions on Systems, Man, and Cybernetics, 8(8):585 – 600,

1978.

[BNST06] Geert Jan Bex, Frank Neven, Thomas Schwentick, and Karl Tuyls. In-

ference of concise DTDs from XML data. In Proceedings of the 32nd

International Conference on Very Large Data Bases, Seoul, Korea, Sep-

tember 12-15, 2006, pages 115–126, 2006.

[BTGC16] James Bornholt, Emina Torlak, Dan Grossman, and Luis Ceze. Optimi-

zing synthesis with metasketches. In Proceedings of the 43rd Annual

ACM SIGPLAN-SIGACT Symposium on Principles of Programming Lan-

guages, POPL 2016, St. Petersburg, FL, USA, January 20 - 22, 2016,

pages 775–788, 2016.

[Bul05] Thomas N. Bulkowski. Encyclopedia of Chart Patterns. Wiley, 2nd

edition, 2005.

[Bul12] T.N. Bulkowski. Visual Guide to Chart Patterns. Bloomberg Financial.

2012.

[CCFL13] Patrick Cousot, Radhia Cousot, Manuel Fahndrich, and Francesco Lo-

gozzo. Automatic inference of necessary preconditions. In Verification,

Model Checking, and Abstract Interpretation, 14th International Con-

ference, VMCAI 2013, Rome, Italy, January 20-22, 2013. Proceedings,

pages 128–148, 2013.

[CCL11] Patrick Cousot, Radhia Cousot, and Francesco Logozzo. Precondition

inference from intermittent assertions and application to contracts on

collections. In Verification, Model Checking, and Abstract Interpretation

- 12th International Conference, VMCAI 2011, Austin, TX, USA, January

23-25, 2011. Proceedings, pages 150–168, 2011.

[CDL+15] Robert A. Cochran, Loris D’Antoni, Benjamin Livshits, David Mol-

nar, and Margus Veanes. Program boosting: Program synthesis via

crowd-sourcing. In Proceedings of the 42nd Annual ACM SIGPLAN-

SIGACT Symposium on Principles of Programming Languages, POPL

2015, Mumbai, India, January 15-17, 2015, pages 677–688, 2015.

[CF07] Mooi Choo Chuah and Fen Fu. ECG Anomaly Detection via Time Series

Analysis, pages 123–135. Springer Berlin Heidelberg, 2007.

[CGM10] Badrish Chandramouli, Jonathan Goldstein, and David Maier. High-

performance dynamic pattern matching over disordered streams. PVLDB,

3(1):220–231, 2010.

95


[Cic13] Ferdinando Cicalese. Group testing. In Fault-Tolerant Search Algorithms,

pages 139–173. Springer, 2013.

[CKSL15] Alvin Cheung, Shoaib Kamil, and Armando Solar-Lezama. Bridging

the gap between general-purpose and domain-specific compilers with

synthesis. In 1st Summit on Advances in Programming Languages,

SNAPL 2015, May 3-6, 2015, Asilomar, California, USA, pages 51–62,

2015.

[CSLM13] Alvin Cheung, Armando Solar-Lezama, and Samuel Madden. Optimizing

database-backed applications with query synthesis. In ACM SIGPLAN

Conference on Programming Language Design and Implementation,

PLDI ’13, Seattle, WA, USA, June 16-19, 2013, pages 3–14, 2013.

[CSRL01] Thomas H. Cormen, Clifford Stein, Ronald L. Rivest, and Charles E.

Leiserson. Introduction to Algorithms. McGraw-Hill Higher Education,

2nd edition, 2001.

[DH00] D. Du and F. Hwang. Combinatorial Group Testing and Its Applications.

Applied Mathematics. World Scientific, 2000.

[DH06] D. Du and F. Hwang. Pooling Designs and Nonadaptive Group Testing:

Important Tools for DNA Sequencing. Series on applied mathematics.

World Scientific, 2006.

[Dij75] Edsger W. Dijkstra. Guarded commands, nondeterminacy and formal

derivation of programs. Commun. ACM, 18(8), 1975.

[DMB08] Leonardo De Moura and Nikolaj Bjørner. Z3: An efficient SMT solver.

In Tools and Algorithms for the Construction and Analysis of Systems,

14th International Conference, TACAS 2008, Held as Part of the Joint

European Conferences on Theory and Practice of Software, ETAPS 2008,

Budapest, Hungary, March 29-April 6, 2008. Proceedings, pages 337–

340, 2008.

[Dor43] Robert Dorfman. The detection of defective members of large populati-

ons. The Annals of Mathematical Statistics, 14(4):436–440, 1943.

[DP60] Martin Davis and Hilary Putnam. A computing procedure for quantifica-

tion theory. J. ACM, 7(3):201–215, July 1960.

[DSPGMW10] Anish Das Sarma, Aditya Parameswaran, Hector Garcia-Molina, and

Jennifer Widom. Synthesizing view definitions from data. In Data-

base Theory - ICDT 2010, 13th International Conference, Lausanne,

Switzerland, March 23-25, 2010, Proceedings, pages 89–103, 2010.

96


[FCD15] John K. Feser, Swarat Chaudhuri, and Isil Dillig. Synthesizing data

structure transformations from input-output examples. In Proceedings of

the 36th ACM SIGPLAN Conference on Programming Language Design

and Implementation, Portland, OR, USA, June 15-17, 2015, pages 229–

239, 2015.

[Fer09] Henning Fernau. Algorithms for learning regular expressions from

positive data. Inf. Comput., 2009.

[GHS12] Sumit Gulwani, William R. Harris, and Rishabh Singh. Spreadsheet data

manipulation using examples. Commun. ACM, 55(8):97–105, 2012.

[GK95] S.A. Goldman and M.J. Kearns. On the complexity of teaching. J.

Comput. Syst. Sci., 50(1):20–31, February 1995.

[GK98] Vladimir Grebinski and Gregory Kucherov. Reconstructing a hamiltonian

cycle by querying the graph: Application to DNA physical mapping.

Discrete Appl. Math., 88(1-3):147–165, November 1998.

[GLMN14] Pranav Garg, Christof Loding, P. Madhusudan, and Daniel Neider. ICE:

A robust framework for learning invariants. In Computer Aided Verifi-

cation - 26th International Conference, CAV 2014, Held as Part of the

Vienna Summer of Logic, VSL 2014, Vienna, Austria, July 18-22, 2014.

Proceedings, pages 69–87, 2014.

[Gol67] E. Mark Gold. Language identification in the limit. Information and

Control, 10(5):447–474, 1967.

[Gre69] Cordell Green. Application of theorem proving to problem solving.

In Proceedings of the 1st International Joint Conference on Artificial

Intelligence, IJCAI’69, pages 219–239, 1969.

[GT07] Sumit Gulwani and Ashish Tiwari. Computing procedure summaries for

interprocedural analysis. In Programming Languages and Systems, 16th

European Symposium on Programming, ESOP 2007, Held as Part of the

Joint European Conferences on Theory and Practics of Software, ETAPS

2007, Braga, Portugal, March 24 - April 1, 2007, Proceedings, pages

253–267, 2007.

[Gul10] Sumit Gulwani. Dimensions in program synthesis. In Proceedings of

the 12th International ACM SIGPLAN Conference on Principles and

Practice of Declarative Programming, July 26-28, 2010, Hagenberg,

Austria, pages 13–24, 2010.

97


[Gul11] Sumit Gulwani. Automating string processing in spreadsheets using

input-output examples. In Proceedings of the 38th ACM SIGPLAN-


2011, Austin, TX, USA, January 26-28, 2011, pages 317–330, 2011.

[HAG+13] M. Hirzel, H. Andrade, B. Gedik, G. Jacques-Silva, R. Khandekar, V. Ku-

mar, M. Mendell, H. Nasgaard, S. Schneider, R. Soule, and K.-L. Wu.

IBM streams processing language: Analyzing big data in motion. IBM J.

Res. Dev., 57(3-4), 2013.

[Har74] Steven Hardy. Automatic induction of lisp functions. In Proceedings of

the 1st Summer Conference on Artificial Intelligence and Simulation of

Behaviour, AISB’74, pages 50–62, 1974.

[HG11] William R. Harris and Sumit Gulwani. Spreadsheet table transformations

from examples. In Proceedings of the 32nd ACM SIGPLAN Conference

on Programming Language Design and Implementation, PLDI 2011, San

Jose, CA, USA, June 4-8, 2011, pages 317–328, 2011.

[IGIS10] Shachar Itzhaky, Sumit Gulwani, Neil Immerman, and Mooly Sagiv. A

simple inductive synthesis methodology and its applications. In Procee-

dings of the 25th Annual ACM SIGPLAN Conference on Object-Oriented

Programming, Systems, Languages, and Applications, OOPSLA 2010,

October 17-21, 2010, Reno/Tahoe, Nevada, USA, pages 36–46, 2010.

[Inv] Investopedia. http://www.investopedia.com/

university/technical/techanalysis8.asp.

[JGST10] Susmit Jha, Sumit Gulwani, Sanjit A. Seshia, and Ashish Tiwari. Oracle-

guided component-based program synthesis. In Proceedings of the 32nd

ACM/IEEE International Conference on Software Engineering - Volume

1, ICSE 2010, Cape Town, South Africa, 1-8 May 2010, pages 215–224,

2010.

[JNR02] Rajeev Joshi, Greg Nelson, and Keith Randall. Denali: A goal-directed

superoptimizer. In Proceedings of the 2002 ACM SIGPLAN Conference

on Programming Language Design and Implementation (PLDI), Berlin,

Germany, June 17-19, 2002, pages 304–314, 2002.

[Kin10] Efim B. Kinber. Learning regular expressions from representative exam-

ples and membership queries. In Grammatical Inference: Theoretical

Results and Applications, 10th International Colloquium, ICGI 2010,

Valencia, Spain, September 13-16, 2010. Proceedings, pages 94–108,

2010.

98


http://www.investopedia.com/university/technical/techanalysis8.asp

http://www.investopedia.com/university/technical/techanalysis8.asp

[Knu97] Donald E. Knuth. The Art of Computer Programming, Volume 1 (3rd

Ed.): Fundamental Algorithms. Addison Wesley Longman Publishing

Co., Inc., Redwood City, CA, USA, 1997.

[Kol32] A. Kolmogoroff. Zur deutung der intuitionistischen logik. Mathematische

Zeitschrift, 35(1):58–65, 1932.

[LG14] Vu Le and Sumit Gulwani. Flashextract: A framework for data extraction

by examples. In ACM SIGPLAN Conference on Programming Language

Design and Implementation, PLDI ’14, Edinburgh, United Kingdom -

June 09 - 11, 2014, pages 542–553, 2014.

[LGS13] Vu Le, Sumit Gulwani, and Zhendong Su. Smartsynth: synthesizing

smartphone automation scripts from natural language. In The 11th Annual

International Conference on Mobile Systems, Applications, and Services,

MobiSys’13, Taipei, Taiwan, June 25-28, 2013, pages 193–206, 2013.

[LMW00] Andrew W. Lo, Harry Mamaysky, and Jiang Wang. Foundations of

technical analysis: Computational algorithms, statistical inference, and

empirical implementation. The Journal of Finance, 55(4):pp. 1705–1765,

2000.

[LWDW03] Tessa A. Lau, Steven A. Wolfman, Pedro Domingos, and Daniel S. Weld.

Programming by demonstration using version space algebra. Machine

Learning, 53(1-2):111–156, 2003.

[MEMlT+10] A. Morales-Esteban, F. Martınez-Alvarez, A. Troncoso, J.L. Justo, and

C. Rubio-Escudero. Pattern recognition to forecast seismic time series.

Expert Systems with Applications, 37(12):8333 – 8342, 2010.

[MF90] S. Muggleton and C. Feng. Efficient induction of logic programs. In

First Conference on Algorithmic Learning Theory, pages 368–381, 1990.

[MTG+13] Aditya Krishna Menon, Omer Tamuz, Sumit Gulwani, Butler W. Lamp-

son, and Adam Kalai. A machine learning framework for programming

by example. In Proceedings of the 30th International Conference on

Machine Learning, ICML 2013, Atlanta, GA, USA, 16-21 June 2013,

pages 187–195, 2013.

[MW71] Zohar Manna and Richard J. Waldinger. Toward automatic program

synthesis. Commun. ACM, 14(3):151–165, March 1971.

[MW75] Zohar Manna and Richard Waldinger. Knowledge and reasoning in

program synthesis. Artificial Intelligence, 6(2):175 – 208, 1975.

[MW79] Z. Manna and R. Waldinger. Synthesis: Dreams => programs. IEEE

Trans. Softw. Eng., 5(4):294–328, July 1979.

99


[MW80] Zohar Manna and Richard Waldinger. A deductive approach to program

synthesis. ACM Trans. Program. Lang. Syst., 2(1):90–121, January 1980.

[ND00] Hung Q Ngo and Ding-Zhu Du. A survey on combinatorial group testing

algorithms with applications to DNA library screening. DIMACS Series

in Discrete Mathematics and Theoretical Computer Science, 2000.

[Pel02] Andrzej Pelc. Searching games with errors—fifty years of coping with

liars. Theor. Comput. Sci., 270(1-2):71–109, January 2002.

[PG15] Oleksandr Polozov and Sumit Gulwani. Flashmeta: A framework for

inductive program synthesis. In Proceedings of the 2015 ACM SIGPLAN

International Conference on Object-Oriented Programming, Systems,

Languages, and Applications, OOPSLA 2015, part of SPLASH 2015,

Pittsburgh, PA, USA, October 25-30, 2015, pages 107–126, 2015.

[PJS+14] Phitchaya Mangpo Phothilimthana, Tikhon Jelvis, Rohin Shah, Nishant

Totla, Sarah Chasins, and Rastislav Bodik. Chlorophyll: Synthesis-

aided compiler for low-power spatial architectures. In ACM SIGPLAN

Conference on Programming Language Design and Implementation,

PLDI ’14, Edinburgh, United Kingdom - June 09 - 11, 2014, pages

396–407, 2014.

[Plo70] G. D. Plotkin. A note on inductive generalization. Machine Intelligence,

5, 1970.

[RBVK16] Veselin Raychev, Pavol Bielik, Martin Vechev, and Andreas Krause. Le-

arning programs from noisy data. In Proceedings of the 43rd Annual

ACM SIGPLAN-SIGACT Symposium on Principles of Programming Lan-

guages, POPL 2016, St. Petersburg, FL, USA, January 20 - 22, 2016,

pages 761–774, 2016.

[RGMF14] Mohammad Raza, Sumit Gulwani, and Natasa Milic-Frayling. Program-

ming by example using least general generalizations. In Proceedings of

the Twenty-Eighth AAAI Conference on Artificial Intelligence, July 27

-31, 2014, Quebec City, Quebec, Canada., pages 283–290, 2014.

[Riv05] Xavier Rival. Understanding the origin of alarms in astree. In Static Ana-

lysis, 12th International Symposium, SAS 2005, London, UK, September

7-9, 2005, Proceedings, pages 303–319, 2005.

[Sak97] Yasubumi Sakakibara. Recent advances of grammatical inference. Theo-

retical Computer Science, 185(1):15 – 45, 1997.

[SG12] Rishabh Singh and Sumit Gulwani. Learning semantic string transforma-

tions from examples. PVLDB, 5(8):740–751, 2012.

100


[SG16] Rishabh Singh and Sumit Gulwani. Transforming spreadsheet data types

using examples. In Proceedings of the 43rd Annual ACM SIGPLAN-


2016, St. Petersburg, FL, USA, January 20 - 22, 2016, pages 343–356,

2016.

[SL08] Armando Solar-Lezama. Program synthesis by sketching. ProQuest,

2008.

[SLJB08] Armando Solar-Lezama, Christopher Grant Jones, and Rastislav Bo-

dik. Sketching concurrent data structures. In Proceedings of the ACM

SIGPLAN 2008 Conference on Programming Language Design and Im-

plementation, Tucson, AZ, USA, June 7-13, 2008, pages 136–148, 2008.

[Smi75] David Canfield Smith. Pygmalion: A Creative Programming Environ-

ment. PhD thesis, Stanford, CA, USA, 1975. AAI7525608.

[SSA13] Eric Schkufza, Rahul Sharma, and Alex Aiken. Stochastic superopti-

mization. In Architectural Support for Programming Languages and

Operating Systems, ASPLOS ’13, Houston, TX, USA - March 16 - 20,

2013, pages 305–316, 2013.

[SSG75] David E. Shaw, William R. Swartout, and C. Cordell Green. Inferring

lisp programs from examples. In Proceedings of the 4th International

Joint Conference on Artificial Intelligence - Volume 1, IJCAI’75, pages

260–267, 1975.

[SSL11] Rishabh Singh and Armando Solar-Lezama. Synthesizing data structure

manipulations from storyboards. In SIGSOFT/FSE’11 19th ACM SIGS-

OFT Symposium on the Foundations of Software Engineering (FSE-19)

and ESEC’11: 13th European Software Engineering Conference (ESEC-

13), Szeged, Hungary, September 5-9, 2011, pages 289–299, 2011.

[Sum77] Phillip D. Summers. A methodology for lisp program construction from

examples. J. ACM, 24(1):161–175, January 1977.

[TLHL11] Stavros Tripakis, Ben Lickly, Thomas A. Henzinger, and Edward A. Lee.

A theory of synchronous relational interfaces. ACM Trans. Program.

Lang. Syst., 33(4), 2011.

[URD+13] Abhishek Udupa, Arun Raghavan, Jyotirmoy V. Deshmukh, Sela Mador-

Haim, Milo M.K. Martin, and Rajeev Alur. Transit: Specifying protocols

with concolic snippets. In ACM SIGPLAN Conference on Programming

Language Design and Implementation, PLDI ’13, Seattle, WA, USA, June

16-19, 2013, pages 287–296, 2013.

101


[Val84] L. G. Valiant. A theory of the learnable. Commun. ACM, Nov. 1984.

[VHL+12] Margus Veanes, Pieter Hooimeijer, Benjamin Livshits, David Molnar, and

Nikolaj Bjorner. Symbolic finite state transducers: Algorithms and appli-

cations. In Proceedings of the 39th ACM SIGPLAN-SIGACT Symposium

on Principles of Programming Languages, POPL 2012, Philadelphia,

Pennsylvania, USA, January 22-28, 2012, pages 137–150, 2012.

[VTR+14] Mandana Vaziri, Olivier Tardieu, Rodric Rabbah, Philippe Suter, and

Martin Hirzel. Stream processing with a spreadsheet. In ECOOP 2014

- Object-Oriented Programming - 28th European Conference, Uppsala,

Sweden, July 28 - August 1, 2014. Proceedings, pages 360–384. 2014.

[Was16] Kunihiro Wasa. Enumeration of enumeration algorithms. CoRR,

abs/1605.05102, 2016.

[WDR06] Eugene Wu, Yanlei Diao, and Shariq Rizvi. High-performance complex

event processing over streams. In Proceedings of the ACM SIGMOD

International Conference on Management of Data, Chicago, Illinois,

USA, June 27-29, 2006, pages 407–418, 2006.

[WGS16] Xinyu Wang, Sumit Gulwani, and Rishabh Singh. FIDEX: filtering

spreadsheet data using examples. In Proceedings of the 2016 ACM

SIGPLAN International Conference on Object-Oriented Programming,

Systems, Languages, and Applications, OOPSLA 2016, part of SPLASH

2016, Amsterdam, The Netherlands, October 30 - November 4, 2016,

pages 195–213, 2016.

[WL69] Richard J. Waldinger and Richard C. T. Lee. Prow: A step toward

automatic program writing. In Proceedings of the 1st International Joint

Conference on Artificial Intelligence, IJCAI’69, pages 241–252, San

Francisco, CA, USA, 1969. Morgan Kaufmann Publishers Inc.

[YF] Yahoo!-Finance. finance.yahoo.com.

[YTM+13] Kuat Yessenov, Shubham Tulsiani, Aditya Krishna Menon, Robert C.

Miller, Sumit Gulwani, Butler W. Lampson, and Adam Kalai. A colorful

approach to text processing by example. In The 26th Annual ACM

Symposium on User Interface Software and Technology, UIST’13, St.

Andrews, United Kingdom, October 8-11, 2013, pages 495–504, 2013.

[ZS13] Sai Zhang and Yuyin Sun. Automatically synthesizing SQL queries

from input-output examples. In 2013 28th IEEE/ACM International

Conference on Automated Software Engineering, ASE 2013, Silicon

Valley, CA, USA, November 11-15, 2013, pages 224–234, 2013.

102


finance.yahoo.com

שונה בגישה שנוהג מדויק אלגוריתם מראים אנו – אבסטרקטיות מדוגמאות סינתזה .4

המשתמש כוונת למידת של המשימה את הפרידו קודמות גישות הקודמים. מהאלגוריתמים

מאמינים תמ"ד מומחי לרב, הראשונה). המשימה את בעיקר (ולמדו התוכנית ייצור ממשימת

אחרונה כתרומה לכן, לייצר. שיש התוכנית אחר החיפוש את להניע צריך התוכניות שמרחב

להבטיח שונה גישה מראים אנו עתידית), לעבודה חדש תחום שפותחת מאמינים (שאנו

בעזרת המשתמש עם לתקשר הוא המרכזי הרעיון התוכניות. במרחב חיפוש תוך דיוק

למשתמש אותה ומתאר מתאימה שנראית תוכנית בוחר המסנתז כאשר אבסטרקטיות, דוגמאות

כמפ־ משמשות האבסטרקטיות הדוגמאות אבסטרקטיות. דוגמאות של קטן מספר בעזרת

הדוגמאות דרך שלו. החיפוש בתהליך שוקל שהמסנתז תוכניות בשביל אינטואיטיבי רט

על כוונתו את תופסת מייצר שהמסנתז האחרונה שהתוכנית למשתמש מובטח האבסטרקטיות,

האפשריים. הקלטים כל

iii


מדויקת למידה – התיזה כותרת (ומכאן מדויקת למידה של בתחום כבעיה מדוגמאות המשתמש כוונת

התחומים משלושת אחד עם מקושר שלרוב חישובית בלמידה תחום היא מדויקת למידה מדוגמאות).

אנו זו, בתיזה . [Ang88] משאלות ולמידה ,[Val84] CAP למידת ,[Gol67] בגבול למידה הבאים:

היא התלמיד ומטרת רעיון יודע המורה ותלמיד. מורה יש זה, במודל האחרון. המודל אחר עוקבים

ושאלות קיום שאלות שאלות: סוגי משני אחד להציג יכול התלמיד כך, לשם רעיון. אותו את ללמוד

הוא המורה שלנו, בהקשר שאלות. שפחות כמה לשאול היא התלמיד של המשנית המטרה שקילות.

הקלטים כל על המשתמש כוונת את שמביעה נוסחה הוא והרעיון המסנתז הוא התלמיד המשתמש,

שאלת (או שקילות שאלת המשתמש. לכוונת מתאים קלט־פלט זוג האם היא קיום שאלת האפשריים.

ב־ שקילות שאלת על עונה המורה אם המשתמש. כוונת את מתארת מסוימת נוסחה האם היא וידוא)

אפשריות לא הן שקילות ששאלות למרות נגדית. דוגמה מספק המורה אחרת, הסתיימה. הלמידה 'כן',

.[ABJ+13, IGIS10, SL08] כאלו שאלות מאפשרות שכן בסינתזה עבודות קיימות תמ"ד, של בשיטה

מספק הפורמלי המפרט משתמש). (ולא פורמלי מפרט עם מאמת ע"י ממומש המורה אלו, בעבודות

אוטומטי. באופן שקילות שאלות על לענות יעילה דרך

[JGST10] בודדת עבודה מלבד חדשנית. היא מדויקת למידה של כבעיה תמ"ד של שלנו ההגדרה

לא התמ"ד מסנתזי כל אחת, תוכנית מכיל המתאימות התוכניות שמרחב עד קיום שאלות שמציגה

זו בודדת עבודה של שהגישה בעוד שקיבלו. לדוגמאות מעבר המשתמש כוונת את ללמוד מבטיחים

שאלות מספר על טריוויאלי לא חסם לה אין המשתמש, כוונת את שמביעה תוכנית ללמוד מבטיחה

ידי על האלגוריתם יעילות את להציג יש מדויקת בלמידה בעבודות להבדיל, מציגה. שהיא הקיום

הן: הזו התיזה של התרומות תחתון. לחסם והשוואה הקיום שאלות סיבוכיות ניתוח

שלומד מדויק תמ"ד מסנתז בהצגת מתחילים אנו תבניות־זמן: בשביל מדויק תמ"ד מסנתז .1

תבניות מייצגים אנו מניות). מחירי (למשל, בזמן כתלות נתונים ערכי שמציגים בגרפים תבניות

שנעזר הזאת המחלקה של למידה אלגוריתם ומראים אי־שיוויונות, מעל גימום כנוסחאות

מדוגמה מתחילה שהלמידה מניחים אנו ויזואליים). לגרפים (שמתורגמות בלבד קיום בשאלות

מסנתז עם הזה האלגוריתם את להרחיב איך מראים אנחנו גרף). (כלומר, אחת חיובית

מניות. במחירי הזאת התבנית את שמאתרת תוכנית מייצר תבנית, שמתארת נוסחה שבהינתן

עם אותן ומאתר פופולריות תבניות מגוון לומד שהוא והראינו אמפירית האלגוריתם את בדקנו

.%95 של דיוק

מעל ודיסיונקציה) (קוניונקציה ואיווי גימום של המחלקות עבור מדויקת למידה אלגוריתמי .2

עשויות תחבירית שונות נוסחאות ולכן תלויים להיות יכולים הפרדיקטים פרדיקטים. קבוצת

מלהציג להימנע כדי שקולות הלא הנוסחאות את לזהות הוא האתגר לכן, לוגית. שקולות להיות

להגביל חשוב לוגית, שקולות הן נוסחאות האם לבדוק יקר שזה מכיוון מיותרות. קיום שאלות

"עצלני" באופן שקולות הלא הנוסחאות במרחב שמחפש אלגוריתם מראים אנו כאלו. בדיקות

ללמידה. נחוץ זה כאשר רק במרחב איברים מחשב הוא –

מעל Disjunctive normal form (DNF) נוסחאות של המחלקות עבור מדויקת למידה אלגוריתמי .3

תתי בשתי מתרכזים ואז כללי אלגוריתם עם מתחילים אנו פרדיקטים. של שרירותית קבוצה

"אנטי־ הפרדיקטים שבה מחלקה ו־(2) לשלילה סגורים הפרדיקטים שבה מחלקה (1) מחלקות:

טובה קיום שאלות סיבוכיות עם אלגוריתם מחלקה תת לכל מראים אנו לשלילה. סגורים"

הראשונה. המחלקה לתת אופטימלי אלגוריתם מראים אנו בפרט, הכללי. מהאלגוריתם יותר

ii


תקציר

הקיימות. התוכניות להיצע מוגבל במחשב שימושם ולכן לתכנת יודעים אינם המחשב משתמשי מרבית

למשתמשים מסובכות הקיימות שהתוכנות מדגים בעיה תחום באותו לסייע שמיועדות התוכנות שפע

פרחו מדוגמאות תכנות ובפרט תוכנה של סינתזה שיטות לצרכיהם. מספק באופן עונות לא או

על משלהם תוכניות לכתוב למשתמשים ולאפשר אלו בעיות בדיוק למנוע במטרה האחרונות בשנים

של בסינתזה המטרה אחד. קוד קטע לבדוק או לכתוב מבלי דוגמאות, בעזרת כוונתם תיאור ידי

ולא תיאורי לרב הוא המפרט גבוהה). (בשפה ממפרט נמוכה) (בשפה תוכנית ליצור היא תוכנה

שניתן לקוד מפרטים לתרגם תחבירי באופן יכולים לא מסנתזים לכן, אותו. לממש יש כיצד מסביר

ושיטות הסקה ידי על זה אתגר עם התמודדו הראשונים המסנתזים עושים. שמהדרים כפי להרצה,

הן שנוצרו שהתוכניות מובטח פעולתם, אופן פי שעל הוא כאלו מסנתזים של היתרון .[MW71] המרה

מוחלטים לא בחוקים תלוי פעולתם שאופן הוא שלהם החסרון המפרט. את מממשות כלומר נכונות,

בחירת את לשפר כדי יוריסטיקות הציגו שיטות שכמה למרות יסתיים. שהתהליך מובטח לא ולכן

לשיטות או [SLJB08] אילוצים פתרון של לשיטות עברו מודרניים מסנתזים , [JNR02] החוקים

פתרון וכל אילוצים של קבוצה הוא המפרט אלו, בגישות .[SSA13] התוכניות מרחב על מניה של

שהמפרט היא אלו שיטות של ההנחה כלומר, חוקי. לפתרון נחשב האלו האילוצים על שעונה (תוכנית)

אם גם התוכנית, של אפשרי קלט כל על נכון הוא המפרט על שעונה פתרון כל כלומר, שלם. הוא

מפורש. באופן עליהם הרצויה ההתנהגות את תיאר לא המפרט

[Gul10, LWDW03, DSPGMW10, פופולריות צברה (תמ"ד) מדוגמאות תכנות של השיטה במקביל,

HG11, Gul11, GHS12, SG12, YTM+13, AGK13, ZS13, MTG+13, LG14, FCD15, BGHZ15,

בהשוואה קלט־פלט. דוגמאות של קבוצה הוא המפרט מדוגמאות בתכנות .PG15, SG16, RBVK16]

תוכנית, של יעיל לא מימוש או לוגית נוסחה הוא המפרט בהן תוכנה, של סינתזה של אחרות לשיטות

בשיטות אם לכן, המפרט. את מתמטית בצורה לייצג יש שבו האופן על מוקדם ידע דורש לא תמ"ד

כל להיות יכול המשתמש בתמ"ד מתכנתים, או מומחים להיות צריכים היו המשתמשים הקודמות

היא זו שיטה של ההנחה משמעותית. יותר הרבה היא תמ"ד של האפשרית ההשפעה כלומר, אחד.

לא זה הצער, למרבה דוגמאות. של קטן מספר בעזרת כוונתם את להביע יכולים שהמשתמשים

אלגוריתמי לכן, המשתמש. כוונת של חלקי תיאור מהוות הגדרתן באופן ודוגמאות נכון בהכרח

יכולים ולא סיפק, שהמשתמש הדוגמאות עם שעקבית תוכנית ייצרו שהם להבטיח רק יכולים תמ"ד

על נכונות להבטיח שרוצה משתמש קיבלו. לא שהם קלטים על המשתמש כוונת את להבין להבטיח

לטעויות. נוטה אשר קשה משימה וזוהי ידני, באופן התוכנית את לבדוק חייב האפשריים הקלטים כל

האפשריים הקלטים כל על המשתמש כוונת את ללמוד שמבטיחים אלגוריתמים מראים אנו זו, בתיזה

למידת של הבעיה את מייצגים אנו כך, לשם דוגמאות. בעזרת לתקשר למשתמש מאפשרים ועדיין

i


המחשב. למדעי בפקולטה יהב, ערן פרופסור של בהנחייתו בוצע המחקר

ובכתבי־עת בכנסים למחקר ושותפיו המחבר מאת כמאמרים פורסמו זה בחיבור התוצאות מן חלק

הינן: ביותר העדכניות גרסאותיהם אשר המחבר, של הדוקטורט מחקר תקופת במהלך

Nader Bshouty, Dana Drachsler-Cohen, Martin T. Vechev, and Eran Yahav. Learning disjunctions ofpredicates. In Proceedings of the 30th Conference on Learning Theory, COLT 2017, 2017.

Dana Drachsler-Cohen, Sharon Shoham, and Eran Yahav. Synthesis with abstract examples. In ComputerAided Verification - 29th International Conference, CAV 2017, 2017.

Dana Drachsler-Cohen, Martin T. Vechev, and Eran Yahav. Optimal learning of specifications fromexamples (in preparation). CoRR, abs/1608.00089, 2016.

תודות

לך תודה ידו. על מונחית להיות גדול מזל לי שהיה יהב, ערן לפרופ' מודה אני כל, ראשית

דיונים על לך תודה הלימודים. לאורך אופטימיות על לשמור לי שגרמה המדבקת ההתלהבות על

לכתוב איך אותי שלימדת לך תודה הגשות. לפני הארוכים הלילות במהלך במיוחד רבים, מעשירים

את לקצר ואיך מסובך, רעיון כל של המהות את ולהסביר למצוא איך ואלגנטית, פשוטה בצורה

לרדוף תמיד אותי שלימדת לך תודה זה...). על לעבוד להמשיך ונצטרך ייתכן כי (אם שלי המשפטים

על לך תודה הכל, מעל הדרך. במהלך אתגר כל על ולהתגבר ביותר המעניינות המחקר שאלות אחרי

תודה. אסירת תמיד אהיה מעבר, הרבה ועל כך על בי. האינסופית האמונה

הארוכות השעות על לך תודה וצ'ב, מרטין לפרופ' זו. לתיזה רבות שתרמו לשותפים גם מודה אני

עם הרבה העזרה על לך תודה בשותי, נאדר לפרופ' והרחוק. הקרוב לטווח והעצות, הדיונים ועל

השעות על לך תודה שוהם, שרון לפרופ' לבסוף, ממך. רבות למדתי הזו, התיזה של התיאורטי החלק

לזהות ואיך והוכחות אלגוריתמים רעיונות, לפשט דרכים לחפש תמיד איך אותי שלימדת על הארוכות,

אלגנטית. בצורה אותן ולפתור עדינות נקודות

התמי־ על לכם תודה גל. היקר, ובעלי דורין, אחותי, וגבריאל, אילנה להוריי, מודה אני לבסוף,

האהבה על הכל ומעל בפרספקטיבה, דברים העמדתם שתמיד על העמוסות, התקופות במהלך כה

לכם. מוקדשת הזאת התיזה בי. והאמונה

בהשתלמותי. הנדיבה הכספית התמיכה על לטכניון מודה אני


מדוגמאות מדויק תכנות

מחקר על חיבור

התואר לקבלת הדרישות של חלקי מילוי לשם

לפילוסופיה דוקטור

כהן דרקסלר דנה

לישראל טכנולוגי מכון – הטכניון לסנט הוגש

2017 יוני חיפה התשע"ז סיון


מדוגמאות מדויק תכנות

כהן דרקסלר דנה


exact programming by example · introduction program synthesis is the task of automatically...

Documents