data generation, the hard parts

82
Eric Torreborre / FP-Syd Data generation The hard parts

Upload: eric-torreborre

Post on 16-Jan-2017

612 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Data generation, the hard parts

Eric Torreborre / FP-Syd

Data generation

The hard parts

Page 2: Data generation, the hard parts
Page 3: Data generation, the hard parts
Page 4: Data generation, the hard parts

NOT SO SIMPLE

Page 5: Data generation, the hard parts
Page 6: Data generation, the hard parts

Recursive data structuresPolymorphic functions

Constrained data

Page 7: Data generation, the hard parts
Page 8: Data generation, the hard parts
Page 9: Data generation, the hard parts
Page 10: Data generation, the hard parts
Page 11: Data generation, the hard parts

Recursive data structures

Page 12: Data generation, the hard parts

Agile Estimating and PlanningAgile ManagementAgile Product Management with ScrumAgile Product Planning and AnalysisAgile Project Management: Creating Innovative ProductsAgile Project Management For DummiesAgile Software Development, Principles, Patterns, and PracticesAgile Software Development with ScrumAgile Software Development with Distributed TeamsAgile Testing

Page 13: Data generation, the hard parts

Agile + Estimating and Planning + Management + Product + Management with Scrum + Planning and Analysis

+ Project Management + Creating Innovative Products + For Dummies

+ Software Development + Principles, Patterns, and Practices + with Scrum + with Distributed Teams

+ Testing

Page 14: Data generation, the hard parts

Generate trees

Page 15: Data generation, the hard parts
Page 16: Data generation, the hard parts
Page 17: Data generation, the hard parts

Generate treesDepth?

Width?

Balanced?

Coverage?

Composition?

Uniformity?

Constraints?

Performance?

Page 18: Data generation, the hard parts

programs

well-typed

Page 19: Data generation, the hard parts

Size and dimension

BoltzmannModel

Combinatorialspecies

Page 20: Data generation, the hard parts
Page 21: Data generation, the hard parts

Same size bound

Page 22: Data generation, the hard parts

505 valueson average for n=100

Page 23: Data generation, the hard parts

18 constructors

P = 1 / 89

Page 24: Data generation, the hard parts

P=1/4

P = 1 / 9

Page 25: Data generation, the hard parts
Page 26: Data generation, the hard parts

generating function

System of equations

solution + singularity

size in O(n)

Page 27: Data generation, the hard parts

Enumerate structures

Sample uniformly

Page 28: Data generation, the hard parts
Page 29: Data generation, the hard parts

Set of labels

Family ofstructures

Page 30: Data generation, the hard parts
Page 31: Data generation, the hard parts

2

1

3

4

5

6

2

1

3

4

5

6

2

1

3

4

5

6

Page 32: Data generation, the hard parts

b

a

c

d

e

f

b

a

c

d

e

f

b

a

c

d

e

e

Page 33: Data generation, the hard parts

2

1

3

4

5

6

2

1

3

4

5

6

Page 34: Data generation, the hard parts

Regular species

Page 35: Data generation, the hard parts

0

Page 36: Data generation, the hard parts

1

Page 37: Data generation, the hard parts

X

11

Page 38: Data generation, the hard parts

F

1

G

23

45

F1

23 4

5

G1

23 4

5

Page 39: Data generation, the hard parts

X

1

0

X1

Page 40: Data generation, the hard parts

X

1

1

X1

Page 41: Data generation, the hard parts

1 1

1

1

Page 42: Data generation, the hard parts

n 11 1 …

Page 43: Data generation, the hard parts

n 11 1 …

Page 44: Data generation, the hard parts

n 11 1 …

Page 45: Data generation, the hard parts

F

1

G

23

45

F

1

2

3

4

5

G

Page 46: Data generation, the hard parts

X 0

Page 47: Data generation, the hard parts

X

1

1

X1

1

Page 48: Data generation, the hard parts

X

1

X

X1

X

1

2

2

X2

X1

Page 49: Data generation, the hard parts

X

1

XX

1X

2

2

3

XX

3

X1

X3

X2

X2

X1

X3

X3

X2

X1

Page 50: Data generation, the hard parts
Page 51: Data generation, the hard parts

L X1 L

L X1 L

L X

L X X

L X X X

Page 52: Data generation, the hard parts

L

12

3

4

1 2 3 4

2 1 3 4

3 2 1 4

Page 53: Data generation, the hard parts

L X1 L

LX

1

1

Page 54: Data generation, the hard parts

2 3 4

1

5

9

6 7 8

10 11

No symmetries

Page 55: Data generation, the hard parts

GF

12

3

45

1

2

3

4

5

G

F

G

G G

Page 56: Data generation, the hard parts

R LX R

2

1

3

4

5

6 7

Page 57: Data generation, the hard parts

F G

Page 58: Data generation, the hard parts

F '

12

3

45

F

1

2

34

5

Page 59: Data generation, the hard parts

F '

L L L'

C L'

Page 60: Data generation, the hard parts

F |n|

1 23

n5

F

1

23

4

5 n

… …

… /= n

Page 61: Data generation, the hard parts

Non regular species

Page 62: Data generation, the hard parts

E

1 23

5

E

1

23

4

54

Page 63: Data generation, the hard parts

C

1 23

5

C

4…

Page 64: Data generation, the hard parts

E

CEP

Page 65: Data generation, the hard parts

C

P CE

L 'C

L P

Page 66: Data generation, the hard parts

GF

12

3

45

F G

4

1 3

25 G

4

1 3

25

G

4

1 3

25

Page 67: Data generation, the hard parts

GF

EE |2|

EX |2|

Page 68: Data generation, the hard parts

in code?

Page 69: Data generation, the hard parts
Page 70: Data generation, the hard parts
Page 71: Data generation, the hard parts
Page 72: Data generation, the hard parts
Page 73: Data generation, the hard parts
Page 74: Data generation, the hard parts

Maths…

Page 75: Data generation, the hard parts

"seems it is doable

to find such a function, but needs

work."

"we have a noneasy

question"

Page 76: Data generation, the hard parts

My strategy

Page 77: Data generation, the hard parts

number of partitions having p sets

Page 78: Data generation, the hard parts

int partitions of 6

change of representation

3-int partitions of n

Page 79: Data generation, the hard parts

Given an index k

Page 80: Data generation, the hard parts
Page 81: Data generation, the hard parts

Proper notion of size

Uniformity

Species combinators for constraints

Page 82: Data generation, the hard parts

Eric Torreborre / FP-Syd

Data generation

The hard partsThanks!