javagenes evolving molecules and molecular force fields al globus deepak srivastava sandy johan a...

18
JavaGenes Evolving Molecules and Molecular Force Fields Al Globus Deepak Srivastava Sandy Johan A Work In Progress

Upload: daniel-silva

Post on 27-Mar-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: JavaGenes Evolving Molecules and Molecular Force Fields Al Globus Deepak Srivastava Sandy Johan A Work In Progress

JavaGenesEvolving Molecules and Molecular Force

FieldsAl Globus

Deepak SrivastavaSandy Johan

A Work In Progress

Page 2: JavaGenes Evolving Molecules and Molecular Force Fields Al Globus Deepak Srivastava Sandy Johan A Work In Progress

Molecules to Evolve

Cl

N

N

O

O

N

N N

N

Page 3: JavaGenes Evolving Molecules and Molecular Force Fields Al Globus Deepak Srivastava Sandy Johan A Work In Progress

Graph Crossover Problem

Any edge may be a member of one or more cycles. Graph fragments produced by division may have

more than one crossover point ("broken edges") When two fragments are combined they may have

different numbers of  broken edges to be merged. Our crossover operator

• Operate on any connected graph. • Divides graphs at randomly generated cut sets. • Can evolve arbitrary cyclic structures given at least

some cycles in the initial population.• Always produces connected undirected graphs.• Almost always produces connected directed graphs.

Page 4: JavaGenes Evolving Molecules and Molecular Force Fields Al Globus Deepak Srivastava Sandy Johan A Work In Progress

Crossover

abcd wxyz

abcd wxyz

abyz wxcd

Strings Trees Graphs

Page 5: JavaGenes Evolving Molecules and Molecular Force Fields Al Globus Deepak Srivastava Sandy Johan A Work In Progress

Graph Crossover

Rip Two Parents Apart Combine into a Child

Page 6: JavaGenes Evolving Molecules and Molecular Force Fields Al Globus Deepak Srivastava Sandy Johan A Work In Progress

Molecule Division Choose an initial random bond Repeat

• Find the shortest path between the initial bond's atoms.

• Remove and remember a random bond from this path. These bonds are called "broken edges.“

Until a cut set is found, i.e., no path exists between the initial bond's vertices.

Page 7: JavaGenes Evolving Molecules and Molecular Force Fields Al Globus Deepak Srivastava Sandy Johan A Work In Progress

Fragment Recombination Repeat

• Select a random broken edge. Determine which fragment it is associated with.

• If at least one broken edge in other fragment exists– choose one at randomchoose one at random– merge the broken edges into one bond; respecting valence merge the broken edges into one bond; respecting valence

by reducing the order of the bond if necessaryby reducing the order of the bond if necessary

• Else flip coin– heads -- attach the broken edge to a random atom in other heads -- attach the broken edge to a random atom in other

fragment (respecting valence)fragment (respecting valence)– tails -- discard the broken edgetails -- discard the broken edge

Until each broken edge has been processed exactly once

Page 8: JavaGenes Evolving Molecules and Molecular Force Fields Al Globus Deepak Srivastava Sandy Johan A Work In Progress

Molecule Fitness FunctionAll-pairs-shortest-path distance

•Assign extended types to each atom –Extended type = (element, |single Extended type = (element, |single bonds|, |double bonds|, |triple bonds|)bonds|, |double bonds|, |triple bonds|)

•Find shortest bond path between each pair of atoms

•Create bag: one item per atom pair– item = (type1, type2, path length)item = (type1, type2, path length)–bag = set with repeated itemsbag = set with repeated items

•distance = 1 - |intersection| / |union|

Page 9: JavaGenes Evolving Molecules and Molecular Force Fields Al Globus Deepak Srivastava Sandy Johan A Work In Progress

Finding Small Molecules

N

N N

N

Page 10: JavaGenes Evolving Molecules and Molecular Force Fields Al Globus Deepak Srivastava Sandy Johan A Work In Progress

Finding Larger Molecules

Cl

N

N

O

O

Page 11: JavaGenes Evolving Molecules and Molecular Force Fields Al Globus Deepak Srivastava Sandy Johan A Work In Progress

JavaGenes in Action

Finding

with all-pairs-shortest-pathand Tanimoto index fitnessfunction (0 is perfect)

Page 12: JavaGenes Evolving Molecules and Molecular Force Fields Al Globus Deepak Srivastava Sandy Johan A Work In Progress

Molecular Dynamics and Mechanics Newton’s laws of motion in a potential field

Discover common conformations during dynamics

Discover minimum energy conformations (e.g., protein folding problem)

Began in 1960s with two body potentials for inert gas modeling

1980s extended to metals and bonded systems (upper-right corner of periodic table)

Our studies focus on the evolving potentials for reactive systems (bonds break and form)

Page 13: JavaGenes Evolving Molecules and Molecular Force Fields Al Globus Deepak Srivastava Sandy Johan A Work In Progress

Molecular Potentials Energy = sum 2-body terms + sum 3-body

terms + … Stillinger-Weber SiF potential function

• 2-body(r) – A(BrA(Br-p-p - r - r-q-q) * cutoff) * cutoff– Cutoff = exp(C/(r-a)); r < a, 0 otherwiseCutoff = exp(C/(r-a)); r < a, 0 otherwise

• 3-body(rij,rjk,theta) =

– (alpha + lambda (cos(theta) - cos(theta(alpha + lambda (cos(theta) - cos(theta00))^2))))^2)) * cutoff * cutoff

– Cutoff = exp(gamma(1/(Cutoff = exp(gamma(1/(rrijij- a1) + 1/(- a1) + 1/(rrjkjk- a1))- a1))

• FFF additional term = – delta(rdelta(rijijrrjkjk))

-m-m * cutoff * cutoff

– Cutoff = exp(beta(1/(Cutoff = exp(beta(1/(rrijij - a2) + 1/( - a2) + 1/(rrjkjk- a2)))- a2)))

Discovering parameters can require months or years

Page 14: JavaGenes Evolving Molecules and Molecular Force Fields Al Globus Deepak Srivastava Sandy Johan A Work In Progress

Evolving Molecular Force Fields Chromosome

• 2D ragged array of floating point numbers– SiSi, SiF, FF, SiSiSi, SiSiF, SiFSi, FSiF, FFSi, FFFSiSi, SiF, FF, SiSiSi, SiSiF, SiFSi, FSiF, FFSi, FFF

• 5-63 parameters Transmission operators

• Interval crossover• Mutation

Fitness Function• RMS difference between individuals and

“correct” energies for n molecules• “Correct” energies

– Currently: energies generated with the force field with Currently: energies generated with the force field with published parameterspublished parameters

– Next step: energies generated by higher quality Next step: energies generated by higher quality quantum codesquantum codes

Page 15: JavaGenes Evolving Molecules and Molecular Force Fields Al Globus Deepak Srivastava Sandy Johan A Work In Progress

Interval Crossover For each allele:

LowerParental

Value(1.1)

HigherParental

Value(2.1)

Construct larger interval (100% larger)(.6) (2.6)

Choose a random number

(1.3)

1.

2.

3.

Construct an interval from parental values

Page 16: JavaGenes Evolving Molecules and Molecular Force Fields Al Globus Deepak Srivastava Sandy Johan A Work In Progress

Si potential results population = 1000 generations = 3000 fitness function: 100 random 5-body Si

tetrahedra 31 runs. Best run results:

• A = 7.151346144801161 (7.049556277)• B = 0.6007865398735448 (0.6022245584)• p = 3.9825158463763977 (4)• q = 0.014970062068368135 (0)• a = 1.797123919332413 (1.8)• alpha = 0.1442970771852687 (0)• lambda = 27.783092740584205 (21)• gamma = 1.328091763076223 (1.2)• a1 = 1.8173559091012945 (1.8)

Page 17: JavaGenes Evolving Molecules and Molecular Force Fields Al Globus Deepak Srivastava Sandy Johan A Work In Progress

Future Plans Hill climbing Use experimental data for new fitness

functions Feed results from easy to hard

evolution

SiSi (5) SiF (6)FF (6)

SiSiSi (9) FFF (14)

SiFSi (10)

Full SiF (63)

SiSiF (10)

FSiF (10)

FFSi (10)

Page 18: JavaGenes Evolving Molecules and Molecular Force Fields Al Globus Deepak Srivastava Sandy Johan A Work In Progress

Condor Cycle-scavenging batch system for

single workstation jobs• Desktop machines, nights, weekends, etc.• University of Wisconsin • In production since 1986• Unix workstations

250 SGI and 50 Sun workstations at code IN

Good for• parameter studies• stochastic algorithms (e.g., GA)

One JavaGenes job per Condor job