if the singularity arrives, will it be by design or evolution?

Post on 20-Jun-2015

182 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Gives an in-depth introduction to evolutionary algorithms, particularly genetic algorithms and genetic programming along with a number of links to relevant sites, papers and books.

TRANSCRIPT

IF THE SINGULARITY ARRIVES, WILL IT BE BY DESIGN OR

EVOLUTION?Bill Worzel billwzel@gmail.com

Evolution Enterprises http://evolver.biz

Data Day Texas 11 Jan 2014Austin, TX

Monday, January 13, 14

NATURE HAS MANY ROOMS

• Animals solve the problem of survival in many ways

• Most are adapted to specific ecological niches

• Genetics forms the common language of living creatures

Monday, January 13, 14

EVOLUTIONARY ALGORITHMS (EA) BORROW FROM NATURE

• Based on natural selection and population dynamics

• Create a population of solutions

• Preferentially select and recombine better individuals to find better solutions

Monday, January 13, 14

AN ELEGANT SEARCH

• EAs combine global search with local search

• Randomly generated individuals test many niches

• Selection and recombination hones in on the best neighborhoods

Monday, January 13, 14

GENETIC ALGORITHMS (GA)

• GAs encode information and then combine and mutate individuals

• In simplest case, encoding is a bit string mapped to variable values

• Initial population of individuals are created randomly

101001011010P/E Trend

Population

Monday, January 13, 14

SELECTION & FITNESS

• Subset of individuals are selected at random from population

• Fitness of each is calculated

• Best pair are combined to produce offspring

32 16 18 90 Fitness

9032x

Monday, January 13, 14

CROSSOVER & MUTATION

• Crossover combines bit strings

• Mutation changes bits

• Both operations are stochastic

• Offspring replace parents or weaker individuals in population

101001011010x

crossover pt

011101101110=

011101111010+

101011001110

mutationMonday, January 13, 14

BUILDING BLOCKS AND SCHEMAS

• Building block hypothesis states that GAs find good simple components that confer better fitness on individuals

• The Schema Theorem shows that better building blocks accrue to produce best individuals: E(m(H,t+1)) ≥ ((m(H,t) f(H))/at)[1-p].

Monday, January 13, 14

CASE STUDY: AGRICULTURAL MODELING

• Decision support software for farmers: With large number of new hybrids, what to choose?

• Needed to integrate agronomic, weather, economic, personal factors

• GA not as an optimizer but as an optionizer in a multi-objective space

Monday, January 13, 14

GENETIC PROGRAMMING (GP)

• GP evolves computer programs (usually functions)

• Essentially a program that produces programs as its output

• Extends idea of combining bit strings to parse trees

Monday, January 13, 14

GP OVERVIEW

ProgramPopulation

SelectMating Group

Terminate?

Crossoverand

Mutate

Replace Least Fit

With Offspring

SelectTwoBest

Programs

OutputResults

Yes

No

Input Data GP Parameters

GPCycle

?

?

?

? = stochastic process

Monday, January 13, 14

CONSTRUCTING TREES• Randomly assemble a population of function trees as

constrained by GP parameters

From: ‘A Field Guide To Genetic Programming’

Monday, January 13, 14

CROSSOVER (RECOMBINATION)

From: ‘A Field Guide To Genetic Programming’

Monday, January 13, 14

MUTATION

From: ‘A Field Guide To Genetic Programming’

Monday, January 13, 14

THE DEVIL IN THE DETAILS

• How do you correct syntax errors?

• Type coherence?

• Control overfitting?

• Computationally intensive

Monday, January 13, 14

BUT HEAVEN’S ON OUR SIDE

• Naturally parallel algorithm - linear speedup, mostly not iterative

• Sub-populations may be run asynchronously in parallel: m*n/p where m is individuals in a sub-population, n is the number of sub-populations, and p is number of processors

• Matches up well with cloud computing

Monday, January 13, 14

THE SKGP

• Uses purely functional combinators to represent programs

• Efficient, powerful, reusable code

• Algorithm becomes superlinear in parallel application because of code reuse

Monday, January 13, 14

COMBINATORS• Applicative algebra, derived from Lambda calculus,

binds left-to-right

• Sxyz = xz(yz)

• Kxy = x

• Ix = x

• Bxyz = x(yz)

• Cxyz = xzy

Monday, January 13, 14

VARIABLE ABSTRACTION

• D.A. Turner showed that all bound variables could be removed completely using combinators (Turner 1979, A New Implementation Technique for Applicative Languages, Software–Practice and Experience, vol 9, 31-49 )

• Essentially this provides a way to create expressions that are, combinators applied to data with no reference to variables

Monday, January 13, 14

EXAMPLE COMBINATOR FUNCTION

Example: ‘S(S(K +)(K 1))I’ is the function that adds 1so S(S(K +)(K 1)I applied to 3 is:

S(S(K +)(K 1))I 3

S(K +)(K 1)3(I 3)

K+3((K 1)3)(I 3)

+K 1 3 (I 3)

+ 1 (I 3)

+ 1 3

4Monday, January 13, 14

COMBINATORS FUNCTIONS QUICKLY BECOME COMPLEX

Here is the function for factorial:

def fac = S(S(S(K cond)(S(S(K =)(K 0)))I))(K 1))(S(S(K *)I) (S(K fac)(S(S(K -)I)(K 1))))

Evaluation is left as an “exercise to the reader.”

Monday, January 13, 14

THE SKGP• Implements programs as graphs

using both combinators with GP to produce pure functional (combinator) expressions

• Combinators have the property of being ‘structure altering operators’

• There is evidence that GP can be limited in its search ability without such a capability

Daida, unpublished based on Daida2004 Demonstrating Constraints to Diversity with a

Tunably Difficulty Problem for Genetic Programming

Monday, January 13, 14

CHURCH-ROSSER THEOREM

• The Church-Rosser Theorem says pure function evaluation can be order independent: Regardless of order of evaluation, result will be the same

• Because of this, each functional piece, when evaluated, can be stored for re-use since order of evaluation does not matter

• Because GP shares pieces across generations, reuse gives super-linear speed up: you don’t have to recompute each component

Monday, January 13, 14

CASE STUDY: MODELING THE MODEL

• Modeling chemical kinetics for NASA

• NASA had a set of first principle models used to simulate combustion of jet fuel and its exhaust gases: accurate but very slow

• By using the simulator to train the SKGP, it was able to produce a highly accurate function for predicting output gas amounts across a wide range of values

• Functional results was 2370x faster than running simulation

• Function was highly accurate empirical solution of PDEsMonday, January 13, 14

CASE STUDY: LISTENING TO DATA• Collaboration with Dr. Richard Cote and USC to study

bladder cancer

• Is there a molecular signature that matches T-stage of tumors? No! Attempt produced complicated, poorly performing functions

• Examining data showed that tumors with local metastasis were consistently misclassified

• Is there a signature in tumor that indicates local mets? Yes! Produced a set of concise, highly accurate, biologically sensitive functions that could identify when a tumor had metastasized

Monday, January 13, 14

SOME APPLICATIONS

• Inferential sensors (Dow Chemical)

• Financial modeling (Analytic Research Foundation, State Street Global Advisors)

• Antenna design (NASA)

• Analog circuit layout (Solido Design)

• Solid State Memory management (NVM durance)

Monday, January 13, 14

OPEN SOURCE SOLUTIONS

• Java: ECJ - a well known Java implementation from one of the well known researchers in GP

• Python: DEAP - an “all-in one package” written in Python

• Clojure: PushGP - a stack-based version of GP with many nice features, also written developed by a respected GP researcher

Monday, January 13, 14

PROPRIETARY

• Evolver by Evolution Enterprises: http://evolver.biz

• Data Modeler by Evolved Analytics: http://www.evolved-analytics.com/

Monday, January 13, 14

GP REFERENCES• J. Koza, Genetic Programming I-IV, Morgan Kauffman

and Kluwer.

• R. Poli, W.B. Langdon and N.F. McPhee, A Field Guide to Genetic Programming

• <Various> Genetic Programming Theory and Practice I-X1, 2002-2013

• Mitra et al, The use of genetic programming in the

analysis of quantitative gene expression profiles for

identification of nodal status in bladder cancer, BMC Cancer, 6(159) 2006

Monday, January 13, 14

POSSIBLE FUTURES• Some immediate areas of application include Smart Grid

and energy efficient designs, intrusion detection, discovery of protein-gene-SNP networks

• Since evolutionary algorithms give a multi-dimensional analysis in the form of a population of solutions they provide more information than a single solution

• EAs can continuous analyze data as it comes in, adapting to a changing environment while still providing high performance solutions

• There is a bridge from functions to full programs, though functional methods reduce the gap and could lead to functional co-applications (an ecology of functions)

Monday, January 13, 14

“THE BEST WAY TO PREDICT THE FUTURE IS TO INVENT IT.”

-ALAN KAY

Monday, January 13, 14

top related