automatic synthesis of statistical and machine learning ... › m › groups ›...
TRANSCRIPT
AutomaticSynthesisofstatisticalandmachinelearning
AlgorithmsJohannSchumannSGT,Inc.,NASAARC
Example:Analysisofterminaltrafficpattern
• Giventrajectorydataforlandingandtakeofffromtheairport• Clusteringoftrajectoriestofindstructureinthedata
Let’suseourfavorite“multivariateclustering”software/tool• Matlab• R• Weka• AutoBayes• …
N
Example:Analysisofterminaltrafficpattern
Result:Num_classes =5
Uups…expectingonly4classes
N
NeedaspecialPDFtohandleangles
Declarativeproblemspecificationforthetask• declareconstantsandparameters• theclassfrequencyisaprobability,i.e.addsupto1• cistheclassassignment• thedataareGaussiandistributed• Whatdowewant?
• estimatesoftheparameters• returntheclassassignment
Model cluster1 as ‘simple cluster analysis'.const nat N as ’data points'. where 0 < N.const nat C. where 0 < C. where C << N.double µ(0..C-1), s(0..C-1).
double phi(0..C-1) as 'class frequency'.where 0 = sum(_i := 0 .. C-1, phi(_i))-1.
output double c(0..N-1).c(_) ~ discrete(vector(_i := 0 .. C-1, phi(_i))).
data double x(0..N-1).x(_i) ~ gauss(µ(c(_i)), s(c(_i))).
max pr({x} |{phi, s, µ}) for {phi, s, µ}.
Correcteddeclarativespecification
• Modificationstohandleangles(changesinred)
model ac_track as 'AC track analysis'.const nat N as ’# tracks'. where 0 < N.const nat C. where 0 < C. where C << N.
double phi(0..C-1) as 'class frequency'.where 0 = sum(_i := 0 .. C-1, phi(_i))-1.
output double c(0..N-1).c(_) ~ discrete(vector(_i := 0 .. C-1, phi(_i))).
data double x(0..N-1).
max pr({x |{phi, }for {phi, }.
double l_x(0..C-1),m_x(0..C-1).
x(_i) ~ vonmises(l_x(c(_i)), m_x(c(_i))).
l_hd, m_hdl_hd, m_hd
ThisistheactualinputoftheAutoBayes tool
ProgramSynthesis• AutoBayes generatedanentirelydifferentalgorithm• 880linesofC++versus582lines• Highlydocumentedcode• InterfacestoMatlab oroctave
TheAutoBayes Synthesistool
• Toolfortheautomatedgenerationofcustomizeddataanalysisandmachinelearningalgorithms• Bayesianapproachforproblembreak-downandsolution• Fulldeclarativeinputspecifications• Targetlanguages:C,C++• Schema-basedprogramsynthesis
Schema-basedsynthesis:Example
• Weneedanalgorithmtomaximizemax -5*x**2 + 3*x + 7 for x.
thewecandothetext-bookstyletechnique:
1.calculatethederivative:df/dx2.setitto0:
0=df/dx3.solvethatequationforx4.thesolutionisthedesiredmaximum
Schemaforunivariateoptimization
1. buildthederivative:df/dx2. setitto0:0=df/dx3. solvethatequationforxand
generatecodeforthat4. thesolutionisthedesired
maximum
schema(max F for X, C) :-% INPUT (Problem), OUTPUT (Code fragment)
length(X, 1),
% calculate the first derivativesimplify(deriv(F, X), DF),% solve the equationsolve(true, 0 = DF, X, S),C = S
..
Schemaforunivariateoptimizationschema(max F for X, C) :-
% INPUT (Problem), OUTPUT (Code fragment)% guards
length(X, 1),
% calculate the first derivativesimplify(deriv(F, X), DF),% solve the equationsolve(true, x, 0 = DF, S),% is that really a maximum?
simplify(deriv(DF, X), DDF),solve(true, x, 0 < DDF, _),C = S
Schemaforunivariateoptimizationschema(max F for X, C) :-
% INPUT (Problem), OUTPUT (Code fragment)% guards
length(X, 1),
% calculate the first derivativesimplify(deriv(F, X), DF),% solve the equationsolve(true, x, 0 = DF, S),% possibly more checks% is that really a maximum?
simplify(deriv(DF, X), DDF),(solve(true, x, 0 > DDF, _)-> true ;writeln(‘Proof obligation not solved’)),
C = S
Schemaforunivariateoptimizationschema(max F for X, C) :-
% INPUT (Problem), OUTPUT (Code fragment)% guards
length(X, 1),
% calculate the first derivativesimplify(deriv(F, X), DF),% solve the equationsolve(true, x, 0 = DF, S),% possibly more checks% is that really a maximum?
simplify(deriv(DF, X), DDF),(solve(true, x, 0 > DDF, _)-> true ;writeln(‘Proof obligation not solved automatically’)),XP = [‘The maximum for‘, expr(F), ‘is calculated ...’],V = pv_fresh,C = let(assign(V, C, [comment(XP)]), V).
AutoBayes Applications
• NASAOpenSource:https://software.nasa.gov/software/ARC-16276-1• Manual:https://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/20080042409.pdf
educational
AstrophysicsSoftwareTesting
AirTrafficControl
EarthScience/Geodata
AlgorithmDesignviaSynthesis• Anappropriatemachine-learningalgorithm isthebasisforeachapplication• Somedomainsrequirespecializedalgorithmsoralgorithmvariantsforthespecifictaskathand.Examples:• Clustering,Optimization,Filtering
• Someprogramsynthesissystems(likeAutoBayes)cangenerateacustomizedalgorithm fromscratchorschemas• Algorithmvariantscanbegeneratedeasilyforthesametask,e.g.• K-means,EM,kd-trees,…forclustering• Gradientdescend,Nelder-Mead,BGFS,Levenberg-Margquart,…foroptimization• Matrix-inverse,Schmidt,Bierman update,etc.forKalman filter
AlgorithmInstantiationviaSynthesis• ManyMLtasksrequiresubstantialmathematicalpre-calculations• E.g.,discretizationandlinearizationofthestate-transitionmatrixinaKalmanfilter.
• Suchcalculationsneedtobedonebeforetheresultscanbe“plugged”intotheMLalgorithm• Algorithmsoftencanbeimprovedsuchresults(e.g.,matrixisdiagonal)• Examplesofsuchsystems:• AutoFilter:synthesisofKFandEKFalgorithms• *kf:synthesisofKalman filtersoftware[1]
[1]T.McClure.OntheAutomaticGenerationofrecursiveAttitudeDeterminationAlgorithms,AAS2017
WorkflowbySynthesis• Manymachine-learningtasksrequireanumberofindividualactivities• Readingandparsingofdata• Datatransformationandpreprocessing• Trainingandtesting• Assessmentofresults• Visualization• …
• Work-flowtoolforML• Reduceamountofcustomcode• Increaseflexibility
• Example:• WekaKnowledge-flow(notsynthesisperse)
ArchitecturebySynthesis• Modernhardwarearchitecturesenablesthefastandefficientexecutionofmachine-learningalgorithms• GPU,MPIcluster,Epiphany,FPGA,Tensorflow,neuromorphic,…
• VeryhighefforttoimplementaMLalgorithmsforaspecifichardwarearchitecture• Veryeasytogetthingswrong• Programsynthesistoolscangenerateefficientandcustomizedalgorithmsforanumberofdifferenttargetarchitectures• Fromasinglehigh-levelalgorithmdescription
• Example:• SpiralGen (spiralgen.com)
AlgorithmOptimizationbySynthesis
• Systemrequirementscanconstraintime,power,andhardwarealternatives• Whatisthe“best”algorithmandhardwarearchitectureforthegiventaskunderthegivenrequirements?• Synthesistechniquescanhelptosolvethatproblem:• Fastandeffort-lessgenerationofmultiplealgorithmsandarchitecturesforthesameMLtask• Automatedgenerationoftestdata,evaluationandtestenvironments
WhatelsecansynthesisdoforML?• Automaticgenerationofalgorithmdesigndocumentationanddetailedmathematicalderivations• Automaticgenerationofmultipleartifactsfordebugging,testing,verificationandvalidation
AlgorithmderivationgeneratedbyAutoBayes (excerpt)
SynthesisforML:Allproblemssolved?
SynthesisforML:Allproblemssolved?SynthesissystemsforMLapplicationshavehugeadvantages,but• Veryhardandcostlytodevelopandmaintain• Steeplearningcurvefortheuser,toolsoftentooresearchy• Difficultintegration,understanding,andmodifyingofgeneratedcode
Future• StronglyincreasedneedforMLalgorithmsonspecializedhard/softwareplatforms(UAS,(Cube)Sat,Rovers,…),• AvailabilityofspecialhardwareforML(Tensorflow,neuromorphic,highlyparallel,lowpower),• Rad-hardbyalgorithm,• Advanced,fastchangingMLtasksduringthemissionmakessynthesisforMachine-Learningalgorithmsandsoftwareanexcitingprospect