robots and automatic genome annotation
Embed Size (px)
DESCRIPTION
Robots and Automatic Genome Annotation. Ross D. King Department of Computer Science University of Wales, Aberystwyth. Talk Plan. Data Mining based gene function prediction The Robot Scientist Automating annotation and experimentation. Data Mining Prediction. - PowerPoint PPT PresentationTRANSCRIPT
-
Robots and Automatic Genome Annotation
Ross D. KingDepartment of Computer ScienceUniversity of Wales, Aberystwyth
-
Talk PlanData Mining based gene function prediction
The Robot Scientist
Automating annotation and experimentation
-
Data Mining PredictionWe have developed a method for predicting the functional class of gene products based on data mining.The idea is to learn a reliable predictive function on the examples of genes with products of known function.Then apply this function to genes where the functional class is unknown.Applied to: E. coli, M. tuberculosis, S. cerevisiae, A. thaliana.We call this approach: Data Mining Prediction (DMP).
-
Classification schemes (MIPS/GO)1,0,0,0 "METABOLISM"1,1,0,0 "amino acid metabolism"1,1,1,0 "amino acid biosynthesis"1,1,4,0 "regulation of amino acid metabolism"1,1,7,0 "amino acid transport"1,1,10,0 "amino acid degradation (catabolism)"1,1,99,0 "other amino acid metabolism activities"
1,2,0,0 "nitrogen and sulfur metabolism"1,3,0,0 "nucleotide metabolism"1,4,0,0 "phosphate metabolism"1,5,0,0 "C-compound and carbohydrate metabolism"1,6,0,0 "lipid, fatty-acid and isoprenoid metabolism"1,7,0,0 "metabolism of vitamins, cofactors, and prosthetic groups"1,20,0,0 "secondary metabolism"
... and ORFs may have multiple functions too!Hierarchy of classes
-
Sequence Data478 attributes in totalfielddescriptiontypeaa_rat_X% of amino acid X in the proteinrealseq_lenlength of the protein sequenceintaa_rat_pair_X_Y% of the amino acids X and Y consecutivelyrealmol_wtmolecular weight of the proteininttheo_pItheoretical pI (isoelectric point)realatomic_comp_Xatomic composition of X (C,H,N,O,S)realaliphatic_indexaliphatic indexrealhydrogrand average of hydropathyrealstrandthe DNA strand'w' or 'c'positionthe number of exons (no. of start positions)intcaicodon adaptation indexrealmotifsnumber of PROSITE motifsinttmSpansnumber of transmembrane spansintchromosomechromosome number1..16,mit
-
Homology dataYAL001C: mvltiypdelvqivsdkiasnkgkitlnqlwdisgkyfdlsdk....sfc3:keyword(membrane)length(358)dbref(prosite)dbref(embl)We look up the associated information from SwissProt
-
Predicted Secondary Structure Datamvltiypdelvqivsdkiasnkgkitlnqlwdisgkyfdlsdkkvk...cbbbbccaaaaaaaaaaaacccccbbbbaaaaaacccbbccccccb...We record length and relative positions of the secondary structure elements.
This is relational data.
-
Expression DataSpellman et al (1998), Roth et al (1998)DeRisi et al (1997), Eisen et al (1998)Gasch et al (2000, 2001), Chu et al (1998)Microrarray experiments to measure expression changes in yeast under a variety of conditions, including cell cycle, heat shock, diauxic shift.Short time series data, numerical-valued
-
Phenotype DataData from knockout gene growth experiments Many missing dataData taken from 3 sources (TRIPLES, MIPS, EUROFAN)s = sensitive (less growth)w = wild-type (no observable effect) r = resistant (more growth)n = no dataORF
YAL001CYAL019WYAL021CYAL029Ccalcofluor white w n n nsorbitol
n s n wbenomyl
n w n w...deleted ORFgrowth medium H2O2
w w n r
-
What are the Machine Learning Issues? Large volume of data Missing data Accurate results required Intelligible results required Class hierarchy Multiple labels Relational data
-
Data Mining Prediction (DMP)Entire databaseData for rule creation2/31/32/31/3PolyFARMC4.5Rule gener-ationSelectbestrulesMeasureruleaccuracyValidation dataTrainingdataAllrulesBestrulesTest dataResults
-
Application to Bacterial GenomesSuccessful for both M. tuberculosis and E. coli.Of the ORFs with no assigned function >40% were predicted to have a function at one or more levels of the class hierarchy. It was found that many of the predictive rules were more general than possible using sequence homology. ReferencesKing et al. (2000) KDD 2000King et al. (2000) Yeast (Comparative and Functional Genomics)King et al. (2001) Bioinformatics
-
Summary Results (Bacteria)Using voting (2 or more rules agree on a prediction)Level 2 :128 ORFs predicted - 87.5% accuracyLevel 3 : 23 ORFs predicted - 91.3% accuracy
All predictionsLevel 2 :335 ORFs predicted - 64.5% accuracyLevel 3: 204 ORFs predicted - 44.6% accuracy
-
Example Rule (level 2 E. coli) If the ORF is not predicted to have a b-strand of length 3 a homologous protein from class Chytridiomycetes was foundThen its functional class is Cell processes, Transport/binding proteins
12/13 (86%) correct on Test Set - probability of this result occurring by chance is estimated at 4x10-7. 24 ORFs of unknown function are predicted by the rule.
16 ORFs now with putative or confirmed function - 93.8% accurate predictions
-
Experimental ConformationThe original bacterial ORF predictions were made over three years ago. In the intervening time many more ORFs have been sequenced, making traditional homologous prediction methods more accurate and sensitive, and the function of some ORFs have been determined by wet biology.The E. coli genome has recently been re-annotated by Monica Rileys group.
-
Wet Biology conformationA number of predictions have been confirmed or falsified by new wet experimental data.
This new data is biased towards hard classes. Despite this the results are still good:Level 2: 23 predictions - 47.8% accuracyLevel 3: 23 predictions - 43.4% accuracy
This is very much better than random as there are many classes.
-
Confirmation of Wet Predictions
ORF
Rule
Predicted Class
Confirmed Function
Result
b0805
8
Cell envelop
Outer membrane protein
C
b1519
15
Degradation of small molecules
Trans-aconitate methyltransferase
C
b1533
43
Transport/binding proteins
Cysteine pathway metabolite transport
C
b1981
42
Transport/binding proteins
Shikimate and dehydroshikimate transport protein
C
b1981
56
Transport/binding proteins
Shikimate and dehydroshikimate transport protein
C
b2210
15
Degradation of small molecules
Malate:quinone oxidoreductase
C
b2392
43a
Transport/binding proteins
High-affinity manganese transporter
C
b2392
43b
Transport/binding proteins
High-affinity manganese transporter
C
b2392
54
Transport/binding proteins
High-affinity manganese transporter
C
b2924
45
Transport/binding proteins
Component of the MscS mechanosensitive channel new gene family
C
b3839
43
Transport/binding proteins
Essential component of translocase
C
b0103
42
Transport/binding proteins
dephospho-CoA kinase
W
b0103
41
Transport/binding proteins
dephospho-CoA kinase
W
b0103
43
Transport/binding proteins
dephospho-CoA kinase
W
b1822
15
Degradation of small molecules
23S rRNA m1G745 methyltransferase
W
b2530
35
Global regulatory functions
cysteine desulfurase
W
b2392
14
Degradation of small molecules
High-affinity manganese transporter
W
b2889
50
Energy metabolism carbon
Isopentenyl diphosphate isomerase
W
b3222
54
Transport/binding proteins
ManNAc kinase
W
b3223
39
Ribosome constituents
ManNAc epimerase
W
b3337
28
Laterally acquired elements
regulatory or redox component
W
b3338
39
Ribosome constituents
Periplasmic endochitinase
W
b3569
32
Laterally acquired elements
transcriptional regulator of xylose utilization
W
b3955
8
Cell envelop
Required for invasion of brain microvascular endothelial cells
EF
b3955
18
Energy metabolism carbon
Required for invasion of brain microvascular endothelial cells
EA
b3955
20
Energy metabolism carbon
Required for invasion of brain microvascular endothelial cells
EA
-
Results (Yeast)Many rules from each data typeRules at each level of hierarchySome classes are much easier to predict than others (for example "protein synthesis" at 71-93%, "energy" at 20-47%)Good levels of accuracy on held out test dataMany predictions for ORFs of unknown function (some function at some level is predicted for 96% of the ORFs of unknown function)Some rules explainable by biology -> scientific knowledge discoveryClare & King (2003) Bioinformatics suppl. 2., 42-49
-
Accuracy Table
Level
Datatype
1
2
3
4
all
Seq
55
55
33
0
71
Struc
49
43
0
0
58
Hom
65
38
69
20
55
Expr
42
37
35
0
75
Phen
75
40
7
0
68
-
Extension to Arabidopsis GenomeCollaborative project with the Institute of Grassland and Environmental Research and the University of Nottingham.Large increase in data: 6,000 -> 25,000 ORFs. Large amount of micro-array data from the Nottingham Arabidopsis stock centre. 250 million Prolog facts, 200,000 attributes, File sizes almost 2Gb 7,964 gene function predictions with an expected accuracy >70%, 2,974 with an expected accuracy >90%, We are currently growing 14 knockout varieties of Arabidopsis to test a sample of these predictions
-
AvailabilityAll rules and data available at http://www.aber.ac.uk/compsci/Research/bio/dss/All predictions available at http://www.genepredictions.org
-
The Robots Scientist
-
The Robot Scientist ConceptBackground KnowledgeMachine LearningAnalysisConsistentHypothesisFinal TheoryExperiment(s) selectionRobotExperiments(s)ResultsThe robot scientist project aims to develop a computer system that is capable of originating its own experiments, physically doing them, interpreting the results, and then repeating the cycle.
-
Motivation: TechnologicalIn many areas of science our ability to generate data is outstripping our ability to analyse the data.
One scientific area where this is true is functional genomics, where data is now being generated on an industrial scale.
The analysis of scientific data needs to become as industrialised as its generation.
-
The Application DomainFunctional genomicsIn yeast (S. cerivasae) ~30% of the 6,000 genes still have no known function.EUROFAN 2 has knocked out each of the 6,000 genes in mutant strains.Task to determine the function of the gene by auxotrophic growth experiments comparing mutants and wild type.
-
Logical Cell ModelWe have built a logical model of the known metabolic pathways (coded in Prolog) - taken from KEGG and other bioinformatic sources. This is essentially a directed graph: with metabolites as nodes and enzymes as arcs.
If a path can be found from cell inputs (metabolites in the growth medium) to all the cell outputs (essential compounds), then the cell can grow.
-
AAA Model SystemWe started using the aromatic amino-acid (AAA) pathway in yeast as a model system to prove the principle of the Robot Scientist.
9 metabolities can be used of the shelf15 knockout mutants from Eurofan
The mutant can grow iff all three aromatic amino-acids can be synthesised (tyrosine, phenyalalanine, tryptophan). Based on a pathway from glycerate-2-phophate.
-
Glycerate-2-PhosphatePhosphoenolpyruvateD-Erythrose-4-Phosphate3-deoxy-D-arabino-heptulosonate-7-phosphate3-Dehydroquinate3-Dehydroshikimate5-DehydroshikimateShikimateShikimate 3-phosphate5-o-1-carboxyvinyl-3-phosphoshikimateChorismatePrephenatep-HydroxyphenylpyruvateTYROSINEPhenylpyruvatePHENYLALANINEAnthranilateTRYPTOPHANN-5-Phospho--d-ribosylanthranilate1-(2-Carboxylphenylamino)-1-deoxy-D-ribulose-5-phosphate(3-Indolyl)-glycerolphosphateIndoleYBR249CYDR035WYGR254WYHR174WYMR323W
YDR127WYDR127WYDR127WYDR127WYDR127WYDR127WYPR060CYBR166CYHR137WYGL202WYNL316CYGL148WYDR354WYDR007WYKL211CYGL026CYGL026CYGL026CYER090W(YKL211C)C00631C00074C00279C04961C00944C02637C02652C00493C03175C01269C00251C00254C01179C00166C03506C01302C00108C04302C00463C00078C00079C00082YHR137WYGL202WPhenyalanine, Tyrosine, and Tryptophan Pathways for S. cerivisaeGrowth MediumMetabolite import
-
Experimental MethodologyExperiments consist of making particular growth media and testing if the mutants can grow (add metabolites to a basic defined medium).
A mutant is auxotrophic if cannot grow on a defined medium that the wild type can grow on.
By observing the pattern of chemicals that recover growth the function of the knocked out mutant can be inferred.
-
Inferring HypothesesIn the philosophy of science. It has often been argued that only humans can make the leaps of imagination necessary to form hypotheses.
We use Abductive Logic Programming to infer missing arcs/labels in our metabolic graph. With these missing nodes we can explain (deductively) all the experimental results.
Reiser et al., (2001) ETAI 5, 233-244;
-
The Form of the HypothesesThe form of the hypotheses we can infer is currently quite simple. Each hypothesis binds a particular gene to an enzyme that catalyses the reaction.A correct hypothesis would be that: YDR060C codes for the enzyme for the reaction chorismate prephenate.An incorrect hypothesis would be that: it coded for the reaction chorismate anthranilate.We have also demonstrated how more complex abductive hypotheses could be formed.
-
A Discriminating ExperimentHypothesis 1: YDR060C codes for the enzyme the reaction: chorismate prephenate.Hypothesis 2: YDR060C codes for the enzyme the reaction: chorismate anthranilate.
These can be distinguished by growing the knockout YDR060C on prephenate or anthranilate. Note that these two experiments will have differing monetary cost.
-
Glycerate-2-PhosphatePhosphoenolpyruvateD-Erythrose-4-Phosphate3-deoxy-D-arabino-heptulosonate-7-phosphate3-Dehydroquinate3-Dehydroshikimate5-DehydroshikimateShikimateShikimate 3-phosphate5-o-1-carboxyvinyl-3-phosphoshikimateChorismatePrephenatep-HydroxyphenylpyruvateTYROSINEPhenylpyruvatePHENYLALANINEAnthranilateTRYPTOPHANN-5-Phospho--d-ribosylanthranilate1-(2-Carboxylphenylamino)-1-deoxy-D-ribulose-5-phosphate(3-Indolyl)-glycerolphosphateIndoleYBR249CYDR035WYGR254WYHR174WYMR323W
YDR127WYDR127WYDR127WYDR127WYDR127WYDR127WYPR060CYBR166CYHR137WYGL202WYNL316CYGL148WYDR354WYDR007WYKL211CYGL026CYGL026CYGL026CYER090W(YKL211C)C00631C00074C00279C04961C00944C02637C02652C00493C03175C01269C00251C00254C01179C00166C03506C01302C00108C04302C00463C00078C00079C00082YHR137WYGL202WPhenyalanine, Tyrosine, and Tryptophan Pathways for S. cerivisaeGrowth MediumMetabolite import
-
Inferring ExperimentsGiven a set of hypotheses we wish to infer an experiment that will efficiently discriminate between them
Assume:Every experiment has an associated cost.Each hypothesis has a probability of being correct.
The task:To choose a series of experiments which minimise the expected cost of eliminating all but one hypothesis.
-
Comparison of different experimental strategies
ASE - Expected cost minimization.
Nave - Choose cheapest experiment.
Random - Randomly choose experiments.
The cost of a series of experiment is a function of the time taken and money spent. Time is Money.
-
The RobotBiomek 200
-
Closing the LoopWe have physically implemented all aspects of the Robot Scientist system.
To the best of our knowledge this is the first active learning system that both explicitly forms hypotheses and experiments, and physicals does real experiments.
-
Accuracy v TimeAt the end of the 5th iteration: ASE 80.1%, Nave 74.0%, Random 72.2%. ASE was significantly more accurate than either Nave (p < 0.05) or Random (p < 0.07) using a paired t-test.
RS Accuracy
57.358157.358157.3581
67.17187567.34892187567.265625
76.13599687572.585412568.550709375
79.54475312571.577687573.828478125
80.4737687573.03236562573.886753125
80.10918437572.164312573.93518125
ase
random
naive
Iterations
Classification Accuracy (%)
RS cost vs accuracy
57.358157.358157.3581
67.17187567.34892187567.265625
76.13599687572.585412568.550709375
79.54475312571.577687573.828478125
80.4737687573.03236562573.886753125
80.10918437572.164312573.93518125
ase
random
naive
Log 10 Cost ()
Classification Accuracy (%)
results
AseRandomNaiveday 1
GeneErrorLoopTechniqueIterationCostAccuracyGeneErrorLoopTechniqueIterationCostAccuracyGeneErrorLoopTechniqueIterationCostAccuracy
AseTests (individual Gene)Ave Accuracy
YBR166C1ase11069.3827YBR166C1random124781.3333YBR166C1naive11069.3827Iterationave accuracyave costLog ave costAseRandomNaive
YDR007W1ase11074.321YDR007W1random18179.4444YDR007W1naive11074.321057.358100Gene
YDR035W1ase11028.8889YDR035W1random16253.4503YDR035W1naive11028.8889167.171875101T-TestsCostYBR166C69.382772.23822569.3827
YDR354W1ase11074.321YDR354W1random124776.4646YDR354W1naive11074.321276.13599687557.68751.7610817184YDR007W71.67642575.29172571.676425
YER090W1ase11059.7531YER090W1random110463.1579YER090W1naive11059.7531379.544753125184.031252.264891576Iterationase|randomase|naverandom|naveYDR035W47.3099545.08772548.05995
YGL026C1ase11079.2593YGL026C1random16281.1111YGL026C1naive11079.2593480.47376875255.6252.4076033254YDR354W74.32176.118174.321
YKL211C1ase11063.7427YKL211C1random124163.7427YKL211C1naive11063.7427580.109184375313.81252.496670238710.05645075200.056450752YER090W59.753160.35167559.7531
YNL316C1ase11069.3827YNL316C1random120569.3827YNL316C1naive11069.382720.00038079300.0003592261YGL026C79.259373.34077579.2593
YBR166C2ase11069.3827YBR166C2random118978.5185YBR166C2naive11069.382730.00002549460.00000000020.0000193599YKL211C71.67642567.11257571.676425
YDR007W2ase11063.7427YDR007W2random182263.7427YDR007W2naive11063.742740.00000066860.00000000180.0000004905YNL316C63.996169.25057563.9961
YDR035W2ase11053.4503YDR035W2random16253.4503YDR035W2naive11053.4503Random50.00000004080.0000000030.0000000303
YDR354W2ase11074.321YDR354W2random1942776.4646YDR354W2naive11074.321Iterationave accuracyave costLog ave cost
YER090W2ase11059.7531YER090W2random1942767.9798YER090W2naive11059.7531057.358100All0.00000000000.00000000020.0000000000
YGL026C2ase11079.2593YGL026C2random124180.4444YGL026C2naive11079.2593167.348921875805.468752.9060486956day 5AccuracyAccuracy/Cost compares ase at day 3, random at day 0 and naive at day 5Accuracy day 3Accuracy day 4
YKL211C2ase11074.321YKL211C2random110463.7427YKL211C2naive11074.321272.58541253607.8753.5572514824
YNL316C2ase11069.3827YNL316C2random122464.5455YNL316C2naive11069.3827371.57768755641.656253.7514066208Tests (individual Gene)
YBR166C3ase11069.3827YBR166C3random18164.5556YBR166C3naive11069.3827473.0323656257381.93753.8681703639T-TestsAccuracyAseRandomNaiveAse/RandomAse/NaiveRandom/NaiveAse (3)Random (0)naive (5)Ase/RandomAse/NaiveRandom/NaiveAse (3)Random (0)naive (5)Ase/RandomAse/NaiveRandom/NaiveAse (3)Random (0)naive (5)Ase/RandomAse/NaiveRandom/Naive
YDR007W3ase11074.321YDR007W3random12976.8687YDR007W3naive11074.321572.16431259021.93753.9552998142Gene
YDR035W3ase11053.4503YDR035W3random16253.4503YDR035W3naive11053.4503Iterationase|randomase|naverandom|naveYBR166C86.4444581.1018565.6481255.342620.79632515.453725YBR166C79.01232546.198865.64812532.81352513.3642-19.449325YBR166C79.01232581.1018569.3827-2.0895259.62962511.71915YBR166C86.4444581.1018565.6481255.342620.79632515.453725
YDR354W3ase11074.321YDR354W3random16277.2222YDR354W3naive11074.321YDR007W97.08332569.53702584.83022527.546312.2531-15.2932YDR007W97.08332563.742784.83022533.34062512.2531-21.087525YDR007W97.08332569.53702584.83022527.546312.2531-15.2932YDR007W97.08332569.53702584.83022527.546312.2531-15.2932
YER090W3ase11059.7531YER090W3random110447.1111YER090W3naive11059.7531Naive10.95316011030.97287765720.9778616371YDR035W52.91667549.4722553.444475-2.083325-5.5278YDR035W52.91667553.450355-0.533625-2.083325-1.5497YDR035W52.91667537.83477553.83772515.0819-0.92105-16.00295YDR035W52.91667549.472254.6125753.444475-1.6959-5.140375
YGL026C3ase11079.2593YGL026C3random168556.9591YGL026C3naive11079.2593Iterationave accuracyave costLog ave cost20.41658982970.0394205960.3112710687YDR354W97.08332567.31482586.635829.768510.447525-19.320975YDR354W10063.742486.635836.257613.3642-22.8934YDR354W10074.2592581.3271525.7407518.67285-7.0679YDR354W10074.2592586.635825.7407513.3642-12.37655
YKL211C3ase11074.321YKL211C3random16277.2222YKL211C3naive11074.321057.35810030.09015137330.14196970860.5990318345YER090W70.9876575.57407577.9321-4.586425-6.94445-2.358025YER090W70.9876563.157977.93217.82975-6.94445-14.7742YER090W70.9876575.57407577.9321-4.586425-6.94445-2.358025YER090W70.9876575.57407577.9321-4.586425-6.94445-2.358025
YNL316C3ase11069.3827YNL316C3random138478.5185YNL316C3naive11069.3827167.26562510140.12407125270.11446243280.8481660016YGL026C82.96379.42507581.111153.5379251.85185-1.686075YGL026C82.96356.959181.1111526.00391.85185-24.15205YGL026C82.96379.42507579.25933.5379253.70370.165775YGL026C82.96379.42507581.111153.5379251.85185-1.686075
YBR166C4ase11069.3827YBR166C4random122464.5455YBR166C4naive11069.3827268.550709375391.59106460750.10366336520.13403453560.696108696YKL211C84.24382585.63887581.32715-1.395052.9166754.311725YKL211C84.24382563.742781.3271520.5011252.916675-17.58445YKL211C84.24382585.63887581.32715-1.395052.9166754.311725YKL211C84.24382585.63887581.32715-1.395052.9166754.311725
YDR007W4ase11074.321YDR007W4random110481.1111YDR007W4naive11074.321373.828478125781.8920946027YNL316C69.15122569.25057558.9969-0.0993510.15432510.253675YNL316C69.15122547.836358.996921.31492510.154325-11.1606YNL316C69.15122569.25057562.731475-0.099356.419756.5191YNL316C69.15122569.25057558.9969-0.0993510.15432510.253675
YDR035W4ase11053.4503YDR035W4random119920YDR035W4naive11056.4503473.8867531251302.1139433523All0.00712203400.00245050070.9336110841
YDR354W4ase11074.321YDR354W4random164374.321YDR354W4naive11074.321573.935181251822.260071388ALL640.873475577.3145591.4814563.55897549.392025-14.16695All636.358025458.8302591.48145177.52782544.876575-132.65125All636.358025572.6215590.62782563.73652545.7302-18.006325All643.79015584.258925591.09402559.53122552.696125-6.8351
YER090W4ase11059.7531YER090W4random124163.1579YER090W4naive11059.7531
YGL026C4ase11079.2593YGL026C4random124774.8485YGL026C4naive11079.2593
YKL211C4ase11074.321YKL211C4random121863.7427YKL211C4naive11074.321MEAN7.9448718756.174003125-1.77086875MEAN22.1909781255.609571875-16.58140625MEAN7.9670656255.716275-2.250790625MEAN7.4414031256.587015625-0.8543875
YNL316C4ase11047.8363YNL316C4random168564.5556YNL316C4naive11047.8363ST DEV13.17958566638.899776414411.8538744459ST DEV12.96058246527.77556664427.419043069ST DEV12.98862052347.95207395310.024391646ST DEV12.27253050989.104671927110.5464561275
YBR166C1ase26269.3827YBR166C1random248881.3333YBR166C1naive23969.3827TTEST1.70502257681.9621524288-0.4225431297TTEST4.8428042972.0405284882-6.3214755283TTEST1.73492361852.0331887452-0.6350706837TTEST1.71500624332.0463003845-0.2291360008
YDR007W1ase26295.5556YDR007W1random213393.1481YDR007W1naive23974.321
YDR035W1ase26243.3333YDR035W1random225753.4503YDR035W1naive23928.8889
YDR354W1ase26295.5556YDR354W1random244676.4646YDR354W1naive23974.321sqrt(8)2.8284271247confidence fig for 7 degrees freedom1.895
YER090W1ase26280YER090W1random230972YER090W1naive23959.7531
YGL026C1ase26286.6667YGL026C1random2962681.1111YGL026C1naive23979.2593Accuracies for all genes at day 0
YKL211C1ase23988.3333YKL211C1random2962695.5556YKL211C1naive23988.3333
YNL316C1ase26269.3827YNL316C1random240469.3827YNL316C1naive23969.3827GeneDay (0)CostAccuracy
YBR166C2ase26269.3827YBR166C2random238878.5185YBR166C2naive23969.3827YBR166C0046.1988
YDR007W2ase23988.3333YDR007W2random21024963.7427YDR007W2naive23988.3333YDR007W0063.7427
YDR035W2ase23953.4503YDR035W2random214353.4503YDR035W2naive23953.4503YDR035W0053.4503
YDR354W2ase26295.5556YDR354W2random21883176.4646YDR354W2naive23974.321YDR354W0063.7427
YER090W2ase26280YER090W2random2953185.3704YER090W2naive23959.7531YER090W0063.1579
YGL026C2ase26286.6667YGL026C2random229380.4444YGL026C2naive23979.2593YGL026C0056.9591
YKL211C2ase26295.5556YKL211C2random2953182.0833YKL211C2naive23974.321YKL211C0063.7427
YNL316C2ase26269.3827YNL316C2random227664.5455YNL316C2naive23969.3827YNL316C0047.8363
YBR166C3ase26269.3827YBR166C3random227664.5556YBR166C3naive23969.3827
YDR007W3ase26295.5556YDR007W3random227676.8687YDR007W3naive23974.321
YDR035W3ase23955YDR035W3random214317.3333YDR035W3naive23953.4503Changes in accuracy between Day n and Day n-1Changes in accuracy between Day n and Day n-1Changes in accuracy between Day n and Day n-1
YDR354W3ase26295.5556YDR354W3random272477.2222YDR354W3naive23974.321Run 1AseRun 1RandomRun 1Naive
YER090W3ase26259.7531YER090W3random2948947.1111YER090W3naive23959.7531
YGL026C3ase26279.2593YGL026C3random288081.2963YGL026C3naive23979.2593Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)
YKL211C3ase26274.321YKL211C3random230977.2222YKL211C3naive23974.321YBR166C23.183900-5.82710YBR166C35.13450000YBR166C23.18390000
YNL316C3ase26269.3827YNL316C3random2978878.5185YNL316C3naive23969.3827YDR007W10.578321.23464.444400YDR007W15.701713.7037000YDR007W10.5783014.012300
YBR166C4ase26269.3827YBR166C4random2305100YBR166C4naive23969.3827YDR035W-24.561414.44443.333400YDR035W00046.54970YDR035W-24.5614026.111100
YDR007W4ase26295.5556YDR007W4random229981.1111YDR007W4naive23974.321YDR354W10.578321.23464.444400YDR354W12.72190-29.797900YDR354W10.57830021.23460
YDR035W4ase23955YDR035W4random2960320YDR035W4naive23953.4503YER090W-3.404820.24694.444400YER090W08.842112.444400YER090W-3.4048036.35800
YDR354W4ase26295.5556YDR354W4random286174.321YDR354W4naive23974.321YGL026C22.30027.4074000YGL026C24.1520000YGL026C22.3002007.40740
YER090W4ase26259.7531YER090W4random2966885.3704YER090W4naive23959.7531YKL211C024.5906000YKL211C031.8129000YKL211C024.5906000
YGL026C4ase26279.2593YGL026C4random232874.8485YGL026C4naive23979.2593YNL316C21.5464012.839500YNL316C21.54640000YNL316C21.54640000
YKL211C4ase26274.321YKL211C4random2104695.3333YKL211C4naive23974.321
YNL316C4ase23942.7778YNL316C4random292664.5556YNL316C4naive23942.7778
YBR166C1ase316669.3827YBR166C1random368781.3333YBR166C1naive37869.3827Run 2AseRun 2RandomRun 2Naive
YDR007W1ase3251100YDR007W1random337493.1481YDR007W1naive37888.3333
YDR035W1ase325146.6667YDR035W1random348153.4503YDR035W1naive37855Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)
YDR354W1ase3251100YDR354W1random367046.6667YDR354W1naive37874.321YBR166C23.1839012.8395240YBR166C32.31970000YBR166C23.18390000
YER090W1ase325184.4444YER090W1random349884.4444YER090W1naive37896.1111YDR007W024.5906012.33330YDR007W0023.479500YDR007W024.5906000
YGL026C1ase325186.6667YGL026C1random3983181.1111YGL026C1naive37879.2593YDR035W001.5497-210YDR035W00-6.228100YDR035W00001.5497
YKL211C1ase37888.3333YKL211C1random31031195.5556YKL211C1naive37888.3333YDR354W10.578321.23464.444424-11.6667YDR354W12.72190-2.02020-27.7777YDR354W10.5783014.012300
YNL316C1ase316682.2222YNL316C1random3983169.3827YNL316C1naive37869.3827YER090W-3.404820.2469040YER090W4.821917.3906000YER090W-3.4048036.35800
YBR166C2ase316682.2222YBR166C2random3105078.5185YBR166C2naive37869.3827YGL026C22.30027.4074010.66670YGL026C23.48530000YGL026C22.30020000
YDR007W2ase37888.3333YDR007W2random31035387.2222YDR007W2naive37888.3333YKL211C10.578321.23464.4444240YKL211C018.3406-7.638900YKL211C10.5783014.012300
YDR035W2ase37855YDR035W2random382847.2222YDR035W2naive37853.4503YNL316C21.5464012.83956.22220YNL316C16.70920000YNL316C21.546400-14.93830
YDR354W2ase3251100YDR354W2random31947474.4444YDR354W2naive37888.3333
YER090W2ase325180YER090W2random3959385.3704YER090W2naive37896.1111
YGL026C2ase325186.6667YGL026C2random3967880.4444YGL026C2naive37879.2593Run 3AseRun 3RandomRun 3Naive
YKL211C2ase3251100YKL211C2random31891674.4444YKL211C2naive37888.3333
YNL316C2ase316682.2222YNL316C2random332864.5455YNL316C2naive37869.3827Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)
YBR166C3ase316682.2222YBR166C3random352364.5556YBR166C3naive37869.3827YBR166C23.1839012.839500YBR166C18.35680000YBR166C23.18390000
YDR007W3ase3251100YDR007W3random332851.1111YDR007W3naive37888.3333YDR007W10.578321.23464.444400YDR007W13.1260-25.757600YDR007W10.5783014.012300
YDR035W3ase37855YDR035W3random3957017.3333YDR035W3naive37853.4503YDR035W01.5497000YDR035W0-36.117000YDR035W0001.54970
YDR354W3ase3251100YDR354W3random378675.9259YDR354W3naive37874.321YDR354W10.578321.23464.444400YDR354W13.47950-1.296300YDR354W10.57830000
YER090W3ase316659.7531YER090W3random31013247.1111YER090W3naive37859.7531YER090W-3.40480000YER090W-16.04680000YER090W-3.40480000
YGL026C3ase316679.2593YGL026C3random394281.2963YGL026C3naive37879.2593YGL026C22.30020000YGL026C024.3372000YGL026C22.30020000
YKL211C3ase316674.321YKL211C3random331977.2222YKL211C3naive37874.321YKL211C10.57830000YKL211C13.47950000YKL211C10.57830000
YNL316C3ase316669.3827YNL316C3random31921578.5185YNL316C3naive37869.3827YNL316C21.54640000YNL316C30.68220000YNL316C21.54640000
YBR166C4ase316682.2222YBR166C4random3344100YBR166C4naive37869.3827
YDR007W4ase3251100YDR007W4random398446.6667YDR007W4naive37874.321
YDR035W4ase37855YDR035W4random31028833.3333YDR035W4naive37853.4503Run 4AseRun 4RandomRun 4Naive
YDR354W4ase3251100YDR354W4random31079100YDR354W4naive37888.3333
YER090W4ase316659.7531YER090W4random31035385.3704YER090W4naive37859.7531Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)
YGL026C4ase316679.2593YGL026C4random354674.8485YGL026C4naive37879.2593YBR166C23.1839012.839517.77780YBR166C18.346735.4545000YBR166C23.183900-14.93830
YKL211C4ase316674.321YKL211C4random31061095.3333YKL211C4naive37874.321YDR007W10.578321.23464.444400YDR007W17.36840-34.444400YDR007W10.57830000
YNL316C4ase37842.7778YNL316C4random3161164.5556YNL316C4naive37842.7778YDR035W01.5497000YDR035W-33.4503013.333300YDR035W3-301.54970
YBR166C1ase427063.5556YBR166C1random41025181.3333YBR166C1naive413069.3827YDR354W10.578321.23464.444400YDR354W10.5783025.67900YDR354W10.5783014.012300
YDR007W1ase4280100YDR007W1random4105993.1481YDR007W1naive413088.3333YER090W-3.40480000YER090W022.2125000YER090W-3.40480000
YDR035W1ase428046.6667YDR035W1random4728100YDR035W1naive413055YGL026C22.30020000YGL026C17.88940000YGL026C22.30020000
YDR354W1ase4280100YDR354W1random467046.6667YDR354W1naive413095.5556YKL211C10.57830000YKL211C031.5906000YKL211C10.57830000
YER090W1ase428084.4444YER090W1random4116084.4444YER090W1naive413096.1111YNL316C0-5.0585000YNL316C16.71930000YNL316C0-5.0585000
YGL026C1ase445086.6667YGL026C1random4986081.1111YGL026C1naive413086.6667
YKL211C1ase413088.3333YKL211C1random41036395.5556YKL211C1naive413088.3333
YNL316C1ase421882.2222YNL316C1random41047469.3827YNL316C1naive413069.3827
YBR166C2ase4218100YBR166C2random4173578.5185YBR166C2naive413069.3827
YDR007W2ase413088.3333YDR007W2random41103887.2222YDR007W2naive413088.3333Average change for all 4 runsAverage change for all 4 runsAverage change for all 4 runs
YDR035W2ase413055YDR035W2random41025547.2222YDR035W2naive413053.4503AverageAseAverageRandomAverageNaive
YDR354W2ase4280100YDR354W2random41971574.4444YDR354W2naive413088.3333
YER090W2ase445080YER090W2random4962285.3704YER090W2naive413096.1111Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)
YGL026C2ase445086.6667YGL026C2random41006280.4444YGL026C2naive413079.2593YBR166C23.183909.6296258.9876750YBR166C26.0394258.863625000YBR166C23.183900-3.7345750
YKL211C2ase4280100YKL211C2random41960174.4444YKL211C2naive413088.3333YDR007W7.93372522.07363.33333.0833250YDR007W11.5490253.425925-9.18062500YDR007W7.9337256.147657.0061500
YNL316C2ase421882.2222YNL316C2random499064.5455YNL316C2naive413054.4444YDR035W-6.140354.385951.220775-5.250YDR035W-8.362575-9.029251.776311.6374250YDR035W-5.39035-0.756.5277750.774850.387425
YBR166C3ase421882.2222YBR166C3random4995064.5556YBR166C3naive413069.3827YDR354W10.578321.23464.44446-2.916675YDR354W12.37540-1.858850-6.944425YDR354W10.578307.006155.308650
YDR007W3ase4280100YDR007W3random432851.1111YDR007W3naive413088.3333YER090W-3.404810.123451.111110YER090W-2.80622512.11133.111100YER090W-3.4048018.17900
YDR035W3ase413055YDR035W3random4976917.3333YDR035W3naive413055YGL026C22.30023.703702.6666750YGL026C16.3816756.0843000YGL026C22.3002001.851850
YDR354W3ase4280100YDR354W3random483875.9259YDR354W3naive413074.321YKL211C7.93372511.45631.111160YKL211C3.36987520.436025-1.90972500YKL211C7.9337256.147653.50307500
YER090W3ase427059.7531YER090W3random41037347.1111YER090W3naive413059.7531YNL316C16.1598-1.2646256.419751.555550YNL316C21.4142750000YNL316C16.1598-1.2646250-3.7345750
YGL026C3ase427079.2593YGL026C3random4162781.2963YGL026C3naive413079.2593
YKL211C3ase427074.321YKL211C3random437177.2222YKL211C3naive413074.321Average for all genesAverage for all genesAverage for all genes
YNL316C3ase427069.3827YNL316C3random41945678.5185YNL316C3naive413069.3827All9.81806258.9641218753.408756253.005403125-0.364584375All9.9951093755.236490625-1.0077251.454678125-0.868053125All9.91181251.2850843755.277768750.0582750.048428125
YBR166C4ase4218100YBR166C4random4344100YBR166C4naive413054.4444
YDR007W4ase4280100YDR007W4random498446.6667YDR007W4naive413074.321Comparison of change in accuracy between techniques (AccChangeTech1 - AccChangeTech2)Comparison of change in accuracy between techniques (AccChangeTech1 - AccChangeTech2)Comparison of change in accuracy between techniques (AccChangeTech1 - AccChangeTech2)
YDR035W4ase413055YDR035W4random41028833.3333YDR035W4naive413055Ase-RandomAse -NaiveRandom -Naive
YDR354W4ase4280100YDR354W4random41463100YDR354W4naive413088.3333
YER090W4ase427059.7531YER090W4random41991785.3704YER090W4naive413059.7531Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)
YGL026C4ase427079.2593YGL026C4random4117974.8485YGL026C4naive413079.2593YBR166C-2.855525-8.8636259.6296258.9876750YBR166C009.62962512.722250YBR166C2.8555258.86362503.7345750
YKL211C4ase427074.321YKL211C4random41071495.3333YKL211C4naive413074.321YDR007W-3.615318.64767512.5139253.0833250YDR007W015.92595-3.672853.0833250YDR007W3.6153-2.721725-16.18677500
YNL316C4ase413042.7778YNL316C4random41103864.5556YNL316C4naive413042.7778YDR035W2.22222513.4152-0.555525-16.8874250YDR035W-0.755.13595-5.307-6.02485-0.387425YDR035W-2.972225-8.27925-4.75147510.862575-0.387425
YBR166C1ase545963.5556YBR166C1random51030381.3333YBR166C1naive518269.3827YDR354W-1.797121.23466.3032564.02775YDR354W021.2346-2.561750.69135-2.916675YDR354W1.79710-8.865-5.30865-6.944425
YDR007W1ase5319100YDR007W1random51062393.1481YDR007W1naive518288.3333YER090W-0.598575-1.98785-210YER090W010.12345-17.067910YER090W0.59857512.1113-15.067900
YDR035W1ase531946.6667YDR035W1random5728100YDR035W1naive518255YGL026C5.918525-2.380602.6666750YGL026C03.703700.8148250YGL026C-5.9185256.08430-1.851850
YDR354W1ase5319100YDR354W1random567046.6667YDR354W1naive518295.5556YKL211C4.56385-8.9797253.02082560YKL211C05.30865-2.39197560YKL211C-4.5638514.288375-5.412800
YER090W1ase531984.4444YER090W1random51054584.4444YER090W1naive518296.1111YNL316C-5.254475-1.2646256.419751.555550YNL316C006.419755.2901250YNL316C5.2544751.26462503.7345750
YGL026C1ase547986.6667YGL026C1random51004981.1111YGL026C1naive518286.6667
YKL211C1ase518288.3333YKL211C1random51042595.5556YKL211C1naive518288.3333Sum-1.41637529.8210535.3318512.40584.02775Sum-0.7561.4323-14.952123.577025-3.3041Sum0.66637531.61125-50.2839511.171225-7.33185
YNL316C1ase527082.2222YNL316C1random51113669.3827YNL316C1naive518269.3827
YBR166C2ase5218100YBR166C2random5237878.5185YBR166C2naive518269.3827Mean-0.1770468753.727631254.416481251.5507250.50346875Mean-0.093757.6790375-1.86901252.947128125-0.4130125Mean0.0832968753.95140625-6.285493751.396403125-0.91648125
YDR007W2ase518288.3333YDR007W2random51172387.2222YDR007W2naive518288.3333
YDR035W2ase518255YDR035W2random51035947.2222YDR035W2naive518255StDev4.017230198312.17648273645.17775028127.91709926361.424024669StDev0.26516504297.59289673668.02895296415.40243555231.0206760074StDev4.08932520767.73736717476.58414747034.80923540462.4394271476
YDR354W2ase531988.3333YDR354W2random51979646.6667YDR354W2naive518288.3333
YER090W2ase547980YER090W2random5986985.3704YER090W2naive518296.1111sqrt(8)2.8284271247Ttest-0.12465409220.86587675332.41257200230.55400500951Ttest-12.8605153883-0.65841283111.5429591058-1.1445118229Ttest0.05761320721.4444531798-2.70013104890.8212582965-1.0626267029
YGL026C2ase547986.6667YGL026C2random51011480.4444YGL026C2naive518279.2593
YKL211C2ase5319100YKL211C2random52900574.4444YKL211C2naive518288.3333
YNL316C2ase527082.2222YNL316C2random5167564.5455YNL316C2naive518254.4444confidence fig for 7 degrees freedom1.895confidence fig for 7 degrees freedom1.895confidence fig for 7 degrees freedom1.895
YBR166C3ase527082.2222YBR166C3random51005464.5556YBR166C3naive518269.3827
YDR007W3ase5319100YDR007W3random532851.1111YDR007W3naive518288.3333
YDR035W3ase518255YDR035W3random5983117.3333YDR035W3naive518255
YDR354W3ase5319100YDR354W3random5103775.9259YDR354W3naive518274.321
YER090W3ase537459.7531YER090W3random51042547.1111YER090W3naive518259.7531
YGL026C3ase537479.2593YGL026C3random51101281.2963YGL026C3naive518279.2593
YKL211C3ase537474.321YKL211C3random556077.2222YKL211C3naive518274.321
YNL316C3ase537469.3827YNL316C3random51948578.5185YNL316C3naive518269.3827
YBR166C4ase5218100YBR166C4random5344100YBR166C4naive518254.4444
YDR007W4ase5319100YDR007W4random598446.6667YDR007W4naive518274.321
YDR035W4ase518255YDR035W4random51028833.3333YDR035W4naive518255
YDR354W4ase5319100YDR354W4random51710100YDR354W4naive518288.3333
YER090W4ase537459.7531YER090W4random52057985.3704YER090W4naive518259.7531
YGL026C4ase537479.2593YGL026C4random51060674.8485YGL026C4naive518279.2593
YKL211C4ase537474.321YKL211C4random51077695.3333YKL211C4naive518274.321
YNL316C4ase518242.7778YNL316C4random51128564.5556YNL316C4naive518242.7778
-
Accuracy v MoneyGiven a spend of 102.26, ASE 79.5%, Nave 73.9%, Random 57.4%. ASE was significantly more accurate than either Nave (p < 0.05) or Random (p < 0.001).
RS Accuracy
57.358157.358157.3581
67.17187567.34892187567.265625
76.13599687572.585412568.550709375
79.54475312571.577687573.828478125
80.4737687573.03236562573.886753125
80.10918437572.164312573.93518125
ase
random
naive
Iterations
Classification Accuracy (%)
RS cost vs accuracy
57.358157.358157.3581
67.17187567.34892187567.265625
76.13599687572.585412568.550709375
79.54475312571.577687573.828478125
80.4737687573.03236562573.886753125
80.10918437572.164312573.93518125
ase
random
naive
Log 10 Cost ()
Classification Accuracy (%)
results
AseRandomNaiveday 1
GeneErrorLoopTechniqueIterationCostAccuracyGeneErrorLoopTechniqueIterationCostAccuracyGeneErrorLoopTechniqueIterationCostAccuracy
AseTests (individual Gene)Ave Accuracy
YBR166C1ase11069.3827YBR166C1random124781.3333YBR166C1naive11069.3827Iterationave accuracyave costLog ave costAseRandomNaive
YDR007W1ase11074.321YDR007W1random18179.4444YDR007W1naive11074.321057.358100Gene
YDR035W1ase11028.8889YDR035W1random16253.4503YDR035W1naive11028.8889167.171875101T-TestsCostYBR166C69.382772.23822569.3827
YDR354W1ase11074.321YDR354W1random124776.4646YDR354W1naive11074.321276.13599687557.68751.7610817184YDR007W71.67642575.29172571.676425
YER090W1ase11059.7531YER090W1random110463.1579YER090W1naive11059.7531379.544753125184.031252.264891576Iterationase|randomase|naverandom|naveYDR035W47.3099545.08772548.05995
YGL026C1ase11079.2593YGL026C1random16281.1111YGL026C1naive11079.2593480.47376875255.6252.4076033254YDR354W74.32176.118174.321
YKL211C1ase11063.7427YKL211C1random124163.7427YKL211C1naive11063.7427580.109184375313.81252.496670238710.05645075200.056450752YER090W59.753160.35167559.7531
YNL316C1ase11069.3827YNL316C1random120569.3827YNL316C1naive11069.382720.00038079300.0003592261YGL026C79.259373.34077579.2593
YBR166C2ase11069.3827YBR166C2random118978.5185YBR166C2naive11069.382730.00002549460.00000000020.0000193599YKL211C71.67642567.11257571.676425
YDR007W2ase11063.7427YDR007W2random182263.7427YDR007W2naive11063.742740.00000066860.00000000180.0000004905YNL316C63.996169.25057563.9961
YDR035W2ase11053.4503YDR035W2random16253.4503YDR035W2naive11053.4503Random50.00000004080.0000000030.0000000303
YDR354W2ase11074.321YDR354W2random1942776.4646YDR354W2naive11074.321Iterationave accuracyave costLog ave cost
YER090W2ase11059.7531YER090W2random1942767.9798YER090W2naive11059.7531057.358100All0.00000000000.00000000020.0000000000
YGL026C2ase11079.2593YGL026C2random124180.4444YGL026C2naive11079.2593167.348921875805.468752.9060486956day 5AccuracyAccuracy/Cost compares ase at day 3, random at day 0 and naive at day 5Accuracy day 3Accuracy day 4
YKL211C2ase11074.321YKL211C2random110463.7427YKL211C2naive11074.321272.58541253607.8753.5572514824
YNL316C2ase11069.3827YNL316C2random122464.5455YNL316C2naive11069.3827371.57768755641.656253.7514066208Tests (individual Gene)
YBR166C3ase11069.3827YBR166C3random18164.5556YBR166C3naive11069.3827473.0323656257381.93753.8681703639T-TestsAccuracyAseRandomNaiveAse/RandomAse/NaiveRandom/NaiveAse (3)Random (0)naive (5)Ase/RandomAse/NaiveRandom/NaiveAse (3)Random (0)naive (5)Ase/RandomAse/NaiveRandom/NaiveAse (3)Random (0)naive (5)Ase/RandomAse/NaiveRandom/Naive
YDR007W3ase11074.321YDR007W3random12976.8687YDR007W3naive11074.321572.16431259021.93753.9552998142Gene
YDR035W3ase11053.4503YDR035W3random16253.4503YDR035W3naive11053.4503Iterationase|randomase|naverandom|naveYBR166C86.4444581.1018565.6481255.342620.79632515.453725YBR166C79.01232546.198865.64812532.81352513.3642-19.449325YBR166C79.01232581.1018569.3827-2.0895259.62962511.71915YBR166C86.4444581.1018565.6481255.342620.79632515.453725
YDR354W3ase11074.321YDR354W3random16277.2222YDR354W3naive11074.321YDR007W97.08332569.53702584.83022527.546312.2531-15.2932YDR007W97.08332563.742784.83022533.34062512.2531-21.087525YDR007W97.08332569.53702584.83022527.546312.2531-15.2932YDR007W97.08332569.53702584.83022527.546312.2531-15.2932
YER090W3ase11059.7531YER090W3random110447.1111YER090W3naive11059.7531Naive10.95316011030.97287765720.9778616371YDR035W52.91667549.4722553.444475-2.083325-5.5278YDR035W52.91667553.450355-0.533625-2.083325-1.5497YDR035W52.91667537.83477553.83772515.0819-0.92105-16.00295YDR035W52.91667549.472254.6125753.444475-1.6959-5.140375
YGL026C3ase11079.2593YGL026C3random168556.9591YGL026C3naive11079.2593Iterationave accuracyave costLog ave cost20.41658982970.0394205960.3112710687YDR354W97.08332567.31482586.635829.768510.447525-19.320975YDR354W10063.742486.635836.257613.3642-22.8934YDR354W10074.2592581.3271525.7407518.67285-7.0679YDR354W10074.2592586.635825.7407513.3642-12.37655
YKL211C3ase11074.321YKL211C3random16277.2222YKL211C3naive11074.321057.35810030.09015137330.14196970860.5990318345YER090W70.9876575.57407577.9321-4.586425-6.94445-2.358025YER090W70.9876563.157977.93217.82975-6.94445-14.7742YER090W70.9876575.57407577.9321-4.586425-6.94445-2.358025YER090W70.9876575.57407577.9321-4.586425-6.94445-2.358025
YNL316C3ase11069.3827YNL316C3random138478.5185YNL316C3naive11069.3827167.26562510140.12407125270.11446243280.8481660016YGL026C82.96379.42507581.111153.5379251.85185-1.686075YGL026C82.96356.959181.1111526.00391.85185-24.15205YGL026C82.96379.42507579.25933.5379253.70370.165775YGL026C82.96379.42507581.111153.5379251.85185-1.686075
YBR166C4ase11069.3827YBR166C4random122464.5455YBR166C4naive11069.3827268.550709375391.59106460750.10366336520.13403453560.696108696YKL211C84.24382585.63887581.32715-1.395052.9166754.311725YKL211C84.24382563.742781.3271520.5011252.916675-17.58445YKL211C84.24382585.63887581.32715-1.395052.9166754.311725YKL211C84.24382585.63887581.32715-1.395052.9166754.311725
YDR007W4ase11074.321YDR007W4random110481.1111YDR007W4naive11074.321373.828478125781.8920946027YNL316C69.15122569.25057558.9969-0.0993510.15432510.253675YNL316C69.15122547.836358.996921.31492510.154325-11.1606YNL316C69.15122569.25057562.731475-0.099356.419756.5191YNL316C69.15122569.25057558.9969-0.0993510.15432510.253675
YDR035W4ase11053.4503YDR035W4random119920YDR035W4naive11056.4503473.8867531251302.1139433523All0.00712203400.00245050070.9336110841
YDR354W4ase11074.321YDR354W4random164374.321YDR354W4naive11074.321573.935181251822.260071388ALL640.873475577.3145591.4814563.55897549.392025-14.16695All636.358025458.8302591.48145177.52782544.876575-132.65125All636.358025572.6215590.62782563.73652545.7302-18.006325All643.79015584.258925591.09402559.53122552.696125-6.8351
YER090W4ase11059.7531YER090W4random124163.1579YER090W4naive11059.7531
YGL026C4ase11079.2593YGL026C4random124774.8485YGL026C4naive11079.2593
YKL211C4ase11074.321YKL211C4random121863.7427YKL211C4naive11074.321MEAN7.9448718756.174003125-1.77086875MEAN22.1909781255.609571875-16.58140625MEAN7.9670656255.716275-2.250790625MEAN7.4414031256.587015625-0.8543875
YNL316C4ase11047.8363YNL316C4random168564.5556YNL316C4naive11047.8363ST DEV13.17958566638.899776414411.8538744459ST DEV12.96058246527.77556664427.419043069ST DEV12.98862052347.95207395310.024391646ST DEV12.27253050989.104671927110.5464561275
YBR166C1ase26269.3827YBR166C1random248881.3333YBR166C1naive23969.3827TTEST1.70502257681.9621524288-0.4225431297TTEST4.8428042972.0405284882-6.3214755283TTEST1.73492361852.0331887452-0.6350706837TTEST1.71500624332.0463003845-0.2291360008
YDR007W1ase26295.5556YDR007W1random213393.1481YDR007W1naive23974.321
YDR035W1ase26243.3333YDR035W1random225753.4503YDR035W1naive23928.8889
YDR354W1ase26295.5556YDR354W1random244676.4646YDR354W1naive23974.321sqrt(8)2.8284271247confidence fig for 7 degrees freedom1.895
YER090W1ase26280YER090W1random230972YER090W1naive23959.7531
YGL026C1ase26286.6667YGL026C1random2962681.1111YGL026C1naive23979.2593Accuracies for all genes at day 0
YKL211C1ase23988.3333YKL211C1random2962695.5556YKL211C1naive23988.3333
YNL316C1ase26269.3827YNL316C1random240469.3827YNL316C1naive23969.3827GeneDay (0)CostAccuracy
YBR166C2ase26269.3827YBR166C2random238878.5185YBR166C2naive23969.3827YBR166C0046.1988
YDR007W2ase23988.3333YDR007W2random21024963.7427YDR007W2naive23988.3333YDR007W0063.7427
YDR035W2ase23953.4503YDR035W2random214353.4503YDR035W2naive23953.4503YDR035W0053.4503
YDR354W2ase26295.5556YDR354W2random21883176.4646YDR354W2naive23974.321YDR354W0063.7427
YER090W2ase26280YER090W2random2953185.3704YER090W2naive23959.7531YER090W0063.1579
YGL026C2ase26286.6667YGL026C2random229380.4444YGL026C2naive23979.2593YGL026C0056.9591
YKL211C2ase26295.5556YKL211C2random2953182.0833YKL211C2naive23974.321YKL211C0063.7427
YNL316C2ase26269.3827YNL316C2random227664.5455YNL316C2naive23969.3827YNL316C0047.8363
YBR166C3ase26269.3827YBR166C3random227664.5556YBR166C3naive23969.3827
YDR007W3ase26295.5556YDR007W3random227676.8687YDR007W3naive23974.321
YDR035W3ase23955YDR035W3random214317.3333YDR035W3naive23953.4503Changes in accuracy between Day n and Day n-1Changes in accuracy between Day n and Day n-1Changes in accuracy between Day n and Day n-1
YDR354W3ase26295.5556YDR354W3random272477.2222YDR354W3naive23974.321Run 1AseRun 1RandomRun 1Naive
YER090W3ase26259.7531YER090W3random2948947.1111YER090W3naive23959.7531
YGL026C3ase26279.2593YGL026C3random288081.2963YGL026C3naive23979.2593Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)
YKL211C3ase26274.321YKL211C3random230977.2222YKL211C3naive23974.321YBR166C23.183900-5.82710YBR166C35.13450000YBR166C23.18390000
YNL316C3ase26269.3827YNL316C3random2978878.5185YNL316C3naive23969.3827YDR007W10.578321.23464.444400YDR007W15.701713.7037000YDR007W10.5783014.012300
YBR166C4ase26269.3827YBR166C4random2305100YBR166C4naive23969.3827YDR035W-24.561414.44443.333400YDR035W00046.54970YDR035W-24.5614026.111100
YDR007W4ase26295.5556YDR007W4random229981.1111YDR007W4naive23974.321YDR354W10.578321.23464.444400YDR354W12.72190-29.797900YDR354W10.57830021.23460
YDR035W4ase23955YDR035W4random2960320YDR035W4naive23953.4503YER090W-3.404820.24694.444400YER090W08.842112.444400YER090W-3.4048036.35800
YDR354W4ase26295.5556YDR354W4random286174.321YDR354W4naive23974.321YGL026C22.30027.4074000YGL026C24.1520000YGL026C22.3002007.40740
YER090W4ase26259.7531YER090W4random2966885.3704YER090W4naive23959.7531YKL211C024.5906000YKL211C031.8129000YKL211C024.5906000
YGL026C4ase26279.2593YGL026C4random232874.8485YGL026C4naive23979.2593YNL316C21.5464012.839500YNL316C21.54640000YNL316C21.54640000
YKL211C4ase26274.321YKL211C4random2104695.3333YKL211C4naive23974.321
YNL316C4ase23942.7778YNL316C4random292664.5556YNL316C4naive23942.7778
YBR166C1ase316669.3827YBR166C1random368781.3333YBR166C1naive37869.3827Run 2AseRun 2RandomRun 2Naive
YDR007W1ase3251100YDR007W1random337493.1481YDR007W1naive37888.3333
YDR035W1ase325146.6667YDR035W1random348153.4503YDR035W1naive37855Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)
YDR354W1ase3251100YDR354W1random367046.6667YDR354W1naive37874.321YBR166C23.1839012.8395240YBR166C32.31970000YBR166C23.18390000
YER090W1ase325184.4444YER090W1random349884.4444YER090W1naive37896.1111YDR007W024.5906012.33330YDR007W0023.479500YDR007W024.5906000
YGL026C1ase325186.6667YGL026C1random3983181.1111YGL026C1naive37879.2593YDR035W001.5497-210YDR035W00-6.228100YDR035W00001.5497
YKL211C1ase37888.3333YKL211C1random31031195.5556YKL211C1naive37888.3333YDR354W10.578321.23464.444424-11.6667YDR354W12.72190-2.02020-27.7777YDR354W10.5783014.012300
YNL316C1ase316682.2222YNL316C1random3983169.3827YNL316C1naive37869.3827YER090W-3.404820.2469040YER090W4.821917.3906000YER090W-3.4048036.35800
YBR166C2ase316682.2222YBR166C2random3105078.5185YBR166C2naive37869.3827YGL026C22.30027.4074010.66670YGL026C23.48530000YGL026C22.30020000
YDR007W2ase37888.3333YDR007W2random31035387.2222YDR007W2naive37888.3333YKL211C10.578321.23464.4444240YKL211C018.3406-7.638900YKL211C10.5783014.012300
YDR035W2ase37855YDR035W2random382847.2222YDR035W2naive37853.4503YNL316C21.5464012.83956.22220YNL316C16.70920000YNL316C21.546400-14.93830
YDR354W2ase3251100YDR354W2random31947474.4444YDR354W2naive37888.3333
YER090W2ase325180YER090W2random3959385.3704YER090W2naive37896.1111
YGL026C2ase325186.6667YGL026C2random3967880.4444YGL026C2naive37879.2593Run 3AseRun 3RandomRun 3Naive
YKL211C2ase3251100YKL211C2random31891674.4444YKL211C2naive37888.3333
YNL316C2ase316682.2222YNL316C2random332864.5455YNL316C2naive37869.3827Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)
YBR166C3ase316682.2222YBR166C3random352364.5556YBR166C3naive37869.3827YBR166C23.1839012.839500YBR166C18.35680000YBR166C23.18390000
YDR007W3ase3251100YDR007W3random332851.1111YDR007W3naive37888.3333YDR007W10.578321.23464.444400YDR007W13.1260-25.757600YDR007W10.5783014.012300
YDR035W3ase37855YDR035W3random3957017.3333YDR035W3naive37853.4503YDR035W01.5497000YDR035W0-36.117000YDR035W0001.54970
YDR354W3ase3251100YDR354W3random378675.9259YDR354W3naive37874.321YDR354W10.578321.23464.444400YDR354W13.47950-1.296300YDR354W10.57830000
YER090W3ase316659.7531YER090W3random31013247.1111YER090W3naive37859.7531YER090W-3.40480000YER090W-16.04680000YER090W-3.40480000
YGL026C3ase316679.2593YGL026C3random394281.2963YGL026C3naive37879.2593YGL026C22.30020000YGL026C024.3372000YGL026C22.30020000
YKL211C3ase316674.321YKL211C3random331977.2222YKL211C3naive37874.321YKL211C10.57830000YKL211C13.47950000YKL211C10.57830000
YNL316C3ase316669.3827YNL316C3random31921578.5185YNL316C3naive37869.3827YNL316C21.54640000YNL316C30.68220000YNL316C21.54640000
YBR166C4ase316682.2222YBR166C4random3344100YBR166C4naive37869.3827
YDR007W4ase3251100YDR007W4random398446.6667YDR007W4naive37874.321
YDR035W4ase37855YDR035W4random31028833.3333YDR035W4naive37853.4503Run 4AseRun 4RandomRun 4Naive
YDR354W4ase3251100YDR354W4random31079100YDR354W4naive37888.3333
YER090W4ase316659.7531YER090W4random31035385.3704YER090W4naive37859.7531Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)
YGL026C4ase316679.2593YGL026C4random354674.8485YGL026C4naive37879.2593YBR166C23.1839012.839517.77780YBR166C18.346735.4545000YBR166C23.183900-14.93830
YKL211C4ase316674.321YKL211C4random31061095.3333YKL211C4naive37874.321YDR007W10.578321.23464.444400YDR007W17.36840-34.444400YDR007W10.57830000
YNL316C4ase37842.7778YNL316C4random3161164.5556YNL316C4naive37842.7778YDR035W01.5497000YDR035W-33.4503013.333300YDR035W3-301.54970
YBR166C1ase427063.5556YBR166C1random41025181.3333YBR166C1naive413069.3827YDR354W10.578321.23464.444400YDR354W10.5783025.67900YDR354W10.5783014.012300
YDR007W1ase4280100YDR007W1random4105993.1481YDR007W1naive413088.3333YER090W-3.40480000YER090W022.2125000YER090W-3.40480000
YDR035W1ase428046.6667YDR035W1random4728100YDR035W1naive413055YGL026C22.30020000YGL026C17.88940000YGL026C22.30020000
YDR354W1ase4280100YDR354W1random467046.6667YDR354W1naive413095.5556YKL211C10.57830000YKL211C031.5906000YKL211C10.57830000
YER090W1ase428084.4444YER090W1random4116084.4444YER090W1naive413096.1111YNL316C0-5.0585000YNL316C16.71930000YNL316C0-5.0585000
YGL026C1ase445086.6667YGL026C1random4986081.1111YGL026C1naive413086.6667
YKL211C1ase413088.3333YKL211C1random41036395.5556YKL211C1naive413088.3333
YNL316C1ase421882.2222YNL316C1random41047469.3827YNL316C1naive413069.3827
YBR166C2ase4218100YBR166C2random4173578.5185YBR166C2naive413069.3827
YDR007W2ase413088.3333YDR007W2random41103887.2222YDR007W2naive413088.3333Average change for all 4 runsAverage change for all 4 runsAverage change for all 4 runs
YDR035W2ase413055YDR035W2random41025547.2222YDR035W2naive413053.4503AverageAseAverageRandomAverageNaive
YDR354W2ase4280100YDR354W2random41971574.4444YDR354W2naive413088.3333
YER090W2ase445080YER090W2random4962285.3704YER090W2naive413096.1111Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)
YGL026C2ase445086.6667YGL026C2random41006280.4444YGL026C2naive413079.2593YBR166C23.183909.6296258.9876750YBR166C26.0394258.863625000YBR166C23.183900-3.7345750
YKL211C2ase4280100YKL211C2random41960174.4444YKL211C2naive413088.3333YDR007W7.93372522.07363.33333.0833250YDR007W11.5490253.425925-9.18062500YDR007W7.9337256.147657.0061500
YNL316C2ase421882.2222YNL316C2random499064.5455YNL316C2naive413054.4444YDR035W-6.140354.385951.220775-5.250YDR035W-8.362575-9.029251.776311.6374250YDR035W-5.39035-0.756.5277750.774850.387425
YBR166C3ase421882.2222YBR166C3random4995064.5556YBR166C3naive413069.3827YDR354W10.578321.23464.44446-2.916675YDR354W12.37540-1.858850-6.944425YDR354W10.578307.006155.308650
YDR007W3ase4280100YDR007W3random432851.1111YDR007W3naive413088.3333YER090W-3.404810.123451.111110YER090W-2.80622512.11133.111100YER090W-3.4048018.17900
YDR035W3ase413055YDR035W3random4976917.3333YDR035W3naive413055YGL026C22.30023.703702.6666750YGL026C16.3816756.0843000YGL026C22.3002001.851850
YDR354W3ase4280100YDR354W3random483875.9259YDR354W3naive413074.321YKL211C7.93372511.45631.111160YKL211C3.36987520.436025-1.90972500YKL211C7.9337256.147653.50307500
YER090W3ase427059.7531YER090W3random41037347.1111YER090W3naive413059.7531YNL316C16.1598-1.2646256.419751.555550YNL316C21.4142750000YNL316C16.1598-1.2646250-3.7345750
YGL026C3ase427079.2593YGL026C3random4162781.2963YGL026C3naive413079.2593
YKL211C3ase427074.321YKL211C3random437177.2222YKL211C3naive413074.321Average for all genesAverage for all genesAverage for all genes
YNL316C3ase427069.3827YNL316C3random41945678.5185YNL316C3naive413069.3827All9.81806258.9641218753.408756253.005403125-0.364584375All9.9951093755.236490625-1.0077251.454678125-0.868053125All9.91181251.2850843755.277768750.0582750.048428125
YBR166C4ase4218100YBR166C4random4344100YBR166C4naive413054.4444
YDR007W4ase4280100YDR007W4random498446.6667YDR007W4naive413074.321Comparison of change in accuracy between techniques (AccChangeTech1 - AccChangeTech2)Comparison of change in accuracy between techniques (AccChangeTech1 - AccChangeTech2)Comparison of change in accuracy between techniques (AccChangeTech1 - AccChangeTech2)
YDR035W4ase413055YDR035W4random41028833.3333YDR035W4naive413055Ase-RandomAse -NaiveRandom -Naive
YDR354W4ase4280100YDR354W4random41463100YDR354W4naive413088.3333
YER090W4ase427059.7531YER090W4random41991785.3704YER090W4naive413059.7531Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)Gene(1 - 0)(2 - 1)(3 - 2)(4 - 3)(5 - 4)
YGL026C4ase427079.2593YGL026C4random4117974.8485YGL026C4naive413079.2593YBR166C-2.855525-8.8636259.6296258.9876750YBR166C009.62962512.722250YBR166C2.8555258.86362503.7345750
YKL211C4ase427074.321YKL211C4random41071495.3333YKL211C4naive413074.321YDR007W-3.615318.64767512.5139253.0833250YDR007W015.92595-3.672853.0833250YDR007W3.6153-2.721725-16.18677500
YNL316C4ase413042.7778YNL316C4random41103864.5556YNL316C4naive413042.7778YDR035W2.22222513.4152-0.555525-16.8874250YDR035W-0.755.13595-5.307-6.02485-0.387425YDR035W-2.972225-8.27925-4.75147510.862575-0.387425
YBR166C1ase545963.5556YBR166C1random51030381.3333YBR166C1naive518269.3827YDR354W-1.797121.23466.3032564.02775YDR354W021.2346-2.561750.69135-2.916675YDR354W1.79710-8.865-5.30865-6.944425
YDR007W1ase5319100YDR007W1random51062393.1481YDR007W1naive518288.3333YER090W-0.598575-1.98785-210YER090W010.12345-17.067910YER090W0.59857512.1113-15.067900
YDR035W1ase531946.6667YDR035W1random5728100YDR035W1naive518255YGL026C5.918525-2.380602.6666750YGL026C03.703700.8148250YGL026C-5.9185256.08430-1.851850
YDR354W1ase5319100YDR354W1random567046.6667YDR354W1naive518295.5556YKL211C4.56385-8.9797253.02082560YKL211C05.30865-2.39197560YKL211C-4.5638514.288375-5.412800
YER090W1ase531984.4444YER090W1random51054584.4444YER090W1naive518296.1111YNL316C-5.254475-1.2646256.419751.555550YNL316C006.419755.2901250YNL316C5.2544751.26462503.7345750
YGL026C1ase547986.6667YGL026C1random51004981.1111YGL026C1naive518286.6667
YKL211C1ase518288.3333YKL211C1random51042595.5556YKL211C1naive518288.3333Sum-1.41637529.8210535.3318512.40584.02775Sum-0.7561.4323-14.952123.577025-3.3041Sum0.66637531.61125-50.2839511.171225-7.33185
YNL316C1ase527082.2222YNL316C1random51113669.3827YNL316C1naive518269.3827
YBR166C2ase5218100YBR166C2random5237878.5185YBR166C2naive518269.3827Mean-0.1770468753.727631254.416481251.5507250.50346875Mean-0.093757.6790375-1.86901252.947128125-0.4130125Mean0.0832968753.95140625-6.285493751.396403125-0.91648125
YDR007W2ase518288.3333YDR007W2random51172387.2222YDR007W2naive518288.3333
YDR035W2ase518255YDR035W2random51035947.2222YDR035W2naive518255StDev4.017230198312.17648273645.17775028127.91709926361.424024669StDev0.26516504297.59289673668.02895296415.40243555231.0206760074StDev4.08932520767.73736717476.58414747034.80923540462.4394271476
YDR354W2ase531988.3333YDR354W2random51979646.6667YDR354W2naive518288.3333
YER090W2ase547980YER090W2random5986985.3704YER090W2naive518296.1111sqrt(8)2.8284271247Ttest-0.12465409220.86587675332.41257200230.55400500951Ttest-12.8605153883-0.65841283111.5429591058-1.1445118229Ttest0.05761320721.4444531798-2.70013104890.8212582965-1.0626267029
YGL026C2ase547986.6667YGL026C2random51011480.4444YGL026C2naive518279.2593
YKL211C2ase5319100YKL211C2random52900574.4444YKL211C2naive518288.3333
YNL316C2ase527082.2222YNL316C2random5167564.5455YNL316C2naive518254.4444confidence fig for 7 degrees freedom1.895confidence fig for 7 degrees freedom1.895confidence fig for 7 degrees freedom1.895
YBR166C3ase527082.2222YBR166C3random51005464.5556YBR166C3naive518269.3827
YDR007W3ase5319100YDR007W3random532851.1111YDR007W3naive518288.3333
YDR035W3ase518255YDR035W3random5983117.3333YDR035W3naive518255
YDR354W3ase5319100YDR354W3random5103775.9259YDR354W3naive518274.321
YER090W3ase537459.7531YER090W3random51042547.1111YER090W3naive518259.7531
YGL026C3ase537479.2593YGL026C3random51101281.2963YGL026C3naive518279.2593
YKL211C3ase537474.321YKL211C3random556077.2222YKL211C3naive518274.321
YNL316C3ase537469.3827YNL316C3random51948578.5185YNL316C3naive518269.3827
YBR166C4ase5218100YBR166C4random5344100YBR166C4naive518254.4444
YDR007W4ase5319100YDR007W4random598446.6667YDR007W4naive518274.321
YDR035W4ase518255YDR035W4random51028833.3333YDR035W4naive518255
YDR354W4ase5319100YDR354W4random51710100YDR354W4naive518288.3333
YER090W4ase537459.7531YER090W4random52057985.3704YER090W4naive518259.7531
YGL026C4ase537479.2593YGL026C4random51060674.8485YGL026C4naive518279.2593
YKL211C4ase537474.321YKL211C4random51077695.3333YKL211C4naive518274.321
YNL316C4ase518242.7778YNL316C4random51128564.5556YNL316C4naive518242.7778
-
Time and MoneyCost is a positive function of time & money. ASE dominates for both, therefore ASE dominates for any reasonable cost function.
For example: to achieve an accuracy of ~70%, ASE requires fewer trial iterations, and a hundredth of the price, of Random; and almost half the number of iterations, and a third of the price, of Nave.
King et al. (2004) Nature. 427, 247-252.
-
Human ComparisonsWe were interested to compare the performance of the Robot Scientist with that of humans.We adopted the simulator to allow humans to chooses and interpret the results of cycles of experimentation.Compared nine graduate computer scientists and biologists.
No significant difference between the best humans and the Robot
-
Robotic Annotation
-
New Biological KnowledgeSo far with the Robot Scientist we have only shown that we can automatically rediscover known biological knowledge.
We wish to extend this result to the discovery of new biological knowledge.
To do this we need to combine the robot scientist with conventional genome annotation bioinformatics, and DMP.
-
Robotic AnnotationOne way of thinking about genome annotation is as a hypothesis formation process. Hypothesis formation is perhaps the hardest part of automating science.
Our idea is to incorporate bioinformatic annotation methods with genome annotation. The bioinformatic methods will generate the hypotheses which the robot scientist will experimentally test.
-
Genome Scale Model of Yeast MetabolismWe have extended our model of aromatic amino acid metabolism to cover most of what is known about yeast metabolism.Includes 1,166 ORFs (940 known, 226 inferred)Growth if path from growth medium to defined end-points.83% accuracy (based on 914 strain/medium predictions)
-
The Model is IncompleteIt is not possible to find a path from the inputs (growth medium) to all the end-point metabolites using only reactions encoded by known genes.This suggests automated strategies for determining the identity of the missing genes - new biological knowledge.One strategy is based on using EC enzyme class of missing reactions, identify genes that code for this EC class in other organism, then find homologous genes in yeast.The predictions can be tested automatically by robot.
-
Confirmation of DMPYeast PredictionsThe yeast gene YBR147W, of currently unknown function.It is predicted to have a function in metabolism by 2 DMP rules with expected accuracies of >80%.It is predicted to have a function in amino-acid metabolism with two rules with expected accuracies of 50% and 60% respectively.Using our robot scientist auxotrophic methodology we have recovered growth of the knockout with: aspartic acid, tyrosine, leucine, valine, phenylalanine, cystine, arginine.
-
ConclusionsMachine learning can be used to accurately predict gene function.
Simple forms of scientific reasoning and experimentation can be fully automated.
To develop robotic systems capable of generating new biological knowledge will require a synthesis of traditional genome annotation techniques, machine learning, and a Robot Scientist like methodology.
-
The Three Objects of the Intellect The True The Beautiful The Beneficial
-
AcknowledgementsDMPAndreas KarwathAberystwythAmanda ClareAberystwythPaul WiseAberystwythLuc DehaspeLeuven
Robot ScientistKen WhelanAberystwythPhilip ReiserAberystwythFfion JonesAberystwythUgis Sarkans Aberystwyth (EBI)Douglas KellManchester (Aberystwyth) Steve OliverManchesterStephen MuggletonImperial College (York)Chris BryantRobert Gordons (York) David PageWisconsin
BBSRC, EPSRCPharmDM - Commercial Support
-
Relational vs PropositionalPropositional: single table, fixed number of columns/attributesRelational: multiple tables, multiple values
-
Expression Data RuleIf in the micro-array experiment (sorbitol incubation) the ORF expression is > -0.25 and in the micro-array experiment (nitrogen depletion) the ORF expression is -1.06then the function of this ORF is pheromone response, mating type determination, sex-specific proteins"Accuracy on training data: 11/12 (92%)Accuracy on the test data: 3/4 (75%)21 predictions made
-
Structure Rule80% accurate on test dataMost matching ORFs belong to the Mitochondrial Carrier FamilyThese have 6 long transmembrane alpha-helices of about 20-30 amino acidsWhy do we notice alpha-helices of length 10-14?
-
AlignmentYJL133W -------NEYNPLIHCLC----GSISGSTCAAITTPLDCIKTVLQIRG------------ 251YKR052C -------NSYNPLIHCLC----GGISGATCAALTTPLDCIKTVLQVRG------------ 241YIL006W ----NNTNSINLQRLIMA----SSVSKMIASAVTYPHEILRTRMQLKS------------ 310YBR104W ----LTRNEIPPWKLCLF----GAFSGTMLWLTVYPLDVVKSIIQNDD------------ 271YGR096W ----KTTAAHKKWELATLNHSAGTIGGVIAKIITFPLETIRRRMQFMNSKHLEK------ 250YJR095W -----QMDVLPSWETSCI----GLISGAIGPFSNAPLDTIKTRLQKDK------------ 246YKL120W -----LMKDGPALHLTAS-----TISGLGVAVVMNPWDVILTRIYNQK------------ 261YLR348C -----FDASKNYTHLTAS-----LLAGLVATTVCSPADVMKTRIMNGS------------ 239YMR166C ----DGRDGELSIPNEILT---GACAGGLAGIITTPMDVVKTRVQTQQPPSQSNKSYSVT 300YDL198C ------DYSQATWSQNFIS---SIVGACSSLIVSAPLDVIKTRIQNRN------------ 242YGR257C ----RFASKDANWVHFINSFASGCISGMIAAICTHPFDVGKTRWQISMMN---------- 302YDL119C FIHYNPEGGFTTYTSTTVNTTSAVLSASLATTVTAPFDTIKTRMQLEP------------ 255
YJL133W -SQTVSLEIMRKADTFSKAASAIYQVYGWKGFWRGWKPRIVANMPATAISWTAYECAKHF 310YKR052C -SETVSIEIMKDANTFGRASRAILEVHGWKGFWRGLKPRIVANIPATAISWTAYECAKHF 300YIL006W -DIPDSIQRR-----LFPLIKATYAQEGLKGFYSGFTTNLVRTIPASAITLVSFEYFRNR 364YBR104W -LRKPKYKNS-----ISYVAKTIYAKEGIRAFFKGFGPTMVRSAPVNGATFLTFELVMRF 325YGR096W FSRHSSVYGSYKGYGFARIGLQILKQEGVSSLYRGILVALSKTIPTTFVSFWGYETAIHY 310YJR095W ---SISLEKQSGMKKIITIGAQLLKEEGFRALYKGITPRVMRVAPGQAVTFTVYEYVREH 303YKL120W ----GDLYKG-----PIDCLVKTVRIEGVTALYKGFAAQVFRIAPHTIMCLTFMEQTMKL 312YLR348C ----GDHQP------ALKILADAVRKEGPSFMFRGWLPSFTRLGPFTMLIFFAIEQLKKH 289YMR166C HPHVTNGRPAALSNSISLSLRTVYQSEGVLGFFSGVGPRFVWTSVQSSIMLLLYQMTLRG 360YDL198C ---FDNPESG------LRIVKNTLKNEGVTAFFKGLTPKLLTTGPKLVFSFALAQSLIPR 293YGR257C ---NSDPKGGNRSRNMFKFLETIWRTEGLAALYTGLAARVIKIRPSCAIMISSYEISKKV 359YDL119C ----SKFTNS------FNTFTSIVKNENVLKLFSGLSMRLARKAFSAGIAWGIYEELVKR 305
-
AlignmentYJL133W -------cccccaaaaaa----aaaaaaaaaaacccaaaaaaaaaacc------------ 251YKR052C -------cccccaaaaaa----aaaaaaaaaaacccaaaaaaaaaacc------------ 241YIL006W ----ccccccccaaaaaa----aaaaaaaaaaacccaaaaaaaaaacc------------ 310YBR104W ----ccccccccaaaaaa----aaaaaaaaaaacccaaaaaaaaaacc------------ 271YGR096W ----cccccccccccccbaaaaaaaaaaaaaaacccaaaaaaaaaacccccccc------ 250YJR095W -----cccccccaaaaaa----aaaaaaaaaaacccaaaaaaaaaccc------------ 246YKL120W -----ccccccaaaaaaa-----aaaaaaaaaacccaaaaaaaaaacc------------ 261YLR348C -----ccccccaaaaaaa-----aaaaaaaaaacccaaaaaaaaaacc------------ 239YMR166C ----cccccccccaaaaaa---aaaaaaaaaaacccaaaaaaaaaacccccccccccccc 300YDL198C ------cccccccaaaaaa---aaaaaaaaaaacccaaaaaaaaaacc------------ 242YGR257C ----ccccccccccccaaaaaaaaaaaaaaaaacccaaaaaaaaaacccc---------- 302YDL119C ccccccccccccccaaaaaaaaaaaaaaaaaaacccaaaaaaaaaacc------------ 255
YJL133W -ccccccccccccccaaaaaaaaaaaccccaaaaccaaaaaaacaaaaaaaaaaaaaaaa 310YKR052C -ccccccccccccccaaaaaaaaaaacccaaaaaccaaaaaaaccaaaaaaaaaaaaaaa 300YIL006W -ccccccccc-----aaaaaaaaaaaccccaaacccaaaaaaaccaaaaaaaaaaaaaaa 364YBR104W -ccccccccc-----aaaaaaaaaaacccaaaaaccaaaaaaaccaaaaaaaaaaaaaaa 325YGR096W cccccccccccccccaaaaaaaaaaacccaaaaaccaaaaaaaccaaaaaaaaaaaaaaa 310YJR095W ---ccccccccccccaaaaaaaaaaacccaaaaaccaaaaaaaccaaaaaaaaaaaaaaa 303YKL120W ----cccccc-----aaaaaaaaaaacccaaaaaccaaaaaaaccaaaaaaaaaaaaaaa 312YLR348C ----ccccc------aaaaaaaaaaacccaaaaaccaaaaaaaccaaaaaaaaaaaaaaa 289YMR166C cccccccccccccccaaaaaaaaaaacccaaaaaccaaaaaaaccaaaaaaaaaaaaaaa 360YDL198C ---cccccca------aaaaaaaaaacccaaaaacccaaaaaaaaaaaaaaaaaaaaaaa 293YGR257C ---ccccccccccccaaaaaaaaaaacccaaaaaccaaaaaaaccaaaaaaaaaaaaaaa 359YDL119C ----ccccca------aaaaaaaaaacccaaaaacccaaaaaaccaaaaaaaaaaaaaaa 305
-
Types of LogicDeductionRule: If a cell grows, then it can synthesise tryptophan.Fact: cell cannot synthesise tryptophanCell cannot grow.Given the rule P Q, and the fact Q, infer the fact P (modus tollens)
AbductionRule: If a cell grows, then it can synthesise tryptophan.Fact: Cell cannot grow.Cell cannot synthesise tryptophan.Given the rule P Q, and the fact P, infer the fact Q