vectorbase vectorbase probe mapping. vectorbase automatic annotation browser array data chado manual...
TRANSCRIPT
VectorBaseVectorBase
Vectorbase probe mapping
VectorBaseVectorBase
Automatic Annotation
browserbrowser
Array data
CHADO
Manual Annotation
XML
vectorbase
Automatic Annotation
VectorBaseVectorBase
Integration aims
• Contigview– View alignment
track
• Geneview– Reporters– experiments
– Expression
patterns?
• Reporter page– Positions mapped– Genes overlapped
– Experiments used in
VectorBaseVectorBase
Linked to location: Feature track
Link to BASE Detail popup:
st-end, name, %id…
contigview
VectorBaseVectorBase
Associated with gene:ReporterExperiments(?)
geneview
VectorBaseVectorBase
• Slow
• Delayed data flow
• Limited scope for adaptation– a load of links…
DAS Vs. e!
VectorBaseVectorBase
DAS server
DB
request browser
DAS - request:response
VectorBaseVectorBase
Automatic Annotation
browserbrowser
Array data
CHADO
Manual Annotation
XML
DAS
1
2
3
data flow
2 3
4
namechr::start-end::strand
VectorBaseVectorBase
• Slow
• Delayed data flow
• Limited scope for adaptation– a load of links…
DAS Vs. e!
• Fast
• Data generated with new assembly
• Fully integrated with ensembl– Pages are
extendable & adaptable
VectorBaseVectorBase
cf affy_feature
affy feature
VectorBaseVectorBase
featureview
VectorBaseVectorBase
est sequences
Is feature associated with any external databases? (i.e. EMBL)
Xref
ContigView display track
FeatureView positions in genome
> 97% identity> 90% coverage
DNA_Align_Feature
GeneView display information with associated gene
FeatureView positions in genome other features est is associated with
e! features
VectorBaseVectorBase
probe types
• MMC1– Spotted array: cDNA clones
• MMC2– Spotted array: PCR products
• Affy- Short oligo tiling path
- Agilent- Long oligo, tiling path?
Already handled by ensembl
VectorBaseVectorBase
DNA_Align_Feature (ESTs)
MMC1 (clones)exonerate: EST2genes
Misc_Feature(REPORTER)
collapse ESTsinto clones
ContigView display all features in ‘probes’ track
FeatureView: positions in genome
1 - est2reporter
VectorBaseVectorBase
2L
3R
est1
est2 est2
est2 est2
len est2
len (est2)+len (est1)
est1
est2
300bp
500bp
500 = 500
500 + 300 = 800
2 - est2reporter
VectorBaseVectorBase
2L
3R
est1
est2 est2
est2 est2
len est2
len (est2)+len (est1)
est1
est2
50bp
500bp
500 = 500
500 + 50 = 550est2
est2
3 - est2reporter
VectorBaseVectorBase
DNA_Align_Feature (ESTs)
MMC1 (clones)exonerate: EST2genes
Marker_Feature (PCR primers)
MMC2 (primers)e-PCR
assess significance?
Misc_Feature(REPORTER)
collapse ESTsinto clones
ContigView display all features in ‘probes’ track
FeatureView: positions in genome
pcr2reporter
VectorBaseVectorBase
2L
3R
550bp => map weight 0
100bp => map weight 1
800bp => map weight 1
id left_primer right_primer distance name accession species
1.1.1 GATTACAACATCCAGAAGGAGTC GTAGTACTTGAGGACAGCAAG 104 ENSANGG00000020724 ENSANGG00000020724 A.gambiae
1.1.10 GCCTTTGCCGGGCTGC TTCGGGGGTTTCGAGCAG 497 ENSANGG00000002666 ENSANGG00000002666 A.gambiae
1.1.11 GATAATTCAGCGCTACACATTA CACCGTAATGCTAACATCGAA 146 ENSANGG00000002705 ENSANGG00000002705 A.gambiae
1.1.11 GATAATTCAGCGCTACACATTA CACCGTAATGCTAACATCGAA 146 ENSANGG00000020019 ENSANGG00000020019 A.gambiae
1.1.12 CAGTGCTACGTGAAGAATGA TCCGCTGTCGAGGGAAC 473 ENSANGG00000003095 ENSANGG00000003095 A.gambiae
1.1.2 TCGTCCAACAGTTTCTCCTAC GATCGTTTGCTGCTTGCATA 449 ENSANGG00000000521 ENSANGG00000000521 A.gambiae
sts pipeline
VectorBaseVectorBase
2L
ENSANG00012345:
MMC2: 9713
MMC1: 4A3B-AAG-D08
1 - reporter2gene
VectorBaseVectorBase
DNA_Align_Feature (ESTs)
MMC1 (clones)exonerate: EST2genes
Marker_Feature (PCR primers)
MMC2 (primers)e-PCR
assess significance?
Misc_Feature(REPORTER)
collapse ESTsinto clones
Xref
Xref: criteria? % alignment overlap w. exon boundaries
ContigView display all features in ‘probes’ track
FeatureView: positions in genome
GeneView display reporters (& experiments?) in geneview
FeatureView: positions in genome
Links to genesLinks to experiments
2 - reporter2gene
VectorBaseVectorBase
MMC1 locations
VectorBaseVectorBase
locations arraymap.v1 no est seqs anoest_v7.5 no est seqs0 11315 3060 2573 30601 4779 0 8042 02 484 0 3252 03 14 0 1587 04 21 0 603 05 3 0 197 06 0 0 90 07 0 0 46 08 3 0 37 09 0 0 22 0
10 2 0 14 0>10 2 0 160 0
MMC1 locations table
VectorBaseVectorBase
MMC1 genes
VectorBaseVectorBase
MMC1 genes table
hits anoest_v7.1 arraymap.v1-1 5433 126150 9049 17601 4215 48812 718 3263 123 274 70 145 18 146 16 67 4 68 8 39 1 3
10 7 1>10 21 27
VectorBaseVectorBase
MMC2 locations
VectorBaseVectorBase
MMC2 genes
VectorBaseVectorBase
MMC2 table
genes count-1 6150 19131 86992 5873 954 365 176 37 29 4
10 1>10 23
aligns count0 6151 96732 6303 2144 1685 1646 1107 648 439 34
10 24>10 256
VectorBaseVectorBase
MMC DAS trackshttp://base.vectorbase.org:8080/das
VectorBaseVectorBase
MMC listings
http://base.vectorbase.org:8080/MMC1.jsphttp://base.vectorbase.org:8080/MMC2.jsp
VectorBaseVectorBase
fin
Thanks…
– Bob, George, Fotis– Dan, Karyn, Martin @ EBI– Ian Sealy @ Sanger– Informatics support @ Sanger
VectorBaseVectorBase
EST quality
TOTAL: 19280• OK: 17186 (89.14%)
• POOR: 2094 (10.86%)– repetitive: 2094– short: 36
– avg length (bp): 576.39– avg repeat %: 1.05%
VectorBaseVectorBase
general runnable
e! DB
runnableDB runnable
Exonerate
BLAT
ePCR
DB record >> perl object
perl object >> DB record
process object
runexecutable
get object from db
VectorBaseVectorBase
MMC1 runnable
e! DB
runnableDB runnable
Exonerate
BLAT
ePCR
DB record >> dna align object
dna align object >> DB record
MMC1_xref reporter XREF
VectorBaseVectorBase
MMC2 runnable
e! DB
runnableDB runnable
Exonerate
BLAT
ePCR
DB record >> marker object
marker object >> DB record
MMC2_xref reporter XREF