vectorbase vectorbase probe mapping. vectorbase automatic annotation browser array data chado manual...

34
VectorBas e Vectorbase probe mapping

Upload: amberlynn-marshall

Post on 17-Jan-2016

243 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: VectorBase Vectorbase probe mapping. VectorBase Automatic Annotation browser Array data CHADO Manual Annotation XML vectorbase Automatic Annotation

VectorBaseVectorBase

Vectorbase probe mapping

Page 2: VectorBase Vectorbase probe mapping. VectorBase Automatic Annotation browser Array data CHADO Manual Annotation XML vectorbase Automatic Annotation

VectorBaseVectorBase

Automatic Annotation

browserbrowser

Array data

CHADO

Manual Annotation

XML

vectorbase

Automatic Annotation

Page 3: VectorBase Vectorbase probe mapping. VectorBase Automatic Annotation browser Array data CHADO Manual Annotation XML vectorbase Automatic Annotation

VectorBaseVectorBase

Integration aims

• Contigview– View alignment

track

• Geneview– Reporters– experiments

– Expression

patterns?

• Reporter page– Positions mapped– Genes overlapped

– Experiments used in

Page 4: VectorBase Vectorbase probe mapping. VectorBase Automatic Annotation browser Array data CHADO Manual Annotation XML vectorbase Automatic Annotation

VectorBaseVectorBase

Linked to location: Feature track

Link to BASE Detail popup:

st-end, name, %id…

contigview

Page 5: VectorBase Vectorbase probe mapping. VectorBase Automatic Annotation browser Array data CHADO Manual Annotation XML vectorbase Automatic Annotation

VectorBaseVectorBase

Associated with gene:ReporterExperiments(?)

geneview

Page 6: VectorBase Vectorbase probe mapping. VectorBase Automatic Annotation browser Array data CHADO Manual Annotation XML vectorbase Automatic Annotation

VectorBaseVectorBase

• Slow

• Delayed data flow

• Limited scope for adaptation– a load of links…

DAS Vs. e!

Page 7: VectorBase Vectorbase probe mapping. VectorBase Automatic Annotation browser Array data CHADO Manual Annotation XML vectorbase Automatic Annotation

VectorBaseVectorBase

DAS server

DB

request browser

DAS - request:response

Page 8: VectorBase Vectorbase probe mapping. VectorBase Automatic Annotation browser Array data CHADO Manual Annotation XML vectorbase Automatic Annotation

VectorBaseVectorBase

Automatic Annotation

browserbrowser

Array data

CHADO

Manual Annotation

XML

DAS

1

2

3

data flow

2 3

4

namechr::start-end::strand

Page 9: VectorBase Vectorbase probe mapping. VectorBase Automatic Annotation browser Array data CHADO Manual Annotation XML vectorbase Automatic Annotation

VectorBaseVectorBase

• Slow

• Delayed data flow

• Limited scope for adaptation– a load of links…

DAS Vs. e!

• Fast

• Data generated with new assembly

• Fully integrated with ensembl– Pages are

extendable & adaptable

Page 10: VectorBase Vectorbase probe mapping. VectorBase Automatic Annotation browser Array data CHADO Manual Annotation XML vectorbase Automatic Annotation

VectorBaseVectorBase

cf affy_feature

affy feature

Page 11: VectorBase Vectorbase probe mapping. VectorBase Automatic Annotation browser Array data CHADO Manual Annotation XML vectorbase Automatic Annotation

VectorBaseVectorBase

featureview

Page 12: VectorBase Vectorbase probe mapping. VectorBase Automatic Annotation browser Array data CHADO Manual Annotation XML vectorbase Automatic Annotation

VectorBaseVectorBase

est sequences

Is feature associated with any external databases? (i.e. EMBL)

Xref

ContigView display track

FeatureView positions in genome

> 97% identity> 90% coverage

DNA_Align_Feature

GeneView display information with associated gene

FeatureView positions in genome other features est is associated with

e! features

Page 13: VectorBase Vectorbase probe mapping. VectorBase Automatic Annotation browser Array data CHADO Manual Annotation XML vectorbase Automatic Annotation

VectorBaseVectorBase

probe types

• MMC1– Spotted array: cDNA clones

• MMC2– Spotted array: PCR products

• Affy- Short oligo tiling path

- Agilent- Long oligo, tiling path?

Already handled by ensembl

Page 14: VectorBase Vectorbase probe mapping. VectorBase Automatic Annotation browser Array data CHADO Manual Annotation XML vectorbase Automatic Annotation

VectorBaseVectorBase

DNA_Align_Feature (ESTs)

MMC1 (clones)exonerate: EST2genes

Misc_Feature(REPORTER)

collapse ESTsinto clones

ContigView display all features in ‘probes’ track

FeatureView: positions in genome

1 - est2reporter

Page 15: VectorBase Vectorbase probe mapping. VectorBase Automatic Annotation browser Array data CHADO Manual Annotation XML vectorbase Automatic Annotation

VectorBaseVectorBase

2L

3R

est1

est2 est2

est2 est2

len est2

len (est2)+len (est1)

est1

est2

300bp

500bp

500 = 500

500 + 300 = 800

2 - est2reporter

Page 16: VectorBase Vectorbase probe mapping. VectorBase Automatic Annotation browser Array data CHADO Manual Annotation XML vectorbase Automatic Annotation

VectorBaseVectorBase

2L

3R

est1

est2 est2

est2 est2

len est2

len (est2)+len (est1)

est1

est2

50bp

500bp

500 = 500

500 + 50 = 550est2

est2

3 - est2reporter

Page 17: VectorBase Vectorbase probe mapping. VectorBase Automatic Annotation browser Array data CHADO Manual Annotation XML vectorbase Automatic Annotation

VectorBaseVectorBase

DNA_Align_Feature (ESTs)

MMC1 (clones)exonerate: EST2genes

Marker_Feature (PCR primers)

MMC2 (primers)e-PCR

assess significance?

Misc_Feature(REPORTER)

collapse ESTsinto clones

ContigView display all features in ‘probes’ track

FeatureView: positions in genome

pcr2reporter

Page 18: VectorBase Vectorbase probe mapping. VectorBase Automatic Annotation browser Array data CHADO Manual Annotation XML vectorbase Automatic Annotation

VectorBaseVectorBase

2L

3R

550bp => map weight 0

100bp => map weight 1

800bp => map weight 1

id left_primer right_primer distance name accession species

1.1.1 GATTACAACATCCAGAAGGAGTC GTAGTACTTGAGGACAGCAAG 104 ENSANGG00000020724 ENSANGG00000020724 A.gambiae

1.1.10 GCCTTTGCCGGGCTGC TTCGGGGGTTTCGAGCAG 497 ENSANGG00000002666 ENSANGG00000002666 A.gambiae

1.1.11 GATAATTCAGCGCTACACATTA CACCGTAATGCTAACATCGAA 146 ENSANGG00000002705 ENSANGG00000002705 A.gambiae

1.1.11 GATAATTCAGCGCTACACATTA CACCGTAATGCTAACATCGAA 146 ENSANGG00000020019 ENSANGG00000020019 A.gambiae

1.1.12 CAGTGCTACGTGAAGAATGA TCCGCTGTCGAGGGAAC 473 ENSANGG00000003095 ENSANGG00000003095 A.gambiae

1.1.2 TCGTCCAACAGTTTCTCCTAC GATCGTTTGCTGCTTGCATA 449 ENSANGG00000000521 ENSANGG00000000521 A.gambiae

sts pipeline

Page 19: VectorBase Vectorbase probe mapping. VectorBase Automatic Annotation browser Array data CHADO Manual Annotation XML vectorbase Automatic Annotation

VectorBaseVectorBase

2L

ENSANG00012345:

MMC2: 9713

MMC1: 4A3B-AAG-D08

1 - reporter2gene

Page 20: VectorBase Vectorbase probe mapping. VectorBase Automatic Annotation browser Array data CHADO Manual Annotation XML vectorbase Automatic Annotation

VectorBaseVectorBase

DNA_Align_Feature (ESTs)

MMC1 (clones)exonerate: EST2genes

Marker_Feature (PCR primers)

MMC2 (primers)e-PCR

assess significance?

Misc_Feature(REPORTER)

collapse ESTsinto clones

Xref

Xref: criteria? % alignment overlap w. exon boundaries

ContigView display all features in ‘probes’ track

FeatureView: positions in genome

GeneView display reporters (& experiments?) in geneview

FeatureView: positions in genome

Links to genesLinks to experiments

2 - reporter2gene

Page 21: VectorBase Vectorbase probe mapping. VectorBase Automatic Annotation browser Array data CHADO Manual Annotation XML vectorbase Automatic Annotation

VectorBaseVectorBase

MMC1 locations

Page 22: VectorBase Vectorbase probe mapping. VectorBase Automatic Annotation browser Array data CHADO Manual Annotation XML vectorbase Automatic Annotation

VectorBaseVectorBase

locations arraymap.v1 no est seqs anoest_v7.5 no est seqs0 11315 3060 2573 30601 4779 0 8042 02 484 0 3252 03 14 0 1587 04 21 0 603 05 3 0 197 06 0 0 90 07 0 0 46 08 3 0 37 09 0 0 22 0

10 2 0 14 0>10 2 0 160 0

MMC1 locations table

Page 23: VectorBase Vectorbase probe mapping. VectorBase Automatic Annotation browser Array data CHADO Manual Annotation XML vectorbase Automatic Annotation

VectorBaseVectorBase

MMC1 genes

Page 24: VectorBase Vectorbase probe mapping. VectorBase Automatic Annotation browser Array data CHADO Manual Annotation XML vectorbase Automatic Annotation

VectorBaseVectorBase

MMC1 genes table

hits anoest_v7.1 arraymap.v1-1 5433 126150 9049 17601 4215 48812 718 3263 123 274 70 145 18 146 16 67 4 68 8 39 1 3

10 7 1>10 21 27

Page 25: VectorBase Vectorbase probe mapping. VectorBase Automatic Annotation browser Array data CHADO Manual Annotation XML vectorbase Automatic Annotation

VectorBaseVectorBase

MMC2 locations

Page 26: VectorBase Vectorbase probe mapping. VectorBase Automatic Annotation browser Array data CHADO Manual Annotation XML vectorbase Automatic Annotation

VectorBaseVectorBase

MMC2 genes

Page 27: VectorBase Vectorbase probe mapping. VectorBase Automatic Annotation browser Array data CHADO Manual Annotation XML vectorbase Automatic Annotation

VectorBaseVectorBase

MMC2 table

genes count-1 6150 19131 86992 5873 954 365 176 37 29 4

10 1>10 23

aligns count0 6151 96732 6303 2144 1685 1646 1107 648 439 34

10 24>10 256

Page 28: VectorBase Vectorbase probe mapping. VectorBase Automatic Annotation browser Array data CHADO Manual Annotation XML vectorbase Automatic Annotation

VectorBaseVectorBase

MMC DAS trackshttp://base.vectorbase.org:8080/das

Page 29: VectorBase Vectorbase probe mapping. VectorBase Automatic Annotation browser Array data CHADO Manual Annotation XML vectorbase Automatic Annotation

VectorBaseVectorBase

MMC listings

http://base.vectorbase.org:8080/MMC1.jsphttp://base.vectorbase.org:8080/MMC2.jsp

Page 30: VectorBase Vectorbase probe mapping. VectorBase Automatic Annotation browser Array data CHADO Manual Annotation XML vectorbase Automatic Annotation

VectorBaseVectorBase

fin

Thanks…

– Bob, George, Fotis– Dan, Karyn, Martin @ EBI– Ian Sealy @ Sanger– Informatics support @ Sanger

Page 31: VectorBase Vectorbase probe mapping. VectorBase Automatic Annotation browser Array data CHADO Manual Annotation XML vectorbase Automatic Annotation

VectorBaseVectorBase

EST quality

TOTAL: 19280• OK: 17186 (89.14%)

• POOR: 2094 (10.86%)– repetitive: 2094– short: 36

– avg length (bp): 576.39– avg repeat %: 1.05%

Page 32: VectorBase Vectorbase probe mapping. VectorBase Automatic Annotation browser Array data CHADO Manual Annotation XML vectorbase Automatic Annotation

VectorBaseVectorBase

general runnable

e! DB

runnableDB runnable

Exonerate

BLAT

ePCR

DB record >> perl object

perl object >> DB record

process object

runexecutable

get object from db

Page 33: VectorBase Vectorbase probe mapping. VectorBase Automatic Annotation browser Array data CHADO Manual Annotation XML vectorbase Automatic Annotation

VectorBaseVectorBase

MMC1 runnable

e! DB

runnableDB runnable

Exonerate

BLAT

ePCR

DB record >> dna align object

dna align object >> DB record

MMC1_xref reporter XREF

Page 34: VectorBase Vectorbase probe mapping. VectorBase Automatic Annotation browser Array data CHADO Manual Annotation XML vectorbase Automatic Annotation

VectorBaseVectorBase

MMC2 runnable

e! DB

runnableDB runnable

Exonerate

BLAT

ePCR

DB record >> marker object

marker object >> DB record

MMC2_xref reporter XREF