biological databases [s2] e nrique b lanco number of slides: 5 expected time: 15 minutes
TRANSCRIPT
Biological Databases [S2] - ENRIQUE BLANCO 2013 [email protected]
1. COMPUTER ANALYSIS
>sequenceTACGTACGTAGCTAGCTAGCTACGTAGCTAGCTAGCTACGTAGCTAATGTCGAAGTAACGTACGATCGTAGCTAGCTAGCTGATGCTATCGTAGCTAGCTGATGCATGCGCTAAACACATCGCTTTGGCACGAGCTAGCTAGCTACTACAGCACGGGGGCACGTAGTGCAGCTAGCAGCCGCCGCATCGCCCCCCGATCGATCGTAGCCGACGATCTACTACGTAGCGACTGACTGATCGATGAGGATCGTGAGCTAGCGTGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCAGCTACGTACGTAGCTAGCTACGAGCAGCTAGCTAGCTACGAC
+ =
GENE1 FEATURE1 FEATURE2 … FEATUREm
GENE2 FEATURE1 FEATURE2 … FEATUREm
GENEn FEATURE1 FEATURE2 … FEATUREm
…
Biological Databases [S2] - ENRIQUE BLANCO 2013 [email protected]
2. ONE DATA SET
GENE1 FEATURE1 FEATURE2 … FEATUREm
GENE2 FEATURE1 FEATURE2 … FEATUREm
GENEn FEATURE1 FEATURE2 … FEATUREm
…
SORT
REARRANGE
FEATUREm+1
FEATUREm+1
FEATUREm+1
ADD/CONVERT/EXTRACT
FILTER
Biological Databases [S2] - ENRIQUE BLANCO 2013 [email protected]
3. TWO DATA SETS
GENE1FEATURE1 FEATURE2
GENE2FEATURE1 FEATURE2
GENEmFEATURE1 FEATURE2
…
GENE1
GENE2
GENEn
…
FEATURE1 FEATURE3
FEATURE1 FEATURE3
FEATURE1 FEATURE3
Biological Databases [S2] - ENRIQUE BLANCO 2013 [email protected]
4. PIPELINES/WORKFLOWS
>chromosome1TACGTACGTAGCTAGCTAGCTACGTAGCTAGCTAGCTACGTAGCTAATGTCGAAGTAACGTACGATCGTAGCTAGCTAGCTGATGCTATCGTAGCTAGCTGATGCATGCGCTAAACACATCGCTTTGGCACGAGCTAGCTAGCTACTACAGCACGGGGGCACGTAGTGCAGCTAGCAGCCGCCGCATCGCCCCCCGATCGATCGTAGCCGACGATCTACTACGTAGCGACTGACTGATCGATGAGGATCGTGAGCTAGCGTGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCAGCTACGTACGTAGCTAGCTACGAGCAGCTAGCTAGCTACGAC
>chromosome22TACGTACGTAGCTAGCTAGCTACGTAGCTAGCTAGCTACGTAGCTAATGTCGAAGTAACGTACGATCGTAGCTAGCTAGCTGATGCTATCGTAGCTAGCTGATGCATGCGCTAAACACATCGCTTTGGCACGAGCTAGCTAGCTACTACAGCACGGGGGCACGTAGTGCAGCTAGCAGCCGCCGCATCGCCCCCCGATCGATCGTAGCCGACGATCTACTACGTAGCGACTGACTGATCGATGAGGATCGTGAGCTAGCGTGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCAGCTACGTACGTAGCTAGCTACGAGCAGCTAGCTAGCTACGAC
…
>sequenceTACGTACGTAGCTAGCTAGCTACGTAGCTAGCTAGCTACGTAGCTAATGTCGAAGTAACGTACGATCGTAGCTAGCTAGCTGATGCTATCGTAGCTAGCTGATGCATGCGCTAAACACATCGCTTTGGCACGAGCTAGCTAGCTACTACAGCACGGGGGCACGTAGTGCAGCTAGCAGCCGCCGCATCGCCCCCCGATCGATCGTAGCCGACGATCTACTACGTAGCGACTGACTGATCGATGAGGATCGTGAGCTAGCGTGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCAGCTACGTACGTAGCTAGCTACGAGCAGCTAGCTAGCTACGAC
+
Biological Databases [S2] - ENRIQUE BLANCO 2013 [email protected]
5. GALAXYhttps://main.g2.bx.psu.edu/