biodb 2011-05

39
Copyright OpenHelix. No use or reproduction without express written consent 1

Upload: bioinformaticsinstitute

Post on 20-Feb-2017

1.109 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Biodb 2011-05

Copyright OpenHelix. No use or reproduction without express written consent 1

Page 2: Biodb 2011-05

Copyright OpenHelix. No use or reproduction without express written consent 2

Important note to slide users:

To maintain the color schemes/cues and the animations, if you import these slides into other slide sets please click the checkbox in the PowerPoint Insert/Reuse window that maintains slide format. Otherwise important information may be lost.

Mac usersPC users

Page 3: Biodb 2011-05

Version 3 3

ENCODE Data Available through The UCSC Genome Browser

Materials prepared byMary Mangan, Ph.D.Warren C. Lathe, Ph.D.www.openhelix.com

Updated: Q1 2011

Page 4: Biodb 2011-05

Copyright OpenHelix. No use or reproduction without express written consent 4

ENCODE DCC at UCSC

ENCODE at UCSC: http://encodeproject.org

Introduction ENCODE Data Types Find and Use ENCODE Data ENCODE Downloads Additional ENCODE Topics Summary Exercises

Page 5: Biodb 2011-05

ENCODE: www.genome.gov/10005107

ENCyclopedia of DNA Elements, NHGRI Consortium of international researchers UCSC is the Data Coordination Center

Copyright OpenHelix. No use or reproduction without express written consent 5

Page 6: Biodb 2011-05

ENCODE Background

Pilot phase, or phase I: www.genome.gov/26525202 Selected regions of the genome: 1%, 30 MB

Copyright OpenHelix. No use or reproduction without express written consent 6

Page 7: Biodb 2011-05

ENCODE Discoveries

“Marker” papers: Nature and issue of Genome Research Changes to our conceptual framework for the genome

Copyright OpenHelix. No use or reproduction without express written consent 7

Page 8: Biodb 2011-05

ENCODE Pilot Data and Beyond

ENCODE portal: http://genome.ucsc.edu/ENCODE/ Pilot ENCODE browser: genome.ucsc.edu/ENCODE/pilot.html

Copyright OpenHelix. No use or reproduction without express written consent 8

Page 9: Biodb 2011-05

ENCODE Next Phase: Production Phase

UCSC is the DCC for human and mouse data The portal is available: genome.ucsc.edu/ENCODE/ New aspects of the Production Phase projects

Copyright OpenHelix. No use or reproduction without express written consent 9

Page 10: Biodb 2011-05

ENCODE Production Phase Focus

ENCODE is now genome-wide Specific cell types and new technologies being applied Project focus topics selected, then supplemented

Copyright OpenHelix. No use or reproduction without express written consent 10

chromatin

transcriptome/genes

promoters/regulatory sites

DNase sites

Page 11: Biodb 2011-05

ENCODE Data is Flowing!

Data being submitted to UCSC DCC by data providers “Wranglers” ensure meta data is present Quality checks occur, data is released for use

Copyright OpenHelix. No use or reproduction without express written consent 11

Page 12: Biodb 2011-05

ENCODE DCC at UCSC

Copyright OpenHelix. No use or reproduction without express written consent 12

ENCODE at UCSC: http://encodeproject.org

Introduction ENCODE Data Types Find and Use ENCODE Data ENCODE Downloads Additional ENCODE Topics Summary Exercises

Page 13: Biodb 2011-05

ENCODE Data Types Mapping data

Genes

Expression

Regulation

Variation

Copyright OpenHelix. No use or reproduction without express written consent 13

ENCODE Tracks

identified with icon

Page 14: Biodb 2011-05

Mapability Data

Mapability for unique regions Higher the peak, the more unique Cleavage intensity for structural profiling

Copyright OpenHelix. No use or reproduction without express written consent 14

Broad: 36 mers

Duke: 20-35 mers

Rosetta: 35 mers

UMass: 15 mers more

uniquenot

unique

Page 15: Biodb 2011-05

GENCODE http://www.sanger.ac.uk/PostGenomics/encode/

Gencode for assessment of protein coding genesCopyright OpenHelix. No use or reproduction without express written consent 15

Page 16: Biodb 2011-05

Expression Data: RNA Localization

RNAs molecules, location in various cell types and fractionsCopyright OpenHelix. No use or reproduction without express written consent 16

http://en.wikipedia.org/wiki/MRNA

Page 17: Biodb 2011-05

Expression Data: Presence of RNA or Exons

RNAs of various types Special look for long mRNAs and exons

Copyright OpenHelix. No use or reproduction without express written consent 17

http://en.wikipedia.org/wiki/MRNA

Page 18: Biodb 2011-05

Regulation Data

Regulation data Structure: modifications, open vs. closed chromatin

Copyright OpenHelix. No use or reproduction without express written consent 18

Image from NIH

Page 19: Biodb 2011-05

Regulation Data II

Transcription factor binding sites, TFBS RNA binding proteins

Copyright OpenHelix. No use or reproduction without express written consent 19

TATA bound to DNA

Page 20: Biodb 2011-05

Variation Data

Copy Number Variation (CNV) DataCopyright OpenHelix. No use or reproduction without express written consent 20

Page 21: Biodb 2011-05

Super-Tracks

New strategies to integrate and display data Super-Tracks provide multiple data types to view See Track Description page for details, options, and keys

Copyright OpenHelix. No use or reproduction without express written consent 21

Page 22: Biodb 2011-05

ENCODE DCC at UCSC

Copyright OpenHelix. No use or reproduction without express written consent 22

ENCODE at UCSC: http://encodeproject.org

Introduction ENCODE Data Types Find and Use ENCODE Data ENCODE Downloads Additional ENCODE Topics Summary Exercises

Page 23: Biodb 2011-05

General Organization

Tracks identified with icon Also available in Table Browser Description pages have options, settings, filters,

display keys, meta data, and referencesCopyright OpenHelix. No use or reproduction without express written consent 23

Configurationchoices, options,

filters

Display key,techniques,references,

contacts

click

Page 24: Biodb 2011-05

ENCODE Data Policy genome.ucsc.edu/ENCODE/terms.html

Non-scoop window “Ft. Lauderdale agreement”

Copyright OpenHelix. No use or reproduction without express written consent 24

Page 25: Biodb 2011-05

Awareness of Embargo Dates

Track description pages, Table Browser interface Download pages

Copyright OpenHelix. No use or reproduction without express written consent 25

Page 26: Biodb 2011-05

ChIP-seq Data for TFBS

Yale TFBS Sample display near TP53 in “dense” visibility mode Chip-seq graphic adapted from: wikipedia.org/wiki/ChIP-on-chip

Copyright OpenHelix. No use or reproduction without express written consent 26

TP53

stronger signalscell types + antibodies

Page 27: Biodb 2011-05

Description Page, Upper

See description page for more display options Choose tracks and view styles

Copyright OpenHelix. No use or reproduction without express written consent 27

display mode

peak configure

download

Page 28: Biodb 2011-05

Description Page, Lower

Display conventions explained Methods and references

Copyright OpenHelix. No use or reproduction without express written consent 28

Page 29: Biodb 2011-05

ENCODE DCC at UCSC

Copyright OpenHelix. No use or reproduction without express written consent 29

ENCODE at UCSC: http://encodeproject.org

Introduction ENCODE Data Types Find and Use ENCODE Data ENCODE Downloads Additional ENCODE Topics Summary Exercises

Page 30: Biodb 2011-05

Downloads and Release Log

Release log for a handy list of available data Download is offered; FTP recommended

Copyright OpenHelix. No use or reproduction without express written consent 30

Release log

Human

Mouse

Page 31: Biodb 2011-05

ENCODE DCC at UCSC

Copyright OpenHelix. No use or reproduction without express written consent 31

ENCODE at UCSC: http://encodeproject.org

Introduction ENCODE Data Types Find and Use ENCODE Data ENCODE Downloads Additional ENCODE Topics Summary Exercises

Page 32: Biodb 2011-05

New Features

Mouse data Proteomics data Publications Questions? UCSC mailing list, or ENCODE at NHGRI

Copyright OpenHelix. No use or reproduction without express written consent 32

encode-announce mailing list:https://lists.soe.ucsc.edu/mailman/listinfo/encode-announce

UCSC Genome Browser discussion list:http://genome.ucsc.edu/contacts.html

Page 33: Biodb 2011-05

modENCODE: modencode.org

A separate modENCODE: www.genome.gov/26524507 C. elegans and D. melanogaster modENCODE DCC: www.modencode.org

Copyright OpenHelix. No use or reproduction without express written consent 33

Science 24 December 2010: Vol. 330

new

February 2011 issue

Page 34: Biodb 2011-05

ENCODE DCC at UCSC

Copyright OpenHelix. No use or reproduction without express written consent 34

ENCODE at UCSC: http://encodeproject.org

Introduction ENCODE Data Types Find and Use ENCODE Data ENCODE Downloads Additional ENCODE Topics Summary Exercises

Page 35: Biodb 2011-05

Summary

Encyclopedia of DNA Elements Data Coordination Center at UCSC Genome Browser

Copyright OpenHelix. No use or reproduction without express written consent 35

Page 36: Biodb 2011-05

ENCODE DCC at UCSC

Copyright OpenHelix. No use or reproduction without express written consent 36

ENCODE at UCSC: http://encodeproject.org

Introduction ENCODE Data Types Find and Use ENCODE Data ENCODE Downloads Additional ENCODE Topics Summary Exercises

Page 37: Biodb 2011-05

Copyright OpenHelix. No use or reproduction without express written consent 37

Hands-on session for ENCODE at UCSC

Exercises on the handouts We will walk through them together 2 styles: questions only, and step-by-step When we are finished the formal exercises, we can

help you to investigate issues that you want to understand for your research

Page 38: Biodb 2011-05

Copyright OpenHelix. No use or reproduction without express written consent 38

Notice:

The materials and slides offered are for non-commercial use only. Reproduction, distribution and/or use for commercial purposes is strictly prohibited.

Copyright 2010, OpenHelix, LLC

http://www.openhelix.com/ENCODE

Page 39: Biodb 2011-05

Copyright OpenHelix. No use or reproduction without express written consent 39