using entrez the life sciences search engine. searching ncbi databases efficiently knowing how to...
TRANSCRIPT
![Page 1: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/1.jpg)
Using Entrez
The Life Sciences Search Engine
![Page 2: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/2.jpg)
Searching NCBI Databases Efficiently
• Knowing how to retrieve the exact information you need in an efficient way is the fundamental and most important skill in Bioinformatics.
• Every NCBI database is designed and created for some specific purposes.
• A common mistake Bioinformatics novices make is searching for information in an inappropriate database.
• Entrez links among and within databases, making it easier to search for information.
![Page 3: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/3.jpg)
What is Entrez?
• Entrez is an NCBI retrieval system designed for searching several linked databases.
• Entrez is a search tool for integrated access to the biological literature and sequence data.
• Entrez is extremely powerful, enabling the user to quickly move between the different specialized databases.
![Page 4: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/4.jpg)
Entrez
• Entrez is divided into sites for nucleotide, protein, structure, genomes, OMIM, and more. You can use limits (such as RefSeq) to focus your Entrez search.
• When you conduct a search via Entrez, your query generates this screen, telling you the number of hits to your query.
![Page 5: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/5.jpg)
The Entrez System
![Page 6: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/6.jpg)
The Big Picture
LocusLink
Nucleotide
Protein
OMIM
PubMed
SNP
MGC
UCSC
GDB
e!
HGMD
UniGene
Homologene
MapViewer
Structure
3D Domains
CDD
Books
PopSet
Genome
Taxonomy
ProbeSet
UniSTS
Entrez
![Page 7: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/7.jpg)
Entrez and LocusLink
• Entrez doesn’t link to all the databases that contain sequences, however!
• LocusLink has its own groups of links to specialty databases, since it doesn’t cover all the genomes yet.
![Page 8: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/8.jpg)
Genomes
Taxonomy
Entrez:Database Integration
PubMed abstracts
Nucleotide sequences
Protein sequences
3-D Structure
3 -D Structure
Word weight
VAST
BLASTBLAST
Phylogeny
![Page 9: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/9.jpg)
Entrez
Journals
UniGenePubMed Nucleotide
Protein
SNP
Genome
BooksProbeSet
OMIM
CDD
Taxonomy
3D Domains
UniSTS
PopSet
Structure
The (ever) Expanding Entrez System
![Page 10: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/10.jpg)
Entrez DatabasesPubMed Biomedical literatureBooks Online textbooksNucleotide GenBank, EMBL, DDBJ, RefSeq, PDBProtein [GenBank, EMBL, DDBJ], RefSeq,
SWISS-PROT, PIR, PRF, PDBGenome Complete genomesTaxonomy Organisms in NCBI sequence databasesStructure MMDB: experimental 3D structuresDomains CDD: conserved protein domains3D Domains Compact 3D protein domains in MMDBOMIM Online Mendelian Inheritance in ManSNP Single nucleotide polymorphismsUniSTS Sequence Tagged Site markersProbeSet Gene expression and microarray datasetsPopSet Population study datasetsUniGene Gene-based expressed sequence clusters
![Page 11: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/11.jpg)
Nucleotide Database
• The Nucleotide database contains sequence data from GenBank, EMBL, and DDBJ, the members of the tripartite, international collaboration of sequence databases.
• EMBL is the European Molecular Biology Laboratory at Hinxton Hall, UK;
• DDBJ is the DNA Database of Japan in Mishima, Japan.
• Sequence data are also incorporated from the Genome Sequence Data Base (GSDB), Santa Fe, NM.
• Patent sequences are incorporated through arrangements with the U.S. Patent and Trademark Office (USPTO) and via the collaborating international databases from other international patent offices.
![Page 12: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/12.jpg)
Entrez Nucleotides
Primary • GenBank / EMBL / DDBJ 35,116,960
Derivative• RefSeq 259,219• Third Party Annotation 3,182
• PDB 4,703 Total 35,384,248
![Page 13: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/13.jpg)
Database Searching with Entrez
Using limits and field restriction to find plant g6pdhLinking and neighboring with g6pdh
![Page 14: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/14.jpg)
Entrez Nucleotides
glucose 6 phosphate dehydrogenase
The G6PD enzyme catalyzes the oxidation of glucose-6-phosphate to 6-phosphogluconate, while reducing nicotinamide adenine dinucleotide phosphate (NADP+ to NADPH). In terms of electron transfer, glucose-6-phosphate loses two electrons to become 6-phosphogluconate and NADP+ gains two electrons to become NADPH. This is the first step in the pentose phosphate pathway. This pathway, or shunt, as it is sometimes called, produces the 5- carbon sugar, ribose, which is an essential component of both DNA and RNA.
![Page 15: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/15.jpg)
![Page 16: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/16.jpg)
Limits Are Helpful
• Limits allow restriction of a search to a defined subset of the database.
• Limits can be set to restrict a search to a particular database field (e.g., the Author field).
• Limits can be set to search everything but a particular type of data (e.g., “exclude patent records”).
• Alternatively, limits can be set to search only a particular type of data (e.g., Genomic RNA/DNA) or to search only data from a particular source database (e.g., EMBL). Date limits and sequence length limits are also possible.
• The contents of each Entrez database differ, and therefore the Limits available for each database differ.
![Page 17: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/17.jpg)
glucose 6 phosphate dehydrogenase
Entrez Nucleotides: Limits & Preview/Index
Try using the Limits and Preview function to hone your searchTo find the Plant G6PD genes.
![Page 18: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/18.jpg)
glucose 6 phosphate dehydrogenase
Entrez Nucleotides: LimitsAccessionAll FieldsAuthor NameEC/RN NumberFeature keyFilterGene NameIssueJournal NameKeywordModification DateOrganismPage NumberPrimary AccessionPropertiesProtein NamePublication DateSeqID StringSequence LengthSubstance NameText WordTitle WordUidVolume
Field Restriction
Exclude bulk sequences
![Page 19: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/19.jpg)
glucose 6 phosphate dehydrogenase
Entrez Nucleotides: Limits
Title == Definition
Exclude Bulk Sequences
mRNA molecule type
Nuclear gene
![Page 20: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/20.jpg)
Document Summaries: Limits
![Page 21: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/21.jpg)
green plants
Adding Terms: Preview/IndexAccessionAll FieldsAuthor NameEC/RN NumberFeature keyFilterGene NameIssueJournal NameKeywordModification DateOrganismPage NumberPrimary AccessionPropertiesProtein NamePublication DateSeqID StringSequence LengthSubstance NameText WordTitle WordUidVolume
green plants
![Page 22: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/22.jpg)
Plant cytosolic g6pdh mRNAs
![Page 23: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/23.jpg)
Database Neighbors and Interlinking
• What makes Entrez more powerful than many services is that most of its records are linked to other records, both within a given database (such as Nucleotide) and between databases.
• Links within a database are called “neighbors” (e.g., Nucleotide neighbors).
![Page 24: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/24.jpg)
Links Between Databases
• Protein and Nucleotide neighbors are determined by performing similarity searches using the BLAST algorithm to compare the entry amino acid or DNA sequence to all other amino acid or DNA sequences in the database. We will discuss more about BLAST later.
• Nucleotide sequence records in the Nucleotide database are linked to the PubMed citation of the article in which the sequences were published.
• Protein sequence records are linked to the nucleotide sequence from which the protein was translated.
![Page 25: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/25.jpg)
Plant cytosolic g6pdh mRNAsSummaryBriefGenBankASN.1FASTAGI listLinkOutPubMed LinksProtein LinksNucleotide NeighborsPopSet LinksStructure LinksGenome LinksTaxonomy LinksOMIM Links
Formats
Links and neighbors (related records)
![Page 26: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/26.jpg)
LinkOut
• LinkOut is a feature of Entrez that is designed to provide users with links from PubMed and other Entrez databases to a wide variety of relevant web-accessible online resources:– Full-text publications– Other biological databases– Consumer health information– Research tools
• The goal is to facilitate access to relevant online resources beyond the Entrez system to extend, clarify, or supplement information found in the Entrez databases.
![Page 27: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/27.jpg)
Protein Database• The protein
database includes proteins from translate regions of DNA in GenBank as well as sequence from PIR
• The entry includes:– The name of the
protein– How the protein
sequence was derived
– An accession and a PID number
– The number of amino acids
![Page 28: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/28.jpg)
Protein EntryThe Entry also
includes:• Structural
information for the protein (if known)– Helices and -
Sheets – Domains– Etc
• The sequence of amino acids comprising the protein
![Page 29: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/29.jpg)
Setting Protein Database search limits• Choose Protein from
the drop-down menu– Can do a Boolean
search– Or can set LIMITS
• Fields (eg Author, Journal, etc.)
• Gene Location (genomic, mitochondrial etc)
• Segmented Sequence
• Only from (Database to check)
• Modification date
![Page 30: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/30.jpg)
Linking Between Databases
• Sometimes you will pull up a record and you have no idea what organism the gene you are looking at is from.
• For Example, the following record- what is Medicago sativa ?
![Page 31: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/31.jpg)
Entrez GenBank / GenPept
![Page 32: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/32.jpg)
Taxonomy to the Rescue
• Entrez lets you click a live link from the record and determine what organism Medicago sativa is.
• It is alfalfa.• You can also tell what it is related to
taxonomically, because sometimes the common name isn’t very useful either!
![Page 33: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/33.jpg)
Taxonomy Link
![Page 34: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/34.jpg)
Advanced Neighbors: BLink
![Page 35: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/35.jpg)
What is BLink
• BLink - BLAST Link • Someone has done a BLAST search
already, and you can just retrieve it!• BLink displays the graphical output of pre-
computed blastp results against the protein non-redundant (nr) database.
![Page 36: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/36.jpg)
This graphical output includes:
• Alignment of up to 200 BLAST hits on the query sequence
• Best Hits to each organism • List of known protein domains in the query
sequence • Filter hits by selecting the BLAST cutoff score • Distribution of hits by taxonomic grouping • Display of similar sequences with known 3D
structure • Filter hits by database and/or by taxonomic
grouping • Display a taxonomic tree of all organisms with
similar sequences
![Page 37: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/37.jpg)
PopSet Links
• The PopSet database contains aligned sequences submitted as a set resulting from a population, phylogenetic, or mutation study.
• These alignments describe such events as evolution and population variation.
• The PopSet database contains both nucleotide and protein sequence data.
![Page 38: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/38.jpg)
Protein Neighbors->PopSet Links
![Page 39: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/39.jpg)
Protein Neighbors->Genome Links
![Page 40: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/40.jpg)
PopSet search results
• The results or a PopSet search
• The PopSet database includes alignments of genes from multiple organisms OR different gene families OR mutational analyses
![Page 41: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/41.jpg)
PopSet Entry• The PopSet
entry includes:– The title of
the paper/study
– The length of the sequence(s) aligned
– The number of aligned sequences
![Page 42: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/42.jpg)
PopSet Entry without alignment
• The PopSet Entry without an alignment– Title of the
study– The number
of sequences included
– Links to the sequences
![Page 43: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/43.jpg)
Entrez Structures
![Page 44: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/44.jpg)
Protein Structures can also be in databases
http://bmbiris.bmb.uga.edu/wampler/tutorial/prot0.html is a useful review
Tutorial.
![Page 45: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/45.jpg)
Entrez links to structure databases
• The Structure database or Molecular Modeling Database (MMDB) contains experimental data from crystallographic and NMR structure determinations.
• The data for MMDB are obtained from the Protein Data Bank (PDB).
• The NCBI has cross-linked structural data to bibliographic information, to the sequence databases, and to the NCBI taxonomy.
• Use Cn3D, the NCBI 3D structure viewer, for easy interactive visualization of molecular structures from Entrez.
![Page 46: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/46.jpg)
Structure Search results
• The structure of proteins are also in a database
• Search as before
• Your search results are similar
![Page 47: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/47.jpg)
Structure Entry• The structure
Entry has links to the other databases
• And it will allow you download a file to open with a structure viewer program
![Page 48: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/48.jpg)
• Proteins with similar structures and functions have been identified in the databases
![Page 49: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/49.jpg)
BLink: Advanced Protein Neighbors
![Page 50: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/50.jpg)
BLink: Related Structures
![Page 51: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/51.jpg)
Viewing Structure in Cn3D• You can
download Cn3D (a structural viewer program) from NCBI
• This will allow you to view the structures from the structure database
![Page 52: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/52.jpg)
Cn3D Text Window
• The Text window of Cn3D will align two or more proteins so you can compare the structure of multiple proteins
![Page 53: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/53.jpg)
BLink: Human Homologue
![Page 54: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/54.jpg)
Human RefSeqs: Genome Reagents
![Page 55: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/55.jpg)
MMDB: MMolecular MModeling Data Base
• Derived from experimentally determined PDB records
• Value added to PDB records including:– Addition of explicit chemical graph
information– Validation– Inclusion of Taxonomy, Citation, and other information– Conversion to ASN.1 data description
language• Structure neighbors determined by
Vector Alignment Search Tool (VAST)
![Page 56: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/56.jpg)
Structure Summary
Cn3D viewer
Conserved Domains3D Domain Neighbors
Structure Neighbors
![Page 57: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/57.jpg)
Cn3D 4.1
![Page 58: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/58.jpg)
Cn3D 4.1: Structural Alignment
Casein kinase S. pombe
Src Kinase H. sapiens
Conserved ATP binding site
![Page 59: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/59.jpg)
Cn3D: Simple Homology Modeling
human
swordtail
![Page 60: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/60.jpg)
Using Cn3D to model domains
![Page 61: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/61.jpg)
Other services and databases from the NCBI
• LocusLink to all possible information from NCBI and beyond for a few well characterized model organisms.
• LocusLink is a great starting point: it collects key information on each gene/protein from major databases. It now covers 8 organisms.
• RefSeq provides a curated, optimal accession number for each DNA (NM_006744) or protein (NP_007635)
![Page 62: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/62.jpg)
Locus Links • Results of a Locus links search, includes:– Locus ID– Species – Locus symbol– Locus name– Locus location– Links
• Protein Database
• OMIM
• Reference Sequence
• Related GenBank Sequences
• Homologene Data
• UniGene
• Variation Data
![Page 63: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/63.jpg)
LocusLink: Selected Higher Genomes
OMIM
RefSeq
GenBank dbSNP
UniGene
Full report
PubMedHomoloGene
Map Viewer
Protein
![Page 64: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/64.jpg)
Protein Database
• The Protein database contains sequence data from the translated coding regions from DNA sequences in GenBank, EMBL, and DDBJ as well as protein sequences submitted to:– Protein Information Resource (PIR)– SWISS-PROT– Protein Research Foundation (PRF)– Protein Data Bank (PDB) (sequences from solved
structures)
![Page 65: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/65.jpg)
NCBI Protein Databases
• GenPept GenBank, EMBL, DDBJ CDS translations
• RefSeq mRNA based (NP_) and genome based (XP_)
• Swiss-Prot curated high quality protein reviews
• PIR protein information resource Georgetown University
• PRF protein resource foundation
• PDB Protein Databank sequences from structures
![Page 66: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/66.jpg)
Entrez Protein
• GenPept (GB,EMBL, DDBJ) 3,442,298 • RefSeq 856,191
• Third Party Annotation 3,834• Swiss Prot 144,508• PIR 282,821• PRF 12,079 Total 3,442,298
BLAST nr 1,642,191
![Page 67: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/67.jpg)
Protein Link
BLAST Link
Conserved Domains
![Page 68: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/68.jpg)
Related Proteins: Redundancy
Red
un
dan
t Seq
uen
ces
![Page 69: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/69.jpg)
Sequence from MutL structure
Related Proteins: Links
![Page 70: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/70.jpg)
BLink: non-redundant relatives
Arabidopsis homolog
Conserved Domain
![Page 71: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/71.jpg)
MLH1 Domain Structure: CDD
ATPase Domain Mismatch Repair Domain
![Page 72: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/72.jpg)
MLH1: ATPase Domain
![Page 73: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/73.jpg)
1BGQ: ATPase Domain in Cn3D
Yeast HSP90ATP Binding site helix
![Page 74: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/74.jpg)
Variations Human MLH1
![Page 75: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/75.jpg)
BLink
Finding structural models
![Page 76: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/76.jpg)
Mapping Variation Onto Structure
Bacterial DNA mismatch repair proteins
Loads sequence alignment and structure in Cn3D
![Page 77: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/77.jpg)
Mapping Variation Onto Structure
Conserved Asn
AsnIle
Ile – Val
![Page 78: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/78.jpg)
NCBI Genome Databases
• The Genome database provides views for a variety of genomes, complete chromosomes, sequence maps with contigs, and integrated genetic and physical maps.
![Page 79: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/79.jpg)
Microbial Genomes
ZWF
![Page 80: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/80.jpg)
Genome search results
• Genome Search Results
• The Genome database includes full (and some partial) genomes from viruses to complex organisms
![Page 81: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/81.jpg)
Genome Entry
• Genome entries include– Maps of the
genome– Links to the
sequence– The organism
for the genome
![Page 82: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/82.jpg)
Genes Database: All Genomes
Coming soon!
![Page 83: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/83.jpg)
Genes Database: All Genomes
![Page 84: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/84.jpg)
Genes Database: All Genomes
![Page 85: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/85.jpg)
But wait! There’s more!
• There is even more at NCBI that I have covered here.
• This site map is also a guide to NCBI resources. Each link leads to a brief description of the resource on this page, then to the resource itself. http://www.ncbi.nlm.nih.gov/Sitemap/
![Page 86: Using Entrez The Life Sciences Search Engine. Searching NCBI Databases Efficiently Knowing how to retrieve the exact information you need in an efficient](https://reader035.vdocuments.mx/reader035/viewer/2022070307/551b640c550346d31b8b59d8/html5/thumbnails/86.jpg)
There are many bioinformatics servers outside NCBI.
• Try ExPASy’s sequence retrieval system at http://www.expasy.ch/
• (ExPASy = Expert Protein Analysis System)
• Or try ENSEMBL at www.ensembl.org for a premier human genome web browser.