advanced srs course 12/12/02 -linking -subentries -applications
TRANSCRIPT
Advanced SRS Course 12/12/02
-Linking-Subentries-Applications
Linking in SRS
Types of Links
• Hyperlinks
-links between entries which are displayed as hypertext
-useful for examining entries that are referenced directly from
entries
• Query links
-allow you to construct queries using the relationships between
databanks
-require SRS to search through entries or indices in other
databanks looking for matches
Links between Databases
SWISS-PROT
EMBL
PDB
InterPro
PROSITE
PFAM
BLOCKS
Advantages of SRS Linking
Links are bi-directional
A B C
Direct link from ‘A’ to ‘B’ Direct link from ‘B’ to ‘C’
Multistep link from ‘A’ to ‘C’
Database Network Graph
The link page
• Two forms of the link page:
-type you see if you initiate linking from either the query
manager or the query result page
-type you see if you initiate linking from an individual
entry page
• The difference is at the top of these pages– one provides a “find all entries” option, the other does not
• (see next 2 slides)
From query manager or query result page
Find all entries options• In the selected databanks which are linked to the current query
- this returns entries from other databanks which have links
with entries in the current query
• In the current query which are linked to all selected databanks
- this limits the query so that it includes only the entries(from the
original query) which are linked to all of the selected databanks
• In the current query which are not linked to any of the selected
databanks
- this limits the query so that it includes only the entries(from the
original query) which do not have links to the specified databanks
From an individual entry page
No ‘find all entries’ options available
Linking from query manager page
• Can link from a single query or from multiple queries
• two ways to link your queries from the query manager page:
- tick the checkbox that corresponds to a query set and click
the LINK button
- use the text box beside the Expression button
Expression linking
• Useful alternative to using the linking pages
• Can be used to search for a link between two or more sets
of results or between a set of results and a databank
Linking operators
• < entries in the set or databank to the left of the operator are
returned if they have a link to any entries in the set or databank
to the right of the operator
• > entries in the set or databank to the right of the operator are
returned if they have a link to any entries in the set or databank to
the left of the operator
Linking operations
< Q1 < Q2 In Q1 that link to Q2
> Q1 > Q2 In Q2 that link to Q1
combined with logical operators:
< & Q1<Q2 & Q3 In Q1 that link to Q2 &Q3
< | Q1<Q2 | Q3 In Q1 that link to Q2 or Q3
< ! Q1<Q2 ! Q3 In Q1 that link to Q2
but not Q3
A1A2A3A4A5A6
B1B2B3B4B5
A B
A > B is B2 B3 B4 (all entries in B that have links to A)
A < B is A1 A2 A5 A6 (all entries in A that have links to B)
Subentries
Subentries • Necessary when there is repeated structured
information within an entry
FT DOMAIN 1 12 LUMENAL (POTENTIAL). FT TRANSMEM 13 33 POTENTIAL. FT DOMAIN 34 55 CYTOPLASMIC (POTENTIAL). FT TRANSMEM 56 76 POTENTIAL. FT DOMAIN 77 95 LUMENAL (POTENTIAL).
DR EMBL; L44581; AAA99933.1; -. DR EMBL; L44582; AAA99934.1; -. DR EMBL; L44583; AAA99935.1; -. DR EMBL; L44584; AAA99936.1; -. DR EMBL; L44585; AAA99937.1; -. …..
Use subentries to:
• Search for entries containing one or more subentries with certain values and obtain a list of entry references
• Search for subentries with certain values and obtain a list of subentry references
Subentries have a double function:
• They are part of the entry and often require data from other fields in order for their meaning to be resolved and displayed
• Example: a SWISS-PROT feature requires part of the entry’s sequence to be displayed
• They can be regarded as databanks themselves and can be indexed and queried independently from the entries
• Example: search all the transmembrane segments with a given range of length
Subentries available
• Protein databases have 5 subentries:– Reference
– Comment
– Links
– Feature
– Counter
• Nucleotide databases have 3 subentries:– Reference
– Features
– Counters
Controlled vocabularies
• Some of the fields belonging to the subentries have a predetermined number of keys ( as specified by the database documentation). These fields have a controlled vocabulary and when you use the extended query forms you can select a value from a drop down menu. Examples are:– CommentType– DbName– FtKey– CountItem
The Counter subentry• This is a special subentry created by SRS on the fly.
• It counts the number of times particular feature keys, comment types and links to a certain database occur within an entry
• It can be used to answer questions like:– How many entries have 3 or more links to EMBL?
– How many entries have more than 8 disulphide bridges?
– How many entries have 2 or more comments about function?
Subentry fields
• In the standard query form each subentry field name is preceded by the name of the subentry to which it belongs:
• Reference:authors
• Feature:FtKey
• Links:DbName
• The extended query form is divided up into sections. The top section contains the fields belonging to entry and below this are the subentries and the fields that they contain
Links with sets containing subentries
• Two types:– Simple Links– Parent Links
• It is not possible to combine sets of entries with sets of subentries using the logical operators
but link operators may be used between sets of entries and sets of subentries
[swissprot-org:human] > [swissprot-ftkey:transmem]
gives a set of transmembrane segment subentries found in human proteins
[swissprot-org:human] < [swissprot-ftkey:transmem]
returns all human entries that have a transmembrane segment
Simple links
Parent Links
• Sometimes it is necessary to do an explicit conversion from subentries to entries. This can be done using the operand parent. This method looks for links from the subentries to their respective parent entries and retrieves a set containing parent entries.
[swissprot-ftkey:transmem] > parent
gives the parent entries for the set of subentries from SWISS-PROT that have transmembrane sequence features
• Logical operators can then be used to combine the set of parent entries with another set of entries
Types of entries
Query Form
Using entry…..
Feature that is 10 aa in length
Using feature…..
Only returns transmemregions of exactly 10 aa
Applications
Applications in SRS
SWISSPROT
Upload user owneddata
Sequence
query
Run BLAST
launch
BLAST results- text file
BLAST
Indexing
linking
Pathway
Prosite
Protein Applications in SRS
Homology and similarity tools:BLASTP : database search toolFASTA : database search toolMPSrch
Protein function analysis tools:PPSearh : BLASTProdomScanRegExpFingerPrintScanPfScanInterProScanMPSrch
• Sequence analysis tools: - ClustalW
Nucleotide applications in SRS
Homology and similarity toolsBLASTNNFASTAFASTXFASTY
Sequence analysis tools:NClustalWRestrictionMap