bel framework resources (namespaces, equivalences, documents)
DESCRIPTION
BEL Framework Resources (namespaces, equivalences, documents). August 2012 - PowerPoint PPT PresentationTRANSCRIPT
August 2012
This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/ or send a letter to Creative Commons, 444 Castro Street, Suite 900, Mountain View, California, 94041, USA.
BEL Framework Resources(namespaces, equivalences, documents)
1
Overview
• The BEL Framework accesses files during compilation– For checking and equivalencing namespace values– For augmenting the KAM to increase connectivity
• A set of resource files are maintained by Open BEL• Flexible - can be substituted or augmented with user
provided documents
2
Contents
• Resource locations• Resources– Namespaces– Equivalences– Annotations– BEL Documents
• Creating and Using Custom Namespaces
Resource Locations
• Resources provided by the BEL Framework can be found here:– http://resource.belframework.org/belframework/1.0/
• Can also be downloaded from GitHub– https://github.com/OpenBEL/openbel-framework-resources
4
The BEL Framework Configuration Includes a Resource Index• Provides locations for namespace, equivalence, and
augmentation documents• Can use default or modify to use with custom
namespaces, equivalences, etc.• Default Resource Index:
– http://resource.belframework.org/belframework/1.0/index.xml
5
Contents
• Resource locations• Resources– Namespaces– Equivalences– Annotations– BEL Documents
• Creating and Using Custom Namespaces
BEL Namespaces
• OpenBEL supports 32 namespaces for:– genes/RNAs/proteins – protein families– named complexes – biological processes– chemicals
• Namespace documents (.belns) have a specific format– Include entity encodings to enforce BEL function semantics
• Users can provide custom namespaces
7
Supported Namespaces
• Genes, RNAs, microRNAs, proteins (6 namespaces)– Entrez Gene Ids (human, mouse, and rat only)– HUGO gene symbols– MGI gene symbols– RGD gene symbols– SwissProt accession numbers– SwissProt names
• Affymetrix Probe Sets (9 namespaces)– Human, mouse, and rat probe set identifiers
• Protein families (3 namespaces) – Selventa Protein Families (human, mouse, rat)
8
Supported Namespaces
• Chemicals (3 namespaces)– ChEBI names– ChEBI Ids– Selventa legacy chemicals
• Biological processes and pathologies (5 namespaces)– GO names– GO Ids– MeSH Phenomena and Processes [G]– MeSH Diseases [C]– Selventa legacy diseases
9
Supported Namespaces
• Named Complexes (5 namespaces)– Selventa Named Complexes (human, mouse, rat)– GO Cellular Components names– GO Cellular Components Ids
• Cellular locations (3 namespaces)– MeSH Cellular structures [A11.284]– GO Cellular Components names– GO Cellular Components Ids
10
BEL Namespace Documents• Namespaces are .belns files– Text files with header information and values
• Values include encoding information– Which BEL functions are valid to apply to this entity
11
Namespace Entity EncodingEncoding Value Valid BEL Functions
B bp(), path()O path()R r(), m()M m()P p()G g()A a(), r(), m(), p(), g(), complex()C complex()
• Example values - HGNC namespace– A2ML1-AS1 (A2ML1 antisense RNA 1),
encoded as "GR" is a valid value for a gene or RNA abundance, but not protein abundance
12
BEL Equivalence Files
• A BEL Equivalence File (.beleq) is associated with each BEL namespace• Each namespace value in the equivalence file is associated with a
universally unique identifier (UUID)– 32 hexadecimal digits
• Values with the same UUID are equivalenced– Terms containing same functions are coalesced to a single node during
compilation• Values in a namespace file are not required to be included in the
associated equivalence file
13
Example: Equivalences for MGI namespace
Examples of BEL Equivalencing
• The following three protein abundance terms are equivalent:– p(HGNC:AKT1)
• The abundance of the protein designated by HUGO gene symbol ‘AKT1’
– p(EGID:207)• The abundance of the protein designated by EntrezGene Id 207
(AKT1 Human)– p(SPAC:P31749)
• The abundance of the protein designated by SwissProt Id P31749 (AKT1 Human)
14
Examples of BEL Equivalencing
• The following two biological process terms are equivalent:– bp(MESHPP:apoptosis)
• The biological process designated by the MESH Phenomena and Processes heading ‘apoptosis’
– bp(GOID:0006915) • The biological process designated by the GO Id 0006915 (apoptotic
process)
15
BEL Annotations
• BEL Annotations and BEL Terms are completely separate• Annotations are associated with BEL Statements to express context
information about the statement– Source of the knowledge
• Citation, Evidence – Biological system
• Cell line, Body part, Species
• 22 Annotation Types are provided with the BEL Framework – 2 reserved types: Citation and Evidence– 20 additional defined by .belanno documents
• Additional Annotation Types can be defined by user– Require unique name within BEL document and domain of allowable values
(as list or .belanno document) or regular expression
16
Annotations Can Be Applied to Individual BEL Statements or Groups of Statements
17
Causal relationships demonstrated in lung fibroblasts, reported in PMID 1234567
Causal relationship demonstrated in liver endothelial cells , reported in PMID 1234567
Each Statement is distinct: These Statements have different sets of contexts
Source: PMID 1234567
Cell Type: Fibroblast
Cell Type: Endothelial Cell
Tissue: Lung
Tissue: Liver
p(X) increases r(Y);
kin(p(X)) increases p(Z);
p(X) increases r(Y);
Citation Annotation Format
• The Citation annotation is composed of a comma separated list containing up to 6 fields.– SET Citation = {"PubMed","Cell","16962653","2006-10-07","Jacinto E|
Facchinetti V|Liu D|Soto N|Wei S|Jung SY|Huang Q|Qin J|Su B",""}
18
Field Required Contents1 Yes Type of Citation. This is one of the following strings “Book”,
“PubMed”, “Journal”, “Online Reference”, or “Other”2 Yes Name of the Citation. This is typically the journal reference or
book name.3 Yes Reference. This is an identifier that can be used to link to the
citation. For books this is usually the ISBN number, for PubMeds this would be the PubMed ID and for other types it could be a URL pointing to the reference such as Wikipedia page.
4 No Date of publication in ISO8061 format (YYYY-MM-DD).5 No Authors. This is a “|” delimited list of authors for the reference.6 No Comments. This is optional information such as an abstract
that can be stored along with the reference. Limit is 4000 characters.
BEL Resource Documents
• BEL Resource Documents are used in compilation Phase III for network augmentation – BEL documents– Relevant assertions are identified and added to the network
• Include:– Gene Scaffolding
• g(EG:123) transcribedTo r(EG:123) • r(EG:123) translatedTo p(EG:123)
– Protein Family membership• p(PFH:"AKT Family") hasMembers list(p(HGNC:AKT1), p(HGNC:AKT2), p(HGNC:AKT3))
– Named Complex components• complex(NCH:"9-1-1 Complex") hasComponents
list(p(HGNC:HUS1),p(HGNC:RAD1),p(HGNC:RAD9A))
– Orthology (2.0.0 and future)
19
Contents
• Resource locations• Resources– Namespaces– Equivalences– Annotations– BEL Documents
• Creating and Using Custom Namespaces
Creating Custom Namespaces
• Allows use of a vocabulary not specifically supported by the BEL Framework– Including equivalencing to other namespaces
• Detailed directions can be found here:– http://openbel-framework.readthedocs.org/en/latest/tutorials/building_custom_namespaces.html
• Requires:– Namespace file in .belns format– URL for the .belns file– Customized resource index– Updated BEL Framework configuration file pointing to new resource index
• Optional:– Equivalence file in .beleq format
21