clinical genomics work group (hl7) mukesh sharma washington university in st. louis
TRANSCRIPT
Clinical Genomics Work Group (HL7)
Mukesh Sharma
Washington University in St. Louis
Agenda
• Clinical Genomics Work Group
• Family History Project
• Genetic Variation
• Cytogenetics LOINC codes
• Gene Expression DAM
• Genomic Specimen Model Project
• New Models for Future Ballot
• Useful Links
The HL7 Clinical Genomics (CG) Work Group• Established as a SIG in 2003
• Mission To enable the standard use of patient-related genetic data such as DNA sequence variations and gene expression levels, for healthcare purposes (‘personalized medicine’) as well as for clinical trials & research
• Work Products and Contributions to HL7 ProcessesThe Work Group will collect, review, develop and document clinical genomics use cases in order to determine what data needs to be exchanged. The WG will review existing genomics standards formats such as BSML (Bioinformatics Sequence Markup Language), MAGE-ML (Microarray and Gene Expression Markup Language), LSID (Life Science Identifier) and other. This group will recommend enhancements to and/or extensions of HL7's normative standards for exchange of information about clinical genomic orders and observations.
In addition, Clinical Genomics will seek to assure that related or supportive standards produced by other HL7 groups are robust enough to accommodate their use in both research and clinical care use. The group will also monitor information interchange standards developed outside HL7, and attempt harmonization of information content and representation of such standards with the HL7 content and representation.
CG Work Group Leadership (Co-Chairs)
• Joyce Hernandez
Merck & Co. Inc.
• Kevin Hughes MD Partners HealthCare System, Inc.
• Amnon Shabo, PhD
IBM
• Mollie Ullman-Cullere Dana-Farber Cancer Institute
Formal Relationships with Other HL7 Groups
CG Work Group coordinates with a large number of other Work Groups in order to accomplish its mission. Strongest relationships are with
•Orders and Observation
•Clinical Statement
•Clinical Decision Support
•Regulated Clinical Research Information Management
•Patient Care
•Electronic Health Records
•Modeling and Methodology
•Structured Documents
WG meetings/Balloting Cycles
• 3 times annually
• January, May, September
• 2010 meetings
• January 17–22, 2010 meeting at Pointe Hilton Squaw Peak, Phoenix, AZ
• May 17-20, 2010 meeting at Windsor Barra Hotel and Congressos, Rio De Janerio, Brazil
• October 3-8, 2010 meeting at Cambridge, MA
Clinical genomics Work Group Meeting
OCTOBER, 2010 Update
Open Floor
Discussed FDA regulations and groups concern about reporting raw data
• FDA wants raw data to be part of medical record but it is very expensive to store the data.
• Some members raised concerns that e.g. for next generation sequencing they do not have space to store raw data and quality scores etc.
Overview of Activities
Three Tracksv3:Family History (Pedigree) TopicGenetic Variations TopicGene Expression TopicCMETs defined by the Domainv2:v2 Implementation Guides* The IG “Genetic Test Result Reporting to EHR” is modeled after the HL7 Version 2.5.1 Implementation Guide: Orders And Observations; Interoperable Laboratory Result Reporting To EHR (US Realm), Release 1CDA:A CDA Implementation Guide for Genetic Testing ReportsCommon:Domain Analysis Models for the various topicsA Domain Information Model (v3) describing the common semanticsSemantic alignment among the various specsNormative (V3); DSTU (CDA); Informative (V2)
HL7 Clinical Genomics: The v3 Track
FamilyHistory
Domain Information Model: Genome
Gene Expression
Phenotype(utilizing the HL7 Clinical Statement)
Utilize
Co
nstra
in
Genetic VariationC
on
strain
utilize
Family History
Background
• HL7 and ANSI approved pedigree model• Numerous implementations within care setting• Deployed by Surgeon General’s My Family Health Portrait and
MS Health Vault
Status
• Several groups developing compliant family history tools have confirmed need for compliance testing framework; therefore….• Canonical Pedigree project to develop tools to test compliance to Pedigree
standard and interoperability• Hosted Web Service, using Pedigree Standard, provides hereditary cancer
risk assessments
Genetic Variation
Background• Approved CMET: passed normative ballot under reconciliation• Published HL7 v2.5.1 Lab Reporting Implementation Guide (IG)
for structured clinical genetic test results
Status• Genetic Test Report Project using Clinical Document
Architecture (CDA)• Release 2 of 2.5.1. IG, expanding to new clinical scenarios (e.g.
tumor genetic profile) and genetic test definition• Genetic test orders will be a collaborative modeling effort (e.g.
Clinical Genomics, Orders & Observation, Laboratory)• Starting analysis for scope expansion to whole genome
sequencing• Starting analysis for utility of data set in clinical/research data
warehouse
Cytogenetics LOINC Codes 1
Background• CG has a Genetic Variation Implementation Guide that covers genetic mutations
located within a gene. Need to report larger genetic changes found in cytogenetic testing.
• Develop LOINC codes for representing cytogenetics test results• Develop prototype V2 interface based on the LOINC panel structure
• In Intermountain Healthcare’s DEV environment• Potentially real/live interface between ARUP Laboratories and Intermountain
Healthcare
Status• Officially submitted to LOINC for approval
• Three panels (total 43 codes)• Chromosome Analysis G Banded Panel• Chromosome Analysis FISH Panel• Chromosome Analysis Microarray Copy Number Change Panel
• Additional 11 codes• Drafting HL7 V2 Implementation Guide for Cytogenetics
• Sample messages, etc.• Detailed data models and associated terminology are created in Intermountain
Healthcare’s development environment
Cytogenetics LOINC Codes 2
Next Step
• HL7 standard development
• Target to ballot the v2 IG in January 2011 ballot cycle
• Develop the cytogenetics section of the CDA Genetic Test Report (GTR)
• Prototyping implementation, eventually real implementation
• Real practical challenges
Genomic Specimen Model Project
Background
• CG has started a Specimen Process Step Project
• Discussion with Orders and Observation (O & O) in Jan 2010 meeting concluded that the requirements should be captured in O & O Specimen Model
• O & O will enhance the Universal Specimen CMET. The scope will be updated and named Specimen CMET enhancement phase 2
• CG will drop the specimen process step project and place a change request on the O & O site to make sure that their use cases are captured in the specimen model
Update
• Scope: Project will detail specimen collection, procedure(s) done on specimen and specimen storage that will affect the quality of the specimen.
• Requirements represented in specimen CMET
• Requirements not represented in specimen CMET
Requirements Represented In Specimen CMET
Specimen Handling and Processing Type of preservatives used and amount. Examples: additives used to preserve
RNA/DNA Special handling such as flash freezing
Storage Type of storage used for collected specimen and any genetic extracted
material.
Specimen Access
• Unique identifiers assigned to all materials (both collected and derived) to help manage access to specimens.
Specimen Type
• Whether fluid, tissue, cell or molecular specimen?
Specimen Quantity
• Quantity and/or size of specimen collected.
• In the Specimen model, Natural class is available to capture this information
Requirements Represented In Specimen CMET
Specimen Characteristics
• RNA/DNA characteristics: e.g. Purity values-A260/A230 and A260/A280, RNA integrity number (RIN) number etc.
• QC needs to be done by the specimen core lab O&O : Captured in ObservationEvent. Need an implementation guide for
details. May separate it out in future from Observation Event
Requirements Not Represented In Specimen CMET 1
Genetic Consent Form
• Linking up with the Genetic Consent Form.
• Form signed by the patient to allow genetic/genomic testing and in some cases to permit long-term storage of genetic samples for further research.
• Need to know that there is consent; duration that it allows the specimen to be used for (indefinite or restricted to particular duration or protocols).
• Consent could be withdrawn and as a result the specimen is pulled out and destroyed.
O&O: • Bullet 1 and 2: Present in the current model as part of clinical statement (bullet 1 and 2)
• Bullet 3: Needs to be handled in Medical Records as Medical record owns consent.
• Can not tie consent to specific specimen currently.
• In future, could be captured in the SpecimenProcessStep messaging-and include provision to destroy the specimen. CMET it self does not deal with this activity.
Requirements Not Represented In Specimen CMET 2
Specimen Management
• Specimen Collection
• Two use cases: • i) Patient comes in and we take 2 or more specimens • ii) Patient comes in and we take 1 specimen
• We need to capture the relationship between multiple specimens collected at one time (use case i)
• The universal CMET only has one entry point (SpecimenChoice) i.e. all CMETs are starting from Specimen
• Suggested Action
• In the SpecimenChoice Box add the SpecimenCollectionGroup class (especially for use case i)
Requirements Not Represented In Specimen CMET 3
Current Model Proposed Change
Gene Expression DAM Update
• Currently reviewing results of the last ballot ( informative ballot in May, 2010)
• Next steps:• Finish NCI Generic Assay (IRWG/ICR)
• Changes to GE DAM• Add “generic” classes from Generic Assay
• Bring over additional BRIDGE Classes
• Apply suggested changes from the ballot (use case, BRIDG compatibility)
Clinical Genomics DAM (50,000 foot level view)
class Complete Diagram
A-Phenotype
AminoAcid
+ name: String
ArrayDataType
+ name: String+ version: String
HL7 CG Elements
Joyce Addi tions
NCI Model Elements
BRIDG 2.1
Modi fied for CG
MIAME-MAGE
MAGE-T AB
Legend
Expression
- classCode: String- id: Integer- code: String- negationInd: Boolean- text: String- effectiveT ime: String- uncertaintycode: String- value: String- interpretationCode: String- methodCode: String
T his is the Domain Information Model for the HL7 Cl inical Genomics Work Group.
It consists of the fol lowing topics:
1. Gene Expression 2. Genetic Variation3. Genotype4. Sequence5. Proteomics6. Links to Cl inical Phenotypes
Entry point for the Gene Expression CMET POCG_RM000031UV
AssociatedProperty
- classCode: String- id: Integer- code: String- text: String- effectiveT ime: String- value: String- methodCode: String
Gene
+ symbol: String+ ful lName: StringColumn+ genbankAccession: String+ genbankAccessionVersion: String+ ensemblgeneID: String+ unigeneclusterID: String+ entrezgeneID: String
Chromosome
+ chormosomeNumber: Integer
DNA
- name
Need defini tions
Use this class for inherent data about the locus, e.g. chromosone no.
RNA
- name
Nucleotides
+ nucleotideName: StringIntron
+ length: Integer+ intronClass: String
Exon
+ length: Integer+ intronClass: String
Nucleobases
+ shortName: String
Phosphate
+ name: String
Ester
+ name: String
Sugar
+ name: String
Codon
+ codonId: Integer
Usha: Relationship should be from Gene to DNA. Portions of DNA correspond to a Gene. Chromosones would have a bunch of genes.
GeneticLocus
- id: Integer- text: String- methodCode: String- chromosomePosition: Integer- cel lT ype: String
T his class is a placeholder for speci fying a locus on the genome, i .e., a posi tion of a particular given sequence in the subject’s genome. Note that the semantics of the locus (e.g., gene) is defined by data assigned in the code & value attributes of this class, and also by placing additional data relating to this locus into the classes (and CMET s) associated with this class.
Genome
- classCode: String- id: Integer- code: String- text: String- effectiveT ime: String- confidential i tyCode: String- value: String- interpretationCode: String- methodCode: String
GeneticLoci
- classCode: String- id: Integer- code: String- negationInd: Boolean- ti ti le: String- text: String- statusCode: String- effectiveT ime: String- confidential i tyCode: String- reasonCode: String- value: String- interpretationCode: String- methodCode: String
LargeDuplicaiton
- classCode: String- id: Integer- code: String- negationInd: Boolean- text: String- effectiveT ime: String- uncertaintycode: String- confidential i tycode: String- value: String- interpretationCode: String- methodCode: String
GeneticDocument
- classCode: String- id: Integer- code: String- ti tle: String- text: String- statusCode: String- effectiveT ime: String- confidential i tyCode: String- languageCode: String- setId: Integer- versionNumber: Integer
LargeDeletion
- classCode: String- id: Integer- code: String- text: String- effectiveT ime: String- confidential i tycode: String- uncertaintycode: String- value: String- interpretationCode: String- methodCode: String
Sequence
- classCode: String- id: Integer- code: String- negationInd: Boolean- ti ti le: String- text: String- statusCode: String- effectiveT ime: String- reasonCode: String- value: String- interpretationCode: String- methodCode: String
Need defini tions.
Need defini tions.
Need defini tions.
Use the value attribute to encapsulate raw data relating to the enti re set of loci . For example, SNP genotyping of a large number of genes/markers.
Cytogenetics
- classCode: String- id: Integer- code: String- negationInd: Boolean- text: String- effectiveT ime: String- confidential i tycode: String- uncertaintycode: String- value: String- interpretationCode: String- methodCode: String
???OtherNonLocusData
- classCode: String- id: Integer- code: String- negationInd: Boolean- text: String- effectiveT ime: String- confidential i tycode: String- uncertaintycode: String- value: String- interpretationCode: String- methodCode: String
Need defini tions.
Need defini tions. Need defini tions.
Need defini tions.
GenotypeFinding
- normal izedXIntensi ty: float- normal izedYIntensi ty: float- rawXIntensi ty: float- rawYIntensi ty: float- cal l : String
Indiv idualAllele
- classCode: String- id: Integer- code: String- negationInd: Boolean- ti ti le: String- text: String- statusCode: String- effectiveT ime: String- reasonCode: String- value: String- interpretationCode: String- methodCode: String
SequenceVariation
- classCode: String- id: Integer- code: String- negationInd: Boolean- ti ti le: String- text: String- statusCode: String- effectiveT ime: String- reasonCode: String- value: String- interpretationCode: String- methodCode: String
DeterminantPepetide
- classCode: String- id: Integer- text: String- effectiveT ime: String- value: String- methodCode: String
AssociatedObserv ation
- id: Integer- name: String- copyNumber: Integer- zygosi ty: String- dominancy: String- geneFamily: String
Polypeptide
- classCode: String- id: Integer- text: String- effectiveT ime: String- value: String- methodCode: String
Need defini tions. Need defini tions.
Should we leave this out and just add classes as needed?
Entry path to the broadest path of the genetic variation model.
ViralGenetics
- classCode: String- id: Integer- code: String- negationInd: Boolean- text: String- effectiveT ime: String- confidential i tycode: String- uncertaintycode: String- value: String- interpretationCode: String- methodCode: String
Added as a placeholder. > Future Expansion <
Need defintions
SNPAssay
- designAl leles: String- designScore: Float- designSequence: String- designStrand: String- id: Long- status: String- vendorAssayId: String- version: String
SNPPanel
- assayCount: Integer- description: String- id: Long- name: String- technology: String- vendor: String- vendorPanelId: String- version: String
SNP Design classes
Material
+ id: Integer+ description: String+ name: String+ formcode: String
ExtractedNon-GeneticSample
- extractedSsampleId: Integer- extractedAmount: Integer- extractedAmountUOM: String- extractionMethod: String::Material+ id: Integer+ description: String+ name: String+ formcode: String
ExtractedGeneticSample
+ geneticSampleId: Integer+ geneticSampleT ype: String+ extractedAmount: Integer+ extractionMethod: String+ GeneticSampleT ype: int+ hybridization: int+ authorizationLink: url::Material+ id: Integer+ description: String+ name: String+ formcode: String
OriginalBioSpecimen
+ amount: int+ uni tofMeasure: String+ statusCode: String+ statusDateRange: String::Material+ id: Integer+ description: String+ name: String+ formcode: String::Node+ experimentId: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String
HandlingDocument
- Id: int- text: String
SpecimenCharacteristics
+ Id: int+ color: String+ clari ty: String+ condition: String
Collection
- col lectionMethod: int- id: int::Handl ingDocument- Id: int- text: String
Storage
+ id: Integer+ flashFrozenMethod: String+ temp: Integer+ storageMethod: String::Handl ingDocument- Id: int- text: String
Transportation
- id: Integer::Handl ingDocument- Id: int- text: String
Assume this is a generic l ist of al l material . Speci fic material used and tracked within the conduct of a study and/or cl inical care would be uniquely identi fied via other classes (i .e. extracted or resecti ioned samples). T he identi fier is used only for the original biological specimen.
ArrayGroup
- arraySpacingX: float- arraySpacingY: float- barcode: String- length: float- numArrays: Integer- orientationMark: enum(top,bottom,left,right)- width: float
Array
+ arrayIdenti fier: String+ arrayXOrigin: Integer+ arrayYOrigin: Integer+ originRelativeT o: String
ArrayDesign
+ id: Integer+ version: String+ comment: String+ substrateT ype: String+ surfaceT ype: String+ sequecnePolymerT ype: String+ contactId: Integer::Identi fiable+ id: URI+ name: String+ properties: String+ description: String
ArrayManufacture
- manufacurungDate: String- tolerance: Integer
Gene expression Design classes
Do we need separate classes for Array Design (GE versus Genetic variation)?T he attributes I have added are from the new MAGE-T AB model .Do we sti l l need number of features (this came from the old version of the model)?
LabeledExtract
- flourescentLabel ingSubstance: String- flourescentLabel ingSubstanceAmount: float- flourescentLabel ingSubstanceUnits: float
Hybridization
- name: String- amountOfMaterial : float
ArrayManufactureDev iation
T his area of the MAGE model seems to be placeholders. T here are relationships to both FeatureDefect and ZoneDefect both of which do not have attributes.
FeatureDefect: Stores the defect information for a feature.T his class points to Posi tionDelta which has coordinate information (del ta X,Y). Posi tionDeltapoints to DistantUnit which contains additional measurement data.FeatureDefect points to an OntologyEntry which contains control led vocabulary. T he l ink constrains the vocabulary entries to represent only "defectT ype".
ZoneDefect: Stores the defect information for a zone.T his class points to the Zone class which does have lower-Right X,Y and upper-Left X,Y coordindates,plus a row identi fier.
Channel
- channel_no: Integer
BioAssayTreatment
+ bioAssayProcess: String::Node+ experimentId: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String
ImageAcquistion
+ imageAcquistionMethod
Image
+ name: String+ url : String::DataFi le+ id: String+ name: String+ dataFi leT ype: String+ dataFormat: String::Data+ uri : String+ datatype: String
Should we embed the image as blob, rather than point to i t? Or provide both options?ANS: T oo huge to store in the database. Itis rare to go back to them. But some folks want to keep the images. T hey could be kept in secondary.
Deriv edBioAssayData
Need Array Manufacturing control data. Not chip but chip by overal l .
NOT E: Image is scanned at di fferent wave lengths.
FactorValue
- value: String::Measurement+ value: String+ m invalue: String+ maxvalue: String+ uni t: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String
Lab_Experiment
+ id: Integer+ ti tle: String+ description: String+ date: date+ assayT ype: String+ experimentalDesigns: String+ formatVersion: String+ publ icIdenti fier: String+ sdrfFi le: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String
Factor
+ type: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String
ExperimentalDesigns
+ type: String+ description: String
QualityControl
+ type: String+ qual i tyControlDescription: String
Replicatetypes
+ repl icateT ype: String+ repl icateDescription: String
NormalizationTypes
+ normal izationT ype+ normal izationDescription: String
T hese classes needs to be harmonized to Study Design portion ofthe BRIDG model.
GenomicProtocol
+ id: Integer+ name: String+ type: String+ description: String+ hardware: String+ software: String+ contact: String+ url : String+ publ icProtocolUrl : String
Assume there can be multiple "experiments" for complex studies. ???
Also the new version cal ls this an Investigation. When talking about genomics testing a lot of SMEs use the term "Experiment". Investigation can also be connected more easi ly to the term study which already has a broader scope since i t represents the "cl inical trai l " used in the research context.
Another factor is that the MGED ontology makes references to "ExperimentalProtocol" in a number of places, so i t m ight be better to keep a known term.
Which terms does the team prefer? Is there a term that could fi t both research and healthcare use?
T his class wi l l need speci fic harmonization to the Study class in BRIDG.hardware/software requirements for the arrays need to added. T hese should probably be normal ized into separate classes.
For CG DAM model reviewers:we need more examples: Is there other software required other than the Reporter?
ProtocolApplications
+ edgeId: Integer+ order: Integer+ notes: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String
Is this the proper way toidenti fy an individual channel?
Germ l ine/Somatic needs to be val idated by a lab test. Should i t just be represented as part of a test and taken out of the bio-specimen?
ImageFile
+ name: String+ status: String+ type: String
Raw ArrayData
DataFile
+ id: String+ name: String+ dataFi leT ype: String+ dataFormat: String::Data+ uri : String+ datatype: String
Need defini tion on what type of data is carried here and which function in the process populates i t.
Val idate that "ordered" means sequenced and does not represent "ordered" from lab.
Feature
+ featureId: Integer+ blockCol : Integer+ blockRow: Integer+ col : Integer+ row: Integer+ reporterid: Integer::DesignElement+ id: Integer+ compositid: Integer+ arrayDesignId: Integer+ featureID: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String
Reporter
+ id: Integer+ controlT ype: String+ sequence: String+ databaseEntry: String::DesignElement+ id: Integer+ compositid: Integer+ arrayDesignId: Integer+ featureID: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String
ReporterGroup
+ reporterGroupId: Integer+ name: String::DesignElement+ id: Integer+ compositid: Integer+ arrayDesignId: Integer+ featureID: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String
Need to di fferentiate between frozen and fix.For breast cancer.Containers need to be added.Example:Non-frozen and frozen tissue samples need to be included.Unfixed tissue sections (sl ide type and sl ide mount. In healthcare)
Add class to handle thechange of state of the material .
T ypical ly cal led protocol of treatment.
In MAGE this is the actual Image and everything that was done to get i t.
Can do another treatment and get another image. Actual steps are not kept for al l images. Usual ly only recorded for the last image.
JH: MAGE-T AB model confl icts with these statements. It has an Assay Class as part of the sdrf package and an Image class as part of the data package. I wi l l rename this Bio-Assay class to just Assay.
NOTE: Mollie: Wants to constrain model for clinical environment at a later point.
Hardw are
+ description: String+ version: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String
Softw are
+ description: String+ version: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String
Edge
+ id: Integer+ experimentIdenti fier: String+ input: String+ output: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String
Need examples for EDGE data, primari ly for input and output. Couldn't find any at themagetab and tabemage si tes.
Node
+ experimentId: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String
DimensionElement
+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String
Performer
+ id: Integer+ personID: Integer+ protocol ID: Integer
Person
+ fi rstname: String+ lastname: String+ m idini tials: String+ affi l iation: String::Contact+ Address: String+ phone: String+ emai l : String+ fax: String+ tol lFreePhone: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String
PersonRole
+ role: String::Person+ fi rstname: String+ lastname: String+ m idini tials: String+ affi l iation: String::Contact+ Address: String+ phone: String+ emai l : String+ fax: String+ tol lFreePhone: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String
Publication
+ id: Integer+ pubMedID: String+ ti tle: String+ publ icationDOI: String+ authorl ist: String+ status: String
Contact
+ Address: String+ phone: String+ email: String+ fax: String+ tollFreePhone: String::Identifiable+ id: URI+ name: String+ properties: String+ description: String
Assay
+ arrayIdenti fier: String+ arrayDataFi les: String+ derivedArrayDataFi les: String+ arrayDataMatricFi les: String+ derivedArrayDataMatrixFi les: String
TechnologyType
+ technologyT ype: String
CompositElement
+ id: Integer+ reporterID: Integer+ databaseEntry: String::DesignElement+ id: Integer+ compositid: Integer+ arrayDesignId: Integer+ featureID: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String
DesignElement
+ id: Integer+ compositid: Integer+ arrayDesignId: Integer+ featureID: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String
Question on EBI example: http://www.ebi.ac.uk/m iamexpress/help/array_designs.htm l#ADF
Should the CompositeSequenceComment be represented as a databaseentry (Ontology T erm / Value pair) or as variable?
Data
+ uri : String+ datatype: String
DataElement
+ id: Integer+ datamatrixId: Integer+ col : Integer+ row: Integer+ rowQuanti tationT ype: String+ index_: Integer+ secondayKey: String
DataMatrix
Need more information on how this is implemented. Description seems to indicate calculation.See MGED section below:
class Quanti tationT ypedefini tion:T he Quanti tationT ype provides a method for calculating a single datum of the BioAssayData matrix.superclasses: Quanti tationT ypePackageproperties: unique_identi fier MO_67 class_role abstract class_source mageconstraints: restriction: has_scale has-class Scale restriction: has_type has-class DataT ype
Name: Complete DiagramAuthor: hernajoyVersion: 1.0Created: 2006-01-11 12:00:00 AMUpdated: 2010-06-17 7:10:15 PM
DATA MATRIX EXAMPLE from: http://tab2mage.sourceforge.net/docs/magetab_docs.html#datamatrix
Bio-Specimen-Characteristics
+ term: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String
T his is equivalent to Material class in the MAGE-T AB model .
Material in this model appl ies to BRIDG and HL7 expanded scope which goes beyond biologic material .
Assume this class needs to represent the many to many associations between the fol lowing MGED concepts. T hese associations attempt to group mathematical functions into nodes.
1. Nodes2. Node Values3. Node Value T ypes4. BioAssays5. BioAssayDataCluster
Normalization
+ derviedArrayDataFi le: String::Node+ experimentId: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String
Scan
+ arrayDataFi les: String+ derivedArrayDataFi les: String+ arrayDataMatricFi les: String+ derivedArrayDataMatrixFi les: String::Node+ experimentId: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String
Measurement
+ value: String+ m invalue: String+ maxvalue: String+ uni t: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String
Need sample data for this class.
ProtocolParameter
::Identi fiable+ id: URI+ name: String+ properties: String+ description: String
ParameterValue
+ protocolParameterId: Integer+ protocolAppl ication: Integer::Measurement+ value: String+ m invalue: String+ maxvalue: String+ uni t: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String
Source
+ contactid: Integer::OriginalBioSpecimen+ amount: int+ uni tofMeasure: String+ statusCode: String+ statusDateRange: String::Material+ id: Integer+ description: String+ name: String+ formcode: String::Node+ experimentId: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String
TreatedSample
::ExtractedGeneticSample+ geneticSampleId: Integer+ geneticSampleT ype: String+ extractedAmount: Integer+ extractionMethod: String+ GeneticSampleT ype: int+ hybridization: int+ authorizationLink: url::Material+ id: Integer+ description: String+ name: String+ formcode: String
NameValueType
+ id: Integer+ name: String+ type: String
Definition of Experiment
SPECIMEN HANDLING
ARRAY DESIGN
RELATIONSHIPS BETWEEN: (Samples, Arrays and Data)
GENE EXPRESSION DATA
Usha: May not need sugar and phosphate data.
Specimen Handling
+ type: String+ name: String+ amount: Integer::Handl ingDocument- Id: int- text: String
Shipper
- dateShipped: String- senderT ype: String- senderName: String::Transportation- id: Integer::Handl ingDocument- Id: int- text: String
Receiv er
- dateRecieved: String- receiverT ype: String- receiverrName: String::Transportation- id: Integer::Handl ingDocument- Id: int- text: String
SpecimenContainer
+ containerT ype: String+ risk: String+ handl ing: String+ capaci tyQuanti ty: Integer+ heightQuanti ty: Integer+ diameterQuanti ty: Integer+ capT ype: String+ separatorT ype: String+ barrierQuanti ty: Integer+ bottomDeltaQuanti ty: Integer::Identi fiable+ id: URI+ name: String+ properties: String+ description: String
CellSource
+ T ype: String::OriginalBioSpecimen+ amount: int+ uni tofMeasure: String+ statusCode: String+ statusDateRange: String::ResultInterpretation+ id: Integer::Material+ id: Integer+ description: String+ name: String+ formcode: String::Node+ experimentId: Integer::DimensionElement+ node: String::Identi fiable+ id: URI+ name: String+ properties: String+ description: String
ResultInterpretation
+ id: Integer
1
0..*
0..*
0..*
1
0..*
Contain/ about
contains * /coded by 1
made upof * / ispart of 1
binds * /boundby 1
0..* 0..*
sourced from /derivedcol lection store
0..1
Sourcedfrom /produces
0..*
1..*
produces /produced by
0..*
0..* 0..*
0..1
representedby /represents
0..1
1contains 1.* /part of 1
binds * /boundby 1
* doneon an /0..1undergo
makes * /made by 1
1..
speci fiedby 1 /speci fies * *
contain1.* / partof 1
definedby 1 /defines *
1
0..* 1
0..*
1
0..*
1
0..*
may have 0.*/ defined by 1
contains1.* / partof 1
0...* created by 1/ 1 resul ts in 0..*
1..*
coded by1.* /codes 0..1
0..1
1 may have * /* can beassociated to 1
0..*
0..*
1
0..*
0..*
arrayDataMatrixFi lesLink
0..*
0..*
0..*
appears in */ 1represents asection of
0..*
shipsto 0..*
1..*
0..1
+usedfor
1..*
0..1
derivedArrayDataMatrixFi lesLink
contains * /part of 1
maycontain1.* /belongsto
0..*
arrayDataFi lesLink
0..*
0..*
derivedArrayDataFi lesLink
0..*
0..*
0..*
0..*
0..*
1
0..*
0..*
0..*
1
0..*
contains / isdescribedby
0..*
0..*
1
0..*
1..
0..*
0..*
0..*
0..*
0..*
0..*
0..*
printingProtocol
0..*
1..*
belongsto /contains
1..
0.* used in /0.*performed on
label l ing produces 1 /resul ts from label l ing 1
mayproduce /producedfrom
processingdecribed by /describesprocessing for
0..*1
GeneticVariation
Bio-Specimen
Gene Expression
Color Coding Scheme
class Gene Expression
HL7 CG Elements
Joyce Additions
NCI Model Elements
BRIDG 2.1
Modified for CG
MIAME-MAGE
MAGE-TAB
Legend
CG DAM Views
• Process Models• Specimen Handling and Collection (based on NCI public protocol)• Genomics Testing Process (high level)• Future – interaction diagrams for message flows per Use Case
• Gene Expression – Whole Model• Bio-specimen• Experiment Definition (Gene express specific protocol, not entire study)• Array Design• Common Classes• Data• Relationships
Generic Assay Overview
Generic Assay Overview 1
Study Experiment
Data
Protocol
Equipment
Software
ExperimentalItem
*
*
*
*
*
*
Study: A detailed examination or analysis designed to discover facts about a system under investigation. Systems may include intact organisms, biologic specimens, and natural or synthetic materials.
Experiment: A coordinated set of actions and observations designed to generate data, with the ultimate goal of discovery or hypothesis testing.
Protocol: A rule which guides how an activity should be performed.
ExperimentalItem: Items used in the execution of an experiment: specimens - samples either taken from nature or created for the purpose of study and which are to be the subject of an experiment, and reagents and supplies which will be used in the execution of an experiment. It is not instruments, analysis tools, and general-purpose resources (common reagents, lab equipment, personnel).
Generic Assay Overview 2
Notes:
1.ProcessedData has association to Finding; not included on the diagram to keep things focused
1. Isn’t the result of an analytical experiment what we’ve called ProcessedData?
2. Do we need to have distinction between Data and ProcessedData? Can we have self association on Data to handle both in the DAM
2.Software needs to be defined
3.What about association from ExperimentalItem to ExperimentalStudy?
New v3 Models for Future Ballot
• Domain Information Model (Genome )
• Allows non-locus specific data (e.g., large deletions, cytogenetics, etc.) to be represented
• Link to the locus-specific models, i.e., GeneticLoci & GeneticLocus
• Query Model
• Based on the HL7 V3 Query by Parameter Infrastructure
• Adds selected attributes from the Clinical Genomics models as parameters of the query message
Useful links
• HL7.orghttp://www.hl7.org/
• HL 7 Wiki http://wiki.hl7.org/index.php?title=Main_Page
• Clinical Genomics Wiki http://wiki.hl7.org/index.php?title=CG
• HL7 Standardshttp://www.hl7.org/implement/standards/index.cfm
• HL7v3 Ballot Site http://www.hl7.org/v3ballot/html/welcome/environment/index.htm
• ICR (IRWG) Wikihttps://wiki.nci.nih.gov/x/kQiG
• ICR (IRWG) comments on CG Gene Expression DAMhttps://wiki.nci.nih.gov/x/FZZ9AQ
•Clinical genomics Oct 2010 Meeting Slideshttp://www.hl7.org/Special/committees/clingenomics/docs.cfm?wg_id=7&wg_docs_subfolder_name=presentations
Questions?
CG Gene Expression DAMMay 2010 Ballot; Model Details
• Subpackages
• Array Design• Classes e.g Array, ArrayDesign, ArrayGroup, Reporter etc.
• Common Classes• Identifiable, OntologySource, OntologyTerm etc.
• Data• DataFile, DataMatrix, Image, ImageAcquistion etc.
• Design Element• DesignElement, DimensionElement etc.
• Experiement Definition• GenomicProtocol, LabExperiment, NormalizationTypes, ProtocolParameter etc.
• Relationship• Relationships between: Samples, Arrays and Data
• Bio-Specimen Diagrams• Classes e.g BioSpecimen, Bio-Specimen-Characteristics, Specimen Handling
etc.
Clinical Genomics DAMMay 2010 Ballot; Terminology• Terminology: definitions from NCI EVS team for a number of terms needed
for genetic sample type entries
• nDNA (Nuclear DNA)
• pDNA (plasmid DNA)
• RNA (Ribonucleic acid)
• RNAP (RNA polymerase)
• mRNA (Messenger Ribonucleic Acid)
• snRNA (Small nuclear RNA)
• miRNA (microRNA)
• ssRNA (single-stranded RNA)
• dsRNA (double-stranded RNA)
• snoRNA (small nucleolar RNA)
• tRNA (Transfer RNA)
• hnRNA (heterogeneous nuclear RNA)
• RNP (Ribonucleoprotein)
• snRNP (small nuclear ribonucleoproteins)
CG Gene Expression DAMMay 2010 Ballot
• Model available at http://www.hl7.org/v3ballot/html/domains/uvcg/uvcg_GeneExpressionDAM.htm#POCG_DO000000UV-GeneExpressionDam-ic. 
• Comments submitted by IRWG (ICR WS) on May 7, 2010
https://wiki.nci.nih.gov/display/ICR/IRWG+Review+of+HL7+CG+DAM+2.0
• Review of the ballot results on the Gene Expression DAM
• Received 16 Negatives and 30 Affirmative votes
• Negatives from : CDISC, NCI, FDA and Siemens