2013-03-03 etl specifications v4.0 saftinet

Upload: raj

Post on 07-Jul-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    1/47

    SAFTINet ETL Specifications Document Page 1

    Scalable Architecture for Federated Therapeutic Inquiries Network (SAFTINet)

    ETL Specifications Document

    Version 4.0

    March 3rd, 2013

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    2/47

    SAFTINet ETL Specifications Document Page 2

    LICENSE

    © 2011 Foundation for the National Institutes of Health (FNIH).

    Licensed under the Apache License, Version 2.0 (the "License"); you may not use this document except in

    compliance with the License. You may obtain a copy of the License at http://omop.fnih.org/publiclicense .

    Unless required by applicable law or agreed to in writing, documentation and software distributed under

    the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND,

    either express or implied. Any redistributions of this work or any derivative work or modification based on

    this work should be accompanied by the following source attribution: "This work is based on work by the

    Observational Medical Outcomes Partnership (OMOP) and used under license from the FNIH at

    http://omop.fnih.org/publiclicense.

    Any scientific publication that is based on this work should include a reference to

    http://omop.fnih.org .

    This document was created specifically for the Scalable Architecture for Federated Translational Inquiries

    Network (SAFTINet) project, in collaboration with OMOP. It reflects changes to the OMOP CDMv2 to

    create OMOP CDMv3 which were done in collaboration with FNIH OMOP and the SCANNER (Scalable

    National Network for Effectiveness Research) project (http://scanner.ucsd.edu/)

    SAFTINet is supported by grant number R01HS019908 from the Agency for Healthcare Research and Quality.

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    3/47

    SAFTINet ETL Specifications Document Page 3

    TABLE OF CONTENTS

    1.0 Introduction  6 

    2.0 Definition of terms 7

    3.0 Assumptions  12

    4.0 Source Data Mapping Approach  134.1 Change to Existing Tables 144.2 Table Name: ORGANIZATION  154.3 Table Name: CARE_SITE  174.4 Table Name: PROVIDER  184.5 Table Name: X_DEMOGRAPHIC 214.6 Table Name: VISIT_OCCURRENCE 254.7 Table Name: DRUG_OCCURRENCE 274.8 Table Name: CONDITION_OCCURRENCE 304.9 Table Name: PROCEDURE_OCCURRENCE 334.10 Table Name: OBSERVATION 35

    5.0 Appendix A: Table Specific Rules 38

    6.0 Appendix B: Row Filters 39

    7.0 Appendix C: Sending data using flatfiles 46

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    4/47

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    5/47

    SAFTINet ETL Specifications Document Page 5

    Change Record

    Date Author Version Change Reference

    02-Nov-2009 Vicki Fan, Mark

    Khayter

    1.0 Original OMOP ETL Template Document

    04-Oct-2011 Patrick Hosokawa 2.0 Document adapted to SAFTINet ETL data model,flowcharts added to detail data flow from ETL modelto grid model

    20-Dec-2011 Patrick Hosokawa 2.1 Document updated to 12/20/11 ETL data model

    17-Mar-2012 Patrick Hosokawa 2.2 Document updated to 3/17/12 ETL data model

    06-Aug-2012 Patrick Hosokawa 4.0 Change section removed, Appendix B updated, final

    move to CDMv4. Added data on labs provided to

    Appendix B.

    03-Mar-2013 Patrick Hosokawa 4.1 Additions to Appendix B, Added Appendix C forflatfile instructions

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    6/47

    SAFTINet ETL Specifications Document Page 6

    1.0 Introduction

    This document reflects the requirements, assumptions, business rules and transformations for the

    implementation of OMOP CDM V3, as recommended for SAFTINet.

    The purpose of this document is two-fold:

    1.  Describe ETL mapping of data from SAFTINet partners into Common Data Model.2.  Serve as a blueprint for equivalent ETL mapping processes for other data sources into CDM.

    In each section, the tables and their mapping are individually reviewed along with any source specific

    rules and exceptions.

    The intended audiences for this document are the SAFTINet team and partner ETL technical personnel.

    Sections of the document are targeted specifically towards each audience with appropriate focus and

    level of detail.

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    7/47

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    8/47

    SAFTINet ETL Specifications Document Page 8

    as a string data type, and allowed to have one of two known code

    values: "M" for male, "F" for female -- and NULL for records

    where gender is unknown or not applicable (or arguably "U" for

    unknown as a sentinel value). The data domain for the gender

    column is: "M", "F".

    In database technology, domain refers to the description of an

    attribute's allowed values. The physical description is a set ofvalues the attribute can have, and the semantic, or logical,

    description is the meaning of the attribute. 

    Drug In pharmacology, a drug as "a chemical substance used in the

    treatment, cure, prevention, or diagnosis of disease or used to

    otherwise enhance physical or mental well-being." Drugs may be

    prescribed for a limited duration, or on a regular basis for chronic

    disorders.

    Drug Exposure (entity) The Drug Exposure entity contains individual records that suggest

    drug utilization by the person. Drug Exposure indicators store key

    information about each person medication and the timing

    thereof, including the drug (captured as standard Concept code inthe CDM), quantity, beginning date of medication, number of

    days supply, period of exposure, and prescription refill data. Drug

    Exposures are stored in the DRUG_EXPOSURE table.

    Encrypted Unique Identifiers Output of a de-identification process used to hash the identity of

    subjects, providing them with a unique but de-identified

    identifier.

    Electronic Health Record (EHR) Electronic health record refers to an individual person's medical

    record in digital format. It may be made up of electronic medical

    records from many locations and/or sources. The EHR is a

    longitudinal electronic record of person health information

    generated by one or more encounters in any care deliverysetting. Included in this information are person demographics,

    progress notes, problems, medications, vital signs, past medical

    history, immunizations, laboratory data and radiology reports.

    The EHR has the ability to generate a complete record of a clinical

    person encounter - as well as supporting other care-related

    activities directly or indirectly via interface - including evidence-

    based decision support, quality management, and outcomes

    reporting.

    Electronic Medical Record

    (EMR)

    An electronic medical record is a computerized legal medical

    record created in an organization that delivers care, such as a

    hospital or outpatient setting. Electronic medical records tend to

    be a part of a local stand-alone health information system that

    allows storage, retrieval and manipulation of records. This

    document will reference EHR moving forward even if certain data

    sources internally use the EMR definition.

    Extract, Transform, Load (ETL) Process of getting data out of one data store (Extract), modifying

    it (Transform), and inserting it into a different data store (Load).

    Generic Product Information

    (GPI)

    A proprietary unique identifier for a drug used by the commercial

    Medi-Span® formulary database

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    9/47

    SAFTINet ETL Specifications Document Page 9

    Grid-enabled network A collection of grid nodes (virtual organizations) capable of

    responding to/with grid query/response services

    Grid Node A grid-enabled database containing data “owned” by a specific

    health care entity or virtual organization.

    Grid Portal Contains that set of services that allows queries to be sent, to

    give access to authorized user, and administer query and

    response activities.Healthcare Common Procedure

    Coding System (HCPCS)

    HCPCS Level I codes are managed by the AMA (licensing fees

    apply). The HCPCS Level II codes are managed by CMS (Centers

    for Medicare & Medicaid Services). The Level II codes includes:

    alphanumeric HCPCS procedure and modifier codes, their long

    and short descriptions, and applicable Medicare administrative,

    coverage, and pricing data. These codes are used for Medicare

    outpatient services.

    International Classification of

    Disease, 9th Revision, Clinical

    Modifications (ICD9-CM)

    The official system of assigning codes to diagnoses and

    procedures associated with hospital utilization in the United

    States.

    Investigator Any authorized clinician or researcher, or person designated toact on their behalf (e.g., research assistant, statistician) who has

    been authenticated for access to query and response

    functionality on the grid-enabled network

    Logical Observation Identifiers

    Names and Codes (LOINC)

    Universal code names and identifiers to medical terminology

    related to the Electronic Health Record and assists in the

    electronic exchange and gathering of clinical results (such as

    laboratory tests, clinical observations, outcomes management

    and research).

    Limited Data Set As defined by HIPAA, limited data sets are data sets stripped of

    certain direct identifiers that are specified in the Privacy Rule.

    They are not de-identified information under the Privacy Rule. Alimited data set is PHI that excludes the following direct

    identifiers of the individual or of relatives, employers, or

    household members of the individual: (1) names; (2) postal

    address information, other than town or city, state, and ZIP code;

    (3) telephone numbers; (4) fax numbers; (5) e-mail addresses; (6)

    social security numbers; (7) medical record numbers; (8) health

    plan beneficiary numbers; (9) account numbers; (10)

    certificate/license numbers; (11) vehicle identifiers and serial

    numbers, including license plate numbers; (12) device identifiers

    and serial numbers; (13) web URLs; (14) Internet Protocol (IP)

    address numbers; (15) biometric identifiers, including fingerprintsand voiceprints; and (16) full-face photographic images and any

    comparable images. Importantly, unlike de-identified data, PHI in

    limited data sets may include the following: city, state and ZIP

    codes; all elements of dates (such as admission and discharge

    dates); and unique codes or identifiers not listed as direct

    identifiers. Recognizing that institutions, IRBs and investigators

    are frequently faced with applying both the Common Rule and

    the HIPAA Privacy Rule, OHRP does not consider a Limited Data

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    10/47

    SAFTINet ETL Specifications Document Page 10

    Set (as defined under the HIPAA Privacy Rule) to constitute

    individually identifiable information under 45 CFR 46.102(f)(2).

    Local Reference Value The specific value stored used in the partner’s data to refer to any

    given concept. This value will be mapped to a standardized

    concept value for translation by ROSITA.

    National Drug Codes (NDC) Unique identifiers assigned to individual drugs. NDCs are used

    primarily as an inventory code and for prescriptions.

    Observation (entity) The Observation table contains all general observations that aretracked as attributes, including source Observation code,

    matching standard Concept Code, date of the Observation, type

    of Observation, type of result, number/text/Concept code, and

    reference range for numeric results. Observation entities are

    recorded in the Observation table.

    Observational Medical

    Outcomes Partnership (OMOP)

    A public-private partnership designed to protect human health by

    improving the monitoring of drugs for safety and effectiveness.

    Organization (entity) The Organization table is the highest level of the partner care

    infrastructure hierarchy. Each organization may have multiple

    care sites. Providers will work at one or more care sites.

    Person (entity) A Person entity is one of the basic dimensions of analysis. Itpresents the framework for active drug surveillance. The Person

    entity is Concept driven, and its attribute values are stored as

    standard Concept codes rather than original (i.e., “raw”) source

    values and is stored in the logical X_Demographic table.

    Primary Care Physician A physician designated as responsible to provide specific care to a

    patient, including evaluation and treatment as well as referral to

    specialists.

    Procedure Occurrence (entity) A Procedure Occurrence records individual instances of medical

    procedures extracted from source data. Procedures are recorded

    in various data sources in different forms with varying levels of

    standardization such as CPT-4, ICD-9-CM, and HCPCS procedurecodes. These are stored in the PROCEDURE_OCCURRENCE table.

    Protected Health Information

    (PHI)

    Protected health information (PHI) under HIPAA includes any

    individually identifiable health information. Identifiable refers not

    only to data that is explicitly linked to a particular individual

    (that's identified information). It also includes health information

    with data items which reasonably could be expected to allow

    individual identification. De-indentified information is that from

    which all potentially identifying information has been removed. 

    Provider (entity) The Provider table contains information on local care

    providers including type and specialty. Providers are

    assigned to an individual care site.Query A request for data based on the query specifications “sent” via a

    grid services portal to a specified grid network.

    ROSITA A software package designed to transition SAFTINet data from

    the partner XML download to a grid database compatible form.

    This package will translate local source codes into OMOP

    concepts and will remove PHI other than dates of birth, dates of

    service, and zip codes.

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    11/47

    SAFTINet ETL Specifications Document Page 11

    RxNorm A standardized nomenclature for clinical drugs and drug delivery

    devices is produced by the National Library of Medicine. In

    RxNorm, the name of a clinical drug combines its ingredients,

    strengths, and/or form.

    RxNorm provides normalized names for clinical drugs and links its

    names to many of the drug vocabularies commonly used in

    pharmacy management and drug interaction software, including

    those of First DataBank, Micromedix, MediSpan, Gold StandardAlchemy, and Multum. By providing links between these

    vocabularies, RxNorm can mediate messages between systems

    not using the same software and vocabulary.

    Subject A patient, client or person of interest in the use cases

    described whose clinical and demographic data are

    contained within the virtual organization(s) 

    Systematized Nomenclature of

    Medicine - Clinical Terms

    (SNOMED-CT)

    SNOMED-CT is one of a suite of designated standards for use in

    U.S. Federal Government systems for the electronic exchange of

    clinical health information, and is also a required standard in

    interoperability specifications of the U.S. Healthcare Information

    Technology Standards Panel. SNOMED-CT is also beingimplemented internationally as a standard within other IHTSDO

    Member countries.

    Terminology Technical or special terms used in a business or special subject

    area.

    Virtual organization (aka

    Partner)

    Any entity or group of entities (e.g., clinic, network of clinics,

    agency or agencies) whose data is represented by a single grid

    node and available through grid services for query/response

    activities

    Visit Occurrence (entity) The Visit Occurrence entity contains the information available in

    the source data about person visits to healthcare providers,

    including inpatient, outpatient, and ER visits. Visits are recorded

    in various data sources in different forms with varying levels of

    standardization. The detail level of the classification and

    description of the visit differs by data source. Visit Occurrence

    entities are recorded in the VISIT_OCCURRENCE table.

    Vocabulary A computerized list (as of items of data or words) used for

    reference (as for information retrieval or word processing).

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    12/47

    SAFTINet ETL Specifications Document Page 12

    3.0 Assumptions

    The design follows the agreed upon general project assumptions:1

    -  Electronic Medical Data: EMR is a subset of EHR. This document will reference EHR moving forward

    even if specific data source might internally use Electronic Medical Record (EMR) definition.

    -  Financial Information: The CDM model makes use of financial information such as Fees, Payments,

    Deductibles, Copayment, etc. from payer source data, such as Medicaid

    -  Plan Detail Information: The model potentially makes use of fields related to Plan or Coverage details

    such as Benefit Plan, Plan Indicator, etc. of the administrative information in the claims data. The model

    makes use of medical coverage period and eligibility for prescription drugs.

    -  Cleansing and Validation: The selected data fields will be handled (whether loaded directly or as part of a

    transformation) with a validation plan which is to be determined later.

    -  Data Privacy: ETL from EHR/CDW will contain clear text direct patient identifiers and dates. ROSITA will

    encrypt all clear text direct patient identifiers. A random identifier (called a GUID) that is unrelated to

    any patient identifier will be associated with each patient record. Birth dates and dates of service will

    remain unchanged. Zip codes will also be forward to the grid node unchanged and as second variable to

    only include the 3-digit zip (the leftward 3 digits). The resulting data exported to the grid node will

    therefore be a limited data set containing encrypted direct identifiers with unchanged dates and both 5-

    digit and 3-digit zip codes. The grid node will have no access to any clear text direct patient identifiers

    from the EHR/CDW.

    Under the assumption that payer data will be provided with clear text direct identifiers, ROSITA will

    perform record linkage to link the clinical record with the financial record using clear text identifiers. If a

    match is made, the same GUID assigned to the clinical data will be assigned to the financial data.

    Otherwise, a new GUID will be generated that is unrelated to any patient identifier. Dates will remain

    unchanged. The resulting data exported to the grid node will be consistent with a Limited Data Set

    containing encrypted direct identifiers, unchanged dates, a 5- and 3-digit zip code, and a GUID random

    identifier. The grid node will have no access to any clear text direct patient identifiers from payer (e.g.

    Medicaid) data. 

    -  Concept Identifiers: Data are represented through standard concept identifiers using a standardized

    terminology. During ETL, source data representations (raw data codes) will be translated to standard

    concept identifiers through a mapping process. If no standard concept identifier is available, the concept

    identifier field will contain ‘0’ as a value. 

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    13/47

    SAFTINet ETL Specifications Document Page 13

    4.0 Source Data Mapping Approach 

    This section covers the high-level assumptions and approach to extraction, transformation and loading

    (ETL) of raw source data into the Common Data Model (CDM). The assumptions and approach are

    defined with a special focus on claims and EHR data. The section covers each of the major tables in the

    CDM separately, elaborating the distinct handling required for each.

    Unless otherwise specified with ‘Required’ in field listing, missing attributes will not disqualify data from

    being loaded into the Common Data Model. Missing attributes for Concept Identifiers will be populated

    with the value zero (0) in the CDM, while the rest of the missing attributes will be populated with NULL.

    The Source Field and Applied Rule fields are left blank for the partners to fill in. The source field should

    be filled in with the equivalent field in the partner’s source data. The Applied Rule field should contain

    any specialized rules (i.e. filtering, translation, combination of categories etc…) that the partner

    implements when filling in the field.

    In the flowcharts, the colors red, yellow, and green are used in the following manner.

    Left Side (ETL View): represents desired source data

    Red – Field is not brought forward into the grid

    Green – Field is brought forward into the grid

    Right Side (Grid View): represents desired grid-facing data

    Red – Field is generated by the Rosita application. It is not derived from any ETL data field.

    Yellow – Field is generated from ETL data, but does not exist as a field in the ETL data.

    Green – Field is brought forward from ETL data unchanged.

    The arrows indicate that the field on the right (in yellow) is generated from the field on the left (green forthose fields brought forward, red otherwise).

    The grid facing data model (right side of the flowcharts) closely matches the OMOP v3 data model.

    However, the SAFTINet grid model has a few extra fields needed specifically for SAFTINet. All fields

    present in SAFTINet but not in the OMOP model use the prefix X_ (i.e. X_Organization_Source).

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    14/47

    SAFTINet ETL Specifications Document Page 14

    4.1 Changes to existing tables

    Table Changed Field Change

    Visit_Occurrence x_visit_occurrence_source_identifier Changed from

    visit_occurrence_source_identifier, new

    x_ prefix is so the field can pass throughto the grid

    Drug_Exposure x_visit_occurrence_source_identifier Changed from

    visit_occurrence_source_identifier, new

    x_ prefix is so the field can pass through

    to the grid

    Condition_Occurrence x_visit_occurrence_source_identifier Changed from

    visit_occurrence_source_value, new x_

    prefix is so the field can pass through to

    the grid

    Procedure_Occurrence x_visit_occurrence_source_identifier Changed from

    visit_occurrence_source_value, new x_

    prefix is so the field can pass through to

    the grid

    Observation x_visit_occurrence_source_identifier Changed from

    visit_occurrence_source_value, new x_

    prefix is so the field can pass through to

    the grid

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    15/47

    SAFTINet ETL Specifications Document Page 15

    4.2 Table Name: ORGANIZATION

    The Organization table is the highest level of the partner care infrastructure hierarchy. Each organization may have multiple care sites.

    Providers can work at one or more care sites. Address information submitted with the organization will be used to create a new location

    record which will be linked to the organization record via the Location_ID field.

    The field mapping is performed as follows:

    Destination Field Data Type /

    Required

    Source Field Applied Rule Comment

    organization_source

     _value

    String(50) /

    Required

    Local reference value for organization, used to

    create the organization_id field on the gridfacing record. This value will also be used in

    other records to refer to the organization.

    X_data_source_type String(20) /

    Required

    Data Source Identifier (EHR / CDW / Medicaid)

    place_of_service_source

     _value

    String (50) /

    Required

    The type of organization. If the organization

    type is not defined in the source data refer to

    the place_of_service_type section of the

    Concept ID Table. Used to create

    place_of_service_concept _id.

    organization_address_1 String (50) First line of the address

    organization_address_2 String (50) Second line of the address

    organization_city String (50) City portion of the address

    organization_state String (2) State portion of the address

    organization_zip String (9) Zip code of the addressorganization_county String (20) County portion of the address

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    16/47

    SAFTINet ETL Specifications Document Page 16

    4.2.1 Example of ORGANIZATION source / destination data

    1.  The organization_source_value field will be compared to the current set of locations. If the value does not already occur in the table (new location) a row will

    be added to the table and a new ID (location_id) will be generated. Either a newly generated value or a pre-existing value (if the record is found) of the

    location table Primary Key will be placed into location_id.

    2.  x_zip_deidentified will be generated from organization_zip. This field was created specifically for person locations to support the creation of ‘Safe Harbor’

    Limited Data Sets.

    3.  x_location_type will be derived from the XML record type (Organization in this case)

    Organization Table - XML

    organization_source_value1  UC Internal Medicine

    x_data_source_type EHRplace_of_service_source_value Academic Practice

    organization_address_1 13199 E Montview Blvd

    organization_address_2 Suite 300, Mail Stop F443

    organization_city Aurora

    organization_state CO

    organization_zip 80045

    organization_county Arapahoe

    ETL View Grid View

    Organization Table - Grid

    organization_id 22770494

    organization_source_value UC Internal Medicine

    x_data_source_type EHR

    place_of_service_concept_id 3389

    place_of_service_source_value Academic Practice

    location_id 39458x_gride_node_id 1

    Location Table - Grid

    location_id 39458

    location_source_value UC Internal Medicine

    x_data_source_type EHR

    address_1 13199 E Montview Blvd

    address_2 Suite 300, Mail Stop F443

    city Aurora

    state CO

    zip 80045

    x_ zip_deidentified2  800

    county Arapahoe

    x_location_type3  Organization

    x_grid_node_id 1

    Green – Brought forward into grid model / Red  – Removed in processing Green – Brought forward from ETL / Yellow  – Generated from ETL field / Red – 

    Generated locally or from multiple ETL fields

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    17/47

    SAFTINet ETL Specifications Document Page 17

    4.3 Table Name: CARE_SITE

    The Care Site table refers to the lower level of the provider care hierarchy. Individual provider care locations will be stored in this table.

    The field mapping is performed as follows:

    Destination Field Data Type Source Field Applied Rule Comment

    care_site_source_value String (50) /

    Required

    Local reference value for care site, used to create

    the care_site_id field on the grid facing record. This

    value will also be used in other records to refer to

    the care site.

    x_data_source_type String(20) /Required

    Data Source Identifier (EHR / CDW / Medicaid)

    organization_source

     _value

    String (50) /

    Required

    Local reference value for organization. This value

    will be matched against the organization table to

    obtain the corresponding organization_id.

    place_of_service_source

     _value

    String (50) The type of care site. If the care site type is not

    defined in the source data refer to the

    place_of_service_type section of the Concept ID

    Table. Used to create place_of_service_concept _id

    x_care_site_name String(50) Name of the clinic (care site)

    care_site_address_1 String (50) First line of the address

    care_site_address_2 String (50) Second line of the address

    care_site_city String (50) City portion of the address

    care_site_state String (2) State portion of the address

    care_site_zip String (9) Zip code of the addresscare_site_county String (20) County portion of the address

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    18/47

    SAFTINet ETL Specifications Document Page 18

    4.3.1 Example of CARE SITE source / destination data

    1.  The care_site_source_value field will be compared to the current set of locations. If the value does not already occur in the table (new location) a row will

    be added to the table and a new ID (location_id) will be generated. Either a newly generated value or a pre-existing value (if the record is found) of the

    location table Primary Key will be placed into location_id.

    2.  x _zip_deidentified will be generated from care_site_zip. This field was created specifically for person locations to support the creation of ‘Safe Harbor’

    Limited Data Sets

    3.  x_location_type will be derived from the XML record type (Care Site in this case)

    Care Site Table - XML

    care_site_source_value1

      UC Internal Medicinex_data_source_type EHR

    organization_source_value University of Colorado

    place_of_service_source_value Internal Medicine

    x_care_site_name Eastside Clinic

    care_site_address_1 13199 E Montview Blvd

    care_site_address_2 Suite 300, Mail Stop F443

    care_site_city Aurora

    care_site_state CO

    care_site_zip 80045

    care_site_county Arapahoe

    ETL View Grid View

    Care Site Table - Grid

    care_site_id 22770494

    care_site_source_value UC Internal Medicine

    x_data_source_type EHR

    location_id 49382

    organization_id 382392place_of_service_concept_id 39458

    place_of_service_source_value Internal Medicine

    x_care_site_name Eastside Clinic

    x_grid_node_id 1

    Location Table - Grid

    location_id 49382

    location_source_value UPI Building

    x_data_source_type EHR

    address_1 13199 E Montview Blvd

    address_2 Suite 300, Mail Stop F443

    city Aurora

    state CO

    zip 80045

    x_zip_deidentified2

      800county Arapahoe

    x_location_type3  Care Site

    x_grid_node_id 1

    Green – Brought forward into grid model / Red  – Removed in processing Green – Brought forward from ETL / Yellow  – Generated from ETL field / Red – 

    Generated locally or from multiple ETL fields

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    19/47

    SAFTINet ETL Specifications Document Page 19

    4.4 Table Name: PROVIDER 

    The Provider table contains information on local care providers including type and specialty. Providers are assigned to an individual care site.

    The field mapping is performed as follows:

    Destination Field Data Type Source Field Applied Rule Comment

    provider_source_value String (50) /

    Required

    Local reference value for provider, used to create

    the provider_id field on the grid facing record. This

    value will also be used in other records to refer to

    the provider.x_data_source_type String(20) /

    Required

    Data Source Identifier (EHR / CDW / Medicaid)

    npi String (50) Provider NPI

    dea String (50) Provider DEA Number

    specialty_source_value String (50) Provider type as recorded at the source (e.g.

    Physican, NP, MA, etc). If the provider type is not

    defined in the source data refer to the Health Care

    Provider Specialty section of the Concept ID Table.

    Used to create specialty_concept_id

    x_provider_first String (75) Provider First Name

    x_provider_middle String (75) Provider Middle Name (or initial)

    x_provider_last String (75) Provider Last Name

    care_site_source_value String (50) Local reference value for Care Site. This value will

    be matched against the Care Site table to obtain thecorresponding care_site_id.

    x_organization_source

     _value

    String (50) /

    Required

    Local reference value for Organization. This value

    will be matched against the Care Site table to obtain

    the corresponding organization_id.

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    20/47

    SAFTINet ETL Specifications Document Page 20

    4.4.1 Example of PROVIDER source / destination data

    Provider Table - XML

    provider_source_value 349302

    x_data_source_type EHR

    npi 34930302

    dea 49492

    specialty_source_value General Practitioner

    x_provider_first Marcusx_provider_middle W

    x_provider_last Welby

    care_site_source_value UC Internal Medicine

    x_organization_source_value University of Colorado

    ETL View Grid View

    Provider Table - Grid

    provider_id 2399450

    provider_source_value 349302

    x_data_source_type EHR

    npi 34930302

    dea 49492

    specialty_source_value General Practitioner

    specialty_concept_id 20302

    x_provider_first Marcus

    x_provider_middle W

    x_provider_last Welby

    care_site_id 22770494

    x_organization_id 3939

    x_grid_node_id 1

    Green – Brought forward into grid model / Red  – Removed in processing /

    Blue – Item under discussion

    Green – Brought forward from ETL / Yellow  – Generated from ETL field / Red – 

    Generated locally or from multiple ETL fields / Blue – Item under discussion

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    21/47

    SAFTINet ETL Specifications Document Page 21

    4.5 Table Name: X_Demographic

    The X_Demographic table stores information about individual patients, the PHI elements of this record will be stripped out in the

    transformation to the grid model. Address information will be limited and used to create a new location record.

    The field mapping is performed as follows:

    Destination Field Data Type Source Field Applied Rule Comment

    person_source_value String (50) /

    Required

    Person unique identifier at the source (MRN). Used

    to create the person_id field on the grid facing

    record. This value will also be used in other records

    to refer to the person.

    x_data_source_type String (20) /Required

    Data Source Identifier (EHR / CDW / Medicaid)

    medicaid_id_number String (50) Medicaid ID Number

    ssn String (50) Social Security Number

    last String (75) Last Name

    middle String (75) Middle Name or Initial

    first String (75) First Name

    address_1 String (50) The first line of the person's actual address.

    address_2 String (50) The first line of the person's actual address.

    city String (50) The city portion of the person's actual address.

    state String (2) The state portion of the person's actual address.

    zip String (9) Zip code of the person's actual address.

    county String (20) The county portion of the person’s address as

    recorded at source.

    year_of_birth Number(4) /

    Required

    Year of birth

    month_of_birth Number (2) Month of birth

    day_of_birth Number (2) Day of birth

    gender_source_value String (50) Local reference value for gender of the person.

    Used to create gender_concept_id

    race_source_value String (50) Local reference value for race of the person. Used

    to create race_concept_id.

    ethnicity_source_value String (50) Local reference value for ethnicity of the person.

    Used to create ethnicity_concept_id.

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    22/47

    SAFTINet ETL Specifications Document Page 22

    provider_source_value String (50) Local reference value for patient’s primary provider

    (if any). This value will be matched against the

    Provider table to obtain the corresponding

    provider_id.

    care_site_source_value String (50) Local reference value for the patient’s primary Care

    Site (if any). This value will be matched against the

    Care Site table to obtain the corresponding

    care_site_id.

    x_organization_source

     _value

    String (50) /

    Required

    Local reference value for patient’s organization. Th

    value will be matched against the Organization table

    to obtain the corresponding organization_id.

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    23/47

    SAFTINet ETL Specifications Document Page 23

    4.5.1 Example of X_Demographic source / destination data

    X_Demographic Table - XML

    person_source_value 29201082

    x_data_source_type EHR

    medicaid_id_number 3903432

    ssn 999-99-9999

    last Doe

    middle D

    first John

    address_1 123 Fake St

    address_2 Apt 566

    city Aurora

    state CO

    zip 80045

    county Arapahoe

    year_of_birth 1965

    month_of_birth 2

    day_of_birth 9

    gender_source_value Male

    race_source_value White

    ethnicity_source_value Non-Hispanic

    provider_source_value 35346346

    care_site_source_value UC Internal Medicine

    x_organization_source_value University of Colorado

    ETL View

    Person Table - Grid

    person_id 22770494

    person_source_value2 

    location_id1  49382

    year_of_birth 1965

    month_of_birth 2

    day_of_birth 9

    gender_concept_id 675

    gender_source_value Malerace_concept_id 344

    race_source_value White

    ethnicity_concept_id 202

    ethnicity_source_value Non-Hispanic

    provider_id3  34235556

    care_site_id 22770494

    x_organization_id 382392

    x_grid_node_id 1

    GRID View

    Location Table - Grid

    location_id1  39458

    location_source_value

    x_data_source_type EHR

    address_14 

    address_24 city Aurora

    state CO

    zip 80045

    x_zip_deidentified5  800

    county Arapahoe

    x_location_type6  34344

    x_grid_node_id 1

    Green – Brought forward into grid model / Red  – Removed in processing Green – Brought forward from ETL / Yellow  – Generated from ETL field / Red – 

    Generated locally or from multiple ETL fields

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    24/47

    SAFTINet ETL Specifications Document Page 24

    1.  The location ID value is not linked to a location_source_ value in this case. When the address information is transferred to the location table, the resulting

    ID value will be placed in the person record for reference

    2.  The grid version of the person table contains a blank field for person_source_value to comply with the OMOP standard. The value for

    person_source_value on the ETL side will not be carried forward due to privacy concerns.

    3.  The grid facing provider_id will be derived from the ETL field provider_source_value.

    4.  When creating the location table the local values for person address will not be passed through to the grid, although they are labeled green because in

    other instances, such as Organization and Care Site, they do move forward to the grid facing database

    5.  x _zip_deidentified will be generated from zip. This field was created specifically for person locations to support the creation of ‘Safe Harbor’

    Limited Data Sets

    6.  x_location_type will be derived from the XML record type (Person in this case)

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    25/47

    SAFTINet ETL Specifications Document Page 25

    4.6 Table Name: VISIT_OCCURRENCE

    The Visit Occurrence table contains a record for each patient-provider encounter. The provider, patient and location are all stored as well as

    the type of visit.

    The field mapping is performed as follows:

    Destination Field Data Type Source Field Applied Rule Comment

    x_visit_occurrence

    _source_identifier

    String (50) /

    Required

    Local reference value for visit, used to create the

    visit_occurrence_id field on the grid facing record.

    x_data_source_type String (20) /

    Required

    Data Source Identifier (EHR / CDW / Medicaid)

    person_source_value String (50) /

    Required

    Person unique identifier at the source (MRN). This

    value will be matched against the Person table to

    obtain the corresponding person_id.

    visit_start_date DATE/

    Required

    The date on which the Visit started

    visit_end_date DATE /

    Required

    The date on which the Visit ended

    place_of_service

    _source_value 

    String (50) Visit type (office visit, med refill, face-to-face,

    telephone, med refill … etc). If the visit site type is

    not defined in the source data refer to the

    Visit_Type section of the Concept ID Table. Used to

    create place_of_service_concept_id

    x_provider_source_value String (50) Local reference value for the provider conducting

    the visit. This value will be matched against theProvider table to obtain the corresponding

    provider_id.

    care_site_source_value String (50) Local reference value for the Care Site of the visit.

    This value will be matched against the Care Site

    table to obtain the corresponding care_site_id.

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    26/47

    SAFTINet ETL Specifications Document Page 26

    4.6.1 Example of VISIT OCCURRENCE source / destination data

    Visit Occurrence Table - XML

    x_visit_occurrence_source

     _identifier

    349302

    x_data_source_type EHR

    person_source_value 2302202

    visit_start_date 5/23/2011

    visit_end_date 5/25/2011place_of_service_source_value Physical

    x_provider_source_value 20302340

    care_site_source_value UC Internal Medicine

    ETL View Grid View

    Visit Occurrence Table - Grid

    visit_occurrence_id 3203402

    x_data_source_type EHR

    person_id 30205202

    visit_start_date 5/23/2011

    visit_end_date 5/25/2011

    place_of_service_concept_id 302023003place_of_service_source_value Physical

    x_provider_id 04594020

    care_site_id 202033

    x_grid_node_id 1

    Green – Brought forward into grid model / Red  – Removed in processing Green – Brought forward from ETL / Yellow  – Generated from ETL field / Red – 

    Generated locally or from multiple ETL fields

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    27/47

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    28/47

    SAFTINet ETL Specifications Document Page 28

    quantity Number (8,2) The quantity of drug recorded in the corresponding

    Drug Exposure Instance

    days_supply Number (4) The number of days' supply of the medication

    recorded in the corresponding Drug Exposure

    Instance.

    x_drug_name String (255) /

    Required

    Drug name taken verbatim from source field

    x_drug_strength String (50) Strength (taken verbatim) (e.g. 20, 1000, 2-4, 1)

    sig String (500) Sig (if available)

    provider_source_value String (50) /

    Required

    Local reference value for prescribing/administering

    provider (if any). This value will be matched against

    the Provider table to obtain the correspondingprovider_id.

    x_visit_occurrence_

    source_identifier

    String (50) Local reference value for the visit where the drug

    was prescribed/administered. This value will be

    matched against the Visit Occurrence table to obtai

    the corresponding visit_occurrence_id.

    relevant_condition

     _source_value

    String (50) Associated Diagnosis Source Code. This is the code

    for the condition for which the drug was given. This

    value is independent and will not be matched

    against the Condition Occurrence table.

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    29/47

    SAFTINet ETL Specifications Document Page 29

    4.7.1 Example of DRUG EXPOSURE source / destination data

    Drug Exposure Table - ETL

    drug_exposure_source_identifier 30003400

    x_data_source_type EHR

    person_source_value 2302202

    drug_source_value 4594930302

    drug_source_value_vocabulary NDC

    drug_exposure_start_date 4/19/2011

    drug_exposure_end_date 5/19/2011

    drug_type_source_value Prescription

    stop_reason Regimen Completed

    Refills 1

    quantity 60

    days_Supply 30

    x_drug_name Amoxicillin

    x_drug_strength 500

    sig

    provider_source_value 239292

    x_visit_occurrence_source

     _identifier

    3499202

    relevant_condition_source_value 393821

    ETL View Grid View

    Drug Exposure Table - Grid

    drug_exposure_id 9947839

    x_data_source_type EHR

    person_id 30205202

    drug_concept_id 499506

    drug_source_value 4594930302

    drug_exposure_start_date 4/19/2011

    drug_exposure_end_date 5/19/2011

    drug_type_concept_id 983921

    stop_reason Regimen Completed

    refills 1

    quantity 60

    days_Supply 30

    x_drug_name Amoxicillin

    x_drug_strength 500

    sig

    prescribing_provider_id 3935050

    visit_occurrence_id 040200

    relevant_condition_concept_id 059439333

    x_grid_node_id 1

    Green – Brought forward into grid model / Red  – Removed in processing Green – Brought forward from ETL / Yellow  – Generated from ETL field / Red – Generated locally or from multiple ETL fields

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    30/47

    SAFTINet ETL Specifications Document Page 30

    4.8 Table Name: CONDITION_OCCURRENCE

    The Condition Occurrence table contains a record for each patient condition. The codes associated with the conditions as well as the

    associated person, provider, and visits/encounters are also recorded.

    The field mapping is performed as follows:

    Destination Field Data Type Source Field Applied Rule Comment

    condition_occurrence

     _source_identifier

    String (50) /Required

    Source Condition Primary Key; could be a unique

    record identifier. Used to create the

    condition_occurrence_id field on the grid facing

    record.

    x_data_source_type String (20) /

    Required

    Data Source Identifier (EHR / CDW / Medicaid)

    person_source_value String (50) /Required

    Person unique identifier at the source (MRN). This

    value will be matched against the Person table to

    obtain the corresponding person_id.

    condition_source_value String (50) /

    Required

    Local diagnosis code (e.g. ICD-9, SNOMED etc…).

    Used to create condition_concept_id

    condition_source_value

     _vocabularyString(50) /

    Required 

    Type of code (e.g. ICD-9) used for condition.

    x_condition_source_desc String (50) Source Diagnosis Text Description

    condition_start_date Date / Required Onset Date

    x_condition_update_date Date Date condition was updated/reviewed

    condition_end_date Date Resolved Date – Leave blank for unresolved

    conditions.condition_type_source

     _value

    String (50) /

    Required

    Type of condition as recorded in source data (e.g.

    chief complaint, problem list, etc). If the condition

    type is not defined in the source data refer to the

    Condition_Occurrence section of the Concept ID

    Table. Used to create condition_type_concept_id

    stop_reason String (20) The reason, if available, that the condition was no

    longer recorded, as indicated in the source data.

    Valid values include discharged, resolved etc… 

    associated_provider

     _source_value

    String (50) Provider ID from the source - Provider of record.

    This value will be matched against the Provider tabl

    to obtain the corresponding provider_id.

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    31/47

    SAFTINet ETL Specifications Document Page 31

    x_visit_occurrence

     _source_identifier

    String (50) Local reference value for visit. This value will be

    matched against the Visit Occurrence table to obtai

    the corresponding Visit Occurrence ID.

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    32/47

    SAFTINet ETL Specifications Document Page 32

    4.8.1 Example of CONDITION OCCURRENCE source / destination data

    Condition Occurrence Table - ETL

    condition_occurrence_source_identifier 30003400

    x_data_source_type EHR

    person_source_value 393030

    condition_source_value 162.9

    condition_source_value_vocabulary ICD9

    x_condition_source_desc Malignant Neop

    condition_start_date 4/19/2011

    x_condition_update_date 10/19/2011

    condition_end_date

    condition_type_source_value Chief Complaint

    stop_reason

    associated_provider_source_value 392904

    x_visit_occurrence_source_identifier 403030

    ETL View Grid View

    Condition Occurrence Table - Grid

    condition_occurrence_id 8349393

    x_data_source_type EHR

    person_id 94849303

    condition_concept_id 884934

    condition_source_value 162.9

    x_condition_source_desc Malignant Neop

    condition_start_date 4/19/2011

    x_condition_update_date 10/19/2011

    condition_end_date

    condition_type_concept_id 499404

    stop_reason

    associated_provider_id 39304

    visit_occurrence_id 90493023

    x_grid_node_id 1

    Green – Brought forward into grid model / Red – Removed in processing Green – Brought forward from ETL / Yellow  – Generated from ETL field / Red – 

    Generated locally or from multiple ETL fields

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    33/47

    SAFTINet ETL Specifications Document Page 33

    4.9 Table Name: PROCEDURE_OCCURRENCE

    The Procedure Occurrence table contains a record for each procedure. The type of procedure as well as the associated person and visit are

    recorded.

    The field mapping is performed as follows:

    Destination Field Data Type Source Field Applied Rule Comment

    procedure_occurrence_s

    ource_identifier

    String (50)

    /Required

    Source Procedure Primary Key. Used to create the

    procedure_occurrence_id field on the grid facing

    record.

    x_data_source_type String (20) /

    Required

    Data Source Identifier (EHR / CDW / Medicaid)

    person_source_value String (50) /

    Required

    Person unique identifier at the source (MRN). This

    value will be matched against the Person table to

    obtain the corresponding person_id.

    procedure_source_value String (50) /

    Required

    The Procedure Code as captured from the source

    data. Values include CPT-4, ICD-9-CM (Procedure),

    HCPCS, and other procedure codes. Used to create

    procedure_concept_id.

    procedure_source_value

     _vocabulary

    String(50) /

    Required

    Type of code (e.g. CPT) used for condition.

    procedure_date DATE / Required The date on which the procedure began (or was

    performed)

    procedure_type_source _value String (50) The procedure type as stored in source. If theprocedure type is not defined in the source data

    refer to the Procedure Occurrence section of the

    Concept ID Table. Used to create

    procedure_type_concept_id.

    provider_record_source

     _value

    String (50) Local Reference value for Provider. This value will b

    matched against the Provider table to obtain the

    corresponding provider_id.

    x_visit_occurrence

     _source_identifier

    String (50) Local Reference value for visit. This value will be

    matched against the Visit Occurrence table to obtai

    the corresponding visit_occurrence_id.

    relevant_condition

     _source_value

    String (50) First Associated Diagnosis Code. Used to create

    relevant_condition_concept_id.

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    34/47

    SAFTINet ETL Specifications Document Page 34

    4.9.1 Example of PROCEDURE OCCURRENCE source / destination data

    Procedure Occurrence Table - ETL

    procedure_occurrence_source_identifier 9848493

    x_data_source_type EHR

    person_source_value 594928

    procedure_source_value 49750

    procedure_source_value_vocabulary CPT

    procedure_date 4/19/2011

    procedure_type_source_value Inpatient header

    provider_record_source_value 23902023

    x_visit_occurrence_source_identifier 2302320

    relevant_condition_source_value 20230

    ETL View Grid View

    Procedure Occurrence Table - Grid

    procedure_occurrence_id 393948230

    x_data_source_type EHR

    person_id 3493030

    procedure_concept_id 39949023

    procedure_source_value 49750procedure_date 4/19/2011

    procedure_type_concept_id 884934

    associated_provider_id 34040222

    visit_occurrence_id 20923042

    relevant_condition_concept_id 23032009

    x_grid_node_id 1

    Green – Brought forward into grid model / Red  – Removed in processing Green – Brought forward from ETL / Yellow  – Generated from ETL field / Red – 

    Generated locally or from multiple ETL fields

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    35/47

    SAFTINet ETL Specifications Document Page 35

    4.10 Table Name: OBSERVATION

    The Observation table contains records for labs, measurements such as height and weight, etc… It is also where information from Past

    Medical History, Past Surgical History, Allergy, and Social/Personal History are stored.

    The field mapping is performed as follows:

    Destination Field Data Type Source Field Applied Rule Comment

    observation_source

     _identifier

    String (50) /

    Required

    Source Primary Key for Observation Record. Used to

    create the obs_occurrence_id field on the grid facin

    record.

    x_data_source_type String (20) /

    Required

    Data Source Identifier (EHR / CDW / Medicaid)

    person_source_value String (50) /

    Required

    Person unique identifier at the source (MRN). This

    value will be matched against the Person table to

    obtain the corresponding person_id.

    observation_source

     _value

    String (50) /

    Required

    The Observation Code as it appears in the source

    data. Used to create obs_concept_id

    observation_source

     _value_vocabulary

    String(50) /

    Required

    Vocabulary used for the observation

    observation_date Date / Required The date of the Observation

    observation_time Time The time of the observation

    value_as_number NUMBER(14,3) The observation result stored as a numeric value.

    This is applicable to observations where the result is

    expressed as a numeric value.

    value_as_string String (60) The observation result stored as character string. It

    is applicable to the observations where the result is

    expressed as a character string. Used to create

    obs_value_as_concept_id.

    unit_source_value String (50) Unit of measure for Observation result when

    measured as a numeric value. Used to create

    unit_concept_id

    range_low NUMBER(14,3) The lower limit of the numeric range of the

    Observation value. It is not applicable if the

    observation results are non-numeric or categorical,

    and must be in the same units of measure as the

    observation value

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    36/47

    SAFTINet ETL Specifications Document Page 36

    range_high NUMBER(14,3) The upper limit of the numeric range of the

    Observation value. It is not applicable if the

    observation results are non-numeric or categorical,

    and must be in the same units of measure as the

    observation value

    observation_type_source

     _value

    String (50) /

    Required

    Type of observation (e.g. PRO, Lab, History of, Socia

    History, Allergies). If the visit site type is not define

    in the source data refer to the Observation section

    of the Concept ID Table. Used to create

    observation_type_concept_id

    associated_provider

     _source_value

    String (50) Provider ID from the source. This value will be

    matched against the Provider table to obtain thecorresponding provider_id.

    x_visit_occurrence_

    source_identifier

    String (50) Local reference value for visit. This value will be

    matched against the Visit Occurrence table to obtai

    the corresponding visit_occurrence_id.

    relevant_condition

     _source_value

    String (50) First Associated Diagnosis Code. Used to create

    relevant_condition_concept_id.

    x_obs_comment String (500) Contains Result Comments – do not use this field

    for now 

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    37/47

    SAFTINet ETL Specifications Document Page 37

    4.10.1 Example of OBSERVATION source / destination data

    Observation Table - ETL

    observation_source_identifier 40230320

    x_data_source_type EHR

    person_source_value 20202302

    observation_source_value BP_Systolic

    observation_source_value_vocabulary University Lab

    observation_date 7/12/2011

    observation_time 4:53:00 PMvalue_as_number 148

    value_as_string

    unit_source_value mmHg

    range_low 50

    range_high 200

    observation_type_source_value Lab Value

    asociated_provider_source_value 930392

    x_visit_occurrence_source_identifier 2020200

    relevant_condition_source_value 401.2

    x_obs_comment

    ETL View Grid View

    Observation Table - Grid

    observation_id 23902323

    x_data_source_type EHR

    person_id 3903030

    observation_concept_id 102190

    observation_source_value 8393929

    observation_date 7/12/2011

    observation_time 4:53:00 PMvalue_as_number 148

    value_as_string

    value_as_concept_id

    unit_concept_id 020333

    unit_source_value mmHg

    range_low 50

    range_high 200

    observation_type_concept_id 2032002

    associated_provider_id 939393

    visit_occurrence_id 2002303

    relevant_condition_concept_id 302023

    x_obs_comment

    x_grid_node_id 1

    Green – Brought forward into grid model / Red – Removed in processing Green – Brought forward from ETL / Yellow  – Generated from ETL field / Red

     – Generated locally or from multiple ETL fields

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    38/47SAFTINet ETL Specifications Document Page

     Appendix A: Table Specific Rules

    Person Table

    o  Recordset should consist of all information (including inpatient and outpatient visits) about any patients

    with activity (outpatient visits) at a participating primary care site within the past 5 years (back to1/1/2007 for initial SAFTINet load)

    o  For any patient seen within the past 5 years we request data retrospectively as described below.

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    39/47SAFTINet ETL Specifications Document Page

     Appendix B: Row filters

    This section details the types of data that will go into each table. For each table, the rightmost columns lis

    the general data domains (e.g. Lab values) along with the specific concepts (e.g. Blood Pressure) within

    each domain that should be gathered for the table. When a date is listed with a concept, please gather all

    records after that date. For most concepts, this will mean gathering the last 5 years of data (2007-2012),

    though some concepts go back further such as colonoscopy and pneumovax.

    Organization One record per grouping of care sites operating under a single health care hierarchy 

    Care Site Include a record for any location where care is provided (examples include clinics, mobileunits and "home-health care"). Multiple separate care-sites in a single building could be

    grouped together, or not depending on partner's preference 

    Provider Include a record for every provider who appears in the "provider" table OR the subset ofthe table that can be linked to a claim, a visit, or a prescription, whatever is easiest. If

    filtering, include all providers who have been active since 1/1/2007 even if not currently

    active. Person Include a record for each person who has had some sort of contact with the participating

    clinics since 1/1/2007 (regardless of current activity status). This set of persons can be

    used to filter the rest of the clinical data - only pull data related to this set of patients. 

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    40/47

    SAFTINet ETL Specifications Document Page 40

    For the following four tables, we wish to collect the specified record types. Please check the ‘Collected?’ column for

    any record types that will be included in the source data file. Also, please list the local source value for that type.

    Example: If the local tag for Systolic BP that will go into the observation_source_value field is ‘SBP’, put that in the

    local name column where systolic BP is listed.  

    Record Type Minimum

    Date

    Result Type Collected? Local Name

    Drug

    Exposure 

    Include a record for each prescription / fill / drug administration. 

    Prescription 

    Medication List 

    Administered Drugs 

    Fulfillment 

    Condition

    Occurrence

    Include a record for each entry on the problem list as well as a record for each encounter level diagnosis code.

    Generally, these will be ICD-9 codes. 

    Problem list 

    Visit-level diagnosis codes 

    ICD-9 codes from claims record 

    Observation  Data that do not fit in another table belong here. Observation table contains data from the following categories: labobservations (i.e. test results), general clinical findings, signs, and symptoms, along with other domains listed below. 

    Vital Signs 

    Height  1/1/2007 

    Height Percentile (for children)  1/1/2007 

    Weight 

    1/1/2007 

    Weight Percentile (for children)  1/1/2007 

    Pulse oximetry  1/1/2007 

    Pulse  1/1/2007 

    Blood Pressure - Systolic  1/1/2007 

    Blood Pressure - Diastolic  1/1/2007 

    Social History 

    Smoking Status

    (Current/Past/Former/Second

    Hand Exposure) 

    All Records /

    No Date Limit 

    Drinking Status  All Records /

    No Date Limit 

    Past Medical History (To be defined) 

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    41/47

    SAFTINet ETL Specifications Document Page 41

    Past Surgical History (To be defined) 

    Lab Results

    Cholesterol  1/1/2007 

    LDL  1/1/2007 

    Alanine transaminase  1/1/2007 

    Albumin  1/1/2007 

    Alkaline Phosphatase  1/1/2007 

    Aspartate aminotransferase  1/1/2007 

    Bilrubin (Total, Indirect and

    Direct) 

    1/1/2007 

    Blood Urea Nitrogen_Serum  1/1/2007 

    Calcium-Serum  1/1/2007 CBC_% lymphocytes  1/1/2007 

    CBC_% Neutrophils  1/1/2007 

    CBC_White Blood Cell Count  1/1/2007 

    Chlamydia trachomatis DNA

    assay (procedure) 

    1/1/2007 

    Chol HDL  1/1/2007 

    Chol_LDL, calculated  1/1/2007 

    Chol_LDL, measured directly  1/1/2007 

    Chol_Total 1/1/2007 

    Creatinine_Serum  1/1/2007 

    Free T4  1/1/2007 

    Glucose, Fasting_Serum  1/1/2007 

    Glucose, Random_Serum  1/1/2007 

    Glucose_Serum  1/1/2007 Hemoglobin A1c  1/1/2007 

    Hemoglobin_Serum  1/1/2007 

    Hepatitis B core antibody  1/1/2007 

    Hepatitis B e antibody  1/1/2007 

    Hepatitis B e antigen  1/1/2007 

    Hepatitis B surface antibody  1/1/2007 

    Hepatitis B surface antigen  1/1/2007 

    Hepatitis C antibody  1/1/2007 

    Hepatitis C antigen  1/1/2007 

    INR  1/1/2007 

    Platelet Count  1/1/2007 

    Potassium  1/1/2007 

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    42/47

    SAFTINet ETL Specifications Document Page 42

    Prostate specific antigen

    measurement (procedure) 

    1/1/2007 

    Pulmonary Function Test 1/1/2007

    Sodium  1/1/2007 

    Triglycerides  1/1/2007 

    TSH  1/1/2007 

    Urinary Protein  1/1/2007 

    Urine microalbumin/creatinine

    ratio measurement (procedure) 

    1/1/2007 

    Urine protein/creatinine ratio

    measurement (procedure) 

    1/1/2007 

    Urine_Microalbuminuriameasurement (procedure) 

    1/1/2007 

    Urine_Protein measurement

    (procedure) 

    1/1/2007 

    Creatinine_phosphokinase  1/1/2007 

    GFR, estimated  1/1/2007 

    influenza assay  1/1/2007 

    influenza rapid assay (poct)  1/1/2007 

    pertussis test  1/1/2007 

    respiratory syncytial test  1/1/2007 

    FEV1, pre, number  1/1/2007 

    FEV1, pre, percent  1/1/2007 

    FEV1, post, number  1/1/2007 

    FEV1, post, percent  1/1/2007 

    FVC, pre, number  1/1/2007 FVC, pre, percent  1/1/2007 

    FVC, post, number  1/1/2007 

    FVC, post, percent  1/1/2007 

    PFT: Peak expiratory flow  1/1/2007 

    Allergies 

    Family History

    Family History of CVD 1/1/2007

    Patient Reported Outcomes

    Medication Adherence Survey 

    MAS 1a 1/1/2007  Yes/No

    MAS 1b 1/1/2007  Yes/No 

    MAS 1c 1/1/2007  Yes/No 

    MAS 1d 1/1/2007  Yes/No 

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    43/47

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    44/47

    SAFTINet ETL Specifications Document Page 44

    PHQ-9 Q7 score  1/1/2007 

    PHQ-9 Q8 score  1/1/2007 

    PHQ-9 Q9 score  1/1/2007 

    PHQ-9 Total score  1/1/2007 

    Demographic Information 

    Highest Education Level

    Achieved 

    All Records /

    No Date Limit 

    Language Preference  All Records /

    No Date Limit 

    Imputed Race / Ethnicity  All Records /

    No Date Limit 

    Person % Fed Poverty level 

    1/1/2007 Person family size  1/1/2007 

    Family income  1/1/2007 

    Person relationship status  1/1/2007 

    Person Practice Status (active or

    moved or gone elsewhere) 

    Most Recent

    / No Date

    Limit 

    Procedure

    Occurrence 

    Include a record for each procedure performed on a patient (CPT-4, ICD-9-CM (Procedures), and HCPCS codes). If you

    want to filter the procedure table, at least include the following procedures 

    Procedures

    Bone mineral density (DEXA

    scan) 

    1/1/2007 

    Colonoscopy 

    Diabetic Eye Exam 

    1/1/2007 

    Diabetic Foot Exam  1/1/2007 

    Double contrast barium enema  1/1/2007 

    Mammogram  1/1/2007 

    Pap Smear  1/1/2007 

    Pulmonary Function Test  1/1/2007 

    Spirometry  1/1/2007 

    Mechanical Ventilation 1/1/2007

    Continuous nebulized therapy 1/1/2007

    Endotracheal intubation 1/1/2007

    Critical Care 1/1/2007

    Fecal occult blood test  1/1/2007 

    Immunizations

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    45/47

    SAFTINet ETL Specifications Document Page 45

    Pneumovax 

    Other Immunizations  1/1/2007 

    Education

    Education Nutrition  1/1/2007 

    Education Weight loss

    management 

    1/1/2007 

    1.  ACT and C-ACT categories should be one of the following:

    1 = ACT in control (Total score > 19)

    2 = ACT poorly controlled (Total score 16-19)

    3 = ACT very poorly controlled (Total score < 15)

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    46/47SAFTINet ETL Specifications Document Page

     Appendix C: Sending data using flatfiles

    Some users may wish to send their data in a standard flatfile as opposed to the current XML. ROSITA is

    being modified to handle such files. The basic file should be a .txt style text file with columns arranged in

    the order listed in this document. Individual column values should be separated by a pipe ‘|’character. A

    total of 9 files should be loaded in the initial round, one for each table in Sections 4.2-4.10. The files will b

    processed in the same fashion as the current XML files (see ROSITA Admin Guide for further details)

    Example: 1 row from a sample Organization file

    This record (from Section 4.2):

    Organization Table - XML organization_source_value1  UC Internal Medicine

    x_data_source_type EHRplace_of_service_source_value Academic Practice

    organization_address_1 13199 E Montview Blvd

    organization_address_2 Suite 300, Mail Stop F443

    organization_city Aurora

    organization_state CO

    organization_zip 80045

    organization_county Arapahoe

    Should be represented as follows in the file (the actual text should be all on one line):

    UC Internal Medicine|EHR|Academic Practice|13199 E Montview Blvd|Suite 300, Mail StopF443|Auora|CO|80045|Arapahoe

    Users should apply the following rules when generating flatfiles:

    -  Send a separate file for each data table

    -  Files should be named using the following convention [table name].txt

    -

      Column values should be separated with the | character used as a delimiter

    -  Files should contain one record per row. No header row is needed, the first row should be actual

    data

    -  Quotation marks occurring within column values should be ‘escaped’ so the processor can locate

    them. This should be done with the \ character – the end result should look like \” 

  • 8/19/2019 2013-03-03 ETL Specifications v4.0 SAFTINet

    47/47

    -  Backslash marks occurring within column values should also be ‘escaped’ with a second backslash

    the end result should look like \\

    -  Datetime values should be in the following format 2012-01-09T12:00:00Z (example: 2012-01-09

    4:15:00 PM) and dates should be use the following format YYYY-MM-DD