most common issues in define.xml files nj cdisc user group sergiy sirichenko october 21, 2015

26
Most Common Issues in Define.xml files NJ CDISC User Group Sergiy Sirichenko October 21, 2015

Upload: archibald-welch

Post on 19-Jan-2016

239 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Most Common Issues in Define.xml files NJ CDISC User Group Sergiy Sirichenko October 21, 2015

Most Common Issues in Define.xml filesNJ CDISC User Group

Sergiy SirichenkoOctober 21, 2015

Page 2: Most Common Issues in Define.xml files NJ CDISC User Group Sergiy Sirichenko October 21, 2015

Abbreviations› CT – CDISC Control Terminology› VLM – Value Level Metadata

Page 3: Most Common Issues in Define.xml files NJ CDISC User Group Sergiy Sirichenko October 21, 2015

Major problems in Define.xml

› Usage of outdated Define.xml v1.0› Inconsistency in metadata› Missing study specific metadata› Lack of expertise

Page 4: Most Common Issues in Define.xml files NJ CDISC User Group Sergiy Sirichenko October 21, 2015

Outdated Define.xml v1.0 is still used› Define.xml has many standard limitation issues› “The first” versions are never perfect› Define.xml v1.0 is 11 years old

› Does anybody still using SDTM IG 3.1.1?› Define.xml v2.0 is robust enough to handle current

submission needs

› Separate presentation or webinar will be dedicated to this topic

Page 5: Most Common Issues in Define.xml files NJ CDISC User Group Sergiy Sirichenko October 21, 2015

Lack of structural consistency in v1.0› Metadata structural consistency in define.xml v2.0

is preventive against errors› Example: Variable Source value defines other

attributes› “CRF” -> Pages are expected› “Derived” -> Computational Algorithm is

expected› Define.xml v1.0 allows entering CRF pages for

derived variables, having missing values for expected attributes, etc.

Page 6: Most Common Issues in Define.xml files NJ CDISC User Group Sergiy Sirichenko October 21, 2015

Limited and confusing VLM in v1.0› In v1.0 Value Level Metadata does not provide a

reference to variable it applies› Cannot handle multiple conditions

› Confusing and complex hierarchical VLM structure is used instead

› Example: › LB domain has VLM assigned to LBCAT› LBCAT has VLM for LBSPEC, LBSPEC -> LBMETHOD, etc.› Properties of LBORRES (or other?) variable are described on some

point of this tree structure› V2.0 has explicit single expression with multiple condition assigned

to particular variable

Page 7: Most Common Issues in Define.xml files NJ CDISC User Group Sergiy Sirichenko October 21, 2015

Some sponsors try to mimic v2.0› To use functionality of v2.0› Example:

› V1.0 does not have attributes for NCI Codes› Sponsor added NCI Codes as a part of Decode value› V2.0:

› V1.0:

› It’s invalid usage of v1.0 standard!› Why not switch to v2.0 instead?

Permitted Value (Code)mmol/L [C64387]ng/mL [*]

Code Value Code Textmmol/L mmol/L [C64387]ng/mL ng/mL [*]

Page 8: Most Common Issues in Define.xml files NJ CDISC User Group Sergiy Sirichenko October 21, 2015

Some sponsors use custom stylesheet› Often done to mimic the functionality of v2.0› Regulatory reviewers like consistency, so please

use the CDISC provided standard stylesheet

Page 9: Most Common Issues in Define.xml files NJ CDISC User Group Sergiy Sirichenko October 21, 2015

Non-relevant metadata› Variable Role is used for standard development,

but does not add any value for study metadata› Example:

› STUDYID and USUBJID can only be “Identifier”› Does anyone actually used this info?› Define.xml 2.0 stylesheet doesn’t display it

Page 10: Most Common Issues in Define.xml files NJ CDISC User Group Sergiy Sirichenko October 21, 2015

Order of datasets and variables› Alphabetical

› Example: AE, CM, DM, …› Correct: logical order as defined by standard - by Class,

then by domain name› Random

› Example:

› Correct: as variables are present in dataset

Order # Variable Label1 AECAT Category for Adverse Event 2 AEDECOD Dictionary-Derived Term 3 AEGRPID Group ID 4 AESEQ Sequence Number 5 AETERM Reported Term for the Adverse Event 6 DOMAIN Domain Abbreviation 7 STUDYID Study Identifier 8 USUBJID Unique Subject Identifier 9 AEBODSYS Body System or Organ Class

10 AEOUT Outcome of Adverse Event …

Page 11: Most Common Issues in Define.xml files NJ CDISC User Group Sergiy Sirichenko October 21, 2015

Missing or invalid Origin› No references to CRF pages

› Example: Origin=”CRF”, instead of “CRF Page 12, 41, 57”› Inconsistencies in Origin/Comments

› Example:› RFSTDTC has Origin = “CRF”› No annotations on CRF (as expected)› Comments: “First dose of study medication” (it looks

like Derived variable)

Page 12: Most Common Issues in Define.xml files NJ CDISC User Group Sergiy Sirichenko October 21, 2015

Missing of invalid Derivations› Example 1:

› AGE: ”Calculation: = Min DOV - BRTHDTC in AGEU“› What is DOV? How I can use Character value (BRTHDTC)

in arithmetical formula? How were missing or partially missing dates handled?

› Derivations should be provided in terms of available data

› Example 2: › “ZX021_AE_DURATION”› ???

Page 13: Most Common Issues in Define.xml files NJ CDISC User Group Sergiy Sirichenko October 21, 2015

Invalid Value Level Metadata› VLM should be described on the same level as

regular variables:› Codelist, DataType, Length, Origin, Derivation, etc.› Common issue is missing or invalid metadata for Value

Level› Consider VLM as new variables with properties

independent from “hosted” variable› Example: Treatment Emergent Flag in SUPPAE has

length=1, not 200 as QVAL variable

Page 14: Most Common Issues in Define.xml files NJ CDISC User Group Sergiy Sirichenko October 21, 2015

Duplicate records› Code List

› Term› Variables

› Order Number

Page 15: Most Common Issues in Define.xml files NJ CDISC User Group Sergiy Sirichenko October 21, 2015

External dictionaries› Info on external dictionaries (MedDRA, WHODrug)

is not provided correctly› As comments to variable (non-machine readable)

› ISO8601 is defined as External Dictionary› It’s a data format associated with all date, datetime, etc.

variables. No specific reference to ISO8601 is needed if Data Type is defined correctly

Page 16: Most Common Issues in Define.xml files NJ CDISC User Group Sergiy Sirichenko October 21, 2015

Missing study specific metadata› Study specific information is crucial for reviewers› However in most submission packages it’s missing› Value of define.xml, SDRG, aCRFs is to explain what

is unique in this particular study

Page 17: Most Common Issues in Define.xml files NJ CDISC User Group Sergiy Sirichenko October 21, 2015

Missing Codelists› Codelists are limited to variables which are

assigned to standard CT› Commonly missing study specific Codelists for

variables› Category (--CAT), Subcategory (--SCAT)› EXTRT, ARMCD, --TESTCD/--TEST, QNAM, TPT› RDOMAIN in CO and RELREC domains› XXTOX, …

Page 18: Most Common Issues in Define.xml files NJ CDISC User Group Sergiy Sirichenko October 21, 2015

Merged Codelists› Due to confusion between Standard CT Codelist

and study Variable Codelist› Example:

› Define.xml has one codelist (UNIT) assigned to all --DOSU, --VAMTU, --ORRESU, --STRESU variables

› This codelist includes all unique terms across all study “units” variables and have 450 items, while for example EXDOSU variable is populated with one “mg” term only

› A reference to 450-terms codelist is not relevant

Page 19: Most Common Issues in Define.xml files NJ CDISC User Group Sergiy Sirichenko October 21, 2015

What is define.xml Codelist?› Define.xml Codelist describes data collection

process and should be limited to all terms used for data collection of specific data element (a particular Variable or Value Level)› For example, LBSTRESU, EGORRESU, EXDOSU usually

have separate Codelists based on the same (UNIT) standard CT

› If data is collected as a free text, then Codelist may be not applicable› Common example is CMDOSU, CMDOSFRQ, CELOC, etc.

Page 20: Most Common Issues in Define.xml files NJ CDISC User Group Sergiy Sirichenko October 21, 2015

Missing terms in Codelist› Term is present in data

› SD0037 check› Programming error› Due to misspelling , leading space characters, etc.› Due to missing Decoded value for some items

› CodeList vs. EnumaretedItem› Codelist was populated based on collected data,

but some options from CRF page are not included› Example: Only race “WHITE” is collected, while 6

options are present on CRF

Page 21: Most Common Issues in Define.xml files NJ CDISC User Group Sergiy Sirichenko October 21, 2015

Missing or invalid Value Level Metadata

› Content of SUPPQUAL domains must be described

Page 22: Most Common Issues in Define.xml files NJ CDISC User Group Sergiy Sirichenko October 21, 2015

Missing description of --SPID› --SPID is often Key Variable in domain› Clear and detailed description is required to

understand study data› Why --SPID was introduced? How it was derived? …

› Often Sponsors copy Notes text from CDISC IG. It’s completely invalid approach! Study specific information is expected.› SDTM IG text: “Sponsor-defined reference number.

Perhaps pre-printed on the CRF as an explicit line identifier or defined in the sponsor’s operational database. Example: Line number on a CRF Page.“

Page 23: Most Common Issues in Define.xml files NJ CDISC User Group Sergiy Sirichenko October 21, 2015

Missing description of variables› Study specific variables are the most important

› RFPENDTC, RFSTDTC, RFXSTDTC, --GRPID, --LNKID,--SPID, …

› SDTM text is not a variable description!› See --SPID slide as an example

Page 24: Most Common Issues in Define.xml files NJ CDISC User Group Sergiy Sirichenko October 21, 2015

Invalid Key Variables› Too long list of variables

› Example: “STUDYID, USUBJID, EXSPID, EXTRT, EXCAT, EXDOSTXT, EXDOSU, EXDOSFRM, EXDOSFRQ, EXDOSTOT, EXROUTE, EXSTDTC, EXENDTC, EXSTDY, EXENDY, EXTPT,EXTPTNUM, EXTPTREF, VISIT”

› Inconsistency between Key Variables and domain Structure› Example: Structure: “One record per event”

› Key Variable: “USUBJID, AETERM, AEDECOD, AESTDTC, AESEV, AESER, AEACN, VISIT”

› Usage of –SEQ as Key Variable› Example: “USUBJID, AESEQ”

Page 25: Most Common Issues in Define.xml files NJ CDISC User Group Sergiy Sirichenko October 21, 2015

Non-compliance with eCTD› Define.xml file is located in different folder than

datasets› Example:

› define.xml in …\tabulation› Data in …\tabulation\sdtm

› File name is not “define.xml”› Example:

› “define_study_001_sdtm.xml”

Page 26: Most Common Issues in Define.xml files NJ CDISC User Group Sergiy Sirichenko October 21, 2015

Contact info:

Sergiy [email protected]