systematic review and meta-analysis · students should be able to -develop a review proposal for...
TRANSCRIPT
SYSTEMATIC REVIEW AND META-ANALYSIS
MODULE I: SYSTEMATIC REVIEW
SIMESTER 3/2017
FOR DOCTOR OF PHILOSOPHY PROGRAM
IN CLINICAL EPIDEMIOLOGY AND MASTER OF
SCIENCE PROGRAM IN MEDICAL EPIDEMIOLOGY
FACULTY OF MEDICINE RAMATHIBODIHOSPITAL
MAHIDOL UNIVERSITY
PROF.DR.AMMARIN THAKKINSTIAN
WWW.CEB-RAMA.ORG
CONTENTS
1. INTRODUCTION AND BACKGROUND ............................................................................. 4
2. REVIEW METHODOLOGY .................................................................................................. 5
2.1 Rationale and background ............................................................................................ 5
2.2 Formulate review questions and objectives ................................................................. 9
2.3 Locate/identify studies ................................................................................................. 10
2.3.1 Define search terms and strategies ....................................................................... 10
2.3.2 Define source of relevant studies ......................................................................... 11
2.3.3 Define the software & version (if possible) for search engines ........................... 14
2.4 Selection of studies ...................................................................................................... 15
2.4.1 Define inclusion & exclusion criteria .................................................................. 15
2.4.2 Selection of studies ............................................................................................... 19
2.5 Data extraction ............................................................................................................ 20
2.5.1 General characteristics of studies and subjects ................................................... 20
2.5.2 Data for pooling ................................................................................................... 21
2.6 Risk of bias assessment ............................................................................................... 24
2.7 Statistical analysis ....................................................................................................... 26
2.8 Register review proposal ............................................................................................ 28
2.9 Reporting systematic review and meta-analysis ....................................................... 29
2.10 Interpret results ........................................................................................................... 29
2.11 Grading evidence ......................................................................................................... 29
3. ASSIGNMENTS .................................................................................................................. 39
4. REFERENCES .................................................................................................................... 40
5. TEST BOOK ........................................................................................................................ 43
1
Objectives Students should be able to
- Develop a review proposal for systematic review and meta-analysis.
- Perform and conduct a systematic review and meta-analysis
APPENDICES: READING SECTION
Appendix I: Moher D, Liberati A, Tetzlaff J, Altman DG. Preferred reporting items for
systematic reviews and meta-analyses: the PRISMA statement. Ann Intern
Med. 2009;151(4):264-9, W64.
Appendix II: Higgins JP, Altman DG, Gotzsche PC, Juni P, Moher D, Oxman AD et al.
The Cochrane Collaboration's tool for assessing risk of bias in randomised
trials. BMJ 2011; 343: d5928.
Appendix III: Chapter 8: Assessing risk of bias in included studies from
http://handbook.cochrane.org/ (part II)
Appendix IV: The PRISMA statement for reporting systematic reviews and meta-
analyses of studies that evaluate health care interventions: explanation
and elaboration. PLoS Med 6 (7): e1000100, 2009.
Appendix V: Newcastle-Ottawa quality assessment:
(http://www.ohri.ca/programs/clinical_epidemiology/oxford.asp
• Scale
• Manual
Appendix VI: Stewart L, Moher D, Shekelle P. Why prospective registration of systematic
reviews makes sense. Systematic reviews. 2012;1:7.
Appendix VII: Whiting PF, Rutjes AW, Westwood ME, Mallett S, Deeks JJ, Reitsma JB,
et al. QUADAS-2: a revised tool for the quality assessment of diagnostic
2
accuracy studies. Ann Intern Med. 2011;155(8):529-36. Appendix VIII: Hutton B, Salanti G, Caldwell DM, Chaimani A, Schmid CH, Cameron C,
et al. The PRISMA extension statement for reporting of systematic reviews
incorporating network meta-analyses of health care interventions: checklist
and explanations. Ann Intern Med. 2015;162(11):777-84.
Appendix IX: 1.Stewart LA, Clarke M, Rovers M, Riley RD, Simmonds M, Stewart G,
et al. Preferred Reporting Items for Systematic Review and Meta-Analyses of
individual participant data: the PRISMA-IPD Statement. JAMA.
2015;313(16):1657-65.
Appendix X: Stewart L, Moher D, Shekelle P: Why prospective registration of systematic
reviews makes sense. Systematic reviews 2012, 1:7.
Appendix XI: Grade handbook
Appendix XII: Puhan MA, Schunemann HJ, Murad MH, Li T, Brignardello-Petersen R,
Singh JA, Kessels AG, Guyatt GH, Group GW: A GRADE Working Group
approach for rating the quality of treatment effect estimates from network
meta-analysis. BMJ 2014, 349:g5630.
3
1. INTRODUCTION AND BACKGROUND
From Egger M, Smith GD, Altman DG. Systematic reviews in health care: Meta-analysis in
context (1)
Systematic review is defined as a review that has been conducted using a systematic approach
in order to minimize reduce biases and random errors. Several terms have been used in
literature which refer to a systematic review including systematic review itself, overview,
research synthesis, pooling, and meta-analysis. It is not necessary that a systematic review
always apply meta-analysis to pool effect size (e.g. treatment effect, exposure effect, or
genetic effect) across studies. In the case where pooling the effect size is the primary
objective of interest given there are sufficient data for pooling, a meta-analysis is the
statistical tool for estimation of the effect sizes. The estimated effect size will be valid and
precise depending on a review methodology including identifying studies, selection of
studies, and data extraction, and method of estimates. Thus, a good meta-analysis should be
performed based on a good review methodology, otherwise the result of meta-analysis will be
biased.
Rationales for performing a systematic review and meta-analysis can be numerous as seen in the
outline below:
1.1 The systematic review allows more objective appraisal of evidence (i.e., allows readers
to replicate) than the traditional narrative review, and may be able to contribute to resolving
uncertainty when original researches or narrative review/s disagree. In addition, a systematic
review provides quantitative conclusions if meta-analysis is applied. Conversely, a narrative
review is more subjective and more likely to be selection bias, limited by single or in some
selected studies, unhelpful descriptions (e.g., no clear evidence), and might be based on too
weak or too strong treatment effects or exposure-outcome relationships.
4
1.2 Meta-analysis, if applicable and appropriate, has more statistical power to detect the
treatment or exposure effect, and thus decreases the probability of false negative results. As a
result, efficacious treatment is applied to patients sooner.
1.3 Exploratory analyses will lead to suggest which group of patients are more or less likely
to respond well to a treatment
1.4 Systematic review may demonstrate a lack of adequate evidence or a gab of knowledge,
further studies are therefore needed.
2. REVIEW METHODOLOGY
In the same manner as other primary research as such observational studies or randomized
control trials, a systematic review (known a secondary research) should be performed based
on good review methods. Therefore, a research proposal must be developed prior
conducting a review to make sure that the review methods are robust, transparent, and
minimize bias as much as possible. In addition, the review proposal will reassure investigators
that they will not miss some important issues before/while conducting the review.
Furthermore, the review proposal should be registered to make it more transparent, see
sources and benefits for registration below. Outlines of the research proposal are as follows:
2.1 Rationale and background
The investigator needs to introduce and describe the background of the interested disease or
event to help readers to become more familiar with what will be studied. For instance,
magnitude of the disease (e.g., prevalence or incidence where appropriates) across the world,
or by ethnicity (Caucasian, Africa-American, Asian, etc.), region (Asia, Europe, America,
Africa, etc.), or locality should be described if data are available. How much the disease
impacts/burden on patients (e.g., morbidity, disability, economic loss, quality of life,
mortality) and the country should be described.
5
If the review is aimed at assessing treatment efficacy, information about interested treatments
should be described. For instance, how many treatment regimens are available, how long the
treatments have been used on patients and with what dosages, what relevant clinical
outcomes are usually assessed after administering treatments, what is the mechanism of
treatment- outcome/s, and whether the treatments have adverse events? If so, what are they?
For genetic association studies, the study factor is gene rather than treatment. Gene
identifications by genome-wide scan association studies (known as GWAS) should be
mentioned. In addition, genes’ location and function should be described along with source of
information. The most popular and very useful source is the Online Mendelian Inheritance in
Man (OMIM), which is a comprehensive, authoritative compendium of human genes and
genetic phenotypes by the McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins
University School of Medicine. It is freely available and updated daily, which can be link
from PubMed and select database as OMIM. Then, a specific gene or polymorphism can be
search from there.
The question of interest can be about diagnostic performances. For instance, what were
performances of Berlin and STOP-BANG questionnaires in diagnosis of obstructive sleep
apnea (OSA) comparing with the standard test of polysomnography in pregnancy (2). Details
of these questionnaires (i.e., how many domains/items are there, how to rate for each item,
how to interpret the results) should be described. Diagnostic parameters such as sensitivity,
specificity, positive/negative predictive values, likelihood ratio positive/negative, and area
under receive operative characteristic (ROC) curve, also known as C-statistic, will be pooled.
Exposure effects from various types of observation studies can be question of interest for
systematic review and meta-analysis. For instance, previous studies have reported
6
discrepancy effects of education and income on cardiovascular diseases. Khaing et al (3) had
therefore conducted a systematic review and meta-analysis, which aimed to pool effect sizes
of education and income on cardiovascular diseases by including only cohort studies.
Anothaisintawee et al (4) had conducted a systematic review and meta-analysis to assess
effects of sleep disturbances on diabetes including short (<6 h) and long (>8 h) sleeping time,
insomnia (initiating or maintaining sleep), OSA and abnormal sleep timing. In addition,
pooled effect sizes of these sleep parameters were further compared with other known risk
factors of diabetes, e.g., body mass index, family history of diabetes, and physical inactivity.
How these exposures (e.g., education, income, sleep disturbances) related to disease
outcomes or disease mechanisms should be described.
With growing of research of prediction models, which can be prediction models of risk of
disease occurrence or disease prognosis, a systematic review and meta-analysis of this area is
growing as well. The systematic review may aim to collect relevant prediction models that
are available for specific event/disease. How those model had been developed and whether
they had been both internally and externally validated. It is also important to know how many
important variables/predictors were included in the model and what they were, which will be
useful for further applicable or further research. In addition, the review may also aim to assess
performances of those previous prediction models by applying meta-analysis for pooling their
performances. For instance, Anothaisintawee et al (5) had conducted a systematic review of
risk prediction model of breast cancer to explore how many risk prediction models are
available, what variables were included in the model, had those model been validated either
internally or externally, and how were their performances. Results of this review would lead
identify a gab of knowledge in conducting further study. Wilasrusmee wt al(6) had also
conducted a systematic review of diagnostic scores for appendicitis, which aimed to explore
the history of scores’ developments such as how any scores had been developed and available, 7
how they were developed in term of what variables were considered at univariate and
multivariate analyses, what type of models had been used for develop, how they had assessed
model’s performances in term of calibration and discrimination. In addition, this review had
also applied meta-analysis to pool C-statistic and a O/E ratio across studies.
The next paragraph should describe what is already known or has been reported about the
efficacy of interested treatments or association between exposure/genes and interested disease.
Whether treatment/expose/gene effects are controversial, i.e., treatments may work well in
some patients, but may not in other groups of patients; or expose to that risk factor may be
associated with disease in some studies but may be not in some other studies? What are
possible reasons of controversial results? For treatment efficacy, could it be due to different
types of patients (disease severity, age, underlying diseases) different dosages or route of
administration (local, oral, intravenous), or other additional treatments? For observation
studies, different study design and measurement of exposure/outcome could alsoplay a role.
Effect of ethnicity, which is known as population stratification, and prevalence of minor allele
frequency should also consider in genetic association studies.
For diagnostic test, what test is usually used as the reference /standard/gold test, how to
perform the test, does it need specialist to do the test, is it invasive or cost? What are the new
or less invasive tests available? What is/are benefits of applying the new/index/studied test?
How are performances from previous evidences compared to the reference/standard/gold test.
If there is at least one systematic review on the same topic has been finished and published, or
it has yet finished but still on going as finding from PROSPERO web site, we should try very
hard to find out reasons for conduct our review. General review methods including location
and selection of studies, data extraction, and proper statistical analysis should be assessed. In
addition, number of studied factors (i.e., treatments/exposures/genes) and relevant clinical 8
outcomes should be criticised. Numerous reasons for supporting our additional review are as
follows: all/most treatment regimens are considered whereas the previous review/s considered
only one or few treatment comparisons, relevant clinical outcomes are more considered
compare to the previous review/s, consider more than one polymorphism and genetic mode of
effect/s will be determined, additional individual studies have been published since the
previous review/s, etc. Collect all possible reasons to show why this additional systematic
review & meta-analysis is needed? In addition, further additional information will be added
from the previous review/s should be mentioned. If we cannot find out good reasons to
conduct our systematic review, we should not replicate this topic.
2.2 Formulate review questions and objectives
Formulation of research question and objective/s are the heart of a review because this will
lead to clearly define review objectives and further review methods including
location/identifying and selection of studies, data extraction, and statistical analysis plan.
The research questions should at least consisted of study factors (e.g., treatment,
intervention, studied test/s, gene, exposure), outcomes of interest, and types of patients. If
the review is aimed at assessing treatment efficacy, the research question should also be
more specifically by considering one more components, i.e., comparator. The combination
of research questions is known as PICO in Evidence-based Medicine (EBM), in which P, I,
C, and O refer to patients, intervention, comparator, and outcome, respectively.
Examples of PICO: - Is occurrence of myocardial infarction (O) higher in Rosiglitazone (I) than Metformin
(C) in diabetic patients (P)? This PICO leads to the objective of comparison of MI rates
between Rosiglitazone and Metformin groups in diabetic patients.
- Is renin-angiotensin blockage (I) better in reducing micro-/macro-albumin urea, end-
stage renal disease, and death (O) than other antihypertensive drugs (C) in diabetic 9
patients (P)?
- Does complement factor F gene associate with age-related macular degeneration (O)
in general population (P)?
- What are performances of Berlin and STOP-BANG questionnaires in diagnosis of
obstructive sleep apnea (OSA) comparing with the standard test of polysomnography in
pregnancy?
- Are sleep disturbances risk factors of diabetes?
2.3 Locate/identify studies
2.3.1 Define search terms and strategies
Search terms can be constructed based on PICO. All relevant search terms should be collected
and listed according to each domain of P, I, C, and O. Investigators need to collect words or
terminology (or key words) that have the same meaning and have been used interchangeably.
Once the search terms have been collected, search strategies should then be constructed by
combining these search terms within the same and different domains following functions that
are specifically used for each search engine. Different search engines may require different
combined functions and thus the search strategies may be different and should be clearly
described for all search engines used. Basically, search terms are combined using ‘OR’ within
each domain of PICO whereas using ‘AND’ between domain of PICO.
Search strategies can be sensitive or specific depend on research question and availability of
relevant studies. If not many relevant studies for that topic are available, sensitive search
strategies may be required by considering search terms of only a few domain. For instance,
- Combining search terms only P and I ignoring C and O for assess treatment efficacy;
- Combining only type 1 diabetes for P and sleep disturbance for I/E ignoring O for asses
association between sleep disturbance and glycemic control in type 1 diabetes. 10
Specific search strategies are more strict, i.e., must consist of at least 3 or all domains of
interests. For instance, combining search terms of P, I, C, and O for treatment efficacy to get
studies that not only had specific patients and intervention but also comparator and outcome
of interests. Combining comparator/s in search strategies will be more specific and results in
less number of identified studies than ignoring comparator and/or outcome/s. If investigators
have prior knowledge how research in this area is growing, and there have been some studies
that had already been conducted and available in literature, adding comparator/s will make a
process of selection of studies flow effectively. Conversely, omitting comparator/s from
search strategies will be very sensitive and results in many identified studies for selection.
This is good if investigators know that not many studies have already been published, or
investigators would like to collect for all possible comparators.
In summary, sensitive search strategies will identify too many relevant studies for review.
Conversely, the specific search strategies will yield a small number of identified studies, and
may miss some relevant studies. Investigators should justify which type of search strategies
are suit with their topics.
2.3.2 Define source of relevant studies
All sources of information used for locating studies should be clearly described. These can
mainly be at least two electronic databases, lists of references, contacted authors, and
personal communication with experts in that field. In addition, a period of searching must be
specified. Updated searching should also be planned if the review will take time. Frequency
of update searching depends on the rate of increasing information for that topic, which can be
every second week or a month if it is a hot issue and there are some studies published
monthly, otherwise it can be every 3 to 6 months. This can be automatically set up in the
PubMed and Scopus search engines. A few main medical-health science electronic databases 11
are commonly used as follows:
a) MEDLINE
- Since 1949 to present
- Over 16 million references
- Since 2005, between 2,000-4,000 completed references are added each day (Tuesday
through Saturday)
- Covers 5200 worldwide journals in 40 languages
- Uses medical subject heading (MeSH) for index
- Includes biomedicine and health science journals
- English abstracts for 79% on references
- 90% are English language articles
- 47% of journals covered are published in the US
- PubMed is available free of charge
b) EMBASE
- Over 12 million records from 1974-present
- More than 600,000 records added annually
- Covers over 4,800 active peer-reviewed journals published in 70 countries/ 30 languages
- uses EMTREE for indexing
- includes English abstracts for 80% of references
- daily update, within two weeks of receipt of the original journal
- comprehensive inclusion of drug-related information
- Produced by Elsevier, no free version available
c) Scopus
- launched in November 2004
- 18,000 titles 12
– 16,500 peer-reviewed journals (1,200 Open Access journals)
– 600 trade publications
– 350 book series
– 3.6 million conference papers (~10%) from proceedings and journals
• Medical Science ~2.9%
• Biological Science ~ 2.7%
• Chemical Science ~ 1.9%
– 41 million records
• 21 million records with references back to 1996
• 20 million records 1823-1996
– 318 million scientific web pages
– 23 million patent records from five patent offices
• World Intellectual Property Organization (WIPO)
• European Patent Office
• US Patent Office
• Japanese Patent Office
• UK Intellectual Property Office
d) The Cochrane Controlled Trials Register (CCTR)
e) ClinicalTrials.gov is a registry of federally and privately supported clinical trials,
conducted in the United States and around the world. It is run by the United States
National Library of Medicine (NLM) at the National Institutes of Health. Currently
there are 130,000 trials with locations in 170 countries. The web site provides
information about a trial's purpose, who may participate, locations, and phone numbers
for more details.
13
f) Nonliterary sources are (from Systematic Reviews 2014, 3:74)
- The American clinicaltrials.gov
- The World Health Organization's International Clinical Trials Registration Platform
at : http://www.who.int/ictrp/en/
- The European clinicaltrialsregister.eu. 2.3.3 Define the software & version (if possible) for search engines
– PubMed
– OVID
– Scopus
– Silver Platter
14
2.4 Selection of studies 2.4.1 Define inclusion & exclusion criteria
Eligibility criteria for selection of studies should be clearly described in the review proposal.
Inclusion criteria can be defined mostly based on PICO and additional characteristics of the
study, study design, type of studied patients, and type of reports (e.g., language, year,
publication status) as follows:
- P: Types of Patients/subjects
Cleary describe what types of patients/subjects your review will focus on. For instance,
general patients or patients with specific diseases; which age group e.g., children, adults
(aged ≥ 18 years), or the elderly; only women with/without menopause, or only men, or both
genders will be considered in the review. This will help in framing the scope of the studied
population, and will also help in targeting whom the results of the review will later be applied
to. In addition, the review may be interested in specific group/s of that disease, e.g.,
moderate-severe hypertension, moderate-severe obstructive sleep apnea, type 1 or type 2
diabetes, diabetes with glycated hemoglobin of 6.5-7.5, chronic kidney disease stage III or
higher, cleft lip with or without cleft palate, etc..
- I: Treatment/intervention for RCT, exposure/gene for observational studies , index/studied
test for diagnostic study
What treatments will this review consider? Will the review focus on only specific dosages or all
ranges of possible dosages? For instance, in the review of reno-protective effects (7) included
studies if their interventions were any type of angiotensin converting enzyme inhibitors (ACEIs),
and their comparator could be placebo or any type of active anti-hypertensive drugs. The review
of managements of chronic prostatitis/chronic pelvic pain syndrome (CP/CPPS) (8) included
studies if the interventions were any type of alpha-blockers or antibiotics that compared with
placebo or any type of active treatments.
15
For observational study, what is your studied factor (exposure and un-exposure), and how it
has been defined or measured? For instance, the studies will be included if they assessed
effects of any type of sleep disturbance either by questionnaire or objective measures on type
2 diabetes including duration of sleep, sleep difficulty, and OSA (4). For diagnostic test,
studies will be selected if they had performed and assessed index/studied test against the
reference/standard test? For instance, diagnostic studies were included if they used at least
one of the OSA screening questionnaires (e.g., Berlin questionnaire, STOP-BANG
questionnaire, ESS, etc.) and at least one of standard objective sleep tests such as
polysormnography, Watch-PAT, or home monitoring (2).
For a genetic study, studies will be included if they had assessed association between any of
interested locus/polymorphism/s and disease outcome. For instance, the review of the
association between complementary factor H (CFH) Y402H polymorphism and age-related
macular degeneration (9) included studies if they reported alleles T, and C; or TT, TC, and
CC genotypes for CFH Y402H. For IRF6 gene and non-syndromic cleft lip with/without
cleft palate review (10), studies were included if they had studied any of polymorphism
(i.e., rs2235371 G>A, rs642961 G>A, and rs2013162 C>A) in the IRF6 gene.
- C: Comparator
For treatment efficacy, comparators can be either placebo, specific standard treatment, or
even active treatment. Thus, studies will be eligible if they used placebo, or standard
treatment, or active treatment as the comparator, as described above. For instance, studies
that used any type of active treatments as comparator would be included in the review of
Bell’s palsy management (11), i.e., dual treatments of Azyclovir plus Prednisolone versus
monotherapy of Prednisolone or Azyclovir. The comparator can sometime be any if the
interested active interventions particularly when performing a network meta-analysis is
primarily aimed. For instance, the comparator can be any type of antihypertensive drug or
16
placebo for the review of reno-protective effects of renin-angiotensin system blockade in type
2 diabetic patients (7). Comparator can be placebo or any type of antibiotic prophylaxis for
prevention of surgical-site infection after groin hernia surgery (12). Comparator can be any
type of supplementation (i.e., calcium, vitamin D, calcium plus vitamin D) in prevention of
preeclampsia review (13).
- O: Outcomes
Eligible studies must at least have reported the outcome/s of interest. For instance, the review
will include only studies reported any of adverse outcomes, e.g., gastro-intestinal (GI)
bleeding, gastric ulcer, or myocardial infarction. The review of Interleukin-10 polymorphism
and graft outcomes in renal transplantation (14) included studies in which their outcomes
were graft failure or chronic allograft nephropathy. For the reno-protective effects of ACEIs
(7), studies were included if they reported any of following interested outcomes: micro-
albuminuria, macro-albuminuria, albuminuria regression, and ESRD. The review of CP/CPPS
(8) included studies that reported any of following scores (i.e., total symptom scores, pain,
voiding, and quality of life) or treatment responsiveness. For diagnostic test, the studies were
included if they used any of standard test including polysormnography, Watch-PAT, or home
monitoring for diagnosis of OSA (2).
- Study designs
Review can be limited to only randomized control trials if the primarily aim is to assess
treatment efficacy (or adverse events) given there are sufficient data for pooling, say ≥ 3 to 5
studies for direct meta-analysis or at least one study for network meta-analysis. Other
methodologies, e.g., randomization, concealment, blinding, measurement of outcomes may
sometimes be considered in the inclusion criteria.
For genetic studies, included studies can be any type of population-based observational
17
studies (e.g., cross-sectional study, case-control, or cohort study), which can be community-
based or hospital-based studies, which is known as genetic association studies. The review in
these genetic areas can sometimes be family-based (e.g., sib-pair) studies, in which a method
of pooling data differs to genetic association studies. Thus, investigators should clearly
describe what type of studies they intend to focus on.
For pooling effect size of risk/prognostic factors sometime can focus on only cohort studies if
there are sufficient data for pooling. With the cohort design would be claimed exposure-effect
relationship, thus stronger than exposure-effect from case-control or cross-sectional studies.
For instance, pooling effect of mean platelet volume on cardiovascular events (15) and disease
prognosis (16) were based on only data of cohort studies.
- Report’s characteristics
The review can be limited based on type of reports. For instance, the author might only
choose studies which have been identified as interested treatments after a specific point in
time. Or, perhaps the author might use only full, either published or unpublished papers,
which have been published exclusively in English or any other languages if their reviewers
for that specific language are available.
- Studies will be excluded if there are insufficient data for pooling. However, this decision
should be made after 2-3 attempts in contact authors have been done. Contacting authors should
not be ignored and should be planned in a review proposal. Collecting e-mail addresses of
corresponding authors plus the first authors during the selection process is very useful and handy
when requires.
- Coding for ineligibility should be designed in the review proposal. This will allow
reviewers to code consistently throughout review process for those ineligible studies. In addition,
18
this will lead to summarize reasons for ineligibility and thus report in flow of selection of studies.
A meeting among reviewers (if there is > one reviewer) should be called to standardize and
understand eligibility criteria and coding for ineligibility. One study can be more than one reason
for ineligibility but the most important reason should be assigned.
2.4.2 Selection of studies
Identified/located studies will be selected based on inclusion and exclusion criteria. Some
degree of subjectivity for select or not select is happened sometime during the selection
process. Therefore, this process should have two reviewers if possible to select for
eligibility of candidate studies to minimize selection bias.
The selection process should also be clearly described. This should be start from screening
titles and abstracts of all identified studies. Full papers will be retrieved if eligibility cannot be
made from screening title/abstract. If full paper is not accessible, the corresponding or first
author of that article should be contacted to request full paper. Reason of exclusion studies
due to un-accessibility of full paper should be avoided. If the study does not meet inclusion
criteria, the reason for not meeting should be documented to lead summarizing reasons for
ineligibility. In addition, an exclusion log denoting reasons for exclusion of each study should
be documented. To be consistent for the whole review, reasons for ineligibility and exclusion
must be planned and coded in a review proposal. This will lead to easy computerization and
analysis of data once the review has been finished.
Finally, results of selections should be computerized separately by two reviewers if applied.
The two data set will be validated using STATA or other statistical software to assess
disagreement, i.e., studies are selected by reviewer 1 but not for reviewer 2, or vice versa. A
Kappa statistic should be applied and reported. Investigators should plan for a panel meeting,
19
which should consist of not only the two reviewers but also the third party who is supervisor or
person who has more experience in doing systematic review. Disagreement will be solved by
consensus and discussion from this meeting.
2.5 Data extraction
A data extraction forms (DEF) should be designed before the study commences, and structures
and contents should be considered from the following:
2.5.1 General characteristics of studies and subjects
Characteristics of studies should be extracted in order to let readers get idea about included
studies. General characteristics include the first author (and corresponding author for contact
if required) along with email address, journal name and publication year (can be downloaded
and linked from EndNote). More specific characteristic of studies include county of study
setting, study design (e.g., cohort, case-control, cross-sectional study, RCT, community vs
hospital- based), study period, ethnicity particularly in genetic studies because gene
frequencies may differ accordingly, type of studied subjects (e.g., adults, children, or women,
etc.), targeted population (general patient or specific disease), range of follow up if
cohort/RCT, prevalence/proportion of underlying diseases (e.g., obesity, hypertension,
diabetes, dyslipidemia, chronic kidney disease, etc.).
In addition, information about study factors and outcome must be extracted including type of
treatments (drug class, dosage/day, course of treatment, route) for RCTs, exposure and type of
measurements for observational studies, type of polymorphisms for genetic studies, type of
outcomes and methods of measurement/diagnosis. Further more, type of outcome data (e.g.,
continuous or dichotomous or both), and type of reported data including frequency/summary data
or statistic parameter (e.g., mean difference, regression coefficient, odds ratio (OR), risk ratio (RR),
or hazard ratio(HR)).
20
These characteristics of studies and subjects will allow investigators to explore for causes of
heterogeneity, if it is present; and allow to perform a subgroup analysis to identify specific
patients/subjects may yield better/less benefits from treatments; subgroup of subjects may be
higher risk to expose to specific risk factor or gene, etc.
2.5.2 Data for pooling Tables of treatments/expose/gene/index test and outcomes of interest should be constructed. This
will allow reviewers to understand clearly and consistently what specific data we really need to
extract for further poolings. For continuous outcome data, number of patients and mean along with
its standard deviation (SD) of outcome values by treatment/exposure/genotype/allele groups are
required to extract. For dichotomous outcome data, frequencies of contingency table of
treatment/exposure/genotype/allele/genotype/index test and outcome groups are required.
Sometime these summary/aggregate data are not reported in articles but statistic parameters such
as beta coefficient or mean difference for continuous outcome, OR or RR for dichotomous
outcome, and HR for time to event outcome along with their 95% confidence intervals (CI) are
reported instead. Actually beta- coefficient of group 1 vs group 2 is equivalent to mean difference
of mean1 − mean2 ; thus this should be able to combine with those estimated mean differences
which are calculated from summary/aggregate data of mean and SD. Also those studies reported
summary statistics as OR or RR, these should be combined with those estimated ORs or RRs from
frequency data.
In genetic association studies, genotype effects are sometime considered as dominant (aa+Aa vs
AA if A and a are major and minor alleles), recessive (aa vs Aa+AA), or additive (0, 1, 2 for AA,
Aa, aa) effects. If genotype frequencies for AA, Aa, and aa across outcome groups/value are
available, let’s extract these data. If they are not so, let’s extract these assigned gene effects but
whether this will be able to pool depends on mode of gene effects suggested from pooling
21
summary data.
For time to event outcome, most studies usually report HRs along with 95% CIs of
treatment/exposure effects on outcome of interest. These can be directly combined applying meta-
analysis based on the assumption that the HR is constant or proportional overtime in which
sometime it is not. Thus, pooling HR may be invalid and may need to pool using other parametric
models or restricted mean survival time(17). To do this is required individual patient data (IPD) of
time and event from each study.
What we should seek for from included studies is extracting data from Kaplan-Meier (KM) curve
for both y-axis and x-axis along with number of patients at risk at each point of time underneath.
These data can be converted to individual patient data or raw data using ‘ipdfc’ command in
STATA(17). However, this command is required following data
• ts = time at any point of X–axis
• s = survival probability or failure probability at corresponding time ts
• trisk = time at risk
• nrisk = number patients at risk at corresponding trisk
where ts and s can be extracted from KM curve using graphical imaging such as Dititizelt (17),
WebPlotDigitizer (18) (https://automeris.io/WebPlotDigitizer/), or else. The later software is a
freeware, read/watch its tutorial at the website to see how it works. Once ts and s are extracted,
data should be save as csv file and then import to STATA; trisk and nrisk are next manually
added. This is ready to convert to IPD.
For systematic review of prediction model, data extraction forms should be specifically designed
considering research methods for this type of study following our experiences in doing this type of
review (5, 6), the guideline (19) and CHARMS checklist (20) including as follows:
22
• Type of model (diagnosis vs prognosis) if considering both in one review
• Time that the outcome of interest would be predicted by that prediction model
• Setting refers to the tie point when the model is intended to be used
• Type of study phases, i.e., development or derive vs validation phases
• Type of patients, e.g., general vs specific disease
• Study design, i.e., cohort (retrospective, prospective, bidirectional cohorts), cross-
sectional studies
• Prevalence/incidence of the interested outcome on that setting
• Numbers of variables considered in univariate and multivariate analyses if it is
development phase
• Numbers of variable included in the final prediction model, what they were?
• Proportion of exposure of individual variables included in the prediction model in that
setting
• Model’s performances including
-Calibration by statistic test and graph?
-Discrimination C-statistic
Data for pooling applying meta-analysis should also be considered if possible. Required data can
be C-statistics along with its 95% CI or standard error for discrimination performance; and O/E
data for calibration performance.
Once the DEF is designed, the first reviewer should test or pilot with 3-5 studies, then re-modify
or justify them accordingly to be more applicable. The data extraction process requires at least
two reviewers to independently perform, or at least one reviewer to extract the data while
another reviewer checks the validity of extraction. Once data extractions have been complete,
data should be computerized separately by reviewers, Validation should be next performed using
23
the same technique as mention above for selection of studies. This should be done independently
by statistician who is not involved in data extraction process if possible. Kappa statistic should
also be estimated. Call for meeting between reviewers and a third party should be set up to solve
disagreement.
2.6 Risk of bias assessment Assessing quality or risk of bias is considered in the internal and external validity of the study.
In general, an individual study should at least have internal validity prior to further
generalization. Any systematic bias will distort internal validity such as, selection bias,
performance bias, detection bias, measurement bias, or attrition bias. Issues such as patients’
characteristics (inclusion/exclusion criteria), setting (primary versus referral center), treatment
regimens, or outcome measurements will distort external validity.
A few quality assessment scales have been developed and have been used for RCTs such as
Chalmers (Chalmers TC) and Jadad’s scales. Later on, Higgins and Altman (see Cochrane
handbook 2011, Higgins et al 2011 in Appendix) have encouraged reviewers to assess risk of
biases of individual studies based on the types of biases, see appendix. These include 6
domains: selection bias, performance bias, attrition bias, detection bias, reporting bias, and
other bias. Seven sources of bias are considered for assessing, which are random sequence
generation and allocation concealment for selection bias; blinding of participants and health
care team for performance bias; blinding of outcome assessors for detection bias; incomplete
outcome data for attrition bias; selective outcome report for reporting bias; and anything else
for other sources of bias. Each item will be checked whether it is prone to be bias, and its
possible answer is low or high bias. In case there is insufficient data for making a decision,
an answer can be ‘unclear’. Sometimes a few items may be not applied in RCTs for specific
questions, so an answer should be ‘not applicable’.
24
Risk of bias assessment form for genetic association studies has been developed by
Thakkinstian et al(21). General issues of Epidemiology and genetic association are
considered and assessed. General issues of Epidemiology are selection bias, information bias,
confounding bias, and selective outcome report, whereas generic issues are multiple testing
and Hardy-Weinberg equilibrium test. Each item is gain graded as low, high, and unclear risk
of bias. Investigators should plan and describe in a review proposal what they means for low,
high, or unclear for each item.
For general observational studies, the Newcastle-Ottawa quality assessment scale can be used
for assessing quality of the study
(http://www.ohri.ca/programs/clinical_epidemiology/oxford.asp, see appendix). The original
scale is designed for both cohort and case-control studies. A cross-sectional study can use the
same form as cohort study, but may be needed to modify where appropriate. Three domains
are considered and assessed, i.e., selection, comparability, and exposure. The scale should be
modified to properly use for each review.
Quality in prognostic study (QUIPS) is a more specific scale has been recently designed for
prognostic study(22). The scale consists of 6 domains, which are study participants, study
attrition, prognostic factor measurement, outcome measurement, study confounding, and
statistical analysis and report. Each domain is graded as low, moderate, and high risk of bias.
25
The revised tool for quality assessment of diagnostic accuracy studies (QUADAS-2) has been used
for systematic review of diagnostic studies ( 2 3 ) . The toll consists of consists of four domains:
patient selection, index test, reference standard, and flow- timing. Each domain is graded as low ,
high , or unclear.
Investigators are required to discuss how to assess risk of bias for each item, and what
information should be dug from the study to aid grading. This should be clearly described in a
review proposal and should make sure that reviewers understand well how to do so. Process of
evaluation consists of 2 parts. First, reviewers are required to grade whether the study is low,
high, or unclear risk of bias for each item based on support information from the study. If that
item cannot be assessed, let’s document ‘NA’. Second, support information or relevant trial
characteristics that have been used in making decision for each source of bias should be
documented.
The quality scores or source of bias might associate with treatment effects and thus be used for
exploring sources of heterogeneity. This process should also be performed by two reviewers.
Blinding reviewers about authors, institutions, or journals is sometimes useful. Disagreement
between two reviewers should be solved by consensus and discussion. Levels of agreement for
each domain and the overall domains may be assessed using Kappa statistics.
2.7 Statistical analysis
The same as other primary studies, the statistical analysis plan should be clearly described in the
review proposal. This will let investigator team understand and be assured how data will be
analysed and lead to answer their research questions. For an direct meta-analysis, statistical
analysis plan should be covered pooling methods for all type of outcome (e.g., prevalence,
continuous outcome, dichotomous outcome). Assessing heterogeneity by testing and estimate a
26
degree of heterogeneity (I2) should be mentioned. Exploring source of heterogeneity is required to
plan for in case heterogeneity is present, this will be done by meta- regression or subgroup
analysis. Study and patient characteristic will be used for this exploring should be clearly
mentioned and described. Finally, publication bias assessment by Egger’s test and funnel plot
should be also planned for. A contour- enhanced funnel plot will be used if there is an evidence
of asymmetry of a funnel by either test or plot.
Relevant dummy tables and figures should be prior designed. Steps of analysis, dummy tables
and figures are as follows:
– The first figure for result should be a flow chart of identifying and selecting studies,
numbers of included studies, lists of reasons for excluded studies
– Table describing study’s and patients’ characteristics
– Table displaying frequency data of cross-tabulation between treatments and
outcomes of individual studies for dichotomous outcome; number of patients, mean,
and SD for continuous outcome
– Forest plots for main poolings, this can be incorporated summary/aggregate data (e.g.,
frequency data or mean & SD). If this can be done, Table of these data can be omitted
or move to supplement Tables.
– Explore possible sources of heterogeneity by meta-regression or subgroup analysis
– Perform sensitivity analyses
– Examine publication bias Advance meta-analysis for genetic association studies, diagnostic test, a network meta-analysis
are required more specific pooling methods. This should be clearly described according to detail
in the module 2-3.
27
2.8 Register review proposal As mentioned previously that good systematic review and meta-analysis should be performed
based on good review methods, which are robust, transparent, free/minimal bias, reproducible,
and even auditable. A review proposal is therefore encouraged to be register, which is the same
as a RCT. Some benefits from registration are as follows (24), see more detail in Appendix 6:
2.8.1 Establish that we are doing this review topic. Other researches who are interested in
the same research question as us may be able to collaborate or involve in this review. As a
result, duplicated reviews addressing the same question should be reduced.
2.8.2 Registration should lead to increase potential communications within interested
researchers
2.8.3 Ensure that the review methods are transparency, robust, reproducible, and adhere
with what mentioned in review proposal
Once the review proposal has been developed, investigators should can register at following
sources:
- National Institute of Health (NIH) at
http://nihlibrary.campusguides.com/content.php?pid=252593&sid=2085601
- Campbell Collaboration at http://www.campbellcollaboration.org/
Register systematic reviews of the effects of social interventions
- Cochrane Collaboration at http://www.cochrane.org/
An international organization, that produces and disseminates systematic reviews of health
care interventions which focused on only RCTs, and diagnostic tests
- International prospective register of systematic reviews (PROSPERO ) at
http://www.crd.york.ac.uk/prospero/
28
Not only a systematic review and meta-analysis for RCTs, but also for all sorts of observational
studies can be registered at the PROSPERO.
2.9 Reporting systematic review and meta-analysis
Reporting results should followed the PRISMA guidelines (25), which suggest how to
report results of systematic review and summary/aggregate meta-analysis from
background to discussion. In addition, the PRISMA guidelines have been expanded to a
network meta- analysis and individual patient meta-analysis. Details of PRISMA
guidelines are listed in appendices.
2.10 Interpret results
– Consider limitations, including publications and related biases
– Consider strength of evidence
– Consider applicability
– Consider numbers-needed-to-treat for benefit/harm
– Consider economic implications
– Consider implications for future research
2.11 Grading evidence
Read more detail in Appendix XI-XII
The evidence synthesis from systematic review and meta-analysis should be finally graded. This
allows readers/users feel confident in applying results to clinical practice. A working group,
called the Grading of Recommendations, Assessment, Development and Evaluation (GRADE)
(26) (www.gradeworkinggroup.org); consists of health care methodologists, guideline
developers, clinicians, health services researchers, health economists, public health officers and
other
29
interested members, has developed and updated how to rate evidence and clinical practice
guideline, but herein will focus on only grading evidence from systematic review and meta-
analysis. The GRADE handbook (available at www.gradeworkinggroup.org) describes the
process of rating the quality of the best available evidences, which is transparent and structured
process for developing and presenting evidence summaries. The guideline considers a wide
ranges of research questions including therapy, diagnosis, screening, and prevention.
Similar to conducting a systematic review and meta-analysis, perform grading is usually started
with clearly define research question in terms of PICO, i.e., targeted patients, alternative
treatment/intervention strategies, and patient-important outcome.
The quality of evidence for each pooling outcome is rated according to the factors outlined in the
GRADE approach, including five factors that may lead to rating down the quality of evidence
and three factors that may lead to rating up, see detail below. Grade evidence profile and
summary of finding table should be developed and reported.
The grade evidence profile consists of
- A list of the outcomes
- The number of included studies and their study designs
- Judgments about each of the quality of evidence factors assessed including
o Risk of bias, inconsistency, indirectness, imprecision, other considerations (e.g.,
publication bias and factors that increase the quality of evidence)
- The assumed risk; a measure of the typical burden of the outcomes, i.e. illustrative risk
or also called baseline risk, baseline score, or control group risk
30
- The corresponding risk; a measure of the burden of the outcomes after the intervention
is applied, i.e. the risk of an outcome in treated/exposed people based on the relative
magnitude of an effect and assumed (baseline) risk
- The relative effect; for dichotomous outcomes the table will usually provide risk ratio,
odds ratio, or hazard ratio
- The absolute effect; for dichotomous outcomes the number of fewer or more events in
treated/exposed group as compared to the control group
- Rating of the overall quality of evidence for each outcome (which may vary by
outcome)
- Classification of the importance of each outcome
- Footnotes, if needed, to provide explanations about information in the table such as
elaboration on judgements about the quality of evidence
Rating the quality of evidence reflects how much we are confident with pooling results of our
systematic review and meta-analysis. Each interested outcome should be rated separately, the
quality may be different among outcomes. There are four rating scales as follows:
31
Table 1. Rating scale for quality of evidence assessment
Rating scale
Definition
High
We are highly confident that the true effect size lies close to our
estimation. This means our pooled effect size is very precise estimation
or 95% confidence interval (CI) is very narrow.
Moderate
We are moderately confident in the estimated effect size. The true effect
may be close to the estimated effect size, but the 95% CI is quite wide
and thus there is a possibility that it is substantially different..
Low
We are little confident in the estimated effect size because the estimation
is very imprecise (i.e., very wide CI). The true effect may be
substantially different from our estimated effect size.
Very low
We are very little confident in the estimated effect size, which has very
wide CI: The true effect is likely to be substantially different from our
estimate.
Rating quality of evidence usually begins with study designs of studies included in systematic
review, i.e., those are randomised control trial or observational studies.
Then, five factors are used to possibly rate down and three factors are used to possibly rate up the
quality of evidence as follows:
32
Table 2. Five and three factors use for rate down and up quality of evidence
Down-rate factor
Consequence
Limitations in study design or execution (risk of bias)
↓ 1 or 2 levels
Inconsistency of results
↓ 1 or 2 levels
Indirectness of evidence
↓ 1 or 2 levels
Imprecision
↓ 1 or 2 levels
Publication bias
↓ 1 or 2 levels
Up-rate factor
Consequence
Large magnitude of effect
↑ 1 or 2 levels
All plausible confounding would reduce the demonstrated
effect or increase the effect if no effect was observed
↑ 1 level
Dose-response gradient
↑ 1 level
For research question about therapy or treatment/intervention efficacy, quality of evidence from
each study design is rated as follows:
- randomized control trials without important limitations provide high quality evidence
- observational studies without special strengths or with important limitations
provide low quality evidence
- Non-randomised control trials or quasi-experimental design without important
limitations may provide high quality evidence, but will automatically be down rated for
risk of bias, e.g., lack of randomisation, lack of concealment or blinding.
- Case series and case reports are observational studies that mostly contain only one
patient group with receiving intervention, no comparator or source of control group is
33
not clear .Thus, they will usually warrant downgrading from low to very low quality
evidence.
Quality of evidence factors
a) Risk of bias
Results of risk of bias assessments for individual studies can be directly used to down rate
the quality of evidence of systematic review/meta-analysis. For RCTs, these include
randomisation sequence generation, allocation concealment, lack of blinding, incomplete
patients and outcomes, selective outcome report, and other biases. For observational studies,
the items of risk of bias assessment depend on what type of scale has been used for
assessing risk of bias. However, most scales basically assess limitation of studies in terms
of representativeness of study patients or failure to apply appropriately eligibility criteria,
ascertainment bias in measure exposure and outcome, confounding bias from failure to
adequately control confounders, and incomplete or inadequate follow up time.
Results of risk of bias should be used directly to reflect study limitations. For instance, low
risk of bias would indicate no limitation, unclear risk of bias would indicate either no
limitation or serious limitation; and ‘high risk of bias would indicate either serious or very
serious limitation in the GRADE approach. Suggestions of how to assess study limitation
are described below:
34
Table 3. Describe how to assess study limitations
Risk of bias Across studies Interpretation Considerations GRADE assessment of study
limitations
Low Most information is from studies at low risk of bias.
Plausible bias unlikely to seriously alter the results.
No apparent limitations. No serious limitations, do not downgrade
Unclear Most information is from studies at low or unclear
risk of bias.
Plausible bias that raises some doubt about the
results.
Potential limitations are unlikely to lower confidence in the
estimate of effect.
No serious limitations, do not downgrade
Potential limitations are likely to lower confidence in the estimate
of effect.
Serious limitations, downgrade one level.
High The proportion of information from studies
at high risk of bias is
sufficient to affect the
interpretation of results.
Plausible bias that seriously weakens
confidence in the results.
Crucial limitation for one criterion, or some limitations for
multiple criteria, sufficient to
lower confidence in the estimate
of effect.
Serious limitations, downgrade one level
Crucial limitation for one or more criteria sufficient to
substantially lower confidence in
the estimate of effect.
Very serious limitations, downgrade two levels
35
b) Inconsistency
Inconsistency refers to an unexplained heterogeneity of pooling effect size (e.g., OR, RR,
HR, mean difference), which can be assessed by estimation of I2. If the source of
heterogeneity cannot be explained/identified, the quality of evidence should be downgraded
by one or two levels will depend on the magnitude of the inconsistency in the results.
If the source of heterogeneity can be identified leading to subgroup analysis, rating evidence
should be performed within subgroup, rather than downgrade the overall pooling results.
Criteria uses for inconsistency are as follows:
- Statistical criteria by Cochrane Q test (P < 0.1) or the I2 rule as follows: < 40% may be
low, 30-60% may be moderate, 50-90% may be substantial, and 75-100% may be
considerable. This I2 criteria is needed to incorporate with other two criteria below
because estimation of I2 depends on number of included studies. For instance, there is
high variation of point estimated effect size across studies but Q test is non-significant
or I2 is very low. .
- Wide variation of point estimated effect size across studies
- Minimal or no overlap of CI, which suggests variation is more than what one would
expect by chance alone
c) Indirectness of evidence
Directness of evidence is claimed as good quality of evidence, which is judged by target
patients, interventions, and outcomes of interest. Sources of indirectness can be as follows:
- Difference in study patients
If included studies had difference in patient spectrum, i.e., difference in disease
severity, or studies conducted in mixed population of children and adults, or studies
conducted in specific disease/s such as diabetes or hypertension, but results will be used
in general adults, this evidence is said to be indirectness.
36
- Difference in intervention
Different in interested intervention, drug class, or dosage per day between studies, can
be used for down grade the evidence.
- Difference in the outcome measures
This may be different methods used for outcome measure, or different time at measure,
or surrogate versus patient important outcome. For instance, measure HbA1C vs CVE
complication in diabetes. Using surrogate outcome can be down grade by one or two
level. Knowing disease mechanism will help in making decision to down grade, if the
surrogate outcome is far away from the patient important outcome, it should be down
grade by two levels. For instance, bone mineral density for fracture,, calcium
calcification score for myocardial infraction should be down grade by one level; use of
sleep questionnaire and Polysomnography to measure sleep quality in pregnancy me be
down grade by two levels.
- Indirect comparison
If treatment effect C vs B is indirectly estimated from B vs A and C vs A, indirect
effect of C vs B should be down grade by one level, and two levels if inconsistency
assumption is violated.
d) Imprecision of the outcome
The optimal information size (OIS) is used to determine whether 95% CI for each outcome
is adequate precision. The OIS can be estimated according to conventional sample size
estimation for that specific hypothesis testing. If the estimated OIS is less than the
conventional estimated sample size for a single adequately power study, considering rating
down for imprecision. The criteria for rating of dichotomous outcome is as follow:
- If the OIS ((i.e., a total number of patients included in a systematic review) criterion
is not met (i.e., OIS is lower than a conventional sample size calculation n), rate
37
down for imprecision, unless the sample size is very large (at least 2000, and perhaps
4000 patients).
- If the OIS criterion is met (OIS≥ conventional n) and the 95% CI excludes no
effect (i.e. CI around RR excludes 1.0), do not rate down for imprecision.
- If OIS criterion is met, and the 95% CI overlaps no effect (i.e. CI includes RR of
1.0) rate down for imprecision if the CI fails to exclude important benefit or
important harm. For instance, the lower CI is very benefit or the upper CI is very high
risk, it may rate down for one level. In addition, if the lower CI is very benefit but the
upper CI is high risk (or vice versa), it may consider to rate down for two levels.
e) Publication bias
If there evidence of publication bias suggesting by funnel plot, Egger’s test, and a contour-
enhance funnel plot, the evidence should be rate down for one level.
Evidences from systematic review and meta-analysis can be sometime rated up if they meet
following criteria:
a) Large effect size
- If the effect size (e.g., RR, OR, HR, mean difference) is large, which is defined as > 2
or < 0.5, the evidence may be considered to up rate for one level
- If the effect size is very large effect, i.e., OR/RR/HR > 5 or <0.2, it may be considered
to up rate for two levels . However, this rule might be not applied for OR if the outcome
is not rate, may need to convert OR to RR before applying.
The effect size should be considered not only the point estimate but also the range
estimate, i.e., 95% CI
b) Dose-response effect
c) Effect of plausible residual confounding
38
3. ASSIGNMENTS
Topics
Score
Due Date
Assignment I: Searching for review topic/Literature review
- Explore review topic - Literature review for background, rationale,
research questions, review objectives
10 %
Jun 4, 2018
Assignment II: Perform locate and select studies
– Literature review – Search terms and strategies – Inclusion and exclusion criteria – Perform locate and select studies
10 %
Jun 26, 2018
Assignment III: Design data extraction form & risk of bias
assessment
– Design o Data extraction forms
o Quality or risk of bias assessment
Perform data extraction & assess quality of
Studies
10 %
Jul 3, 2018
Assignment IV: Statistical analysis plan
– Data analysis plan
– Dummy tables & figures
10 %
Jul 12, 2018
Assignment V: Construct review proposal
10% Jul 26, 2018
Assignment VI: Register review proposal at PROSPERO 10 % -
Assignment VII: Writing a manuscript 40% Sep 20, 2018 (4 p.m.)
39
4. REFERENCES
1. Egger M, Smith G, Altman D. Systematic reviews in health care: meta-analysis in context.
Second ed. London: BMJ Publishing Group; 2001.
2. Tantrakul V, Numthavaj P, Guilleminault C, McEvoy M, Panburana P, Khaing W, et al.
Performance of screening questionnaires for obstructive sleep apnea during pregnancy: A
systematic review and meta-analysis. Sleep medicine reviews. 2017;36:96-106.
3. Khaing W, Vallibhakara SA, Attia J, McEvoy M, Thakkinstian A. Effects of education and
income on cardiovascular outcomes: A systematic review and meta-analysis. Eur J Prev Cardiol.
2017;24(10):1032-42.
4. Anothaisintawee T, Reutrakul S, Van Cauter E, Thakkinstian A. Sleep disturbances
compared to traditional risk factors for diabetes development: Systematic review and meta-
analysis. Sleep medicine reviews. 2015;30:11-24.
5. Anothaisintawee T, Teerawattananon Y, Wiratkapun C, Kasamesup V, Thakkinstian A.
Risk prediction models of breast cancer: a systematic review of model performances. Breast Cancer
Res Treat. 2012;133(1):1-10.
6. Wilasrusmee C, Anothaisintawee T, Poprom N, McEvoy M, Attia J, Thakkinstian A.
Diagnostic Scores for Appendicitis: A Systematic Review of Scores’ Performance. British Journal
of Medicine & Medical Research. 2014;4(2):711-30.
7. Vejakama P, Thakkinstian A, Lertrattananon D, Ingsathit A, Ngarmukos C, Attia J. Reno-
protective effects of renin-angiotensin system blockade in type 2 diabetic patients: a systematic
review and network meta-analysis. Diabetologia. 2012;55(3):566-78.
8. Anothaisintawee T, Attia J, Nickel JC, Thammakraisorn S, Numthavaj P, McEvoy M, et al.
Management of chronic prostatitis/chronic pelvic pain syndrome: a systematic review and network
meta-analysis. JAMA. 2011;305(1):78-86.
9. Thakkinstian A, Han P, McEvoy M, Smith W, Hoh J, Magnusson K, et al. Systematic 40
review and meta-analysis of the association between complement factor H Y402H polymorphisms
and age-related macular degeneration. Hum Mol Genet. 2006;15(18):2784-90.
10. Wattanawong K, Rattanasiri S, McEvoy M, Attia J, Thakkinstian A. Association between
IRF6 and 8q24 polymorphisms and nonsyndromic cleft lip with or without cleft palate: Systematic
review and meta-analysis. Birth Defects Res A Clin Mol Teratol. 2016;106(9):773-88.
11. Numthavaj P, Thakkinstian A, Dejthevaporn C, Attia J. Corticosteroid and antiviral therapy
for Bell's palsy: a network meta-analysis. BMC Neurol. 2011;11:1.
12. Boonchan T, Wilasrusmee C, McEvoy M, Attia J, Thakkinstian A. Network meta-analysis
of antibiotic prophylaxis for prevention of surgical-site infection after groin hernia surgery. Br J
Surg. 2017;104(2):e106-e17.
13. Khaing W, Vallibhakara SA, Tantrakul V, Vallibhakara O, Rattanasiri S, McEvoy M, et al.
Calcium and Vitamin D Supplementation for Prevention of Preeclampsia: A Systematic Review
and Network Meta-Analysis. Nutrients. 2017;9(10).
14. Thakkinstian A, Dmitrienko S, Gerbase-Delima M, McDaniel DO, Inigo P, Chow KM, et
al. Association between cytokine gene polymorphisms and outcomes in renal transplantation: a
meta-analysis of individual patient data. Nephrol Dial Transplant. 2008;23(9):3017-23.
15. Sansanayudh N, Anothaisintawee T, Muntham D, McEvoy M, Attia J, Thakkinstian A.
Mean platelet volume and coronary artery disease: a systematic review and meta-analysis.
International journal of cardiology. 2014;175(3):433-40.
16. Sansanayudh N, Numthavaj P, Muntham D, Yamwong S, McEvoy M, Attia J, et al.
Prognostic effect of mean platelet volume in patients with coronary artery disease. A systematic
review and meta-analysis. Thrombosis and haemostasis. 2015;114(6):1299-309.
17. Wei Y, Royston P. Reconstructing time-to-event data from published Kaplan–Meier curves.
The Stata journal. 2017;17(4):786-802.
18. Rohatgi A. WebPlotDigitizer. 4.1 ed. Austin, Texas, USA2018.
19. Debray TP, Damen JA, Snell KI, Ensor J, Hooft L, Reitsma JB, et al. A guide to systematic 41
review and meta-analysis of prediction model performance. BMJ. 2017;356:i6460.
20. Moons KG, de Groot JA, Bouwmeester W, Vergouwe Y, Mallett S, Altman DG, et al.
Critical appraisal and data extraction for systematic reviews of prediction modelling studies: the
CHARMS checklist. PLoS medicine. 2014;11(10):e1001744.
21. Thakkinstian A, McKay GJ, McEvoy M, Chakravarthy U, Chakrabarti S, Silvestri G, et al.
Systematic review and meta-analysis of the association between complement component 3 and age-
related macular degeneration: a HuGE review and meta-analysis. American journal of
epidemiology. 2011;173(12):1365-79.
22. Hayden JA, van der Windt DA, Cartwright JL, Cote P, Bombardier C. Assessing bias in
studies of prognostic factors. Ann Intern Med. 2013;158(4):280-6.
23. Whiting PF, Rutjes AW, Westwood ME, Mallett S, Deeks JJ, Reitsma JB, et al. QUADAS-
2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med.
2011;155(8):529-36.
24. Stewart L, Moher D, Shekelle P. Why prospective registration of systematic reviews makes
sense. Systematic reviews. 2012;1:7.
25. Moher D, Liberati A, Tetzlaff J, Altman DG. Preferred reporting items for systematic
reviews and meta-analyses: the PRISMA statement. J Clin Epidemiol. 2009;62(10):1006-12.
26. Puhan MA, Schunemann HJ, Murad MH, Li T, Brignardello-Petersen R, Singh JA, et al. A
GRADE Working Group approach for rating the quality of treatment effect estimates from network
meta-analysis. BMJ. 2014;349:g5630.
42
5. Test book 1. Egger M, Smith GD, Altman DG. Systematic reviews in health care: Meta-analysis in
context. 2 nd ed. London: BMJ Books; 2001.
2. Chalmers I, Altman DG. Systematic reviews. London: BMJ Publishing Group; 1995. 3. Petitti BD. Meta-Analysis, decision analysis, and cost-effectiveness analysis. Oxford:
University Press 1994; pages 91-130, 194-196.
4. Higgins JPT, Green S. Cochrane Handbook for Systematic Reviews of Interventions
Version 5.1.0 edn, 2011.
43