systematic review and meta-analysis · students should be able to -develop a review proposal for...

SYSTEMATIC REVIEW AND META-ANALYSIS

MODULE I: SYSTEMATIC REVIEW

SIMESTER 3/2017

FOR DOCTOR OF PHILOSOPHY PROGRAM

IN CLINICAL EPIDEMIOLOGY AND MASTER OF

SCIENCE PROGRAM IN MEDICAL EPIDEMIOLOGY

FACULTY OF MEDICINE RAMATHIBODIHOSPITAL

MAHIDOL UNIVERSITY

PROF.DR.AMMARIN THAKKINSTIAN

[email protected]

WWW.CEB-RAMA.ORG

mailto:[email protected]

http://www.ceb-rama.org/

CONTENTS

1. INTRODUCTION AND BACKGROUND ............................................................................. 4

2. REVIEW METHODOLOGY .................................................................................................. 5

2.1 Rationale and background ............................................................................................ 5

2.2 Formulate review questions and objectives ................................................................. 9

2.3 Locate/identify studies ................................................................................................. 10

2.3.1 Define search terms and strategies ....................................................................... 10

2.3.2 Define source of relevant studies ......................................................................... 11

2.3.3 Define the software & version (if possible) for search engines ........................... 14

2.4 Selection of studies ...................................................................................................... 15

2.4.1 Define inclusion & exclusion criteria .................................................................. 15

2.4.2 Selection of studies ............................................................................................... 19

2.5 Data extraction ............................................................................................................ 20

2.5.1 General characteristics of studies and subjects ................................................... 20

2.5.2 Data for pooling ................................................................................................... 21

2.6 Risk of bias assessment ............................................................................................... 24

2.7 Statistical analysis ....................................................................................................... 26

2.8 Register review proposal ............................................................................................ 28

2.9 Reporting systematic review and meta-analysis ....................................................... 29

2.10 Interpret results ........................................................................................................... 29

2.11 Grading evidence ......................................................................................................... 29

3. ASSIGNMENTS .................................................................................................................. 39

4. REFERENCES .................................................................................................................... 40

5. TEST BOOK ........................................................................................................................ 43

1

Objectives Students should be able to

- Develop a review proposal for systematic review and meta-analysis.

- Perform and conduct a systematic review and meta-analysis

APPENDICES: READING SECTION

Appendix I: Moher D, Liberati A, Tetzlaff J, Altman DG. Preferred reporting items for

systematic reviews and meta-analyses: the PRISMA statement. Ann Intern

Med. 2009;151(4):264-9, W64.

Appendix II: Higgins JP, Altman DG, Gotzsche PC, Juni P, Moher D, Oxman AD et al.

The Cochrane Collaboration's tool for assessing risk of bias in randomised

trials. BMJ 2011; 343: d5928.

Appendix III: Chapter 8: Assessing risk of bias in included studies from

http://handbook.cochrane.org/ (part II)

Appendix IV: The PRISMA statement for reporting systematic reviews and meta-

analyses of studies that evaluate health care interventions: explanation

and elaboration. PLoS Med 6 (7): e1000100, 2009.

Appendix V: Newcastle-Ottawa quality assessment:

(http://www.ohri.ca/programs/clinical_epidemiology/oxford.asp

• Scale

• Manual

Appendix VI: Stewart L, Moher D, Shekelle P. Why prospective registration of systematic

reviews makes sense. Systematic reviews. 2012;1:7.

Appendix VII: Whiting PF, Rutjes AW, Westwood ME, Mallett S, Deeks JJ, Reitsma JB,

et al. QUADAS-2: a revised tool for the quality assessment of diagnostic

2

http://handbook.cochrane.org/

http://www.ohri.ca/programs/clinical_epidemiology/oxford.asp

accuracy studies. Ann Intern Med. 2011;155(8):529-36. Appendix VIII: Hutton B, Salanti G, Caldwell DM, Chaimani A, Schmid CH, Cameron C,

et al. The PRISMA extension statement for reporting of systematic reviews

incorporating network meta-analyses of health care interventions: checklist

and explanations. Ann Intern Med. 2015;162(11):777-84.

Appendix IX: 1.Stewart LA, Clarke M, Rovers M, Riley RD, Simmonds M, Stewart G,

et al. Preferred Reporting Items for Systematic Review and Meta-Analyses of

individual participant data: the PRISMA-IPD Statement. JAMA.

2015;313(16):1657-65.

Appendix X: Stewart L, Moher D, Shekelle P: Why prospective registration of systematic

reviews makes sense. Systematic reviews 2012, 1:7.

Appendix XI: Grade handbook

Appendix XII: Puhan MA, Schunemann HJ, Murad MH, Li T, Brignardello-Petersen R,

Singh JA, Kessels AG, Guyatt GH, Group GW: A GRADE Working Group

approach for rating the quality of treatment effect estimates from network

meta-analysis. BMJ 2014, 349:g5630.

3

1. INTRODUCTION AND BACKGROUND

From Egger M, Smith GD, Altman DG. Systematic reviews in health care: Meta-analysis in

context (1)

Systematic review is defined as a review that has been conducted using a systematic approach

in order to minimize reduce biases and random errors. Several terms have been used in

literature which refer to a systematic review including systematic review itself, overview,

research synthesis, pooling, and meta-analysis. It is not necessary that a systematic review

always apply meta-analysis to pool effect size (e.g. treatment effect, exposure effect, or

genetic effect) across studies. In the case where pooling the effect size is the primary

objective of interest given there are sufficient data for pooling, a meta-analysis is the

statistical tool for estimation of the effect sizes. The estimated effect size will be valid and

precise depending on a review methodology including identifying studies, selection of

studies, and data extraction, and method of estimates. Thus, a good meta-analysis should be

performed based on a good review methodology, otherwise the result of meta-analysis will be

biased.

Rationales for performing a systematic review and meta-analysis can be numerous as seen in the

outline below:

1.1 The systematic review allows more objective appraisal of evidence (i.e., allows readers

to replicate) than the traditional narrative review, and may be able to contribute to resolving

uncertainty when original researches or narrative review/s disagree. In addition, a systematic

review provides quantitative conclusions if meta-analysis is applied. Conversely, a narrative

review is more subjective and more likely to be selection bias, limited by single or in some

selected studies, unhelpful descriptions (e.g., no clear evidence), and might be based on too

weak or too strong treatment effects or exposure-outcome relationships.

4

1.2 Meta-analysis, if applicable and appropriate, has more statistical power to detect the

treatment or exposure effect, and thus decreases the probability of false negative results. As a

result, efficacious treatment is applied to patients sooner.

1.3 Exploratory analyses will lead to suggest which group of patients are more or less likely

to respond well to a treatment

1.4 Systematic review may demonstrate a lack of adequate evidence or a gab of knowledge,

further studies are therefore needed.

2. REVIEW METHODOLOGY

In the same manner as other primary research as such observational studies or randomized

control trials, a systematic review (known a secondary research) should be performed based

on good review methods. Therefore, a research proposal must be developed prior

conducting a review to make sure that the review methods are robust, transparent, and

minimize bias as much as possible. In addition, the review proposal will reassure investigators

that they will not miss some important issues before/while conducting the review.

Furthermore, the review proposal should be registered to make it more transparent, see

sources and benefits for registration below. Outlines of the research proposal are as follows:

2.1 Rationale and background

The investigator needs to introduce and describe the background of the interested disease or

event to help readers to become more familiar with what will be studied. For instance,

magnitude of the disease (e.g., prevalence or incidence where appropriates) across the world,

or by ethnicity (Caucasian, Africa-American, Asian, etc.), region (Asia, Europe, America,

Africa, etc.), or locality should be described if data are available. How much the disease

impacts/burden on patients (e.g., morbidity, disability, economic loss, quality of life,

mortality) and the country should be described.

5

If the review is aimed at assessing treatment efficacy, information about interested treatments

should be described. For instance, how many treatment regimens are available, how long the

treatments have been used on patients and with what dosages, what relevant clinical

outcomes are usually assessed after administering treatments, what is the mechanism of

treatment- outcome/s, and whether the treatments have adverse events? If so, what are they?

For genetic association studies, the study factor is gene rather than treatment. Gene

identifications by genome-wide scan association studies (known as GWAS) should be

mentioned. In addition, genes’ location and function should be described along with source of

information. The most popular and very useful source is the Online Mendelian Inheritance in

Man (OMIM), which is a comprehensive, authoritative compendium of human genes and

genetic phenotypes by the McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins

University School of Medicine. It is freely available and updated daily, which can be link

from PubMed and select database as OMIM. Then, a specific gene or polymorphism can be

search from there.

The question of interest can be about diagnostic performances. For instance, what were

performances of Berlin and STOP-BANG questionnaires in diagnosis of obstructive sleep

apnea (OSA) comparing with the standard test of polysomnography in pregnancy (2). Details

of these questionnaires (i.e., how many domains/items are there, how to rate for each item,

how to interpret the results) should be described. Diagnostic parameters such as sensitivity,

specificity, positive/negative predictive values, likelihood ratio positive/negative, and area

under receive operative characteristic (ROC) curve, also known as C-statistic, will be pooled.

Exposure effects from various types of observation studies can be question of interest for

systematic review and meta-analysis. For instance, previous studies have reported

6

discrepancy effects of education and income on cardiovascular diseases. Khaing et al (3) had

therefore conducted a systematic review and meta-analysis, which aimed to pool effect sizes

of education and income on cardiovascular diseases by including only cohort studies.

Anothaisintawee et al (4) had conducted a systematic review and meta-analysis to assess

effects of sleep disturbances on diabetes including short (<6 h) and long (>8 h) sleeping time,

insomnia (initiating or maintaining sleep), OSA and abnormal sleep timing. In addition,

pooled effect sizes of these sleep parameters were further compared with other known risk

factors of diabetes, e.g., body mass index, family history of diabetes, and physical inactivity.

How these exposures (e.g., education, income, sleep disturbances) related to disease

outcomes or disease mechanisms should be described.

With growing of research of prediction models, which can be prediction models of risk of

disease occurrence or disease prognosis, a systematic review and meta-analysis of this area is

growing as well. The systematic review may aim to collect relevant prediction models that

are available for specific event/disease. How those model had been developed and whether

they had been both internally and externally validated. It is also important to know how many

important variables/predictors were included in the model and what they were, which will be

useful for further applicable or further research. In addition, the review may also aim to assess

performances of those previous prediction models by applying meta-analysis for pooling their

performances. For instance, Anothaisintawee et al (5) had conducted a systematic review of

risk prediction model of breast cancer to explore how many risk prediction models are

available, what variables were included in the model, had those model been validated either

internally or externally, and how were their performances. Results of this review would lead

identify a gab of knowledge in conducting further study. Wilasrusmee wt al(6) had also

conducted a systematic review of diagnostic scores for appendicitis, which aimed to explore

the history of scores’ developments such as how any scores had been developed and available, 7

how they were developed in term of what variables were considered at univariate and

multivariate analyses, what type of models had been used for develop, how they had assessed

model’s performances in term of calibration and discrimination. In addition, this review had

also applied meta-analysis to pool C-statistic and a O/E ratio across studies.

The next paragraph should describe what is already known or has been reported about the

efficacy of interested treatments or association between exposure/genes and interested disease.

Whether treatment/expose/gene effects are controversial, i.e., treatments may work well in

some patients, but may not in other groups of patients; or expose to that risk factor may be

associated with disease in some studies but may be not in some other studies? What are

possible reasons of controversial results? For treatment efficacy, could it be due to different

types of patients (disease severity, age, underlying diseases) different dosages or route of

administration (local, oral, intravenous), or other additional treatments? For observation

studies, different study design and measurement of exposure/outcome could alsoplay a role.

Effect of ethnicity, which is known as population stratification, and prevalence of minor allele

frequency should also consider in genetic association studies.

For diagnostic test, what test is usually used as the reference /standard/gold test, how to

perform the test, does it need specialist to do the test, is it invasive or cost? What are the new

or less invasive tests available? What is/are benefits of applying the new/index/studied test?

How are performances from previous evidences compared to the reference/standard/gold test.

If there is at least one systematic review on the same topic has been finished and published, or

it has yet finished but still on going as finding from PROSPERO web site, we should try very

hard to find out reasons for conduct our review. General review methods including location

and selection of studies, data extraction, and proper statistical analysis should be assessed. In

addition, number of studied factors (i.e., treatments/exposures/genes) and relevant clinical 8

outcomes should be criticised. Numerous reasons for supporting our additional review are as

follows: all/most treatment regimens are considered whereas the previous review/s considered

only one or few treatment comparisons, relevant clinical outcomes are more considered

compare to the previous review/s, consider more than one polymorphism and genetic mode of

effect/s will be determined, additional individual studies have been published since the

previous review/s, etc. Collect all possible reasons to show why this additional systematic

review & meta-analysis is needed? In addition, further additional information will be added

from the previous review/s should be mentioned. If we cannot find out good reasons to

conduct our systematic review, we should not replicate this topic.

2.2 Formulate review questions and objectives

Formulation of research question and objective/s are the heart of a review because this will

lead to clearly define review objectives and further review methods including

location/identifying and selection of studies, data extraction, and statistical analysis plan.

The research questions should at least consisted of study factors (e.g., treatment,

intervention, studied test/s, gene, exposure), outcomes of interest, and types of patients. If

the review is aimed at assessing treatment efficacy, the research question should also be

more specifically by considering one more components, i.e., comparator. The combination

of research questions is known as PICO in Evidence-based Medicine (EBM), in which P, I,

C, and O refer to patients, intervention, comparator, and outcome, respectively.

Examples of PICO: - Is occurrence of myocardial infarction (O) higher in Rosiglitazone (I) than Metformin

(C) in diabetic patients (P)? This PICO leads to the objective of comparison of MI rates

between Rosiglitazone and Metformin groups in diabetic patients.

- Is renin-angiotensin blockage (I) better in reducing micro-/macro-albumin urea, end-

stage renal disease, and death (O) than other antihypertensive drugs (C) in diabetic 9

patients (P)?

- Does complement factor F gene associate with age-related macular degeneration (O)

in general population (P)?

- What are performances of Berlin and STOP-BANG questionnaires in diagnosis of

obstructive sleep apnea (OSA) comparing with the standard test of polysomnography in

pregnancy?

- Are sleep disturbances risk factors of diabetes?

2.3 Locate/identify studies

2.3.1 Define search terms and strategies

Search terms can be constructed based on PICO. All relevant search terms should be collected

and listed according to each domain of P, I, C, and O. Investigators need to collect words or

terminology (or key words) that have the same meaning and have been used interchangeably.

Once the search terms have been collected, search strategies should then be constructed by

combining these search terms within the same and different domains following functions that

are specifically used for each search engine. Different search engines may require different

combined functions and thus the search strategies may be different and should be clearly

described for all search engines used. Basically, search terms are combined using ‘OR’ within

each domain of PICO whereas using ‘AND’ between domain of PICO.

Search strategies can be sensitive or specific depend on research question and availability of

relevant studies. If not many relevant studies for that topic are available, sensitive search

strategies may be required by considering search terms of only a few domain. For instance,

- Combining search terms only P and I ignoring C and O for assess treatment efficacy;

- Combining only type 1 diabetes for P and sleep disturbance for I/E ignoring O for asses

association between sleep disturbance and glycemic control in type 1 diabetes. 10

Specific search strategies are more strict, i.e., must consist of at least 3 or all domains of

interests. For instance, combining search terms of P, I, C, and O for treatment efficacy to get

studies that not only had specific patients and intervention but also comparator and outcome

of interests. Combining comparator/s in search strategies will be more specific and results in

less number of identified studies than ignoring comparator and/or outcome/s. If investigators

have prior knowledge how research in this area is growing, and there have been some studies

that had already been conducted and available in literature, adding comparator/s will make a

process of selection of studies flow effectively. Conversely, omitting comparator/s from

search strategies will be very sensitive and results in many identified studies for selection.

This is good if investigators know that not many studies have already been published, or

investigators would like to collect for all possible comparators.

In summary, sensitive search strategies will identify too many relevant studies for review.

Conversely, the specific search strategies will yield a small number of identified studies, and

may miss some relevant studies. Investigators should justify which type of search strategies

are suit with their topics.

2.3.2 Define source of relevant studies

All sources of information used for locating studies should be clearly described. These can

mainly be at least two electronic databases, lists of references, contacted authors, and

personal communication with experts in that field. In addition, a period of searching must be

specified. Updated searching should also be planned if the review will take time. Frequency

of update searching depends on the rate of increasing information for that topic, which can be

every second week or a month if it is a hot issue and there are some studies published

monthly, otherwise it can be every 3 to 6 months. This can be automatically set up in the

PubMed and Scopus search engines. A few main medical-health science electronic databases 11

are commonly used as follows:

a) MEDLINE

- Since 1949 to present

- Over 16 million references

- Since 2005, between 2,000-4,000 completed references are added each day (Tuesday

through Saturday)

- Covers 5200 worldwide journals in 40 languages

- Uses medical subject heading (MeSH) for index

- Includes biomedicine and health science journals

- English abstracts for 79% on references

- 90% are English language articles

- 47% of journals covered are published in the US

- PubMed is available free of charge

b) EMBASE

- Over 12 million records from 1974-present

- More than 600,000 records added annually

- Covers over 4,800 active peer-reviewed journals published in 70 countries/ 30 languages

- uses EMTREE for indexing

- includes English abstracts for 80% of references

- daily update, within two weeks of receipt of the original journal

- comprehensive inclusion of drug-related information

- Produced by Elsevier, no free version available

c) Scopus

- launched in November 2004

- 18,000 titles 12

– 16,500 peer-reviewed journals (1,200 Open Access journals)

– 600 trade publications

– 350 book series

– 3.6 million conference papers (~10%) from proceedings and journals

• Medical Science ~2.9%

• Biological Science ~ 2.7%

• Chemical Science ~ 1.9%

– 41 million records

• 21 million records with references back to 1996

• 20 million records 1823-1996

– 318 million scientific web pages

– 23 million patent records from five patent offices

• World Intellectual Property Organization (WIPO)

• European Patent Office

• US Patent Office

• Japanese Patent Office

• UK Intellectual Property Office

d) The Cochrane Controlled Trials Register (CCTR)

e) ClinicalTrials.gov is a registry of federally and privately supported clinical trials,

conducted in the United States and around the world. It is run by the United States

National Library of Medicine (NLM) at the National Institutes of Health. Currently

there are 130,000 trials with locations in 170 countries. The web site provides

information about a trial's purpose, who may participate, locations, and phone numbers

for more details.

13

http://en.wikipedia.org/wiki/United_States

http://en.wikipedia.org/wiki/National_Library_of_Medicine

http://en.wikipedia.org/wiki/National_Institutes_of_Health

f) Nonliterary sources are (from Systematic Reviews 2014, 3:74)

- The American clinicaltrials.gov

- The World Health Organization's International Clinical Trials Registration Platform

at : http://www.who.int/ictrp/en/

- The European clinicaltrialsregister.eu. 2.3.3 Define the software & version (if possible) for search engines

– PubMed

– OVID

– Scopus

– Silver Platter

14

http://www.who.int/ictrp/en/

2.4 Selection of studies 2.4.1 Define inclusion & exclusion criteria

Eligibility criteria for selection of studies should be clearly described in the review proposal.

Inclusion criteria can be defined mostly based on PICO and additional characteristics of the

study, study design, type of studied patients, and type of reports (e.g., language, year,

publication status) as follows:

- P: Types of Patients/subjects

Cleary describe what types of patients/subjects your review will focus on. For instance,

general patients or patients with specific diseases; which age group e.g., children, adults

(aged ≥ 18 years), or the elderly; only women with/without menopause, or only men, or both

genders will be considered in the review. This will help in framing the scope of the studied

population, and will also help in targeting whom the results of the review will later be applied

to. In addition, the review may be interested in specific group/s of that disease, e.g.,

moderate-severe hypertension, moderate-severe obstructive sleep apnea, type 1 or type 2

diabetes, diabetes with glycated hemoglobin of 6.5-7.5, chronic kidney disease stage III or

higher, cleft lip with or without cleft palate, etc..

- I: Treatment/intervention for RCT, exposure/gene for observational studies , index/studied

test for diagnostic study

What treatments will this review consider? Will the review focus on only specific dosages or all

ranges of possible dosages? For instance, in the review of reno-protective effects (7) included

studies if their interventions were any type of angiotensin converting enzyme inhibitors (ACEIs),

and their comparator could be placebo or any type of active anti-hypertensive drugs. The review

of managements of chronic prostatitis/chronic pelvic pain syndrome (CP/CPPS) (8) included

studies if the interventions were any type of alpha-blockers or antibiotics that compared with

placebo or any type of active treatments.

15

For observational study, what is your studied factor (exposure and un-exposure), and how it

has been defined or measured? For instance, the studies will be included if they assessed

effects of any type of sleep disturbance either by questionnaire or objective measures on type

2 diabetes including duration of sleep, sleep difficulty, and OSA (4). For diagnostic test,

studies will be selected if they had performed and assessed index/studied test against the

reference/standard test? For instance, diagnostic studies were included if they used at least

one of the OSA screening questionnaires (e.g., Berlin questionnaire, STOP-BANG

questionnaire, ESS, etc.) and at least one of standard objective sleep tests such as

polysormnography, Watch-PAT, or home monitoring (2).

For a genetic study, studies will be included if they had assessed association between any of

interested locus/polymorphism/s and disease outcome. For instance, the review of the

association between complementary factor H (CFH) Y402H polymorphism and age-related

macular degeneration (9) included studies if they reported alleles T, and C; or TT, TC, and

CC genotypes for CFH Y402H. For IRF6 gene and non-syndromic cleft lip with/without

cleft palate review (10), studies were included if they had studied any of polymorphism

(i.e., rs2235371 G>A, rs642961 G>A, and rs2013162 C>A) in the IRF6 gene.

- C: Comparator

For treatment efficacy, comparators can be either placebo, specific standard treatment, or

even active treatment. Thus, studies will be eligible if they used placebo, or standard

treatment, or active treatment as the comparator, as described above. For instance, studies

that used any type of active treatments as comparator would be included in the review of

Bell’s palsy management (11), i.e., dual treatments of Azyclovir plus Prednisolone versus

monotherapy of Prednisolone or Azyclovir. The comparator can sometime be any if the

interested active interventions particularly when performing a network meta-analysis is

primarily aimed. For instance, the comparator can be any type of antihypertensive drug or

16

placebo for the review of reno-protective effects of renin-angiotensin system blockade in type

2 diabetic patients (7). Comparator can be placebo or any type of antibiotic prophylaxis for

prevention of surgical-site infection after groin hernia surgery (12). Comparator can be any

type of supplementation (i.e., calcium, vitamin D, calcium plus vitamin D) in prevention of

preeclampsia review (13).

- O: Outcomes

Eligible studies must at least have reported the outcome/s of interest. For instance, the review

will include only studies reported any of adverse outcomes, e.g., gastro-intestinal (GI)

bleeding, gastric ulcer, or myocardial infarction. The review of Interleukin-10 polymorphism

and graft outcomes in renal transplantation (14) included studies in which their outcomes

were graft failure or chronic allograft nephropathy. For the reno-protective effects of ACEIs

(7), studies were included if they reported any of following interested outcomes: micro-

albuminuria, macro-albuminuria, albuminuria regression, and ESRD. The review of CP/CPPS

(8) included studies that reported any of following scores (i.e., total symptom scores, pain,

voiding, and quality of life) or treatment responsiveness. For diagnostic test, the studies were

included if they used any of standard test including polysormnography, Watch-PAT, or home

monitoring for diagnosis of OSA (2).

- Study designs

Review can be limited to only randomized control trials if the primarily aim is to assess

treatment efficacy (or adverse events) given there are sufficient data for pooling, say ≥ 3 to 5

studies for direct meta-analysis or at least one study for network meta-analysis. Other

methodologies, e.g., randomization, concealment, blinding, measurement of outcomes may

sometimes be considered in the inclusion criteria.

For genetic studies, included studies can be any type of population-based observational

17

studies (e.g., cross-sectional study, case-control, or cohort study), which can be community-

based or hospital-based studies, which is known as genetic association studies. The review in

these genetic areas can sometimes be family-based (e.g., sib-pair) studies, in which a method

of pooling data differs to genetic association studies. Thus, investigators should clearly

describe what type of studies they intend to focus on.

For pooling effect size of risk/prognostic factors sometime can focus on only cohort studies if

there are sufficient data for pooling. With the cohort design would be claimed exposure-effect

relationship, thus stronger than exposure-effect from case-control or cross-sectional studies.

For instance, pooling effect of mean platelet volume on cardiovascular events (15) and disease

prognosis (16) were based on only data of cohort studies.

- Report’s characteristics

The review can be limited based on type of reports. For instance, the author might only

choose studies which have been identified as interested treatments after a specific point in

time. Or, perhaps the author might use only full, either published or unpublished papers,

which have been published exclusively in English or any other languages if their reviewers

for that specific language are available.

- Studies will be excluded if there are insufficient data for pooling. However, this decision

should be made after 2-3 attempts in contact authors have been done. Contacting authors should

not be ignored and should be planned in a review proposal. Collecting e-mail addresses of

corresponding authors plus the first authors during the selection process is very useful and handy

when requires.

- Coding for ineligibility should be designed in the review proposal. This will allow

reviewers to code consistently throughout review process for those ineligible studies. In addition,

18

this will lead to summarize reasons for ineligibility and thus report in flow of selection of studies.

A meeting among reviewers (if there is > one reviewer) should be called to standardize and

understand eligibility criteria and coding for ineligibility. One study can be more than one reason

for ineligibility but the most important reason should be assigned.

2.4.2 Selection of studies

Identified/located studies will be selected based on inclusion and exclusion criteria. Some

degree of subjectivity for select or not select is happened sometime during the selection

process. Therefore, this process should have two reviewers if possible to select for

eligibility of candidate studies to minimize selection bias.

The selection process should also be clearly described. This should be start from screening

titles and abstracts of all identified studies. Full papers will be retrieved if eligibility cannot be

made from screening title/abstract. If full paper is not accessible, the corresponding or first

author of that article should be contacted to request full paper. Reason of exclusion studies

due to un-accessibility of full paper should be avoided. If the study does not meet inclusion

criteria, the reason for not meeting should be documented to lead summarizing reasons for

ineligibility. In addition, an exclusion log denoting reasons for exclusion of each study should

be documented. To be consistent for the whole review, reasons for ineligibility and exclusion

must be planned and coded in a review proposal. This will lead to easy computerization and

analysis of data once the review has been finished.

Finally, results of selections should be computerized separately by two reviewers if applied.

The two data set will be validated using STATA or other statistical software to assess

disagreement, i.e., studies are selected by reviewer 1 but not for reviewer 2, or vice versa. A

Kappa statistic should be applied and reported. Investigators should plan for a panel meeting,

19

which should consist of not only the two reviewers but also the third party who is supervisor or

person who has more experience in doing systematic review. Disagreement will be solved by

consensus and discussion from this meeting.

2.5 Data extraction

A data extraction forms (DEF) should be designed before the study commences, and structures

and contents should be considered from the following:

2.5.1 General characteristics of studies and subjects

Characteristics of studies should be extracted in order to let readers get idea about included

studies. General characteristics include the first author (and corresponding author for contact

if required) along with email address, journal name and publication year (can be downloaded

and linked from EndNote). More specific characteristic of studies include county of study

setting, study design (e.g., cohort, case-control, cross-sectional study, RCT, community vs

hospital- based), study period, ethnicity particularly in genetic studies because gene

frequencies may differ accordingly, type of studied subjects (e.g., adults, children, or women,

etc.), targeted population (general patient or specific disease), range of follow up if

cohort/RCT, prevalence/proportion of underlying diseases (e.g., obesity, hypertension,

diabetes, dyslipidemia, chronic kidney disease, etc.).

In addition, information about study factors and outcome must be extracted including type of

treatments (drug class, dosage/day, course of treatment, route) for RCTs, exposure and type of

measurements for observational studies, type of polymorphisms for genetic studies, type of

outcomes and methods of measurement/diagnosis. Further more, type of outcome data (e.g.,

continuous or dichotomous or both), and type of reported data including frequency/summary data

or statistic parameter (e.g., mean difference, regression coefficient, odds ratio (OR), risk ratio (RR),

or hazard ratio(HR)).

20

These characteristics of studies and subjects will allow investigators to explore for causes of

heterogeneity, if it is present; and allow to perform a subgroup analysis to identify specific

patients/subjects may yield better/less benefits from treatments; subgroup of subjects may be

higher risk to expose to specific risk factor or gene, etc.

2.5.2 Data for pooling Tables of treatments/expose/gene/index test and outcomes of interest should be constructed. This

will allow reviewers to understand clearly and consistently what specific data we really need to

extract for further poolings. For continuous outcome data, number of patients and mean along with

its standard deviation (SD) of outcome values by treatment/exposure/genotype/allele groups are

required to extract. For dichotomous outcome data, frequencies of contingency table of

treatment/exposure/genotype/allele/genotype/index test and outcome groups are required.

Sometime these summary/aggregate data are not reported in articles but statistic parameters such

as beta coefficient or mean difference for continuous outcome, OR or RR for dichotomous

outcome, and HR for time to event outcome along with their 95% confidence intervals (CI) are

reported instead. Actually beta- coefficient of group 1 vs group 2 is equivalent to mean difference

of mean1 − mean2 ; thus this should be able to combine with those estimated mean differences

which are calculated from summary/aggregate data of mean and SD. Also those studies reported

summary statistics as OR or RR, these should be combined with those estimated ORs or RRs from

frequency data.

In genetic association studies, genotype effects are sometime considered as dominant (aa+Aa vs

AA if A and a are major and minor alleles), recessive (aa vs Aa+AA), or additive (0, 1, 2 for AA,

Aa, aa) effects. If genotype frequencies for AA, Aa, and aa across outcome groups/value are

available, let’s extract these data. If they are not so, let’s extract these assigned gene effects but

whether this will be able to pool depends on mode of gene effects suggested from pooling

21

summary data.

For time to event outcome, most studies usually report HRs along with 95% CIs of

treatment/exposure effects on outcome of interest. These can be directly combined applying meta-

analysis based on the assumption that the HR is constant or proportional overtime in which

sometime it is not. Thus, pooling HR may be invalid and may need to pool using other parametric

models or restricted mean survival time(17). To do this is required individual patient data (IPD) of

time and event from each study.

What we should seek for from included studies is extracting data from Kaplan-Meier (KM) curve

for both y-axis and x-axis along with number of patients at risk at each point of time underneath.

These data can be converted to individual patient data or raw data using ‘ipdfc’ command in

STATA(17). However, this command is required following data

• ts = time at any point of X–axis

• s = survival probability or failure probability at corresponding time ts

• trisk = time at risk

• nrisk = number patients at risk at corresponding trisk

where ts and s can be extracted from KM curve using graphical imaging such as Dititizelt (17),

WebPlotDigitizer (18) (https://automeris.io/WebPlotDigitizer/), or else. The later software is a

freeware, read/watch its tutorial at the website to see how it works. Once ts and s are extracted,

data should be save as csv file and then import to STATA; trisk and nrisk are next manually

added. This is ready to convert to IPD.

For systematic review of prediction model, data extraction forms should be specifically designed

considering research methods for this type of study following our experiences in doing this type of

review (5, 6), the guideline (19) and CHARMS checklist (20) including as follows:

22

https://automeris.io/WebPlotDigitizer/

• Type of model (diagnosis vs prognosis) if considering both in one review

• Time that the outcome of interest would be predicted by that prediction model

• Setting refers to the tie point when the model is intended to be used

• Type of study phases, i.e., development or derive vs validation phases

• Type of patients, e.g., general vs specific disease

• Study design, i.e., cohort (retrospective, prospective, bidirectional cohorts), cross-

sectional studies

• Prevalence/incidence of the interested outcome on that setting

• Numbers of variables considered in univariate and multivariate analyses if it is

development phase

• Numbers of variable included in the final prediction model, what they were?

• Proportion of exposure of individual variables included in the prediction model in that

setting

• Model’s performances including

-Calibration by statistic test and graph?

-Discrimination C-statistic

Data for pooling applying meta-analysis should also be considered if possible. Required data can

be C-statistics along with its 95% CI or standard error for discrimination performance; and O/E

data for calibration performance.

Once the DEF is designed, the first reviewer should test or pilot with 3-5 studies, then re-modify

or justify them accordingly to be more applicable. The data extraction process requires at least

two reviewers to independently perform, or at least one reviewer to extract the data while

another reviewer checks the validity of extraction. Once data extractions have been complete,

data should be computerized separately by reviewers, Validation should be next performed using

23

the same technique as mention above for selection of studies. This should be done independently

by statistician who is not involved in data extraction process if possible. Kappa statistic should

also be estimated. Call for meeting between reviewers and a third party should be set up to solve

disagreement.

2.6 Risk of bias assessment Assessing quality or risk of bias is considered in the internal and external validity of the study.

In general, an individual study should at least have internal validity prior to further

generalization. Any systematic bias will distort internal validity such as, selection bias,

performance bias, detection bias, measurement bias, or attrition bias. Issues such as patients’

characteristics (inclusion/exclusion criteria), setting (primary versus referral center), treatment

regimens, or outcome measurements will distort external validity.

A few quality assessment scales have been developed and have been used for RCTs such as

Chalmers (Chalmers TC) and Jadad’s scales. Later on, Higgins and Altman (see Cochrane

handbook 2011, Higgins et al 2011 in Appendix) have encouraged reviewers to assess risk of

biases of individual studies based on the types of biases, see appendix. These include 6

domains: selection bias, performance bias, attrition bias, detection bias, reporting bias, and

other bias. Seven sources of bias are considered for assessing, which are random sequence

generation and allocation concealment for selection bias; blinding of participants and health

care team for performance bias; blinding of outcome assessors for detection bias; incomplete

outcome data for attrition bias; selective outcome report for reporting bias; and anything else

for other sources of bias. Each item will be checked whether it is prone to be bias, and its

possible answer is low or high bias. In case there is insufficient data for making a decision,

an answer can be ‘unclear’. Sometimes a few items may be not applied in RCTs for specific

questions, so an answer should be ‘not applicable’.

24

Risk of bias assessment form for genetic association studies has been developed by

Thakkinstian et al(21). General issues of Epidemiology and genetic association are

considered and assessed. General issues of Epidemiology are selection bias, information bias,

confounding bias, and selective outcome report, whereas generic issues are multiple testing

and Hardy-Weinberg equilibrium test. Each item is gain graded as low, high, and unclear risk

of bias. Investigators should plan and describe in a review proposal what they means for low,

high, or unclear for each item.

For general observational studies, the Newcastle-Ottawa quality assessment scale can be used

for assessing quality of the study

(http://www.ohri.ca/programs/clinical_epidemiology/oxford.asp, see appendix). The original

scale is designed for both cohort and case-control studies. A cross-sectional study can use the

same form as cohort study, but may be needed to modify where appropriate. Three domains

are considered and assessed, i.e., selection, comparability, and exposure. The scale should be

modified to properly use for each review.

Quality in prognostic study (QUIPS) is a more specific scale has been recently designed for

prognostic study(22). The scale consists of 6 domains, which are study participants, study

attrition, prognostic factor measurement, outcome measurement, study confounding, and

statistical analysis and report. Each domain is graded as low, moderate, and high risk of bias.

25

http://www.ohri.ca/programs/clinical_epidemiology/oxford.asp

The revised tool for quality assessment of diagnostic accuracy studies (QUADAS-2) has been used

for systematic review of diagnostic studies ( 2 3 ) . The toll consists of consists of four domains:

patient selection, index test, reference standard, and flow- timing. Each domain is graded as low ,

high , or unclear.

Investigators are required to discuss how to assess risk of bias for each item, and what

information should be dug from the study to aid grading. This should be clearly described in a

review proposal and should make sure that reviewers understand well how to do so. Process of

evaluation consists of 2 parts. First, reviewers are required to grade whether the study is low,

high, or unclear risk of bias for each item based on support information from the study. If that

item cannot be assessed, let’s document ‘NA’. Second, support information or relevant trial

characteristics that have been used in making decision for each source of bias should be

documented.

The quality scores or source of bias might associate with treatment effects and thus be used for

exploring sources of heterogeneity. This process should also be performed by two reviewers.

Blinding reviewers about authors, institutions, or journals is sometimes useful. Disagreement

between two reviewers should be solved by consensus and discussion. Levels of agreement for

each domain and the overall domains may be assessed using Kappa statistics.

2.7 Statistical analysis

The same as other primary studies, the statistical analysis plan should be clearly described in the

review proposal. This will let investigator team understand and be assured how data will be

analysed and lead to answer their research questions. For an direct meta-analysis, statistical

analysis plan should be covered pooling methods for all type of outcome (e.g., prevalence,

continuous outcome, dichotomous outcome). Assessing heterogeneity by testing and estimate a

26

degree of heterogeneity (I2) should be mentioned. Exploring source of heterogeneity is required to

plan for in case heterogeneity is present, this will be done by meta- regression or subgroup

analysis. Study and patient characteristic will be used for this exploring should be clearly

mentioned and described. Finally, publication bias assessment by Egger’s test and funnel plot

should be also planned for. A contour- enhanced funnel plot will be used if there is an evidence

of asymmetry of a funnel by either test or plot.

Relevant dummy tables and figures should be prior designed. Steps of analysis, dummy tables

and figures are as follows:

– The first figure for result should be a flow chart of identifying and selecting studies,

numbers of included studies, lists of reasons for excluded studies

– Table describing study’s and patients’ characteristics

– Table displaying frequency data of cross-tabulation between treatments and

outcomes of individual studies for dichotomous outcome; number of patients, mean,

and SD for continuous outcome

– Forest plots for main poolings, this can be incorporated summary/aggregate data (e.g.,

frequency data or mean & SD). If this can be done, Table of these data can be omitted

or move to supplement Tables.

– Explore possible sources of heterogeneity by meta-regression or subgroup analysis

– Perform sensitivity analyses

– Examine publication bias Advance meta-analysis for genetic association studies, diagnostic test, a network meta-analysis

are required more specific pooling methods. This should be clearly described according to detail

in the module 2-3.

27

2.8 Register review proposal As mentioned previously that good systematic review and meta-analysis should be performed

based on good review methods, which are robust, transparent, free/minimal bias, reproducible,

and even auditable. A review proposal is therefore encouraged to be register, which is the same

as a RCT. Some benefits from registration are as follows (24), see more detail in Appendix 6:

2.8.1 Establish that we are doing this review topic. Other researches who are interested in

the same research question as us may be able to collaborate or involve in this review. As a

result, duplicated reviews addressing the same question should be reduced.

2.8.2 Registration should lead to increase potential communications within interested

researchers

2.8.3 Ensure that the review methods are transparency, robust, reproducible, and adhere

with what mentioned in review proposal

Once the review proposal has been developed, investigators should can register at following

sources:

- National Institute of Health (NIH) at

http://nihlibrary.campusguides.com/content.php?pid=252593&sid=2085601

- Campbell Collaboration at http://www.campbellcollaboration.org/

Register systematic reviews of the effects of social interventions

- Cochrane Collaboration at http://www.cochrane.org/

An international organization, that produces and disseminates systematic reviews of health

care interventions which focused on only RCTs, and diagnostic tests

- International prospective register of systematic reviews (PROSPERO ) at

http://www.crd.york.ac.uk/prospero/

28

http://nihlibrary.campusguides.com/content.php?pid=252593&sid=2085601

http://www.campbellcollaboration.org/

http://www.campbellcollaboration.org/

http://www.cochrane.org/

http://www.cochrane.org/




Not only a systematic review and meta-analysis for RCTs, but also for all sorts of observational

studies can be registered at the PROSPERO.

2.9 Reporting systematic review and meta-analysis

Reporting results should followed the PRISMA guidelines (25), which suggest how to

report results of systematic review and summary/aggregate meta-analysis from

background to discussion. In addition, the PRISMA guidelines have been expanded to a

network meta- analysis and individual patient meta-analysis. Details of PRISMA

guidelines are listed in appendices.

2.10 Interpret results

– Consider limitations, including publications and related biases

– Consider strength of evidence

– Consider applicability

– Consider numbers-needed-to-treat for benefit/harm

– Consider economic implications

– Consider implications for future research

2.11 Grading evidence

Read more detail in Appendix XI-XII

The evidence synthesis from systematic review and meta-analysis should be finally graded. This

allows readers/users feel confident in applying results to clinical practice. A working group,

called the Grading of Recommendations, Assessment, Development and Evaluation (GRADE)

(26) (www.gradeworkinggroup.org); consists of health care methodologists, guideline

developers, clinicians, health services researchers, health economists, public health officers and

other

29

http://www.google.com/url?q=http%3A%2F%2Fwww.gradeworkinggroup.org%2F&sa=D&sntz=1&usg=AFQjCNEzM_82NQB1u7gOwDCvaKpFZUKc2w

interested members, has developed and updated how to rate evidence and clinical practice

guideline, but herein will focus on only grading evidence from systematic review and meta-

analysis. The GRADE handbook (available at www.gradeworkinggroup.org) describes the

process of rating the quality of the best available evidences, which is transparent and structured

process for developing and presenting evidence summaries. The guideline considers a wide

ranges of research questions including therapy, diagnosis, screening, and prevention.

Similar to conducting a systematic review and meta-analysis, perform grading is usually started

with clearly define research question in terms of PICO, i.e., targeted patients, alternative

treatment/intervention strategies, and patient-important outcome.

The quality of evidence for each pooling outcome is rated according to the factors outlined in the

GRADE approach, including five factors that may lead to rating down the quality of evidence

and three factors that may lead to rating up, see detail below. Grade evidence profile and

summary of finding table should be developed and reported.

The grade evidence profile consists of

- A list of the outcomes

- The number of included studies and their study designs

- Judgments about each of the quality of evidence factors assessed including

o Risk of bias, inconsistency, indirectness, imprecision, other considerations (e.g.,

publication bias and factors that increase the quality of evidence)

- The assumed risk; a measure of the typical burden of the outcomes, i.e. illustrative risk

or also called baseline risk, baseline score, or control group risk

30

- The corresponding risk; a measure of the burden of the outcomes after the intervention

is applied, i.e. the risk of an outcome in treated/exposed people based on the relative

magnitude of an effect and assumed (baseline) risk

- The relative effect; for dichotomous outcomes the table will usually provide risk ratio,

odds ratio, or hazard ratio

- The absolute effect; for dichotomous outcomes the number of fewer or more events in

treated/exposed group as compared to the control group

- Rating of the overall quality of evidence for each outcome (which may vary by

outcome)

- Classification of the importance of each outcome

- Footnotes, if needed, to provide explanations about information in the table such as

elaboration on judgements about the quality of evidence

Rating the quality of evidence reflects how much we are confident with pooling results of our

systematic review and meta-analysis. Each interested outcome should be rated separately, the

quality may be different among outcomes. There are four rating scales as follows:

31

Table 1. Rating scale for quality of evidence assessment

Rating scale

Definition

High

We are highly confident that the true effect size lies close to our

estimation. This means our pooled effect size is very precise estimation

or 95% confidence interval (CI) is very narrow.

Moderate

We are moderately confident in the estimated effect size. The true effect

may be close to the estimated effect size, but the 95% CI is quite wide

and thus there is a possibility that it is substantially different..

Low

We are little confident in the estimated effect size because the estimation

is very imprecise (i.e., very wide CI). The true effect may be

substantially different from our estimated effect size.

Very low

We are very little confident in the estimated effect size, which has very

wide CI: The true effect is likely to be substantially different from our

estimate.

Rating quality of evidence usually begins with study designs of studies included in systematic

review, i.e., those are randomised control trial or observational studies.

Then, five factors are used to possibly rate down and three factors are used to possibly rate up the

quality of evidence as follows:

32

Table 2. Five and three factors use for rate down and up quality of evidence

Down-rate factor

Consequence

Limitations in study design or execution (risk of bias)

↓ 1 or 2 levels

Inconsistency of results

↓ 1 or 2 levels

Indirectness of evidence

↓ 1 or 2 levels

Imprecision

↓ 1 or 2 levels

Publication bias

↓ 1 or 2 levels

Up-rate factor

Consequence

Large magnitude of effect

↑ 1 or 2 levels

All plausible confounding would reduce the demonstrated

effect or increase the effect if no effect was observed

↑ 1 level

Dose-response gradient

↑ 1 level

For research question about therapy or treatment/intervention efficacy, quality of evidence from

each study design is rated as follows:

- randomized control trials without important limitations provide high quality evidence

- observational studies without special strengths or with important limitations

provide low quality evidence

- Non-randomised control trials or quasi-experimental design without important

limitations may provide high quality evidence, but will automatically be down rated for

risk of bias, e.g., lack of randomisation, lack of concealment or blinding.

- Case series and case reports are observational studies that mostly contain only one

patient group with receiving intervention, no comparator or source of control group is

33

not clear .Thus, they will usually warrant downgrading from low to very low quality

evidence.

Quality of evidence factors

a) Risk of bias

Results of risk of bias assessments for individual studies can be directly used to down rate

the quality of evidence of systematic review/meta-analysis. For RCTs, these include

randomisation sequence generation, allocation concealment, lack of blinding, incomplete

patients and outcomes, selective outcome report, and other biases. For observational studies,

the items of risk of bias assessment depend on what type of scale has been used for

assessing risk of bias. However, most scales basically assess limitation of studies in terms

of representativeness of study patients or failure to apply appropriately eligibility criteria,

ascertainment bias in measure exposure and outcome, confounding bias from failure to

adequately control confounders, and incomplete or inadequate follow up time.

Results of risk of bias should be used directly to reflect study limitations. For instance, low

risk of bias would indicate no limitation, unclear risk of bias would indicate either no

limitation or serious limitation; and ‘high risk of bias would indicate either serious or very

serious limitation in the GRADE approach. Suggestions of how to assess study limitation

are described below:

34

Table 3. Describe how to assess study limitations

Risk of bias Across studies Interpretation Considerations GRADE assessment of study

limitations

Low Most information is from studies at low risk of bias.

Plausible bias unlikely to seriously alter the results.

No apparent limitations. No serious limitations, do not downgrade

Unclear Most information is from studies at low or unclear

risk of bias.

Plausible bias that raises some doubt about the

results.

Potential limitations are unlikely to lower confidence in the

estimate of effect.

No serious limitations, do not downgrade

Potential limitations are likely to lower confidence in the estimate

of effect.

Serious limitations, downgrade one level.

High The proportion of information from studies

at high risk of bias is

sufficient to affect the

interpretation of results.

Plausible bias that seriously weakens

confidence in the results.

Crucial limitation for one criterion, or some limitations for

multiple criteria, sufficient to

lower confidence in the estimate

of effect.

Serious limitations, downgrade one level

Crucial limitation for one or more criteria sufficient to

substantially lower confidence in

the estimate of effect.

Very serious limitations, downgrade two levels

35

b) Inconsistency

Inconsistency refers to an unexplained heterogeneity of pooling effect size (e.g., OR, RR,

HR, mean difference), which can be assessed by estimation of I2. If the source of

heterogeneity cannot be explained/identified, the quality of evidence should be downgraded

by one or two levels will depend on the magnitude of the inconsistency in the results.

If the source of heterogeneity can be identified leading to subgroup analysis, rating evidence

should be performed within subgroup, rather than downgrade the overall pooling results.

Criteria uses for inconsistency are as follows:

- Statistical criteria by Cochrane Q test (P < 0.1) or the I2 rule as follows: < 40% may be

low, 30-60% may be moderate, 50-90% may be substantial, and 75-100% may be

considerable. This I2 criteria is needed to incorporate with other two criteria below

because estimation of I2 depends on number of included studies. For instance, there is

high variation of point estimated effect size across studies but Q test is non-significant

or I2 is very low. .

- Wide variation of point estimated effect size across studies

- Minimal or no overlap of CI, which suggests variation is more than what one would

expect by chance alone

c) Indirectness of evidence

Directness of evidence is claimed as good quality of evidence, which is judged by target

patients, interventions, and outcomes of interest. Sources of indirectness can be as follows:

- Difference in study patients

If included studies had difference in patient spectrum, i.e., difference in disease

severity, or studies conducted in mixed population of children and adults, or studies

conducted in specific disease/s such as diabetes or hypertension, but results will be used

in general adults, this evidence is said to be indirectness.

36

- Difference in intervention

Different in interested intervention, drug class, or dosage per day between studies, can

be used for down grade the evidence.

- Difference in the outcome measures

This may be different methods used for outcome measure, or different time at measure,

or surrogate versus patient important outcome. For instance, measure HbA1C vs CVE

complication in diabetes. Using surrogate outcome can be down grade by one or two

level. Knowing disease mechanism will help in making decision to down grade, if the

surrogate outcome is far away from the patient important outcome, it should be down

grade by two levels. For instance, bone mineral density for fracture,, calcium

calcification score for myocardial infraction should be down grade by one level; use of

sleep questionnaire and Polysomnography to measure sleep quality in pregnancy me be

down grade by two levels.

- Indirect comparison

If treatment effect C vs B is indirectly estimated from B vs A and C vs A, indirect

effect of C vs B should be down grade by one level, and two levels if inconsistency

assumption is violated.

d) Imprecision of the outcome

The optimal information size (OIS) is used to determine whether 95% CI for each outcome

is adequate precision. The OIS can be estimated according to conventional sample size

estimation for that specific hypothesis testing. If the estimated OIS is less than the

conventional estimated sample size for a single adequately power study, considering rating

down for imprecision. The criteria for rating of dichotomous outcome is as follow:

- If the OIS ((i.e., a total number of patients included in a systematic review) criterion

is not met (i.e., OIS is lower than a conventional sample size calculation n), rate

37

down for imprecision, unless the sample size is very large (at least 2000, and perhaps

4000 patients).

- If the OIS criterion is met (OIS≥ conventional n) and the 95% CI excludes no

effect (i.e. CI around RR excludes 1.0), do not rate down for imprecision.

- If OIS criterion is met, and the 95% CI overlaps no effect (i.e. CI includes RR of

1.0) rate down for imprecision if the CI fails to exclude important benefit or

important harm. For instance, the lower CI is very benefit or the upper CI is very high

risk, it may rate down for one level. In addition, if the lower CI is very benefit but the

upper CI is high risk (or vice versa), it may consider to rate down for two levels.

e) Publication bias

If there evidence of publication bias suggesting by funnel plot, Egger’s test, and a contour-

enhance funnel plot, the evidence should be rate down for one level.

Evidences from systematic review and meta-analysis can be sometime rated up if they meet

following criteria:

a) Large effect size

- If the effect size (e.g., RR, OR, HR, mean difference) is large, which is defined as > 2

or < 0.5, the evidence may be considered to up rate for one level

- If the effect size is very large effect, i.e., OR/RR/HR > 5 or <0.2, it may be considered

to up rate for two levels . However, this rule might be not applied for OR if the outcome

is not rate, may need to convert OR to RR before applying.

The effect size should be considered not only the point estimate but also the range

estimate, i.e., 95% CI

b) Dose-response effect

c) Effect of plausible residual confounding

38

3. ASSIGNMENTS

Topics

Score

Due Date

Assignment I: Searching for review topic/Literature review

- Explore review topic - Literature review for background, rationale,

research questions, review objectives

10 %

Jun 4, 2018

Assignment II: Perform locate and select studies

– Literature review – Search terms and strategies – Inclusion and exclusion criteria – Perform locate and select studies

10 %

Jun 26, 2018

Assignment III: Design data extraction form & risk of bias

assessment

– Design o Data extraction forms

o Quality or risk of bias assessment

Perform data extraction & assess quality of

Studies

10 %

Jul 3, 2018

Assignment IV: Statistical analysis plan

– Data analysis plan

– Dummy tables & figures

10 %

Jul 12, 2018

Assignment V: Construct review proposal

10% Jul 26, 2018

Assignment VI: Register review proposal at PROSPERO 10 % -

Assignment VII: Writing a manuscript 40% Sep 20, 2018 (4 p.m.)

39

4. REFERENCES

1. Egger M, Smith G, Altman D. Systematic reviews in health care: meta-analysis in context.

Second ed. London: BMJ Publishing Group; 2001.

2. Tantrakul V, Numthavaj P, Guilleminault C, McEvoy M, Panburana P, Khaing W, et al.

Performance of screening questionnaires for obstructive sleep apnea during pregnancy: A

systematic review and meta-analysis. Sleep medicine reviews. 2017;36:96-106.

3. Khaing W, Vallibhakara SA, Attia J, McEvoy M, Thakkinstian A. Effects of education and

income on cardiovascular outcomes: A systematic review and meta-analysis. Eur J Prev Cardiol.

2017;24(10):1032-42.

4. Anothaisintawee T, Reutrakul S, Van Cauter E, Thakkinstian A. Sleep disturbances

compared to traditional risk factors for diabetes development: Systematic review and meta-

analysis. Sleep medicine reviews. 2015;30:11-24.

5. Anothaisintawee T, Teerawattananon Y, Wiratkapun C, Kasamesup V, Thakkinstian A.

Risk prediction models of breast cancer: a systematic review of model performances. Breast Cancer

Res Treat. 2012;133(1):1-10.

6. Wilasrusmee C, Anothaisintawee T, Poprom N, McEvoy M, Attia J, Thakkinstian A.

Diagnostic Scores for Appendicitis: A Systematic Review of Scores’ Performance. British Journal

of Medicine & Medical Research. 2014;4(2):711-30.

7. Vejakama P, Thakkinstian A, Lertrattananon D, Ingsathit A, Ngarmukos C, Attia J. Reno-

protective effects of renin-angiotensin system blockade in type 2 diabetic patients: a systematic

review and network meta-analysis. Diabetologia. 2012;55(3):566-78.

8. Anothaisintawee T, Attia J, Nickel JC, Thammakraisorn S, Numthavaj P, McEvoy M, et al.

Management of chronic prostatitis/chronic pelvic pain syndrome: a systematic review and network

meta-analysis. JAMA. 2011;305(1):78-86.

9. Thakkinstian A, Han P, McEvoy M, Smith W, Hoh J, Magnusson K, et al. Systematic 40

review and meta-analysis of the association between complement factor H Y402H polymorphisms

and age-related macular degeneration. Hum Mol Genet. 2006;15(18):2784-90.

10. Wattanawong K, Rattanasiri S, McEvoy M, Attia J, Thakkinstian A. Association between

IRF6 and 8q24 polymorphisms and nonsyndromic cleft lip with or without cleft palate: Systematic

review and meta-analysis. Birth Defects Res A Clin Mol Teratol. 2016;106(9):773-88.

11. Numthavaj P, Thakkinstian A, Dejthevaporn C, Attia J. Corticosteroid and antiviral therapy

for Bell's palsy: a network meta-analysis. BMC Neurol. 2011;11:1.

12. Boonchan T, Wilasrusmee C, McEvoy M, Attia J, Thakkinstian A. Network meta-analysis

of antibiotic prophylaxis for prevention of surgical-site infection after groin hernia surgery. Br J

Surg. 2017;104(2):e106-e17.

13. Khaing W, Vallibhakara SA, Tantrakul V, Vallibhakara O, Rattanasiri S, McEvoy M, et al.

Calcium and Vitamin D Supplementation for Prevention of Preeclampsia: A Systematic Review

and Network Meta-Analysis. Nutrients. 2017;9(10).

14. Thakkinstian A, Dmitrienko S, Gerbase-Delima M, McDaniel DO, Inigo P, Chow KM, et

al. Association between cytokine gene polymorphisms and outcomes in renal transplantation: a

meta-analysis of individual patient data. Nephrol Dial Transplant. 2008;23(9):3017-23.

15. Sansanayudh N, Anothaisintawee T, Muntham D, McEvoy M, Attia J, Thakkinstian A.

Mean platelet volume and coronary artery disease: a systematic review and meta-analysis.

International journal of cardiology. 2014;175(3):433-40.

16. Sansanayudh N, Numthavaj P, Muntham D, Yamwong S, McEvoy M, Attia J, et al.

Prognostic effect of mean platelet volume in patients with coronary artery disease. A systematic

review and meta-analysis. Thrombosis and haemostasis. 2015;114(6):1299-309.

17. Wei Y, Royston P. Reconstructing time-to-event data from published Kaplan–Meier curves.

The Stata journal. 2017;17(4):786-802.

18. Rohatgi A. WebPlotDigitizer. 4.1 ed. Austin, Texas, USA2018.

19. Debray TP, Damen JA, Snell KI, Ensor J, Hooft L, Reitsma JB, et al. A guide to systematic 41

review and meta-analysis of prediction model performance. BMJ. 2017;356:i6460.

20. Moons KG, de Groot JA, Bouwmeester W, Vergouwe Y, Mallett S, Altman DG, et al.

Critical appraisal and data extraction for systematic reviews of prediction modelling studies: the

CHARMS checklist. PLoS medicine. 2014;11(10):e1001744.

21. Thakkinstian A, McKay GJ, McEvoy M, Chakravarthy U, Chakrabarti S, Silvestri G, et al.

Systematic review and meta-analysis of the association between complement component 3 and age-

related macular degeneration: a HuGE review and meta-analysis. American journal of

epidemiology. 2011;173(12):1365-79.

22. Hayden JA, van der Windt DA, Cartwright JL, Cote P, Bombardier C. Assessing bias in

studies of prognostic factors. Ann Intern Med. 2013;158(4):280-6.

23. Whiting PF, Rutjes AW, Westwood ME, Mallett S, Deeks JJ, Reitsma JB, et al. QUADAS-

2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med.

2011;155(8):529-36.

24. Stewart L, Moher D, Shekelle P. Why prospective registration of systematic reviews makes

sense. Systematic reviews. 2012;1:7.

25. Moher D, Liberati A, Tetzlaff J, Altman DG. Preferred reporting items for systematic

reviews and meta-analyses: the PRISMA statement. J Clin Epidemiol. 2009;62(10):1006-12.

26. Puhan MA, Schunemann HJ, Murad MH, Li T, Brignardello-Petersen R, Singh JA, et al. A

GRADE Working Group approach for rating the quality of treatment effect estimates from network

meta-analysis. BMJ. 2014;349:g5630.

42

5. Test book 1. Egger M, Smith GD, Altman DG. Systematic reviews in health care: Meta-analysis in

context. 2 nd ed. London: BMJ Books; 2001.

2. Chalmers I, Altman DG. Systematic reviews. London: BMJ Publishing Group; 1995. 3. Petitti BD. Meta-Analysis, decision analysis, and cost-effectiveness analysis. Oxford:

University Press 1994; pages 91-130, 194-196.

4. Higgins JPT, Green S. Cochrane Handbook for Systematic Reviews of Interventions

Version 5.1.0 edn, 2011.

43

systematic review and meta-analysis · students should be able to -develop a review proposal for...

Documents