case study vs meta analysis

Study Design 101: Meta-Analysis

Case Report
Case Control Study
Cohort Study
Randomized Controlled Trial
Practice Guideline
Systematic Review

Meta-Analysis

Helpful Formulas
Finding Specific Study Types

A subset of systematic reviews; a method for systematically combining pertinent qualitative and quantitative study data from several selected studies to develop a single conclusion that has greater statistical power. This conclusion is statistically stronger than the analysis of any single study, due to increased numbers of subjects, greater diversity among subjects, or accumulated effects and results.

Meta-analysis would be used for the following purposes:

To establish statistical significance with studies that have conflicting results
To develop a more correct estimate of effect magnitude
To provide a more complex analysis of harms, safety data, and benefits
To examine subgroups with individual numbers that are not statistically significant

If the individual studies utilized randomized controlled trials (RCT), combining several selected RCT results would be the highest-level of evidence on the evidence hierarchy, followed by systematic reviews, which analyze all available studies on a topic.

Greater statistical power
Confirmatory data analysis
Greater ability to extrapolate to general population affected
Considered an evidence-based resource

Disadvantages

Difficult and time consuming to identify appropriate studies
Not all studies provide adequate data for inclusion and analysis
Requires advanced statistical techniques
Heterogeneity of study populations

Design pitfalls to look out for

The studies pooled for review should be similar in type (i.e. all randomized controlled trials).

Are the studies being reviewed all the same type of study or are they a mixture of different types?

The analysis should include published and unpublished results to avoid publication bias.

Does the meta-analysis include any appropriate relevant studies that may have had negative outcomes?

Fictitious Example

Do individuals who wear sunscreen have fewer cases of melanoma than those who do not wear sunscreen? A MEDLINE search was conducted using the terms melanoma, sunscreening agents, and zinc oxide, resulting in 8 randomized controlled studies, each with between 100 and 120 subjects. All of the studies showed a positive effect between wearing sunscreen and reducing the likelihood of melanoma. The subjects from all eight studies (total: 860 subjects) were pooled and statistically analyzed to determine the effect of the relationship between wearing sunscreen and melanoma. This meta-analysis showed a 50% reduction in melanoma diagnosis among sunscreen-wearers.

Real-life Examples

Goyal, A., Elminawy, M., Kerezoudis, P., Lu, V., Yolcu, Y., Alvi, M., & Bydon, M. (2019). Impact of obesity on outcomes following lumbar spine surgery: A systematic review and meta-analysis. Clinical Neurology and Neurosurgery, 177 , 27-36. https://doi.org/10.1016/j.clineuro.2018.12.012

This meta-analysis was interested in determining whether obesity affects the outcome of spinal surgery. Some previous studies have shown higher perioperative morbidity in patients with obesity while other studies have not shown this effect. This study looked at surgical outcomes including "blood loss, operative time, length of stay, complication and reoperation rates and functional outcomes" between patients with and without obesity. A meta-analysis of 32 studies (23,415 patients) was conducted. There were no significant differences for patients undergoing minimally invasive surgery, but patients with obesity who had open surgery had experienced higher blood loss and longer operative times (not clinically meaningful) as well as higher complication and reoperation rates. Further research is needed to explore this issue in patients with morbid obesity.

Nakamura, A., van Der Waerden, J., Melchior, M., Bolze, C., El-Khoury, F., & Pryor, L. (2019). Physical activity during pregnancy and postpartum depression: Systematic review and meta-analysis. Journal of Affective Disorders, 246 , 29-41. https://doi.org/10.1016/j.jad.2018.12.009

This meta-analysis explored whether physical activity during pregnancy prevents postpartum depression. Seventeen studies were included (93,676 women) and analysis showed a "significant reduction in postpartum depression scores in women who were physically active during their pregnancies when compared with inactive women." Possible limitations or moderators of this effect include intensity and frequency of physical activity, type of physical activity, and timepoint in pregnancy (e.g. trimester).

Related Terms

A document often written by a panel that provides a comprehensive review of all relevant studies on a particular clinical or health-related topic/question.

Publication Bias

A phenomenon in which studies with positive results have a better chance of being published, are published earlier, and are published in journals with higher impact factors. Therefore, conclusions based exclusively on published studies can be misleading.

Now test yourself!

1. A Meta-Analysis pools together the sample populations from different studies, such as Randomized Controlled Trials, into one statistical analysis and treats them as one large sample population with one conclusion.

a) True b) False

2. One potential design pitfall of Meta-Analyses that is important to pay attention to is:

a) Whether it is evidence-based. b) If the authors combined studies with conflicting results. c) If the authors appropriately combined studies so they did not compare apples and oranges. d) If the authors used only quantitative data.

Evidence Pyramid - Navigation

Meta- Analysis
Case Reports
<< Previous: Systematic Review
Next: Helpful Formulas >>

Last Updated: Sep 25, 2023 10:59 AM
URL: https://guides.himmelfarb.gwu.edu/studydesign101

Himmelfarb Intranet
Privacy Notice
Terms of Use
GW is committed to digital accessibility. If you experience a barrier that affects your ability to access content on this page, let us know via the Accessibility Feedback Form .
Himmelfarb Health Sciences Library
2300 Eye St., NW, Washington, DC 20037
Phone: (202) 994-2962
[email protected]
https://himmelfarb.gwu.edu

How it works

Meta-Analysis – Guide with Definition, Steps & Examples

Published by Owen Ingram at April 26th, 2023 , Revised On April 26, 2023

“A meta-analysis is a formal, epidemiological, quantitative study design that uses statistical methods to generalise the findings of the selected independent studies. “

Meta-analysis and systematic review are the two most authentic strategies in research. When researchers start looking for the best available evidence concerning their research work, they are advised to begin from the top of the evidence pyramid. The evidence available in the form of meta-analysis or systematic reviews addressing important questions is significant in academics because it informs decision-making.

What is Meta-Analysis

Meta-analysis estimates the absolute effect of individual independent research studies by systematically synthesising or merging the results. Meta-analysis isn’t only about achieving a wider population by combining several smaller studies. It involves systematic methods to evaluate the inconsistencies in participants, variability (also known as heterogeneity), and findings to check how sensitive their findings are to the selected systematic review protocol.

When Should you Conduct a Meta-Analysis?

Meta-analysis has become a widely-used research method in medical sciences and other fields of work for several reasons. The technique involves summarising the results of independent systematic review studies.

The Cochrane Handbook explains that “an important step in a systematic review is the thoughtful consideration of whether it is appropriate to combine the numerical results of all, or perhaps some, of the studies. Such a meta-analysis yields an overall statistic (together with its confidence interval) that summarizes the effectiveness of an experimental intervention compared with a comparator intervention” (section 10.2).

A researcher or a practitioner should choose meta-analysis when the following outcomes are desirable.

For generating new hypotheses or ending controversies resulting from different research studies. Quantifying and evaluating the variable results and identifying the extent of conflict in literature through meta-analysis is possible.

To find research gaps left unfilled and address questions not posed by individual studies. Primary research studies involve specific types of participants and interventions. A review of these studies with variable characteristics and methodologies can allow the researcher to gauge the consistency of findings across a wider range of participants and interventions. With the help of meta-analysis, the reasons for differences in the effect can also be explored.

To provide convincing evidence. Estimating the effects with a larger sample size and interventions can provide convincing evidence. Many academic studies are based on a very small dataset, so the estimated intervention effects in isolation are not fully reliable.

Elements of a Meta-Analysis

Deeks et al. (2019), Haidilch (2010), and Grant & Booth (2009) explored the characteristics, strengths, and weaknesses of conducting the meta-analysis. They are briefly explained below.

Characteristics:

A systematic review must be completed before conducting the meta-analysis because it provides a summary of the findings of the individual studies synthesised.
You can only conduct a meta-analysis by synthesising studies in a systematic review.
The studies selected for statistical analysis for the purpose of meta-analysis should be similar in terms of comparison, intervention, and population.

Strengths:

A meta-analysis takes place after the systematic review. The end product is a comprehensive quantitative analysis that is complicated but reliable.
It gives more value and weightage to existing studies that do not hold practical value on their own.
Policy-makers and academicians cannot base their decisions on individual research studies. Meta-analysis provides them with a complex and solid analysis of evidence to make informed decisions.

Criticisms:

The meta-analysis uses studies exploring similar topics. Finding similar studies for the meta-analysis can be challenging.
When and if biases in the individual studies or those related to reporting and specific research methodologies are involved, the meta-analysis results could be misleading.

Steps of Conducting the Meta-Analysis

The process of conducting the meta-analysis has remained a topic of debate among researchers and scientists. However, the following 5-step process is widely accepted.

Step 1: Research Question

The first step in conducting clinical research involves identifying a research question and proposing a hypothesis . The potential clinical significance of the research question is then explained, and the study design and analytical plan are justified.

Step 2: Systematic Review

The purpose of a systematic review (SR) is to address a research question by identifying all relevant studies that meet the required quality standards for inclusion. While established journals typically serve as the primary source for identified studies, it is important to also consider unpublished data to avoid publication bias or the exclusion of studies with negative results.

While some meta-analyses may limit their focus to randomized controlled trials (RCTs) for the sake of obtaining the highest quality evidence, other experimental and quasi-experimental studies may be included if they meet the specific inclusion/exclusion criteria established for the review.

Step 3: Data Extraction

After selecting studies for the meta-analysis, researchers extract summary data or outcomes, as well as sample sizes and measures of data variability for both intervention and control groups. The choice of outcome measures depends on the research question and the type of study, and may include numerical or categorical measures.

For instance, numerical means may be used to report differences in scores on a questionnaire or changes in a measurement, such as blood pressure. In contrast, risk measures like odds ratios (OR) or relative risks (RR) are typically used to report differences in the probability of belonging to one category or another, such as vaginal birth versus cesarean birth.

Step 4: Standardisation and Weighting Studies

After gathering all the required data, the fourth step involves computing suitable summary measures from each study for further examination. These measures are typically referred to as Effect Sizes and indicate the difference in average scores between the control and intervention groups. For instance, it could be the variation in blood pressure changes between study participants who used drug X and those who used a placebo.

Since the units of measurement often differ across the included studies, standardization is necessary to create comparable effect size estimates. Standardization is accomplished by determining, for each study, the average score for the intervention group, subtracting the average score for the control group, and dividing the result by the relevant measure of variability in that dataset.

In some cases, the results of certain studies must carry more significance than others. Larger studies, as measured by their sample sizes, are deemed to produce more precise estimates of effect size than smaller studies. Additionally, studies with less variability in data, such as smaller standard deviation or narrower confidence intervals, are typically regarded as higher quality in study design. A weighting statistic that aims to incorporate both of these factors, known as inverse variance, is commonly employed.

Step 5: Absolute Effect Estimation

The ultimate step in conducting a meta-analysis is to choose and utilize an appropriate model for comparing Effect Sizes among diverse studies. Two popular models for this purpose are the Fixed Effects and Random Effects models. The Fixed Effects model relies on the premise that each study is evaluating a common treatment effect, implying that all studies would have estimated the same Effect Size if sample variability were equal across all studies.

Conversely, the Random Effects model posits that the true treatment effects in individual studies may vary from each other, and endeavors to consider this additional source of interstudy variation in Effect Sizes. The existence and magnitude of this latter variability is usually evaluated within the meta-analysis through a test for ‘heterogeneity.’

Forest Plot

The results of a meta-analysis are often visually presented using a “Forest Plot”. This type of plot displays, for each study, included in the analysis, a horizontal line that indicates the standardized Effect Size estimate and 95% confidence interval for the risk ratio used. Figure A provides an example of a hypothetical Forest Plot in which drug X reduces the risk of death in all three studies.

However, the first study was larger than the other two, and as a result, the estimates for the smaller studies were not statistically significant. This is indicated by the lines emanating from their boxes, including the value of 1. The size of the boxes represents the relative weights assigned to each study by the meta-analysis. The combined estimate of the drug’s effect, represented by the diamond, provides a more precise estimate of the drug’s effect, with the diamond indicating both the combined risk ratio estimate and the 95% confidence interval limits.

Figure-A: Hypothetical Forest Plot

Relevance to Practice and Research

Evidence Based Nursing commentaries often include recently published systematic reviews and meta-analyses, as they can provide new insights and strengthen recommendations for effective healthcare practices. Additionally, they can identify gaps or limitations in current evidence and guide future research directions.

The quality of the data available for synthesis is a critical factor in the strength of conclusions drawn from meta-analyses, and this is influenced by the quality of individual studies and the systematic review itself. However, meta-analysis cannot overcome issues related to underpowered or poorly designed studies.

Therefore, clinicians may still encounter situations where the evidence is weak or uncertain, and where higher-quality research is required to improve clinical decision-making. While such findings can be frustrating, they remain important for informing practice and highlighting the need for further research to fill gaps in the evidence base.

Methods and Assumptions in Meta-Analysis

Ensuring the credibility of findings is imperative in all types of research, including meta-analyses. To validate the outcomes of a meta-analysis, the researcher must confirm that the research techniques used were accurate in measuring the intended variables. Typically, researchers establish the validity of a meta-analysis by testing the outcomes for homogeneity or the degree of similarity between the results of the combined studies.

Homogeneity is preferred in meta-analyses as it allows the data to be combined without needing adjustments to suit the study’s requirements. To determine homogeneity, researchers assess heterogeneity, the opposite of homogeneity. Two widely used statistical methods for evaluating heterogeneity in research results are Cochran’s-Q and I-Square, also known as I-2 Index.

Difference Between Meta-Analysis and Systematic Reviews

Meta-analysis and systematic reviews are both research methods used to synthesise evidence from multiple studies on a particular topic. However, there are some key differences between the two.

Systematic reviews involve a comprehensive and structured approach to identifying, selecting, and critically appraising all available evidence relevant to a specific research question. This process involves searching multiple databases, screening the identified studies for relevance and quality, and summarizing the findings in a narrative report.

Meta-analysis, on the other hand, involves using statistical methods to combine and analyze the data from multiple studies, with the aim of producing a quantitative summary of the overall effect size. Meta-analysis requires the studies to be similar enough in terms of their design, methodology, and outcome measures to allow for meaningful comparison and analysis.

Therefore, systematic reviews are broader in scope and summarize the findings of all studies on a topic, while meta-analyses are more focused on producing a quantitative estimate of the effect size of an intervention across multiple studies that meet certain criteria. In some cases, a systematic review may be conducted without a meta-analysis if the studies are too diverse or the quality of the data is not sufficient to allow for statistical pooling.

Software Packages For Meta-Analysis

Meta-analysis can be done through software packages, including free and paid options. One of the most commonly used software packages for meta-analysis is RevMan by the Cochrane Collaboration.

Assessing the Quality of Meta-Analysis

Assessing the quality of a meta-analysis involves evaluating the methods used to conduct the analysis and the quality of the studies included. Here are some key factors to consider:

Study selection: The studies included in the meta-analysis should be relevant to the research question and meet predetermined criteria for quality.
Search strategy: The search strategy should be comprehensive and transparent, including databases and search terms used to identify relevant studies.
Study quality assessment: The quality of included studies should be assessed using appropriate tools, and this assessment should be reported in the meta-analysis.
Data extraction: The data extraction process should be systematic and clearly reported, including any discrepancies that arose.
Analysis methods: The meta-analysis should use appropriate statistical methods to combine the results of the included studies, and these methods should be transparently reported.
Publication bias: The potential for publication bias should be assessed and reported in the meta-analysis, including any efforts to identify and include unpublished studies.
Interpretation of results: The results should be interpreted in the context of the study limitations and the overall quality of the evidence.
Sensitivity analysis: Sensitivity analysis should be conducted to evaluate the impact of study quality, inclusion criteria, and other factors on the overall results.

Overall, a high-quality meta-analysis should be transparent in its methods and clearly report the included studies’ limitations and the evidence’s overall quality.

Hire an Expert Writer

Orders completed by our expert writers are

Formally drafted in an academic style
Free Amendments and 100% Plagiarism Free – or your money back!
100% Confidential and Timely Delivery!
Free anti-plagiarism report
Appreciated by thousands of clients. Check client reviews

Examples of Meta-Analysis

STANLEY T.D. et JARRELL S.B. (1989), « Meta-regression analysis : a quantitative method of literature surveys », Journal of Economics Surveys, vol. 3, n°2, pp. 161-170.
DATTA D.K., PINCHES G.E. et NARAYANAN V.K. (1992), « Factors influencing wealth creation from mergers and acquisitions : a meta-analysis », Strategic Management Journal, Vol. 13, pp. 67-84.
GLASS G. (1983), « Synthesising empirical research : Meta-analysis » in S.A. Ward and L.J. Reed (Eds), Knowledge structure and use : Implications for synthesis and interpretation, Philadelphia : Temple University Press.
WOLF F.M. (1986), Meta-analysis : Quantitative methods for research synthesis, Sage University Paper n°59.
HUNTER J.E., SCHMIDT F.L. et JACKSON G.B. (1982), « Meta-analysis : cumulating research findings across studies », Beverly Hills, CA : Sage.

Frequently Asked Questions

What is a meta-analysis in research.

Meta-analysis is a statistical method used to combine results from multiple studies on a specific topic. By pooling data from various sources, meta-analysis can provide a more precise estimate of the effect size of a treatment or intervention and identify areas for future research.

Why is meta-analysis important?

Meta-analysis is important because it combines and summarizes results from multiple studies to provide a more precise and reliable estimate of the effect of a treatment or intervention. This helps clinicians and policymakers make evidence-based decisions and identify areas for further research.

What is an example of a meta-analysis?

A meta-analysis of studies evaluating physical exercise’s effect on depression in adults is an example. Researchers gathered data from 49 studies involving a total of 2669 participants. The studies used different types of exercise and measures of depression, which made it difficult to compare the results.

Through meta-analysis, the researchers calculated an overall effect size and determined that exercise was associated with a statistically significant reduction in depression symptoms. The study also identified that moderate-intensity aerobic exercise, performed three to five times per week, was the most effective. The meta-analysis provided a more comprehensive understanding of the impact of exercise on depression than any single study could provide.

What is the definition of meta-analysis in clinical research?

Meta-analysis in clinical research is a statistical technique that combines data from multiple independent studies on a particular topic to generate a summary or “meta” estimate of the effect of a particular intervention or exposure.

This type of analysis allows researchers to synthesise the results of multiple studies, potentially increasing the statistical power and providing more precise estimates of treatment effects. Meta-analyses are commonly used in clinical research to evaluate the effectiveness and safety of medical interventions and to inform clinical practice guidelines.

Is meta-analysis qualitative or quantitative?

Meta-analysis is a quantitative method used to combine and analyze data from multiple studies. It involves the statistical synthesis of results from individual studies to obtain a pooled estimate of the effect size of a particular intervention or treatment. Therefore, meta-analysis is considered a quantitative approach to research synthesis.

Meta-analysis and the science of research synthesis

Jessica Gurevitch 1 ,
Julia Koricheva 2 ,
Shinichi Nakagawa 3 , 4 &
Gavin Stewart 5

Nature volume 555 , pages 175–182 ( 2018 ) Cite this article

57k Accesses

936 Citations

732 Altmetric

Metrics details

Biodiversity
Outcomes research

Meta-analysis is the quantitative, scientific synthesis of research results. Since the term and modern approaches to research synthesis were first introduced in the 1970s, meta-analysis has had a revolutionary effect in many scientific fields, helping to establish evidence-based practice and to resolve seemingly contradictory research outcomes. At the same time, its implementation has engendered criticism and controversy, in some cases general and others specific to particular disciplines. Here we take the opportunity provided by the recent fortieth anniversary of meta-analysis to reflect on the accomplishments, limitations, recent advances and directions for future developments in the field of research synthesis.

This is a preview of subscription content, access via your institution

Access options

Access Nature and 54 other Nature Portfolio journals

Get Nature+, our best-value online-access subscription

24,99 € / 30 days

cancel any time

Subscribe to this journal

Receive 51 print issues and online access

185,98 € per year

only 3,65 € per issue

Buy this article

Purchase on SpringerLink
Instant access to full article PDF

Prices may be subject to local taxes which are calculated during checkout

Eight problems with literature reviews and how to fix them

The past, present and future of Registered Reports

Raiders of the lost HARK: a reproducible inference framework for big data science

Jennions, M. D ., Lortie, C. J. & Koricheva, J. in The Handbook of Meta-analysis in Ecology and Evolution (eds Koricheva, J . et al.) Ch. 23 , 364–380 (Princeton Univ. Press, 2013)

Article Google Scholar

Roberts, P. D ., Stewart, G. B. & Pullin, A. S. Are review articles a reliable source of evidence to support conservation and environmental management? A comparison with medicine. Biol. Conserv. 132 , 409–423 (2006)

Bastian, H ., Glasziou, P . & Chalmers, I. Seventy-five trials and eleven systematic reviews a day: how will we ever keep up? PLoS Med. 7 , e1000326 (2010)

Article PubMed PubMed Central Google Scholar

Borman, G. D. & Grigg, J. A. in The Handbook of Research Synthesis and Meta-analysis 2nd edn (eds Cooper, H. M . et al.) 497–519 (Russell Sage Foundation, 2009)

Ioannidis, J. P. A. The mass production of redundant, misleading, and conflicted systematic reviews and meta-analyses. Milbank Q. 94 , 485–514 (2016)

Koricheva, J . & Gurevitch, J. Uses and misuses of meta-analysis in plant ecology. J. Ecol. 102 , 828–844 (2014)

Littell, J. H . & Shlonsky, A. Making sense of meta-analysis: a critique of “effectiveness of long-term psychodynamic psychotherapy”. Clin. Soc. Work J. 39 , 340–346 (2011)

Morrissey, M. B. Meta-analysis of magnitudes, differences and variation in evolutionary parameters. J. Evol. Biol. 29 , 1882–1904 (2016)

Article CAS PubMed Google Scholar

Whittaker, R. J. Meta-analyses and mega-mistakes: calling time on meta-analysis of the species richness-productivity relationship. Ecology 91 , 2522–2533 (2010)

Article PubMed Google Scholar

Begley, C. G . & Ellis, L. M. Drug development: Raise standards for preclinical cancer research. Nature 483 , 531–533 (2012); clarification 485 , 41 (2012)

Article CAS ADS PubMed Google Scholar

Hillebrand, H . & Cardinale, B. J. A critique for meta-analyses and the productivity-diversity relationship. Ecology 91 , 2545–2549 (2010)

Moher, D . et al. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med. 6 , e1000097 (2009). This paper provides a consensus regarding the reporting requirements for medical meta-analysis and has been highly influential in ensuring good reporting practice and standardizing language in evidence-based medicine, with further guidance for protocols, individual patient data meta-analyses and animal studies.

Moher, D . et al. Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015 statement. Syst. Rev. 4 , 1 (2015)

Nakagawa, S . & Santos, E. S. A. Methodological issues and advances in biological meta-analysis. Evol. Ecol. 26 , 1253–1274 (2012)

Nakagawa, S ., Noble, D. W. A ., Senior, A. M. & Lagisz, M. Meta-evaluation of meta-analysis: ten appraisal questions for biologists. BMC Biol. 15 , 18 (2017)

Hedges, L. & Olkin, I. Statistical Methods for Meta-analysis (Academic Press, 1985)

Viechtbauer, W. Conducting meta-analyses in R with the metafor package. J. Stat. Softw. 36 , 1–48 (2010)

Anzures-Cabrera, J . & Higgins, J. P. T. Graphical displays for meta-analysis: an overview with suggestions for practice. Res. Synth. Methods 1 , 66–80 (2010)

Egger, M ., Davey Smith, G ., Schneider, M. & Minder, C. Bias in meta-analysis detected by a simple, graphical test. Br. Med. J. 315 , 629–634 (1997)

Article CAS Google Scholar

Duval, S . & Tweedie, R. Trim and fill: a simple funnel-plot-based method of testing and adjusting for publication bias in meta-analysis. Biometrics 56 , 455–463 (2000)

Article CAS MATH PubMed Google Scholar

Leimu, R . & Koricheva, J. Cumulative meta-analysis: a new tool for detection of temporal trends and publication bias in ecology. Proc. R. Soc. Lond. B 271 , 1961–1966 (2004)

Higgins, J. P. T . & Green, S. (eds) Cochrane Handbook for Systematic Reviews of Interventions : Version 5.1.0 (Wiley, 2011). This large collaborative work provides definitive guidance for the production of systematic reviews in medicine and is of broad interest for methods development outside the medical field.

Lau, J ., Rothstein, H. R . & Stewart, G. B. in The Handbook of Meta-analysis in Ecology and Evolution (eds Koricheva, J . et al.) Ch. 25 , 407–419 (Princeton Univ. Press, 2013)

Lortie, C. J ., Stewart, G ., Rothstein, H. & Lau, J. How to critically read ecological meta-analyses. Res. Synth. Methods 6 , 124–133 (2015)

Murad, M. H . & Montori, V. M. Synthesizing evidence: shifting the focus from individual studies to the body of evidence. J. Am. Med. Assoc. 309 , 2217–2218 (2013)

Rasmussen, S. A ., Chu, S. Y ., Kim, S. Y ., Schmid, C. H . & Lau, J. Maternal obesity and risk of neural tube defects: a meta-analysis. Am. J. Obstet. Gynecol. 198 , 611–619 (2008)

Littell, J. H ., Campbell, M ., Green, S . & Toews, B. Multisystemic therapy for social, emotional, and behavioral problems in youth aged 10–17. Cochrane Database Syst. Rev. https://doi.org/10.1002/14651858.CD004797.pub4 (2005)

Schmidt, F. L. What do data really mean? Research findings, meta-analysis, and cumulative knowledge in psychology. Am. Psychol. 47 , 1173–1181 (1992)

Button, K. S . et al. Power failure: why small sample size undermines the reliability of neuroscience. Nat. Rev. Neurosci. 14 , 365–376 (2013); erratum 14 , 451 (2013)

Parker, T. H . et al. Transparency in ecology and evolution: real problems, real solutions. Trends Ecol. Evol. 31 , 711–719 (2016)

Stewart, G. Meta-analysis in applied ecology. Biol. Lett. 6 , 78–81 (2010)

Sutherland, W. J ., Pullin, A. S ., Dolman, P. M . & Knight, T. M. The need for evidence-based conservation. Trends Ecol. Evol. 19 , 305–308 (2004)

Lowry, E . et al. Biological invasions: a field synopsis, systematic review, and database of the literature. Ecol. Evol. 3 , 182–196 (2013)

Article PubMed Central Google Scholar

Parmesan, C . & Yohe, G. A globally coherent fingerprint of climate change impacts across natural systems. Nature 421 , 37–42 (2003)

Jennions, M. D ., Lortie, C. J . & Koricheva, J. in The Handbook of Meta-analysis in Ecology and Evolution (eds Koricheva, J . et al.) Ch. 24 , 381–403 (Princeton Univ. Press, 2013)

Balvanera, P . et al. Quantifying the evidence for biodiversity effects on ecosystem functioning and services. Ecol. Lett. 9 , 1146–1156 (2006)

Cardinale, B. J . et al. Effects of biodiversity on the functioning of trophic groups and ecosystems. Nature 443 , 989–992 (2006)

Rey Benayas, J. M ., Newton, A. C ., Diaz, A. & Bullock, J. M. Enhancement of biodiversity and ecosystem services by ecological restoration: a meta-analysis. Science 325 , 1121–1124 (2009)

Article ADS PubMed CAS Google Scholar

Leimu, R ., Mutikainen, P. I. A ., Koricheva, J. & Fischer, M. How general are positive relationships between plant population size, fitness and genetic variation? J. Ecol. 94 , 942–952 (2006)

Hillebrand, H. On the generality of the latitudinal diversity gradient. Am. Nat. 163 , 192–211 (2004)

Gurevitch, J. in The Handbook of Meta-analysis in Ecology and Evolution (eds Koricheva, J . et al.) Ch. 19 , 313–320 (Princeton Univ. Press, 2013)

Rustad, L . et al. A meta-analysis of the response of soil respiration, net nitrogen mineralization, and aboveground plant growth to experimental ecosystem warming. Oecologia 126 , 543–562 (2001)

Adams, D. C. Phylogenetic meta-analysis. Evolution 62 , 567–572 (2008)

Hadfield, J. D . & Nakagawa, S. General quantitative genetic methods for comparative biology: phylogenies, taxonomies and multi-trait models for continuous and categorical characters. J. Evol. Biol. 23 , 494–508 (2010)

Lajeunesse, M. J. Meta-analysis and the comparative phylogenetic method. Am. Nat. 174 , 369–381 (2009)

Rosenberg, M. S ., Adams, D. C . & Gurevitch, J. MetaWin: Statistical Software for Meta-Analysis with Resampling Tests Version 1 (Sinauer Associates, 1997)

Wallace, B. C . et al. OpenMEE: intuitive, open-source software for meta-analysis in ecology and evolutionary biology. Methods Ecol. Evol. 8 , 941–947 (2016)

Gurevitch, J ., Morrison, J. A . & Hedges, L. V. The interaction between competition and predation: a meta-analysis of field experiments. Am. Nat. 155 , 435–453 (2000)

Adams, D. C ., Gurevitch, J . & Rosenberg, M. S. Resampling tests for meta-analysis of ecological data. Ecology 78 , 1277–1283 (1997)

Gurevitch, J . & Hedges, L. V. Statistical issues in ecological meta-analyses. Ecology 80 , 1142–1149 (1999)

Schmid, C. H . & Mengersen, K. in The Handbook of Meta-analysis in Ecology and Evolution (eds Koricheva, J . et al.) Ch. 11 , 145–173 (Princeton Univ. Press, 2013)

Eysenck, H. J. Exercise in mega-silliness. Am. Psychol. 33 , 517 (1978)

Simberloff, D. Rejoinder to: Don’t calculate effect sizes; study ecological effects. Ecol. Lett. 9 , 921–922 (2006)

Cadotte, M. W ., Mehrkens, L. R . & Menge, D. N. L. Gauging the impact of meta-analysis on ecology. Evol. Ecol. 26 , 1153–1167 (2012)

Koricheva, J ., Jennions, M. D. & Lau, J. in The Handbook of Meta-analysis in Ecology and Evolution (eds Koricheva, J . et al.) Ch. 15 , 237–254 (Princeton Univ. Press, 2013)

Lau, J ., Ioannidis, J. P. A ., Terrin, N ., Schmid, C. H . & Olkin, I. The case of the misleading funnel plot. Br. Med. J. 333 , 597–600 (2006)

Vetter, D ., Rucker, G. & Storch, I. Meta-analysis: a need for well-defined usage in ecology and conservation biology. Ecosphere 4 , 1–24 (2013)

Mengersen, K ., Jennions, M. D. & Schmid, C. H. in The Handbook of Meta-analysis in Ecology and Evolution (eds Koricheva, J. et al.) Ch. 16 , 255–283 (Princeton Univ. Press, 2013)

Patsopoulos, N. A ., Analatos, A. A. & Ioannidis, J. P. A. Relative citation impact of various study designs in the health sciences. J. Am. Med. Assoc. 293 , 2362–2366 (2005)

Kueffer, C . et al. Fame, glory and neglect in meta-analyses. Trends Ecol. Evol. 26 , 493–494 (2011)

Cohnstaedt, L. W. & Poland, J. Review Articles: The black-market of scientific currency. Ann. Entomol. Soc. Am. 110 , 90 (2017)

Longo, D. L. & Drazen, J. M. Data sharing. N. Engl. J. Med. 374 , 276–277 (2016)

Gauch, H. G. Scientific Method in Practice (Cambridge Univ. Press, 2003)

Science Staff. Dealing with data: introduction. Challenges and opportunities. Science 331 , 692–693 (2011)

Nosek, B. A . et al. Promoting an open research culture. Science 348 , 1422–1425 (2015)

Article CAS ADS PubMed PubMed Central Google Scholar

Stewart, L. A . et al. Preferred reporting items for a systematic review and meta-analysis of individual participant data: the PRISMA-IPD statement. J. Am. Med. Assoc. 313 , 1657–1665 (2015)

Saldanha, I. J . et al. Evaluating Data Abstraction Assistant, a novel software application for data abstraction during systematic reviews: protocol for a randomized controlled trial. Syst. Rev. 5 , 196 (2016)

Tipton, E. & Pustejovsky, J. E. Small-sample adjustments for tests of moderators and model fit using robust variance estimation in meta-regression. J. Educ. Behav. Stat. 40 , 604–634 (2015)

Mengersen, K ., MacNeil, M. A . & Caley, M. J. The potential for meta-analysis to support decision analysis in ecology. Res. Synth. Methods 6 , 111–121 (2015)

Ashby, D. Bayesian statistics in medicine: a 25 year review. Stat. Med. 25 , 3589–3631 (2006)

Article MathSciNet PubMed Google Scholar

Senior, A. M . et al. Heterogeneity in ecological and evolutionary meta-analyses: its magnitude and implications. Ecology 97 , 3293–3299 (2016)

McAuley, L ., Pham, B ., Tugwell, P . & Moher, D. Does the inclusion of grey literature influence estimates of intervention effectiveness reported in meta-analyses? Lancet 356 , 1228–1231 (2000)

Koricheva, J ., Gurevitch, J . & Mengersen, K. (eds) The Handbook of Meta-Analysis in Ecology and Evolution (Princeton Univ. Press, 2013) This book provides the first comprehensive guide to undertaking meta-analyses in ecology and evolution and is also relevant to other fields where heterogeneity is expected, incorporating explicit consideration of the different approaches used in different domains.

Lumley, T. Network meta-analysis for indirect treatment comparisons. Stat. Med. 21 , 2313–2324 (2002)

Zarin, W . et al. Characteristics and knowledge synthesis approach for 456 network meta-analyses: a scoping review. BMC Med. 15 , 3 (2017)

Elliott, J. H . et al. Living systematic reviews: an emerging opportunity to narrow the evidence-practice gap. PLoS Med. 11 , e1001603 (2014)

Vandvik, P. O ., Brignardello-Petersen, R . & Guyatt, G. H. Living cumulative network meta-analysis to reduce waste in research: a paradigmatic shift for systematic reviews? BMC Med. 14 , 59 (2016)

Jarvinen, A. A meta-analytic study of the effects of female age on laying date and clutch size in the Great Tit Parus major and the Pied Flycatcher Ficedula hypoleuca . Ibis 133 , 62–67 (1991)

Arnqvist, G. & Wooster, D. Meta-analysis: synthesizing research findings in ecology and evolution. Trends Ecol. Evol. 10 , 236–240 (1995)

Hedges, L. V ., Gurevitch, J . & Curtis, P. S. The meta-analysis of response ratios in experimental ecology. Ecology 80 , 1150–1156 (1999)

Gurevitch, J ., Curtis, P. S. & Jones, M. H. Meta-analysis in ecology. Adv. Ecol. Res 32 , 199–247 (2001)

Lajeunesse, M. J. phyloMeta: a program for phylogenetic comparative analyses with meta-analysis. Bioinformatics 27 , 2603–2604 (2011)

CAS PubMed Google Scholar

Pearson, K. Report on certain enteric fever inoculation statistics. Br. Med. J. 2 , 1243–1246 (1904)

Fisher, R. A. Statistical Methods for Research Workers (Oliver and Boyd, 1925)

Yates, F. & Cochran, W. G. The analysis of groups of experiments. J. Agric. Sci. 28 , 556–580 (1938)

Cochran, W. G. The combination of estimates from different experiments. Biometrics 10 , 101–129 (1954)

Smith, M. L . & Glass, G. V. Meta-analysis of psychotherapy outcome studies. Am. Psychol. 32 , 752–760 (1977)

Glass, G. V. Meta-analysis at middle age: a personal history. Res. Synth. Methods 6 , 221–231 (2015)

Cooper, H. M ., Hedges, L. V . & Valentine, J. C. (eds) The Handbook of Research Synthesis and Meta-analysis 2nd edn (Russell Sage Foundation, 2009). This book is an important compilation that builds on the ground-breaking first edition to set the standard for best practice in meta-analysis, primarily in the social sciences but with applications to medicine and other fields.

Rosenthal, R. Meta-analytic Procedures for Social Research (Sage, 1991)

Hunter, J. E ., Schmidt, F. L. & Jackson, G. B. Meta-analysis: Cumulating Research Findings Across Studies (Sage, 1982)

Gurevitch, J ., Morrow, L. L ., Wallace, A . & Walsh, J. S. A meta-analysis of competition in field experiments. Am. Nat. 140 , 539–572 (1992). This influential early ecological meta-analysis reports multiple experimental outcomes on a longstanding and controversial topic that introduced a wide range of ecologists to research synthesis methods.

O’Rourke, K. An historical perspective on meta-analysis: dealing quantitatively with varying study results. J. R. Soc. Med. 100 , 579–582 (2007)

Shadish, W. R . & Lecy, J. D. The meta-analytic big bang. Res. Synth. Methods 6 , 246–264 (2015)

Glass, G. V. Primary, secondary, and meta-analysis of research. Educ. Res. 5 , 3–8 (1976)

DerSimonian, R . & Laird, N. Meta-analysis in clinical trials. Control. Clin. Trials 7 , 177–188 (1986)

Lipsey, M. W . & Wilson, D. B. The efficacy of psychological, educational, and behavioral treatment. Confirmation from meta-analysis. Am. Psychol. 48 , 1181–1209 (1993)

Chalmers, I. & Altman, D. G. Systematic Reviews (BMJ Publishing Group, 1995)

Moher, D . et al. Improving the quality of reports of meta-analyses of randomised controlled trials: the QUOROM statement. Quality of reporting of meta-analyses. Lancet 354 , 1896–1900 (1999)

Higgins, J. P. & Thompson, S. G. Quantifying heterogeneity in a meta-analysis. Stat. Med. 21 , 1539–1558 (2002)

Download references

Acknowledgements

We dedicate this Review to the memory of Ingram Olkin and William Shadish, founding members of the Society for Research Synthesis Methodology who made tremendous contributions to the development of meta-analysis and research synthesis and to the supervision of generations of students. We thank L. Lagisz for help in preparing the figures. We are grateful to the Center for Open Science and the Laura and John Arnold Foundation for hosting and funding a workshop, which was the origination of this article. S.N. is supported by Australian Research Council Future Fellowship (FT130100268). J.G. acknowledges funding from the US National Science Foundation (ABI 1262402).

Author information

Authors and affiliations.

Department of Ecology and Evolution, Stony Brook University, Stony Brook, 11794-5245, New York, USA

Jessica Gurevitch

School of Biological Sciences, Royal Holloway University of London, Egham, TW20 0EX, Surrey, UK

Julia Koricheva

Evolution and Ecology Research Centre and School of Biological, Earth and Environmental Sciences, University of New South Wales, Sydney, 2052, New South Wales, Australia

Shinichi Nakagawa

Diabetes and Metabolism Division, Garvan Institute of Medical Research, 384 Victoria Street, Darlinghurst, Sydney, 2010, New South Wales, Australia

School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne, NE1 7RU, UK

Gavin Stewart

You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed equally in designing the study and writing the manuscript, and so are listed alphabetically.

Corresponding authors

Correspondence to Jessica Gurevitch , Julia Koricheva , Shinichi Nakagawa or Gavin Stewart .

Ethics declarations

Competing interests.

The authors declare no competing financial interests.

Additional information

Reviewer Information Nature thanks D. Altman, M. Lajeunesse, D. Moher and G. Romero for their contribution to the peer review of this work.

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

PowerPoint slides

Powerpoint slide for fig. 1, rights and permissions.

Reprints and permissions

About this article

Cite this article.

Gurevitch, J., Koricheva, J., Nakagawa, S. et al. Meta-analysis and the science of research synthesis. Nature 555 , 175–182 (2018). https://doi.org/10.1038/nature25753

Download citation

Received : 04 March 2017

Accepted : 12 January 2018

Published : 08 March 2018

Issue Date : 08 March 2018

DOI : https://doi.org/10.1038/nature25753

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

Accelerating evidence synthesis for safety assessment through clinicaltrials.gov platform: a feasibility study.

BMC Medical Research Methodology (2024)

Investigate the relationship between the retraction reasons and the quality of methodology in non-Cochrane retracted systematic reviews: a systematic review

Azita Shahraki-Mohammadi
Leila Keikha
Razieh Zahedi

Systematic Reviews (2024)

A meta-analysis on global change drivers and the risk of infectious disease

Michael B. Mahon
Alexandra Sack
Jason R. Rohr

Nature (2024)

Systematic review of the uncertainty of coral reef futures under climate change

Shannon G. Klein
Cassandra Roch
Carlos M. Duarte

Nature Communications (2024)

Meta-analysis reveals weak associations between reef fishes and corals

Pooventhran Muruga
Alexandre C. Siqueira
David R. Bellwood

Nature Ecology & Evolution (2024)

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

Explore articles by subject
Guide to authors
Editorial policies

En español – ExME
Em português – EME

Systematic reviews vs meta-analysis: what’s the difference?

Posted on 24th July 2023 by Verónica Tanco Tellechea

You may hear the terms ‘systematic review’ and ‘meta-analysis being used interchangeably’. Although they are related, they are distinctly different. Learn more in this blog for beginners.

What is a systematic review?

According to Cochrane (1), a systematic review attempts to identify, appraise and synthesize all the empirical evidence to answer a specific research question. Thus, a systematic review is where you might find the most relevant, adequate, and current information regarding a specific topic. In the levels of evidence pyramid , systematic reviews are only surpassed by meta-analyses.

To conduct a systematic review, you will need, among other things:

A specific research question, usually in the form of a PICO question.
Pre-specified eligibility criteria, to decide which articles will be included or discarded from the review.
To follow a systematic method that will minimize bias.

You can find protocols that will guide you from both Cochrane and the Equator Network , among other places, and if you are a beginner to the topic then have a read of an overview about systematic reviews.

What is a meta-analysis?

A meta-analysis is a quantitative, epidemiological study design used to systematically assess the results of previous research (2) . Usually, they are based on randomized controlled trials, though not always. This means that a meta-analysis is a mathematical tool that allows researchers to mathematically combine outcomes from multiple studies.

When can a meta-analysis be implemented?

There is always the possibility of conducting a meta-analysis, yet, for it to throw the best possible results it should be performed when the studies included in the systematic review are of good quality, similar designs, and have similar outcome measures.

Why are meta-analyses important?

Outcomes from a meta-analysis may provide more precise information regarding the estimate of the effect of what is being studied because it merges outcomes from multiple studies. In a meta-analysis, data from various trials are combined and generate an average result (1), which is portrayed in a forest plot diagram. Moreover, meta-analysis also include a funnel plot diagram to visually detect publication bias.

Conclusions

A systematic review is an article that synthesizes available evidence on a certain topic utilizing a specific research question, pre-specified eligibility criteria for including articles, and a systematic method for its production. Whereas a meta-analysis is a quantitative, epidemiological study design used to assess the results of articles included in a systematic-review.


DEFINITION	Synthesis of empirical evidence regarding a specific research question	Statistical tool used with quantitative outcomes of various studies regarding a specific topic
RESULTS	Synthesizes relevant and current information regarding a specific research question (qualitative).	Merges multiple outcomes from different researches and provides an average result (quantitative).

Remember: All meta-analyses involve a systematic review, but not all systematic reviews involve a meta-analysis.

If you would like some further reading on this topic, we suggest the following:

The systematic review – a S4BE blog article

Meta-analysis: what, why, and how – a S4BE blog article

The difference between a systematic review and a meta-analysis – a blog article via Covidence

Systematic review vs meta-analysis: what’s the difference? A 5-minute video from Research Masterminds:

About Cochrane reviews [Internet]. Cochranelibrary.com. [cited 2023 Apr 30]. Available from: https://www.cochranelibrary.com/about/about-cochrane-reviews
Haidich AB. Meta-analysis in medical research. Hippokratia. 2010;14(Suppl 1):29–37.

Verónica Tanco Tellechea

Subscribe to our newsletter

You will receive our monthly newsletter and free access to Trip Premium.

How to read a funnel plot

This blog introduces you to funnel plots, guiding you through how to read them and what may cause them to look asymmetrical.

Heterogeneity in meta-analysis

When you bring studies together in a meta-analysis, one of the things you need to consider is the variability in your studies – this is called heterogeneity. This blog presents the three types of heterogeneity, considers the different types of outcome data, and delves a little more into dealing with the variations.

Natural killer cells in glioblastoma therapy

As seen in a previous blog from Davide, modern neuroscience often interfaces with other medical specialities. In this blog, he provides a summary of new evidence about the potential of a therapeutic strategy born at the crossroad between neurology, immunology and oncology.

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Publications
Account settings
My Bibliography
Collections
Citation manager

Save citation to file

Email citation, add to collections.

Create a new collection
Add to an existing collection

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed.

Search in PubMed
Search in NLM Catalog
Add to Search

Case study meta-analysis in the social sciences. Insights on data quality and reliability from a large-N case survey

Affiliations.

1 Research Group on Ecological Economics, Carl von Ossietzky University of Oldenburg, Oldenburg, Germany.
2 Research Group Governance and Sustainability, Leuphana University of Lüneburg, Lüneburg, Germany.
3 Waterways Centre for Freshwater Management, University of Canterbury, Christchurch, New Zealand.
4 IRS Leibniz Institute for Research on Society and Space, Erkner, Germany.
5 Faculty of Sustainability, Leuphana University of Lüneburg, Lüneburg, Germany.
PMID: 34318609
DOI: 10.1002/jrsm.1514

Meta-analytical methods face particular challenges in research fields such as social and political research, where studies often rest primarily on qualitative and case study research. In such contexts, where research findings are less standardized and amenable to structured synthesis, the case survey method has been proposed as a means of data generation and analysis. The method offers a meta-analytical tool to synthesize larger numbers of qualitative case studies, yielding data amenable to large-N analysis. However, resulting data is prone to specific threats to validity, including biases due to publication type, rater behaviour, and variable characteristics, which researchers need to be aware of. While these biases are well known in theory, and typically explored for primary research, their prevalence in case survey meta-analyses remains relatively unexplored. We draw on a case survey of 305 published qualitative case studies of public environmental decision-making, and systematically analyze these biases in the resultant data. Our findings indicate that case surveys can deliver high-quality and reliable results. However, we also find that these biases do indeed occur, albeit to a small degree or under specific conditions of complexity. We identify a number of design choices to mitigate biases that may threaten validity in case survey meta-analysis. Our findings are of importance to those using the case survey method - and to those who might apply insights derived by this method to inform policy and practice.

Keywords: case survey method; evidence-based governance; inter-rater reliability; meta-analysis; publication bias.

PubMed Disclaimer

Publication types

Search in MeSH

Grants and funding

263859/FP7 Ideas: European Research Council

LinkOut - more resources

Full text sources.

Citation Manager

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

Jump to navigation

Cochrane Training

Chapter 10: analysing data and undertaking meta-analyses.

Jonathan J Deeks, Julian PT Higgins, Douglas G Altman; on behalf of the Cochrane Statistical Methods Group

Key Points:

Meta-analysis is the statistical combination of results from two or more separate studies.
Potential advantages of meta-analyses include an improvement in precision, the ability to answer questions not posed by individual studies, and the opportunity to settle controversies arising from conflicting claims. However, they also have the potential to mislead seriously, particularly if specific study designs, within-study biases, variation across studies, and reporting biases are not carefully considered.
It is important to be familiar with the type of data (e.g. dichotomous, continuous) that result from measurement of an outcome in an individual study, and to choose suitable effect measures for comparing intervention groups.
Most meta-analysis methods are variations on a weighted average of the effect estimates from the different studies.
Studies with no events contribute no information about the risk ratio or odds ratio. For rare events, the Peto method has been observed to be less biased and more powerful than other methods.
Variation across studies (heterogeneity) must be considered, although most Cochrane Reviews do not have enough studies to allow for the reliable investigation of its causes. Random-effects meta-analyses allow for heterogeneity by assuming that underlying effects follow a normal distribution, but they must be interpreted carefully. Prediction intervals from random-effects meta-analyses are a useful device for presenting the extent of between-study variation.
Many judgements are required in the process of preparing a meta-analysis. Sensitivity analyses should be used to examine whether overall findings are robust to potentially influential decisions.

Cite this chapter as: Deeks JJ, Higgins JPT, Altman DG (editors). Chapter 10: Analysing data and undertaking meta-analyses. In: Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, Welch VA (editors). Cochrane Handbook for Systematic Reviews of Interventions version 6.4 (updated August 2023). Cochrane, 2023. Available from www.training.cochrane.org/handbook .

10.1 Do not start here!

It can be tempting to jump prematurely into a statistical analysis when undertaking a systematic review. The production of a diamond at the bottom of a plot is an exciting moment for many authors, but results of meta-analyses can be very misleading if suitable attention has not been given to formulating the review question; specifying eligibility criteria; identifying and selecting studies; collecting appropriate data; considering risk of bias; planning intervention comparisons; and deciding what data would be meaningful to analyse. Review authors should consult the chapters that precede this one before a meta-analysis is undertaken.

10.2 Introduction to meta-analysis

An important step in a systematic review is the thoughtful consideration of whether it is appropriate to combine the numerical results of all, or perhaps some, of the studies. Such a meta-analysis yields an overall statistic (together with its confidence interval) that summarizes the effectiveness of an experimental intervention compared with a comparator intervention. Potential advantages of meta-analyses include the following:

T o improve precision . Many studies are too small to provide convincing evidence about intervention effects in isolation. Estimation is usually improved when it is based on more information.
To answer questions not posed by the individual studies . Primary studies often involve a specific type of participant and explicitly defined interventions. A selection of studies in which these characteristics differ can allow investigation of the consistency of effect across a wider range of populations and interventions. It may also, if relevant, allow reasons for differences in effect estimates to be investigated.
To settle controversies arising from apparently conflicting studies or to generate new hypotheses . Statistical synthesis of findings allows the degree of conflict to be formally assessed, and reasons for different results to be explored and quantified.

Of course, the use of statistical synthesis methods does not guarantee that the results of a review are valid, any more than it does for a primary study. Moreover, like any tool, statistical methods can be misused.

This chapter describes the principles and methods used to carry out a meta-analysis for a comparison of two interventions for the main types of data encountered. The use of network meta-analysis to compare more than two interventions is addressed in Chapter 11 . Formulae for most of the methods described are provided in the RevMan Web Knowledge Base under Statistical Algorithms and calculations used in Review Manager (documentation.cochrane.org/revman-kb/statistical-methods-210600101.html), and a longer discussion of many of the issues is available ( Deeks et al 2001 ).

10.2.1 Principles of meta-analysis

The commonly used methods for meta-analysis follow the following basic principles:

Meta-analysis is typically a two-stage process. In the first stage, a summary statistic is calculated for each study, to describe the observed intervention effect in the same way for every study. For example, the summary statistic may be a risk ratio if the data are dichotomous, or a difference between means if the data are continuous (see Chapter 6 ).

The combination of intervention effect estimates across studies may optionally incorporate an assumption that the studies are not all estimating the same intervention effect, but estimate intervention effects that follow a distribution across studies. This is the basis of a random-effects meta-analysis (see Section 10.10.4 ). Alternatively, if it is assumed that each study is estimating exactly the same quantity, then a fixed-effect meta-analysis is performed.
The standard error of the summary intervention effect can be used to derive a confidence interval, which communicates the precision (or uncertainty) of the summary estimate; and to derive a P value, which communicates the strength of the evidence against the null hypothesis of no intervention effect.
As well as yielding a summary quantification of the intervention effect, all methods of meta-analysis can incorporate an assessment of whether the variation among the results of the separate studies is compatible with random variation, or whether it is large enough to indicate inconsistency of intervention effects across studies (see Section 10.10 ).
The problem of missing data is one of the numerous practical considerations that must be thought through when undertaking a meta-analysis. In particular, review authors should consider the implications of missing outcome data from individual participants (due to losses to follow-up or exclusions from analysis) (see Section 10.12 ).

Meta-analyses are usually illustrated using a forest plot . An example appears in Figure 10.2.a . A forest plot displays effect estimates and confidence intervals for both individual studies and meta-analyses (Lewis and Clarke 2001). Each study is represented by a block at the point estimate of intervention effect with a horizontal line extending either side of the block. The area of the block indicates the weight assigned to that study in the meta-analysis while the horizontal line depicts the confidence interval (usually with a 95% level of confidence). The area of the block and the confidence interval convey similar information, but both make different contributions to the graphic. The confidence interval depicts the range of intervention effects compatible with the study’s result. The size of the block draws the eye towards the studies with larger weight (usually those with narrower confidence intervals), which dominate the calculation of the summary result, presented as a diamond at the bottom.

Figure 10.2.a Example of a forest plot from a review of interventions to promote ownership of smoke alarms (DiGuiseppi and Higgins 2001). Reproduced with permission of John Wiley & Sons

10.3 A generic inverse-variance approach to meta-analysis

A very common and simple version of the meta-analysis procedure is commonly referred to as the inverse-variance method . This approach is implemented in its most basic form in RevMan, and is used behind the scenes in many meta-analyses of both dichotomous and continuous data.

The inverse-variance method is so named because the weight given to each study is chosen to be the inverse of the variance of the effect estimate (i.e. 1 over the square of its standard error). Thus, larger studies, which have smaller standard errors, are given more weight than smaller studies, which have larger standard errors. This choice of weights minimizes the imprecision (uncertainty) of the pooled effect estimate.

10.3.1 Fixed-effect method for meta-analysis

A fixed-effect meta-analysis using the inverse-variance method calculates a weighted average as:

where Y i is the intervention effect estimated in the i th study, SE i is the standard error of that estimate, and the summation is across all studies. The basic data required for the analysis are therefore an estimate of the intervention effect and its standard error from each study. A fixed-effect meta-analysis is valid under an assumption that all effect estimates are estimating the same underlying intervention effect, which is referred to variously as a ‘fixed-effect’ assumption, a ‘common-effect’ assumption or an ‘equal-effects’ assumption. However, the result of the meta-analysis can be interpreted without making such an assumption (Rice et al 2018).

10.3.2 Random-effects methods for meta-analysis

A variation on the inverse-variance method is to incorporate an assumption that the different studies are estimating different, yet related, intervention effects (Higgins et al 2009). This produces a random-effects meta-analysis, and the simplest version is known as the DerSimonian and Laird method (DerSimonian and Laird 1986). Random-effects meta-analysis is discussed in detail in Section 10.10.4 .

10.3.3 Performing inverse-variance meta-analyses

Most meta-analysis programs perform inverse-variance meta-analyses. Usually the user provides summary data from each intervention arm of each study, such as a 2×2 table when the outcome is dichotomous (see Chapter 6, Section 6.4 ), or means, standard deviations and sample sizes for each group when the outcome is continuous (see Chapter 6, Section 6.5 ). This avoids the need for the author to calculate effect estimates, and allows the use of methods targeted specifically at different types of data (see Sections 10.4 and 10.5 ).

When the data are conveniently available as summary statistics from each intervention group, the inverse-variance method can be implemented directly. For example, estimates and their standard errors may be entered directly into RevMan under the ‘Generic inverse variance’ outcome type. For ratio measures of intervention effect, the data must be entered into RevMan as natural logarithms (for example, as a log odds ratio and the standard error of the log odds ratio). However, it is straightforward to instruct the software to display results on the original (e.g. odds ratio) scale. It is possible to supplement or replace this with a column providing the sample sizes in the two groups. Note that the ability to enter estimates and standard errors creates a high degree of flexibility in meta-analysis. It facilitates the analysis of properly analysed crossover trials, cluster-randomized trials and non-randomized trials (see Chapter 23 ), as well as outcome data that are ordinal, time-to-event or rates (see Chapter 6 ).

10.4 Meta-analysis of dichotomous outcomes

There are four widely used methods of meta-analysis for dichotomous outcomes, three fixed-effect methods (Mantel-Haenszel, Peto and inverse variance) and one random-effects method (DerSimonian and Laird inverse variance). All of these methods are available as analysis options in RevMan. The Peto method can only combine odds ratios, whilst the other three methods can combine odds ratios, risk ratios or risk differences. Formulae for all of the meta-analysis methods are available elsewhere (Deeks et al 2001).

Note that having no events in one group (sometimes referred to as ‘zero cells’) causes problems with computation of estimates and standard errors with some methods: see Section 10.4.4 .

10.4.1 Mantel-Haenszel methods

When data are sparse, either in terms of event risks being low or study size being small, the estimates of the standard errors of the effect estimates that are used in the inverse-variance methods may be poor. Mantel-Haenszel methods are fixed-effect meta-analysis methods using a different weighting scheme that depends on which effect measure (e.g. risk ratio, odds ratio, risk difference) is being used (Mantel and Haenszel 1959, Greenland and Robins 1985). They have been shown to have better statistical properties when there are few events. As this is a common situation in Cochrane Reviews, the Mantel-Haenszel method is generally preferable to the inverse variance method in fixed-effect meta-analyses. In other situations the two methods give similar estimates.

10.4.2 Peto odds ratio method

Peto’s method can only be used to combine odds ratios (Yusuf et al 1985). It uses an inverse-variance approach, but uses an approximate method of estimating the log odds ratio, and uses different weights. An alternative way of viewing the Peto method is as a sum of ‘O – E’ statistics. Here, O is the observed number of events and E is an expected number of events in the experimental intervention group of each study under the null hypothesis of no intervention effect.

The approximation used in the computation of the log odds ratio works well when intervention effects are small (odds ratios are close to 1), events are not particularly common and the studies have similar numbers in experimental and comparator groups. In other situations it has been shown to give biased answers. As these criteria are not always fulfilled, Peto’s method is not recommended as a default approach for meta-analysis.

Corrections for zero cell counts are not necessary when using Peto’s method. Perhaps for this reason, this method performs well when events are very rare (Bradburn et al 2007); see Section 10.4.4.1 . Also, Peto’s method can be used to combine studies with dichotomous outcome data with studies using time-to-event analyses where log-rank tests have been used (see Section 10.9 ).

10.4.3 Which effect measure for dichotomous outcomes?

Effect measures for dichotomous data are described in Chapter 6, Section 6.4.1 . The effect of an intervention can be expressed as either a relative or an absolute effect. The risk ratio (relative risk) and odds ratio are relative measures, while the risk difference and number needed to treat for an additional beneficial outcome are absolute measures. A further complication is that there are, in fact, two risk ratios. We can calculate the risk ratio of an event occurring or the risk ratio of no event occurring. These give different summary results in a meta-analysis, sometimes dramatically so.

The selection of a summary statistic for use in meta-analysis depends on balancing three criteria (Deeks 2002). First, we desire a summary statistic that gives values that are similar for all the studies in the meta-analysis and subdivisions of the population to which the interventions will be applied. The more consistent the summary statistic, the greater is the justification for expressing the intervention effect as a single summary number. Second, the summary statistic must have the mathematical properties required to perform a valid meta-analysis. Third, the summary statistic would ideally be easily understood and applied by those using the review. The summary intervention effect should be presented in a way that helps readers to interpret and apply the results appropriately. Among effect measures for dichotomous data, no single measure is uniformly best, so the choice inevitably involves a compromise.

Consistency Empirical evidence suggests that relative effect measures are, on average, more consistent than absolute measures (Engels et al 2000, Deeks 2002, Rücker et al 2009). For this reason, it is wise to avoid performing meta-analyses of risk differences, unless there is a clear reason to suspect that risk differences will be consistent in a particular clinical situation. On average there is little difference between the odds ratio and risk ratio in terms of consistency (Deeks 2002). When the study aims to reduce the incidence of an adverse event, there is empirical evidence that risk ratios of the adverse event are more consistent than risk ratios of the non-event (Deeks 2002). Selecting an effect measure based on what is the most consistent in a particular situation is not a generally recommended strategy, since it may lead to a selection that spuriously maximizes the precision of a meta-analysis estimate.

Mathematical properties The most important mathematical criterion is the availability of a reliable variance estimate. The number needed to treat for an additional beneficial outcome does not have a simple variance estimator and cannot easily be used directly in meta-analysis, although it can be computed from the meta-analysis result afterwards (see Chapter 15, Section 15.4.2 ). There is no consensus regarding the importance of two other often-cited mathematical properties: the fact that the behaviour of the odds ratio and the risk difference do not rely on which of the two outcome states is coded as the event, and the odds ratio being the only statistic which is unbounded (see Chapter 6, Section 6.4.1 ).

Ease of interpretation The odds ratio is the hardest summary statistic to understand and to apply in practice, and many practising clinicians report difficulties in using them. There are many published examples where authors have misinterpreted odds ratios from meta-analyses as risk ratios. Although odds ratios can be re-expressed for interpretation (as discussed here), there must be some concern that routine presentation of the results of systematic reviews as odds ratios will lead to frequent over-estimation of the benefits and harms of interventions when the results are applied in clinical practice. Absolute measures of effect are thought to be more easily interpreted by clinicians than relative effects (Sinclair and Bracken 1994), and allow trade-offs to be made between likely benefits and likely harms of interventions. However, they are less likely to be generalizable.

It is generally recommended that meta-analyses are undertaken using risk ratios (taking care to make a sensible choice over which category of outcome is classified as the event) or odds ratios. This is because it seems important to avoid using summary statistics for which there is empirical evidence that they are unlikely to give consistent estimates of intervention effects (the risk difference), and it is impossible to use statistics for which meta-analysis cannot be performed (the number needed to treat for an additional beneficial outcome). It may be wise to plan to undertake a sensitivity analysis to investigate whether choice of summary statistic (and selection of the event category) is critical to the conclusions of the meta-analysis (see Section 10.14 ).

It is often sensible to use one statistic for meta-analysis and to re-express the results using a second, more easily interpretable statistic. For example, often meta-analysis may be best performed using relative effect measures (risk ratios or odds ratios) and the results re-expressed using absolute effect measures (risk differences or numbers needed to treat for an additional beneficial outcome – see Chapter 15, Section 15.4 . This is one of the key motivations for ‘Summary of findings’ tables in Cochrane Reviews: see Chapter 14 ). If odds ratios are used for meta-analysis they can also be re-expressed as risk ratios (see Chapter 15, Section 15.4 ). In all cases the same formulae can be used to convert upper and lower confidence limits. However, all of these transformations require specification of a value of baseline risk that indicates the likely risk of the outcome in the ‘control’ population to which the experimental intervention will be applied. Where the chosen value for this assumed comparator group risk is close to the typical observed comparator group risks across the studies, similar estimates of absolute effect will be obtained regardless of whether odds ratios or risk ratios are used for meta-analysis. Where the assumed comparator risk differs from the typical observed comparator group risk, the predictions of absolute benefit will differ according to which summary statistic was used for meta-analysis.

10.4.4 Meta-analysis of rare events

For rare outcomes, meta-analysis may be the only way to obtain reliable evidence of the effects of healthcare interventions. Individual studies are usually under-powered to detect differences in rare outcomes, but a meta-analysis of many studies may have adequate power to investigate whether interventions do have an impact on the incidence of the rare event. However, many methods of meta-analysis are based on large sample approximations, and are unsuitable when events are rare. Thus authors must take care when selecting a method of meta-analysis (Efthimiou 2018).

There is no single risk at which events are classified as ‘rare’. Certainly risks of 1 in 1000 constitute rare events, and many would classify risks of 1 in 100 the same way. However, the performance of methods when risks are as high as 1 in 10 may also be affected by the issues discussed in this section. What is typical is that a high proportion of the studies in the meta-analysis observe no events in one or more study arms.

10.4.4.1 Studies with no events in one or more arms

Computational problems can occur when no events are observed in one or both groups in an individual study. Inverse variance meta-analytical methods involve computing an intervention effect estimate and its standard error for each study. For studies where no events were observed in one or both arms, these computations often involve dividing by a zero count, which yields a computational error. Most meta-analytical software routines (including those in RevMan) automatically check for problematic zero counts, and add a fixed value (typically 0.5) to all cells of a 2×2 table where the problems occur. The Mantel-Haenszel methods require zero-cell corrections only if the same cell is zero in all the included studies, and hence need to use the correction less often. However, in many software applications the same correction rules are applied for Mantel-Haenszel methods as for the inverse-variance methods. Odds ratio and risk ratio methods require zero cell corrections more often than difference methods, except for the Peto odds ratio method, which encounters computation problems only in the extreme situation of no events occurring in all arms of all studies.

Whilst the fixed correction meets the objective of avoiding computational errors, it usually has the undesirable effect of biasing study estimates towards no difference and over-estimating variances of study estimates (consequently down-weighting inappropriately their contribution to the meta-analysis). Where the sizes of the study arms are unequal (which occurs more commonly in non-randomized studies than randomized trials), they will introduce a directional bias in the treatment effect. Alternative non-fixed zero-cell corrections have been explored by Sweeting and colleagues, including a correction proportional to the reciprocal of the size of the contrasting study arm, which they found preferable to the fixed 0.5 correction when arm sizes were not balanced (Sweeting et al 2004).

10.4.4.2 Studies with no events in either arm

The standard practice in meta-analysis of odds ratios and risk ratios is to exclude studies from the meta-analysis where there are no events in both arms. This is because such studies do not provide any indication of either the direction or magnitude of the relative treatment effect. Whilst it may be clear that events are very rare on both the experimental intervention and the comparator intervention, no information is provided as to which group is likely to have the higher risk, or on whether the risks are of the same or different orders of magnitude (when risks are very low, they are compatible with very large or very small ratios). Whilst one might be tempted to infer that the risk would be lowest in the group with the larger sample size (as the upper limit of the confidence interval would be lower), this is not justified as the sample size allocation was determined by the study investigators and is not a measure of the incidence of the event.

Risk difference methods superficially appear to have an advantage over odds ratio methods in that the risk difference is defined (as zero) when no events occur in either arm. Such studies are therefore included in the estimation process. Bradburn and colleagues undertook simulation studies which revealed that all risk difference methods yield confidence intervals that are too wide when events are rare, and have associated poor statistical power, which make them unsuitable for meta-analysis of rare events (Bradburn et al 2007). This is especially relevant when outcomes that focus on treatment safety are being studied, as the ability to identify correctly (or attempt to refute) serious adverse events is a key issue in drug development.

It is likely that outcomes for which no events occur in either arm may not be mentioned in reports of many randomized trials, precluding their inclusion in a meta-analysis. It is unclear, though, when working with published results, whether failure to mention a particular adverse event means there were no such events, or simply that such events were not included as a measured endpoint. Whilst the results of risk difference meta-analyses will be affected by non-reporting of outcomes with no events, odds and risk ratio based methods naturally exclude these data whether or not they are published, and are therefore unaffected.

10.4.4.3 Validity of methods of meta-analysis for rare events

Simulation studies have revealed that many meta-analytical methods can give misleading results for rare events, which is unsurprising given their reliance on asymptotic statistical theory. Their performance has been judged suboptimal either through results being biased, confidence intervals being inappropriately wide, or statistical power being too low to detect substantial differences.

In the following we consider the choice of statistical method for meta-analyses of odds ratios. Appropriate choices appear to depend on the comparator group risk, the likely size of the treatment effect and consideration of balance in the numbers of experimental and comparator participants in the constituent studies. We are not aware of research that has evaluated risk ratio measures directly, but their performance is likely to be very similar to corresponding odds ratio measurements. When events are rare, estimates of odds and risks are near identical, and results of both can be interpreted as ratios of probabilities.

Bradburn and colleagues found that many of the most commonly used meta-analytical methods were biased when events were rare (Bradburn et al 2007). The bias was greatest in inverse variance and DerSimonian and Laird odds ratio and risk difference methods, and the Mantel-Haenszel odds ratio method using a 0.5 zero-cell correction. As already noted, risk difference meta-analytical methods tended to show conservative confidence interval coverage and low statistical power when risks of events were low.

At event rates below 1% the Peto one-step odds ratio method was found to be the least biased and most powerful method, and provided the best confidence interval coverage, provided there was no substantial imbalance between treatment and comparator group sizes within studies, and treatment effects were not exceptionally large. This finding was consistently observed across three different meta-analytical scenarios, and was also observed by Sweeting and colleagues (Sweeting et al 2004).

This finding was noted despite the method producing only an approximation to the odds ratio. For very large effects (e.g. risk ratio=0.2) when the approximation is known to be poor, treatment effects were under-estimated, but the Peto method still had the best performance of all the methods considered for event risks of 1 in 1000, and the bias was never more than 6% of the comparator group risk.

In other circumstances (i.e. event risks above 1%, very large effects at event risks around 1%, and meta-analyses where many studies were substantially imbalanced) the best performing methods were the Mantel-Haenszel odds ratio without zero-cell corrections, logistic regression and an exact method. None of these methods is available in RevMan.

Methods that should be avoided with rare events are the inverse-variance methods (including the DerSimonian and Laird random-effects method) (Efthimiou 2018). These directly incorporate the study’s variance in the estimation of its contribution to the meta-analysis, but these are usually based on a large-sample variance approximation, which was not intended for use with rare events. We would suggest that incorporation of heterogeneity into an estimate of a treatment effect should be a secondary consideration when attempting to produce estimates of effects from sparse data – the primary concern is to discern whether there is any signal of an effect in the data.

10.5 Meta-analysis of continuous outcomes

An important assumption underlying standard methods for meta-analysis of continuous data is that the outcomes have a normal distribution in each intervention arm in each study. This assumption may not always be met, although it is unimportant in very large studies. It is useful to consider the possibility of skewed data (see Section 10.5.3 ).

10.5.1 Which effect measure for continuous outcomes?

The two summary statistics commonly used for meta-analysis of continuous data are the mean difference (MD) and the standardized mean difference (SMD). Other options are available, such as the ratio of means (see Chapter 6, Section 6.5.1 ). Selection of summary statistics for continuous data is principally determined by whether studies all report the outcome using the same scale (when the mean difference can be used) or using different scales (when the standardized mean difference is usually used). The ratio of means can be used in either situation, but is appropriate only when outcome measurements are strictly greater than zero. Further considerations in deciding on an effect measure that will facilitate interpretation of the findings appears in Chapter 15, Section 15.5 .

The different roles played in MD and SMD approaches by the standard deviations (SDs) of outcomes observed in the two groups should be understood.

For the mean difference approach, the SDs are used together with the sample sizes to compute the weight given to each study. Studies with small SDs are given relatively higher weight whilst studies with larger SDs are given relatively smaller weights. This is appropriate if variation in SDs between studies reflects differences in the reliability of outcome measurements, but is probably not appropriate if the differences in SD reflect real differences in the variability of outcomes in the study populations.

For the standardized mean difference approach, the SDs are used to standardize the mean differences to a single scale, as well as in the computation of study weights. Thus, studies with small SDs lead to relatively higher estimates of SMD, whilst studies with larger SDs lead to relatively smaller estimates of SMD. For this to be appropriate, it must be assumed that between-study variation in SDs reflects only differences in measurement scales and not differences in the reliability of outcome measures or variability among study populations, as discussed in Chapter 6, Section 6.5.1.2 .

These assumptions of the methods should be borne in mind when unexpected variation of SDs is observed across studies.

10.5.2 Meta-analysis of change scores

In some circumstances an analysis based on changes from baseline will be more efficient and powerful than comparison of post-intervention values, as it removes a component of between-person variability from the analysis. However, calculation of a change score requires measurement of the outcome twice and in practice may be less efficient for outcomes that are unstable or difficult to measure precisely, where the measurement error may be larger than true between-person baseline variability. Change-from-baseline outcomes may also be preferred if they have a less skewed distribution than post-intervention measurement outcomes. Although sometimes used as a device to ‘correct’ for unlucky randomization, this practice is not recommended.

The preferred statistical approach to accounting for baseline measurements of the outcome variable is to include the baseline outcome measurements as a covariate in a regression model or analysis of covariance (ANCOVA). These analyses produce an ‘adjusted’ estimate of the intervention effect together with its standard error. These analyses are the least frequently encountered, but as they give the most precise and least biased estimates of intervention effects they should be included in the analysis when they are available. However, they can only be included in a meta-analysis using the generic inverse-variance method, since means and SDs are not available for each intervention group separately.

In practice an author is likely to discover that the studies included in a review include a mixture of change-from-baseline and post-intervention value scores. However, mixing of outcomes is not a problem when it comes to meta-analysis of MDs. There is no statistical reason why studies with change-from-baseline outcomes should not be combined in a meta-analysis with studies with post-intervention measurement outcomes when using the (unstandardized) MD method. In a randomized study, MD based on changes from baseline can usually be assumed to be addressing exactly the same underlying intervention effects as analyses based on post-intervention measurements. That is to say, the difference in mean post-intervention values will on average be the same as the difference in mean change scores. If the use of change scores does increase precision, appropriately, the studies presenting change scores will be given higher weights in the analysis than they would have received if post-intervention values had been used, as they will have smaller SDs.

When combining the data on the MD scale, authors must be careful to use the appropriate means and SDs (either of post-intervention measurements or of changes from baseline) for each study. Since the mean values and SDs for the two types of outcome may differ substantially, it may be advisable to place them in separate subgroups to avoid confusion for the reader, but the results of the subgroups can legitimately be pooled together.

In contrast, post-intervention value and change scores should not in principle be combined using standard meta-analysis approaches when the effect measure is an SMD. This is because the SDs used in the standardization reflect different things. The SD when standardizing post-intervention values reflects between-person variability at a single point in time. The SD when standardizing change scores reflects variation in between-person changes over time, so will depend on both within-person and between-person variability; within-person variability in turn is likely to depend on the length of time between measurements. Nevertheless, an empirical study of 21 meta-analyses in osteoarthritis did not find a difference between combined SMDs based on post-intervention values and combined SMDs based on change scores (da Costa et al 2013). One option is to standardize SMDs using post-intervention SDs rather than change score SDs. This would lead to valid synthesis of the two approaches, but we are not aware that an appropriate standard error for this has been derived.

A common practical problem associated with including change-from-baseline measures is that the SD of changes is not reported. Imputation of SDs is discussed in Chapter 6, Section 6.5.2.8 .

10.5.3 Meta-analysis of skewed data

Analyses based on means are appropriate for data that are at least approximately normally distributed, and for data from very large trials. If the true distribution of outcomes is asymmetrical, then the data are said to be skewed. Review authors should consider the possibility and implications of skewed data when analysing continuous outcomes (see MECIR Box 10.5.a ). Skew can sometimes be diagnosed from the means and SDs of the outcomes. A rough check is available, but it is only valid if a lowest or highest possible value for an outcome is known to exist. Thus, the check may be used for outcomes such as weight, volume and blood concentrations, which have lowest possible values of 0, or for scale outcomes with minimum or maximum scores, but it may not be appropriate for change-from-baseline measures. The check involves calculating the observed mean minus the lowest possible value (or the highest possible value minus the observed mean), and dividing this by the SD. A ratio less than 2 suggests skew (Altman and Bland 1996). If the ratio is less than 1, there is strong evidence of a skewed distribution.

Transformation of the original outcome data may reduce skew substantially. Reports of trials may present results on a transformed scale, usually a log scale. Collection of appropriate data summaries from the trialists, or acquisition of individual patient data, is currently the approach of choice. Appropriate data summaries and analysis strategies for the individual patient data will depend on the situation. Consultation with a knowledgeable statistician is advised.

Where data have been analysed on a log scale, results are commonly presented as geometric means and ratios of geometric means. A meta-analysis may be then performed on the scale of the log-transformed data; an example of the calculation of the required means and SD is given in Chapter 6, Section 6.5.2.4 . This approach depends on being able to obtain transformed data for all studies; methods for transforming from one scale to the other are available (Higgins et al 2008b). Log-transformed and untransformed data should not be mixed in a meta-analysis.

MECIR Box 10.5.a Relevant expectations for conduct of intervention reviews

Addressing skewed data ( )
	Skewed data are sometimes not summarized usefully by means and standard deviations. While statistical methods are approximately valid for large sample sizes, skewed outcome data can lead to misleading results when studies are small.

10.6 Combining dichotomous and continuous outcomes

Occasionally authors encounter a situation where data for the same outcome are presented in some studies as dichotomous data and in other studies as continuous data. For example, scores on depression scales can be reported as means, or as the percentage of patients who were depressed at some point after an intervention (i.e. with a score above a specified cut-point). This type of information is often easier to understand, and more helpful, when it is dichotomized. However, deciding on a cut-point may be arbitrary, and information is lost when continuous data are transformed to dichotomous data.

There are several options for handling combinations of dichotomous and continuous data. Generally, it is useful to summarize results from all the relevant, valid studies in a similar way, but this is not always possible. It may be possible to collect missing data from investigators so that this can be done. If not, it may be useful to summarize the data in three ways: by entering the means and SDs as continuous outcomes, by entering the counts as dichotomous outcomes and by entering all of the data in text form as ‘Other data’ outcomes.

There are statistical approaches available that will re-express odds ratios as SMDs (and vice versa), allowing dichotomous and continuous data to be combined (Anzures-Cabrera et al 2011). A simple approach is as follows. Based on an assumption that the underlying continuous measurements in each intervention group follow a logistic distribution (which is a symmetrical distribution similar in shape to the normal distribution, but with more data in the distributional tails), and that the variability of the outcomes is the same in both experimental and comparator participants, the odds ratios can be re-expressed as a SMD according to the following simple formula (Chinn 2000):

The standard error of the log odds ratio can be converted to the standard error of a SMD by multiplying by the same constant (√3/π=0.5513). Alternatively SMDs can be re-expressed as log odds ratios by multiplying by π/√3=1.814. Once SMDs (or log odds ratios) and their standard errors have been computed for all studies in the meta-analysis, they can be combined using the generic inverse-variance method. Standard errors can be computed for all studies by entering the data as dichotomous and continuous outcome type data, as appropriate, and converting the confidence intervals for the resulting log odds ratios and SMDs into standard errors (see Chapter 6, Section 6.3 ).

10.7 Meta-analysis of ordinal outcomes and measurement scale s

Ordinal and measurement scale outcomes are most commonly meta-analysed as dichotomous data (if so, see Section 10.4 ) or continuous data (if so, see Section 10.5 ) depending on the way that the study authors performed the original analyses.

Occasionally it is possible to analyse the data using proportional odds models. This is the case when ordinal scales have a small number of categories, the numbers falling into each category for each intervention group can be obtained, and the same ordinal scale has been used in all studies. This approach may make more efficient use of all available data than dichotomization, but requires access to statistical software and results in a summary statistic for which it is challenging to find a clinical meaning.

The proportional odds model uses the proportional odds ratio as the measure of intervention effect (Agresti 1996) (see Chapter 6, Section 6.6 ), and can be used for conducting a meta-analysis in advanced statistical software packages (Whitehead and Jones 1994). Estimates of log odds ratios and their standard errors from a proportional odds model may be meta-analysed using the generic inverse-variance method (see Section 10.3.3 ). If the same ordinal scale has been used in all studies, but in some reports has been presented as a dichotomous outcome, it may still be possible to include all studies in the meta-analysis. In the context of the three-category model, this might mean that for some studies category 1 constitutes a success, while for others both categories 1 and 2 constitute a success. Methods are available for dealing with this, and for combining data from scales that are related but have different definitions for their categories (Whitehead and Jones 1994).

10.8 Meta-analysis of counts and rates

Results may be expressed as count data when each participant may experience an event, and may experience it more than once (see Chapter 6, Section 6.7 ). For example, ‘number of strokes’, or ‘number of hospital visits’ are counts. These events may not happen at all, but if they do happen there is no theoretical maximum number of occurrences for an individual. Count data may be analysed using methods for dichotomous data if the counts are dichotomized for each individual (see Section 10.4 ), continuous data (see Section 10.5 ) and time-to-event data (see Section 10.9 ), as well as being analysed as rate data.

Rate data occur if counts are measured for each participant along with the time over which they are observed. This is particularly appropriate when the events being counted are rare. For example, a woman may experience two strokes during a follow-up period of two years. Her rate of strokes is one per year of follow-up (or, equivalently 0.083 per month of follow-up). Rates are conventionally summarized at the group level. For example, participants in the comparator group of a clinical trial may experience 85 strokes during a total of 2836 person-years of follow-up. An underlying assumption associated with the use of rates is that the risk of an event is constant across participants and over time. This assumption should be carefully considered for each situation. For example, in contraception studies, rates have been used (known as Pearl indices) to describe the number of pregnancies per 100 women-years of follow-up. This is now considered inappropriate since couples have different risks of conception, and the risk for each woman changes over time. Pregnancies are now analysed more often using life tables or time-to-event methods that investigate the time elapsing before the first pregnancy.

Analysing count data as rates is not always the most appropriate approach and is uncommon in practice. This is because:

the assumption of a constant underlying risk may not be suitable; and
the statistical methods are not as well developed as they are for other types of data.

The results of a study may be expressed as a rate ratio , that is the ratio of the rate in the experimental intervention group to the rate in the comparator group. The (natural) logarithms of the rate ratios may be combined across studies using the generic inverse-variance method (see Section 10.3.3 ). Alternatively, Poisson regression approaches can be used (Spittal et al 2015).

In a randomized trial, rate ratios may often be very similar to risk ratios obtained after dichotomizing the participants, since the average period of follow-up should be similar in all intervention groups. Rate ratios and risk ratios will differ, however, if an intervention affects the likelihood of some participants experiencing multiple events.

It is possible also to focus attention on the rate difference (see Chapter 6, Section 6.7.1 ). The analysis again can be performed using the generic inverse-variance method (Hasselblad and McCrory 1995, Guevara et al 2004).

10.9 Meta-analysis of time-to-event outcomes

Two approaches to meta-analysis of time-to-event outcomes are readily available to Cochrane Review authors. The choice of which to use will depend on the type of data that have been extracted from the primary studies, or obtained from re-analysis of individual participant data.

If ‘O – E’ and ‘V’ statistics have been obtained (see Chapter 6, Section 6.8.2 ), either through re-analysis of individual participant data or from aggregate statistics presented in the study reports, then these statistics may be entered directly into RevMan using the ‘O – E and Variance’ outcome type. There are several ways to calculate these ‘O – E’ and ‘V’ statistics. Peto’s method applied to dichotomous data (Section 10.4.2 ) gives rise to an odds ratio; a log-rank approach gives rise to a hazard ratio; and a variation of the Peto method for analysing time-to-event data gives rise to something in between (Simmonds et al 2011). The appropriate effect measure should be specified. Only fixed-effect meta-analysis methods are available in RevMan for ‘O – E and Variance’ outcomes.

Alternatively, if estimates of log hazard ratios and standard errors have been obtained from results of Cox proportional hazards regression models, study results can be combined using generic inverse-variance methods (see Section 10.3.3 ).

If a mixture of log-rank and Cox model estimates are obtained from the studies, all results can be combined using the generic inverse-variance method, as the log-rank estimates can be converted into log hazard ratios and standard errors using the approaches discussed in Chapter 6, Section 6.8 .

10.10 Heterogeneity

10.10.1 what is heterogeneity.

Inevitably, studies brought together in a systematic review will differ. Any kind of variability among studies in a systematic review may be termed heterogeneity. It can be helpful to distinguish between different types of heterogeneity. Variability in the participants, interventions and outcomes studied may be described as clinical diversity (sometimes called clinical heterogeneity), and variability in study design, outcome measurement tools and risk of bias may be described as methodological diversity (sometimes called methodological heterogeneity). Variability in the intervention effects being evaluated in the different studies is known as statistical heterogeneity , and is a consequence of clinical or methodological diversity, or both, among the studies. Statistical heterogeneity manifests itself in the observed intervention effects being more different from each other than one would expect due to random error (chance) alone. We will follow convention and refer to statistical heterogeneity simply as heterogeneity .

Clinical variation will lead to heterogeneity if the intervention effect is affected by the factors that vary across studies; most obviously, the specific interventions or patient characteristics. In other words, the true intervention effect will be different in different studies.

Differences between studies in terms of methodological factors, such as use of blinding and concealment of allocation sequence, or if there are differences between studies in the way the outcomes are defined and measured, may be expected to lead to differences in the observed intervention effects. Significant statistical heterogeneity arising from methodological diversity or differences in outcome assessments suggests that the studies are not all estimating the same quantity, but does not necessarily suggest that the true intervention effect varies. In particular, heterogeneity associated solely with methodological diversity would indicate that the studies suffer from different degrees of bias. Empirical evidence suggests that some aspects of design can affect the result of clinical trials, although this is not always the case. Further discussion appears in Chapter 7 and Chapter 8 .

The scope of a review will largely determine the extent to which studies included in a review are diverse. Sometimes a review will include studies addressing a variety of questions, for example when several different interventions for the same condition are of interest (see also Chapter 11 ) or when the differential effects of an intervention in different populations are of interest. Meta-analysis should only be considered when a group of studies is sufficiently homogeneous in terms of participants, interventions and outcomes to provide a meaningful summary (see MECIR Box 10.10.a. ). It is often appropriate to take a broader perspective in a meta-analysis than in a single clinical trial. A common analogy is that systematic reviews bring together apples and oranges, and that combining these can yield a meaningless result. This is true if apples and oranges are of intrinsic interest on their own, but may not be if they are used to contribute to a wider question about fruit. For example, a meta-analysis may reasonably evaluate the average effect of a class of drugs by combining results from trials where each evaluates the effect of a different drug from the class.

MECIR Box 10.10.a Relevant expectations for conduct of intervention reviews

( )
	Meta-analyses of very diverse studies can be misleading, for example where studies use different forms of control. Clinical diversity does not indicate necessarily that a meta-analysis should not be performed. However, authors must be clear about the underlying question that all studies are addressing.

There may be specific interest in a review in investigating how clinical and methodological aspects of studies relate to their results. Where possible these investigations should be specified a priori (i.e. in the protocol for the systematic review). It is legitimate for a systematic review to focus on examining the relationship between some clinical characteristic(s) of the studies and the size of intervention effect, rather than on obtaining a summary effect estimate across a series of studies (see Section 10.11 ). Meta-regression may best be used for this purpose, although it is not implemented in RevMan (see Section 10.11.4 ).

10.10.2 Identifying and measuring heterogeneity

It is essential to consider the extent to which the results of studies are consistent with each other (see MECIR Box 10.10.b ). If confidence intervals for the results of individual studies (generally depicted graphically using horizontal lines) have poor overlap, this generally indicates the presence of statistical heterogeneity. More formally, a statistical test for heterogeneity is available. This Chi 2 (χ 2 , or chi-squared) test is included in the forest plots in Cochrane Reviews. It assesses whether observed differences in results are compatible with chance alone. A low P value (or a large Chi 2 statistic relative to its degree of freedom) provides evidence of heterogeneity of intervention effects (variation in effect estimates beyond chance).

MECIR Box 10.10.b Relevant expectations for conduct of intervention reviews

Assessing statistical heterogeneity ( )

The presence of heterogeneity affects the extent to which generalizable conclusions can be formed. It is important to identify heterogeneity in case there is sufficient information to explain it and offer new insights. Authors should recognize that there is much uncertainty in measures such as and Tau when there are few studies. Thus, use of simple thresholds to diagnose heterogeneity should be avoided.

Care must be taken in the interpretation of the Chi 2 test, since it has low power in the (common) situation of a meta-analysis when studies have small sample size or are few in number. This means that while a statistically significant result may indicate a problem with heterogeneity, a non-significant result must not be taken as evidence of no heterogeneity. This is also why a P value of 0.10, rather than the conventional level of 0.05, is sometimes used to determine statistical significance. A further problem with the test, which seldom occurs in Cochrane Reviews, is that when there are many studies in a meta-analysis, the test has high power to detect a small amount of heterogeneity that may be clinically unimportant.

Some argue that, since clinical and methodological diversity always occur in a meta-analysis, statistical heterogeneity is inevitable (Higgins et al 2003). Thus, the test for heterogeneity is irrelevant to the choice of analysis; heterogeneity will always exist whether or not we happen to be able to detect it using a statistical test. Methods have been developed for quantifying inconsistency across studies that move the focus away from testing whether heterogeneity is present to assessing its impact on the meta-analysis. A useful statistic for quantifying inconsistency is:

In this equation, Q is the Chi 2 statistic and df is its degrees of freedom (Higgins and Thompson 2002, Higgins et al 2003). I 2 describes the percentage of the variability in effect estimates that is due to heterogeneity rather than sampling error (chance).

Thresholds for the interpretation of the I 2 statistic can be misleading, since the importance of inconsistency depends on several factors. A rough guide to interpretation in the context of meta-analyses of randomized trials is as follows:

0% to 40%: might not be important;
30% to 60%: may represent moderate heterogeneity*;
50% to 90%: may represent substantial heterogeneity*;
75% to 100%: considerable heterogeneity*.

*The importance of the observed value of I 2 depends on (1) magnitude and direction of effects, and (2) strength of evidence for heterogeneity (e.g. P value from the Chi 2 test, or a confidence interval for I 2 : uncertainty in the value of I 2 is substantial when the number of studies is small).

10.10.3 Strategies for addressing heterogeneity

Review authors must take into account any statistical heterogeneity when interpreting results, particularly when there is variation in the direction of effect (see MECIR Box 10.10.c ). A number of options are available if heterogeneity is identified among a group of studies that would otherwise be considered suitable for a meta-analysis.

MECIR Box 10.10.c Relevant expectations for conduct of intervention reviews

Considering statistical heterogeneity when interpreting the results ( )

The presence of heterogeneity affects the extent to which generalizable conclusions can be formed. If a fixed-effect analysis is used, the confidence intervals ignore the extent of heterogeneity. If a random-effects analysis is used, the result pertains to the mean effect across studies. In both cases, the implications of notable heterogeneity should be addressed. It may be possible to understand the reasons for the heterogeneity if there are sufficient studies.

Check again that the data are correct. Severe apparent heterogeneity can indicate that data have been incorrectly extracted or entered into meta-analysis software. For example, if standard errors have mistakenly been entered as SDs for continuous outcomes, this could manifest itself in overly narrow confidence intervals with poor overlap and hence substantial heterogeneity. Unit-of-analysis errors may also be causes of heterogeneity (see Chapter 6, Section 6.2 ).
Do not do a meta -analysis. A systematic review need not contain any meta-analyses. If there is considerable variation in results, and particularly if there is inconsistency in the direction of effect, it may be misleading to quote an average value for the intervention effect.
Explore heterogeneity. It is clearly of interest to determine the causes of heterogeneity among results of studies. This process is problematic since there are often many characteristics that vary across studies from which one may choose. Heterogeneity may be explored by conducting subgroup analyses (see Section 10.11.3 ) or meta-regression (see Section 10.11.4 ). Reliable conclusions can only be drawn from analyses that are truly pre-specified before inspecting the studies’ results, and even these conclusions should be interpreted with caution. Explorations of heterogeneity that are devised after heterogeneity is identified can at best lead to the generation of hypotheses. They should be interpreted with even more caution and should generally not be listed among the conclusions of a review. Also, investigations of heterogeneity when there are very few studies are of questionable value.
Ignore heterogeneity. Fixed-effect meta-analyses ignore heterogeneity. The summary effect estimate from a fixed-effect meta-analysis is normally interpreted as being the best estimate of the intervention effect. However, the existence of heterogeneity suggests that there may not be a single intervention effect but a variety of intervention effects. Thus, the summary fixed-effect estimate may be an intervention effect that does not actually exist in any population, and therefore have a confidence interval that is meaningless as well as being too narrow (see Section 10.10.4 ).
Perform a random-effects meta-analysis. A random-effects meta-analysis may be used to incorporate heterogeneity among studies. This is not a substitute for a thorough investigation of heterogeneity. It is intended primarily for heterogeneity that cannot be explained. An extended discussion of this option appears in Section 10.10.4 .
Reconsider the effect measure. Heterogeneity may be an artificial consequence of an inappropriate choice of effect measure. For example, when studies collect continuous outcome data using different scales or different units, extreme heterogeneity may be apparent when using the mean difference but not when the more appropriate standardized mean difference is used. Furthermore, choice of effect measure for dichotomous outcomes (odds ratio, risk ratio, or risk difference) may affect the degree of heterogeneity among results. In particular, when comparator group risks vary, homogeneous odds ratios or risk ratios will necessarily lead to heterogeneous risk differences, and vice versa. However, it remains unclear whether homogeneity of intervention effect in a particular meta-analysis is a suitable criterion for choosing between these measures (see also Section 10.4.3 ).
Exclude studies. Heterogeneity may be due to the presence of one or two outlying studies with results that conflict with the rest of the studies. In general it is unwise to exclude studies from a meta-analysis on the basis of their results as this may introduce bias. However, if an obvious reason for the outlying result is apparent, the study might be removed with more confidence. Since usually at least one characteristic can be found for any study in any meta-analysis which makes it different from the others, this criterion is unreliable because it is all too easy to fulfil. It is advisable to perform analyses both with and without outlying studies as part of a sensitivity analysis (see Section 10.14 ). Whenever possible, potential sources of clinical diversity that might lead to such situations should be specified in the protocol.

10.10.4 Incorporating heterogeneity into random-effects models

The random-effects meta-analysis approach incorporates an assumption that the different studies are estimating different, yet related, intervention effects (DerSimonian and Laird 1986, Borenstein et al 2010). The approach allows us to address heterogeneity that cannot readily be explained by other factors. A random-effects meta-analysis model involves an assumption that the effects being estimated in the different studies follow some distribution. The model represents our lack of knowledge about why real, or apparent, intervention effects differ, by considering the differences as if they were random. The centre of the assumed distribution describes the average of the effects, while its width describes the degree of heterogeneity. The conventional choice of distribution is a normal distribution. It is difficult to establish the validity of any particular distributional assumption, and this is a common criticism of random-effects meta-analyses. The importance of the assumed shape for this distribution has not been widely studied.

To undertake a random-effects meta-analysis, the standard errors of the study-specific estimates (SE i in Section 10.3.1 ) are adjusted to incorporate a measure of the extent of variation, or heterogeneity, among the intervention effects observed in different studies (this variation is often referred to as Tau-squared, τ 2 , or Tau 2 ). The amount of variation, and hence the adjustment, can be estimated from the intervention effects and standard errors of the studies included in the meta-analysis.

In a heterogeneous set of studies, a random-effects meta-analysis will award relatively more weight to smaller studies than such studies would receive in a fixed-effect meta-analysis. This is because small studies are more informative for learning about the distribution of effects across studies than for learning about an assumed common intervention effect.

Note that a random-effects model does not ‘take account’ of the heterogeneity, in the sense that it is no longer an issue. It is always preferable to explore possible causes of heterogeneity, although there may be too few studies to do this adequately (see Section 10.11 ).

10.10.4.1 Fixed or random effects?

A fixed-effect meta-analysis provides a result that may be viewed as a ‘typical intervention effect’ from the studies included in the analysis. In order to calculate a confidence interval for a fixed-effect meta-analysis the assumption is usually made that the true effect of intervention (in both magnitude and direction) is the same value in every study (i.e. fixed across studies). This assumption implies that the observed differences among study results are due solely to the play of chance (i.e. that there is no statistical heterogeneity).

A random-effects model provides a result that may be viewed as an ‘average intervention effect’, where this average is explicitly defined according to an assumed distribution of effects across studies. Instead of assuming that the intervention effects are the same, we assume that they follow (usually) a normal distribution. The assumption implies that the observed differences among study results are due to a combination of the play of chance and some genuine variation in the intervention effects.

The random-effects method and the fixed-effect method will give identical results when there is no heterogeneity among the studies.

When heterogeneity is present, a confidence interval around the random-effects summary estimate is wider than a confidence interval around a fixed-effect summary estimate. This will happen whenever the I 2 statistic is greater than zero, even if the heterogeneity is not detected by the Chi 2 test for heterogeneity (see Section 10.10.2 ).

Sometimes the central estimate of the intervention effect is different between fixed-effect and random-effects analyses. In particular, if results of smaller studies are systematically different from results of larger ones, which can happen as a result of publication bias or within-study bias in smaller studies (Egger et al 1997, Poole and Greenland 1999, Kjaergard et al 2001), then a random-effects meta-analysis will exacerbate the effects of the bias (see also Chapter 13, Section 13.3.5.6 ). A fixed-effect analysis will be affected less, although strictly it will also be inappropriate.

The decision between fixed- and random-effects meta-analyses has been the subject of much debate, and we do not provide a universal recommendation. Some considerations in making this choice are as follows:

Many have argued that the decision should be based on an expectation of whether the intervention effects are truly identical, preferring the fixed-effect model if this is likely and a random-effects model if this is unlikely (Borenstein et al 2010). Since it is generally considered to be implausible that intervention effects across studies are identical (unless the intervention has no effect at all), this leads many to advocate use of the random-effects model.
Others have argued that a fixed-effect analysis can be interpreted in the presence of heterogeneity, and that it makes fewer assumptions than a random-effects meta-analysis. They then refer to it as a ‘fixed-effects’ meta-analysis (Peto et al 1995, Rice et al 2018).
Under any interpretation, a fixed-effect meta-analysis ignores heterogeneity. If the method is used, it is therefore important to supplement it with a statistical investigation of the extent of heterogeneity (see Section 10.10.2 ).
In the presence of heterogeneity, a random-effects analysis gives relatively more weight to smaller studies and relatively less weight to larger studies. If there is additionally some funnel plot asymmetry (i.e. a relationship between intervention effect magnitude and study size), then this will push the results of the random-effects analysis towards the findings in the smaller studies. In the context of randomized trials, this is generally regarded as an unfortunate consequence of the model.
A pragmatic approach is to plan to undertake both a fixed-effect and a random-effects meta-analysis, with an intention to present the random-effects result if there is no indication of funnel plot asymmetry. If there is an indication of funnel plot asymmetry, then both methods are problematic. It may be reasonable to present both analyses or neither, or to perform a sensitivity analysis in which small studies are excluded or addressed directly using meta-regression (see Chapter 13, Section 13.3.5.6 ).
The choice between a fixed-effect and a random-effects meta-analysis should never be made on the basis of a statistical test for heterogeneity.

10.10.4.2 Interpretation of random-effects meta-analyses

The summary estimate and confidence interval from a random-effects meta-analysis refer to the centre of the distribution of intervention effects, but do not describe the width of the distribution. Often the summary estimate and its confidence interval are quoted in isolation and portrayed as a sufficient summary of the meta-analysis. This is inappropriate. The confidence interval from a random-effects meta-analysis describes uncertainty in the location of the mean of systematically different effects in the different studies. It does not describe the degree of heterogeneity among studies, as may be commonly believed. For example, when there are many studies in a meta-analysis, we may obtain a very tight confidence interval around the random-effects estimate of the mean effect even when there is a large amount of heterogeneity. A solution to this problem is to consider a prediction interval (see Section 10.10.4.3 ).

Methodological diversity creates heterogeneity through biases variably affecting the results of different studies. The random-effects summary estimate will only correctly estimate the average intervention effect if the biases are symmetrically distributed, leading to a mixture of over-estimates and under-estimates of effect, which is unlikely to be the case. In practice it can be very difficult to distinguish whether heterogeneity results from clinical or methodological diversity, and in most cases it is likely to be due to both, so these distinctions are hard to draw in the interpretation.

When there is little information, either because there are few studies or if the studies are small with few events, a random-effects analysis will provide poor estimates of the amount of heterogeneity (i.e. of the width of the distribution of intervention effects). Fixed-effect methods such as the Mantel-Haenszel method will provide more robust estimates of the average intervention effect, but at the cost of ignoring any heterogeneity.

10.10.4.3 Prediction intervals from a random-effects meta-analysis

An estimate of the between-study variance in a random-effects meta-analysis is typically presented as part of its results. The square root of this number (i.e. Tau) is the estimated standard deviation of underlying effects across studies. Prediction intervals are a way of expressing this value in an interpretable way.

To motivate the idea of a prediction interval, note that for absolute measures of effect (e.g. risk difference, mean difference, standardized mean difference), an approximate 95% range of normally distributed underlying effects can be obtained by creating an interval from 1.96´Tau below the random-effects mean, to 1.96✕Tau above it. (For relative measures such as the odds ratio and risk ratio, an equivalent interval needs to be based on the natural logarithm of the summary estimate.) In reality, both the summary estimate and the value of Tau are associated with uncertainty. A prediction interval seeks to present the range of effects in a way that acknowledges this uncertainty (Higgins et al 2009). A simple 95% prediction interval can be calculated as:

where M is the summary mean from the random-effects meta-analysis, t k −2 is the 95% percentile of a t -distribution with k –2 degrees of freedom, k is the number of studies, Tau 2 is the estimated amount of heterogeneity and SE( M ) is the standard error of the summary mean.

The term ‘prediction interval’ relates to the use of this interval to predict the possible underlying effect in a new study that is similar to the studies in the meta-analysis. A more useful interpretation of the interval is as a summary of the spread of underlying effects in the studies included in the random-effects meta-analysis.

Prediction intervals have proved a popular way of expressing the amount of heterogeneity in a meta-analysis (Riley et al 2011). They are, however, strongly based on the assumption of a normal distribution for the effects across studies, and can be very problematic when the number of studies is small, in which case they can appear spuriously wide or spuriously narrow. Nevertheless, we encourage their use when the number of studies is reasonable (e.g. more than ten) and there is no clear funnel plot asymmetry.

10.10.4.4 Implementing random-effects meta-analyses

As introduced in Section 10.3.2 , the random-effects model can be implemented using an inverse-variance approach, incorporating a measure of the extent of heterogeneity into the study weights. RevMan implements a version of random-effects meta-analysis that is described by DerSimonian and Laird, making use of a ‘moment-based’ estimate of the between-study variance (DerSimonian and Laird 1986). The attraction of this method is that the calculations are straightforward, but it has a theoretical disadvantage in that the confidence intervals are slightly too narrow to encompass full uncertainty resulting from having estimated the degree of heterogeneity.

For many years, RevMan has implemented two random-effects methods for dichotomous data: a Mantel-Haenszel method and an inverse-variance method. Both use the moment-based approach to estimating the amount of between-studies variation. The difference between the two is subtle: the former estimates the between-study variation by comparing each study’s result with a Mantel-Haenszel fixed-effect meta-analysis result, whereas the latter estimates it by comparing each study’s result with an inverse-variance fixed-effect meta-analysis result. In practice, the difference is likely to be trivial.

There are alternative methods for performing random-effects meta-analyses that have better technical properties than the DerSimonian and Laird approach with a moment-based estimate (Veroniki et al 2016). Most notable among these is an adjustment to the confidence interval proposed by Hartung and Knapp and by Sidik and Jonkman (Hartung and Knapp 2001, Sidik and Jonkman 2002). This adjustment widens the confidence interval to reflect uncertainty in the estimation of between-study heterogeneity, and it should be used if available to review authors. An alternative option to encompass full uncertainty in the degree of heterogeneity is to take a Bayesian approach (see Section 10.13 ).

An empirical comparison of different ways to estimate between-study variation in Cochrane meta-analyses has shown that they can lead to substantial differences in estimates of heterogeneity, but seldom have major implications for estimating summary effects (Langan et al 2015). Several simulation studies have concluded that an approach proposed by Paule and Mandel should be recommended (Langan et al 2017); whereas a comprehensive recent simulation study recommended a restricted maximum likelihood approach, although noted that no single approach is universally preferable (Langan et al 2019). Review authors are encouraged to select one of these options if it is available to them.

10.11 Investigating heterogeneity

10.11.1 interaction and effect modification.

Does the intervention effect vary with different populations or intervention characteristics (such as dose or duration)? Such variation is known as interaction by statisticians and as effect modification by epidemiologists. Methods to search for such interactions include subgroup analyses and meta-regression. All methods have considerable pitfalls.

10.11.2 What are subgroup analyses?

Subgroup analyses involve splitting all the participant data into subgroups, often in order to make comparisons between them. Subgroup analyses may be done for subsets of participants (such as males and females), or for subsets of studies (such as different geographical locations). Subgroup analyses may be done as a means of investigating heterogeneous results, or to answer specific questions about particular patient groups, types of intervention or types of study.

Subgroup analyses of subsets of participants within studies are uncommon in systematic reviews based on published literature because sufficient details to extract data about separate participant types are seldom published in reports. By contrast, such subsets of participants are easily analysed when individual participant data have been collected (see Chapter 26 ). The methods we describe in the remainder of this chapter are for subgroups of studies.

Findings from multiple subgroup analyses may be misleading. Subgroup analyses are observational by nature and are not based on randomized comparisons. False negative and false positive significance tests increase in likelihood rapidly as more subgroup analyses are performed. If their findings are presented as definitive conclusions there is clearly a risk of people being denied an effective intervention or treated with an ineffective (or even harmful) intervention. Subgroup analyses can also generate misleading recommendations about directions for future research that, if followed, would waste scarce resources.

It is useful to distinguish between the notions of ‘qualitative interaction’ and ‘quantitative interaction’ (Yusuf et al 1991). Qualitative interaction exists if the direction of effect is reversed, that is if an intervention is beneficial in one subgroup but is harmful in another. Qualitative interaction is rare. This may be used as an argument that the most appropriate result of a meta-analysis is the overall effect across all subgroups. Quantitative interaction exists when the size of the effect varies but not the direction, that is if an intervention is beneficial to different degrees in different subgroups.

10.11.3 Undertaking subgroup analyses

Meta-analyses can be undertaken in RevMan both within subgroups of studies as well as across all studies irrespective of their subgroup membership. It is tempting to compare effect estimates in different subgroups by considering the meta-analysis results from each subgroup separately. This should only be done informally by comparing the magnitudes of effect. Noting that either the effect or the test for heterogeneity in one subgroup is statistically significant whilst that in the other subgroup is not statistically significant does not indicate that the subgroup factor explains heterogeneity. Since different subgroups are likely to contain different amounts of information and thus have different abilities to detect effects, it is extremely misleading simply to compare the statistical significance of the results.

10.11.3.1 Is the effect different in different subgroups?

Valid investigations of whether an intervention works differently in different subgroups involve comparing the subgroups with each other. It is a mistake to compare within-subgroup inferences such as P values. If one subgroup analysis is statistically significant and another is not, then the latter may simply reflect a lack of information rather than a smaller (or absent) effect. When there are only two subgroups, non-overlap of the confidence intervals indicates statistical significance, but note that the confidence intervals can overlap to a small degree and the difference still be statistically significant.

A formal statistical approach should be used to examine differences among subgroups (see MECIR Box 10.11.a ). A simple significance test to investigate differences between two or more subgroups can be performed (Borenstein and Higgins 2013). This procedure consists of undertaking a standard test for heterogeneity across subgroup results rather than across individual study results. When the meta-analysis uses a fixed-effect inverse-variance weighted average approach, the method is exactly equivalent to the test described by Deeks and colleagues (Deeks et al 2001). An I 2 statistic is also computed for subgroup differences. This describes the percentage of the variability in effect estimates from the different subgroups that is due to genuine subgroup differences rather than sampling error (chance). Note that these methods for examining subgroup differences should be used only when the data in the subgroups are independent (i.e. they should not be used if the same study participants contribute to more than one of the subgroups in the forest plot).

If fixed-effect models are used for the analysis within each subgroup, then these statistics relate to differences in typical effects across different subgroups. If random-effects models are used for the analysis within each subgroup, then the statistics relate to variation in the mean effects in the different subgroups.

An alternative method for testing for differences between subgroups is to use meta-regression techniques, in which case a random-effects model is generally preferred (see Section 10.11.4 ). Tests for subgroup differences based on random-effects models may be regarded as preferable to those based on fixed-effect models, due to the high risk of false-positive results when a fixed-effect model is used to compare subgroups (Higgins and Thompson 2004).

MECIR Box 10.11.a Relevant expectations for conduct of intervention reviews

Comparing subgroups ( )
	Concluding that there is a difference in effect in different subgroups on the basis of differences in the level of statistical significance within subgroups can be very misleading.

10.11.4 Meta-regression

If studies are divided into subgroups (see Section 10.11.2 ), this may be viewed as an investigation of how a categorical study characteristic is associated with the intervention effects in the meta-analysis. For example, studies in which allocation sequence concealment was adequate may yield different results from those in which it was inadequate. Here, allocation sequence concealment, being either adequate or inadequate, is a categorical characteristic at the study level. Meta-regression is an extension to subgroup analyses that allows the effect of continuous, as well as categorical, characteristics to be investigated, and in principle allows the effects of multiple factors to be investigated simultaneously (although this is rarely possible due to inadequate numbers of studies) (Thompson and Higgins 2002). Meta-regression should generally not be considered when there are fewer than ten studies in a meta-analysis.

Meta-regressions are similar in essence to simple regressions, in which an outcome variable is predicted according to the values of one or more explanatory variables . In meta-regression, the outcome variable is the effect estimate (for example, a mean difference, a risk difference, a log odds ratio or a log risk ratio). The explanatory variables are characteristics of studies that might influence the size of intervention effect. These are often called ‘potential effect modifiers’ or covariates. Meta-regressions usually differ from simple regressions in two ways. First, larger studies have more influence on the relationship than smaller studies, since studies are weighted by the precision of their respective effect estimate. Second, it is wise to allow for the residual heterogeneity among intervention effects not modelled by the explanatory variables. This gives rise to the term ‘random-effects meta-regression’, since the extra variability is incorporated in the same way as in a random-effects meta-analysis (Thompson and Sharp 1999).

The regression coefficient obtained from a meta-regression analysis will describe how the outcome variable (the intervention effect) changes with a unit increase in the explanatory variable (the potential effect modifier). The statistical significance of the regression coefficient is a test of whether there is a linear relationship between intervention effect and the explanatory variable. If the intervention effect is a ratio measure, the log-transformed value of the intervention effect should always be used in the regression model (see Chapter 6, Section 6.1.2.1 ), and the exponential of the regression coefficient will give an estimate of the relative change in intervention effect with a unit increase in the explanatory variable.

Meta-regression can also be used to investigate differences for categorical explanatory variables as done in subgroup analyses. If there are J subgroups, membership of particular subgroups is indicated by using J minus 1 dummy variables (which can only take values of zero or one) in the meta-regression model (as in standard linear regression modelling). The regression coefficients will estimate how the intervention effect in each subgroup differs from a nominated reference subgroup. The P value of each regression coefficient will indicate the strength of evidence against the null hypothesis that the characteristic is not associated with the intervention effect.

Meta-regression may be performed using the ‘metareg’ macro available for the Stata statistical package, or using the ‘metafor’ package for R, as well as other packages.

10.11.5 Selection of study characteristics for subgroup analyses and meta-regression

Authors need to be cautious about undertaking subgroup analyses, and interpreting any that they do. Some considerations are outlined here for selecting characteristics (also called explanatory variables, potential effect modifiers or covariates) that will be investigated for their possible influence on the size of the intervention effect. These considerations apply similarly to subgroup analyses and to meta-regressions. Further details may be obtained elsewhere (Oxman and Guyatt 1992, Berlin and Antman 1994).

10.11.5.1 Ensure that there are adequate studies to justify subgroup analyses and meta-regressions

It is very unlikely that an investigation of heterogeneity will produce useful findings unless there is a substantial number of studies. Typical advice for undertaking simple regression analyses: that at least ten observations (i.e. ten studies in a meta-analysis) should be available for each characteristic modelled. However, even this will be too few when the covariates are unevenly distributed across studies.

10.11.5.2 Specify characteristics in advance

Authors should, whenever possible, pre-specify characteristics in the protocol that later will be subject to subgroup analyses or meta-regression. The plan specified in the protocol should then be followed (data permitting), without undue emphasis on any particular findings (see MECIR Box 10.11.b ). Pre-specifying characteristics reduces the likelihood of spurious findings, first by limiting the number of subgroups investigated, and second by preventing knowledge of the studies’ results influencing which subgroups are analysed. True pre-specification is difficult in systematic reviews, because the results of some of the relevant studies are often known when the protocol is drafted. If a characteristic was overlooked in the protocol, but is clearly of major importance and justified by external evidence, then authors should not be reluctant to explore it. However, such post-hoc analyses should be identified as such.

MECIR Box 10.11.b Relevant expectations for conduct of intervention reviews

Interpreting subgroup analyses ( )
If subgroup analyses are conducted	Selective reporting, or over-interpretation, of particular subgroups or particular subgroup analyses should be avoided. This is a problem especially when multiple subgroup analyses are performed. This does not preclude the use of sensible and honest post hoc subgroup analyses.

10.11.5.3 Select a small number of characteristics

The likelihood of a false-positive result among subgroup analyses and meta-regression increases with the number of characteristics investigated. It is difficult to suggest a maximum number of characteristics to look at, especially since the number of available studies is unknown in advance. If more than one or two characteristics are investigated it may be sensible to adjust the level of significance to account for making multiple comparisons.

10.11.5.4 Ensure there is scientific rationale for investigating each characteristic

Selection of characteristics should be motivated by biological and clinical hypotheses, ideally supported by evidence from sources other than the included studies. Subgroup analyses using characteristics that are implausible or clinically irrelevant are not likely to be useful and should be avoided. For example, a relationship between intervention effect and year of publication is seldom in itself clinically informative, and if identified runs the risk of initiating a post-hoc data dredge of factors that may have changed over time.

Prognostic factors are those that predict the outcome of a disease or condition, whereas effect modifiers are factors that influence how well an intervention works in affecting the outcome. Confusion between prognostic factors and effect modifiers is common in planning subgroup analyses, especially at the protocol stage. Prognostic factors are not good candidates for subgroup analyses unless they are also believed to modify the effect of intervention. For example, being a smoker may be a strong predictor of mortality within the next ten years, but there may not be reason for it to influence the effect of a drug therapy on mortality (Deeks 1998). Potential effect modifiers may include participant characteristics (age, setting), the precise interventions (dose of active intervention, choice of comparison intervention), how the study was done (length of follow-up) or methodology (design and quality).

10.11.5.5 Be aware that the effect of a characteristic may not always be identified

Many characteristics that might have important effects on how well an intervention works cannot be investigated using subgroup analysis or meta-regression. These are characteristics of participants that might vary substantially within studies, but that can only be summarized at the level of the study. An example is age. Consider a collection of clinical trials involving adults ranging from 18 to 60 years old. There may be a strong relationship between age and intervention effect that is apparent within each study. However, if the mean ages for the trials are similar, then no relationship will be apparent by looking at trial mean ages and trial-level effect estimates. The problem is one of aggregating individuals’ results and is variously known as aggregation bias, ecological bias or the ecological fallacy (Morgenstern 1982, Greenland 1987, Berlin et al 2002). It is even possible for the direction of the relationship across studies be the opposite of the direction of the relationship observed within each study.

10.11.5.6 Think about whether the characteristic is closely related to another characteristic (confounded)

The problem of ‘confounding’ complicates interpretation of subgroup analyses and meta-regressions and can lead to incorrect conclusions. Two characteristics are confounded if their influences on the intervention effect cannot be disentangled. For example, if those studies implementing an intensive version of a therapy happened to be the studies that involved patients with more severe disease, then one cannot tell which aspect is the cause of any difference in effect estimates between these studies and others. In meta-regression, co-linearity between potential effect modifiers leads to similar difficulties (Berlin and Antman 1994). Computing correlations between study characteristics will give some information about which study characteristics may be confounded with each other.

10.11.6 Interpretation of subgroup analyses and meta-regressions

Appropriate interpretation of subgroup analyses and meta-regressions requires caution (Oxman and Guyatt 1992).

Subgroup comparisons are observational. It must be remembered that subgroup analyses and meta-regressions are entirely observational in their nature. These analyses investigate differences between studies. Even if individuals are randomized to one group or other within a clinical trial, they are not randomized to go in one trial or another. Hence, subgroup analyses suffer the limitations of any observational investigation, including possible bias through confounding by other study-level characteristics. Furthermore, even a genuine difference between subgroups is not necessarily due to the classification of the subgroups. As an example, a subgroup analysis of bone marrow transplantation for treating leukaemia might show a strong association between the age of a sibling donor and the success of the transplant. However, this probably does not mean that the age of donor is important. In fact, the age of the recipient is probably a key factor and the subgroup finding would simply be due to the strong association between the age of the recipient and the age of their sibling.
Was the analysis pre-specified or post hoc? Authors should state whether subgroup analyses were pre-specified or undertaken after the results of the studies had been compiled (post hoc). More reliance may be placed on a subgroup analysis if it was one of a small number of pre-specified analyses. Performing numerous post-hoc subgroup analyses to explain heterogeneity is a form of data dredging. Data dredging is condemned because it is usually possible to find an apparent, but false, explanation for heterogeneity by considering lots of different characteristics.
Is there indirect evidence in support of the findings? Differences between subgroups should be clinically plausible and supported by other external or indirect evidence, if they are to be convincing.
Is the magnitude of the difference practically important? If the magnitude of a difference between subgroups will not result in different recommendations for different subgroups, then it may be better to present only the overall analysis results.
Is there a statistically significant difference between subgroups? To establish whether there is a different effect of an intervention in different situations, the magnitudes of effects in different subgroups should be compared directly with each other. In particular, statistical significance of the results within separate subgroup analyses should not be compared (see Section 10.11.3.1 ).
Are analyses looking at within-study or between-study relationships? For patient and intervention characteristics, differences in subgroups that are observed within studies are more reliable than analyses of subsets of studies. If such within-study relationships are replicated across studies then this adds confidence to the findings.

10.11.7 Investigating the effect of underlying risk

One potentially important source of heterogeneity among a series of studies is when the underlying average risk of the outcome event varies between the studies. The underlying risk of a particular event may be viewed as an aggregate measure of case-mix factors such as age or disease severity. It is generally measured as the observed risk of the event in the comparator group of each study (the comparator group risk, or CGR). The notion is controversial in its relevance to clinical practice since underlying risk represents a summary of both known and unknown risk factors. Problems also arise because comparator group risk will depend on the length of follow-up, which often varies across studies. However, underlying risk has received particular attention in meta-analysis because the information is readily available once dichotomous data have been prepared for use in meta-analyses. Sharp provides a full discussion of the topic (Sharp 2001).

Intuition would suggest that participants are more or less likely to benefit from an effective intervention according to their risk status. However, the relationship between underlying risk and intervention effect is a complicated issue. For example, suppose an intervention is equally beneficial in the sense that for all patients it reduces the risk of an event, say a stroke, to 80% of the underlying risk. Then it is not equally beneficial in terms of absolute differences in risk in the sense that it reduces a 50% stroke rate by 10 percentage points to 40% (number needed to treat=10), but a 20% stroke rate by 4 percentage points to 16% (number needed to treat=25).

Use of different summary statistics (risk ratio, odds ratio and risk difference) will demonstrate different relationships with underlying risk. Summary statistics that show close to no relationship with underlying risk are generally preferred for use in meta-analysis (see Section 10.4.3 ).

Investigating any relationship between effect estimates and the comparator group risk is also complicated by a technical phenomenon known as regression to the mean. This arises because the comparator group risk forms an integral part of the effect estimate. A high risk in a comparator group, observed entirely by chance, will on average give rise to a higher than expected effect estimate, and vice versa. This phenomenon results in a false correlation between effect estimates and comparator group risks. There are methods, which require sophisticated software, that correct for regression to the mean (McIntosh 1996, Thompson et al 1997). These should be used for such analyses, and statistical expertise is recommended.

10.11.8 Dose-response analyses

The principles of meta-regression can be applied to the relationships between intervention effect and dose (commonly termed dose-response), treatment intensity or treatment duration (Greenland and Longnecker 1992, Berlin et al 1993). Conclusions about differences in effect due to differences in dose (or similar factors) are on stronger ground if participants are randomized to one dose or another within a study and a consistent relationship is found across similar studies. While authors should consider these effects, particularly as a possible explanation for heterogeneity, they should be cautious about drawing conclusions based on between-study differences. Authors should be particularly cautious about claiming that a dose-response relationship does not exist, given the low power of many meta-regression analyses to detect genuine relationships.

10.12 Missing data

10.12.1 types of missing data.

There are many potential sources of missing data in a systematic review or meta-analysis (see Table 10.12.a ). For example, a whole study may be missing from the review, an outcome may be missing from a study, summary data may be missing for an outcome, and individual participants may be missing from the summary data. Here we discuss a variety of potential sources of missing data, highlighting where more detailed discussions are available elsewhere in the Handbook .

Whole studies may be missing from a review because they are never published, are published in obscure places, are rarely cited, or are inappropriately indexed in databases. Thus, review authors should always be aware of the possibility that they have failed to identify relevant studies. There is a strong possibility that such studies are missing because of their ‘uninteresting’ or ‘unwelcome’ findings (that is, in the presence of publication bias). This problem is discussed at length in Chapter 13 . Details of comprehensive search methods are provided in Chapter 4 .

Some studies might not report any information on outcomes of interest to the review. For example, there may be no information on quality of life, or on serious adverse effects. It is often difficult to determine whether this is because the outcome was not measured or because the outcome was not reported. Furthermore, failure to report that outcomes were measured may be dependent on the unreported results (selective outcome reporting bias; see Chapter 7, Section 7.2.3.3 ). Similarly, summary data for an outcome, in a form that can be included in a meta-analysis, may be missing. A common example is missing standard deviations (SDs) for continuous outcomes. This is often a problem when change-from-baseline outcomes are sought. We discuss imputation of missing SDs in Chapter 6, Section 6.5.2.8 . Other examples of missing summary data are missing sample sizes (particularly those for each intervention group separately), numbers of events, standard errors, follow-up times for calculating rates, and sufficient details of time-to-event outcomes. Inappropriate analyses of studies, for example of cluster-randomized and crossover trials, can lead to missing summary data. It is sometimes possible to approximate the correct analyses of such studies, for example by imputing correlation coefficients or SDs, as discussed in Chapter 23, Section 23.1 , for cluster-randomized studies and Chapter 23,Section 23.2 , for crossover trials. As a general rule, most methodologists believe that missing summary data (e.g. ‘no usable data’) should not be used as a reason to exclude a study from a systematic review. It is more appropriate to include the study in the review, and to discuss the potential implications of its absence from a meta-analysis.

It is likely that in some, if not all, included studies, there will be individuals missing from the reported results. Review authors are encouraged to consider this problem carefully (see MECIR Box 10.12.a ). We provide further discussion of this problem in Section 10.12.3 ; see also Chapter 8, Section 8.5 .

Missing data can also affect subgroup analyses. If subgroup analyses or meta-regressions are planned (see Section 10.11 ), they require details of the study-level characteristics that distinguish studies from one another. If these are not available for all studies, review authors should consider asking the study authors for more information.

Table 10.12.a Types of missing data in a meta-analysis


Missing studies	Publication bias Search not sufficiently comprehensive
Missing outcomes	Outcome not measured Selective reporting bias
Missing summary data	Selective reporting bias Incomplete reporting
Missing individuals	Lack of intention-to-treat analysis Attrition from the study Selective reporting bias
Missing study-level characteristics (for subgroup analysis or meta-regression)	Characteristic not measured Incomplete reporting

MECIR Box 10.12.a Relevant expectations for conduct of intervention reviews

Addressing missing outcome data ( )

Incomplete outcome data can introduce bias. In most circumstances, authors should follow the principles of intention-to-treat analyses as far as possible (this may not be appropriate for adverse effects or if trying to demonstrate equivalence). Risk of bias due to incomplete outcome data is addressed in the Cochrane risk-of-bias tool. However, statistical analyses and careful interpretation of results are additional ways in which the issue can be addressed by review authors. Imputation methods can be considered (accompanied by, or in the form of, sensitivity analyses).

10.12.2 General principles for dealing with missing data

There is a large literature of statistical methods for dealing with missing data. Here we briefly review some key concepts and make some general recommendations for Cochrane Review authors. It is important to think why data may be missing. Statisticians often use the terms ‘missing at random’ and ‘not missing at random’ to represent different scenarios.

Data are said to be ‘missing at random’ if the fact that they are missing is unrelated to actual values of the missing data. For instance, if some quality-of-life questionnaires were lost in the postal system, this would be unlikely to be related to the quality of life of the trial participants who completed the forms. In some circumstances, statisticians distinguish between data ‘missing at random’ and data ‘missing completely at random’, although in the context of a systematic review the distinction is unlikely to be important. Data that are missing at random may not be important. Analyses based on the available data will often be unbiased, although based on a smaller sample size than the original data set.

Data are said to be ‘not missing at random’ if the fact that they are missing is related to the actual missing data. For instance, in a depression trial, participants who had a relapse of depression might be less likely to attend the final follow-up interview, and more likely to have missing outcome data. Such data are ‘non-ignorable’ in the sense that an analysis of the available data alone will typically be biased. Publication bias and selective reporting bias lead by definition to data that are ‘not missing at random’, and attrition and exclusions of individuals within studies often do as well.

The principal options for dealing with missing data are:

analysing only the available data (i.e. ignoring the missing data);
imputing the missing data with replacement values, and treating these as if they were observed (e.g. last observation carried forward, imputing an assumed outcome such as assuming all were poor outcomes, imputing the mean, imputing based on predicted values from a regression analysis);
imputing the missing data and accounting for the fact that these were imputed with uncertainty (e.g. multiple imputation, simple imputation methods (as point 2) with adjustment to the standard error); and
using statistical models to allow for missing data, making assumptions about their relationships with the available data.

Option 2 is practical in most circumstances and very commonly used in systematic reviews. However, it fails to acknowledge uncertainty in the imputed values and results, typically, in confidence intervals that are too narrow. Options 3 and 4 would require involvement of a knowledgeable statistician.

Five general recommendations for dealing with missing data in Cochrane Reviews are as follows:

Whenever possible, contact the original investigators to request missing data.
Make explicit the assumptions of any methods used to address missing data: for example, that the data are assumed missing at random, or that missing values were assumed to have a particular value such as a poor outcome.
Follow the guidance in Chapter 8 to assess risk of bias due to missing outcome data in randomized trials.
Perform sensitivity analyses to assess how sensitive results are to reasonable changes in the assumptions that are made (see Section 10.14 ).
Address the potential impact of missing data on the findings of the review in the Discussion section.

10.12.3 Dealing with missing outcome data from individual participants

Review authors may undertake sensitivity analyses to assess the potential impact of missing outcome data, based on assumptions about the relationship between missingness in the outcome and its true value. Several methods are available (Akl et al 2015). For dichotomous outcomes, Higgins and colleagues propose a strategy involving different assumptions about how the risk of the event among the missing participants differs from the risk of the event among the observed participants, taking account of uncertainty introduced by the assumptions (Higgins et al 2008a). Akl and colleagues propose a suite of simple imputation methods, including a similar approach to that of Higgins and colleagues based on relative risks of the event in missing versus observed participants. Similar ideas can be applied to continuous outcome data (Ebrahim et al 2013, Ebrahim et al 2014). Particular care is required to avoid double counting events, since it can be unclear whether reported numbers of events in trial reports apply to the full randomized sample or only to those who did not drop out (Akl et al 2016).

Although there is a tradition of implementing ‘worst case’ and ‘best case’ analyses clarifying the extreme boundaries of what is theoretically possible, such analyses may not be informative for the most plausible scenarios (Higgins et al 2008a).

10.13 Bayesian approaches to meta-analysis

Bayesian statistics is an approach to statistics based on a different philosophy from that which underlies significance tests and confidence intervals. It is essentially about updating of evidence. In a Bayesian analysis, initial uncertainty is expressed through a prior distribution about the quantities of interest. Current data and assumptions concerning how they were generated are summarized in the likelihood . The posterior distribution for the quantities of interest can then be obtained by combining the prior distribution and the likelihood. The likelihood summarizes both the data from studies included in the meta-analysis (for example, 2×2 tables from randomized trials) and the meta-analysis model (for example, assuming a fixed effect or random effects). The result of the analysis is usually presented as a point estimate and 95% credible interval from the posterior distribution for each quantity of interest, which look much like classical estimates and confidence intervals. Potential advantages of Bayesian analyses are summarized in Box 10.13.a . Bayesian analysis may be performed using WinBUGS software (Smith et al 1995, Lunn et al 2000), within R (Röver 2017), or – for some applications – using standard meta-regression software with a simple trick (Rhodes et al 2016).

A difference between Bayesian analysis and classical meta-analysis is that the interpretation is directly in terms of belief: a 95% credible interval for an odds ratio is that region in which we believe the odds ratio to lie with probability 95%. This is how many practitioners actually interpret a classical confidence interval, but strictly in the classical framework the 95% refers to the long-term frequency with which 95% intervals contain the true value. The Bayesian framework also allows a review author to calculate the probability that the odds ratio has a particular range of values, which cannot be done in the classical framework. For example, we can determine the probability that the odds ratio is less than 1 (which might indicate a beneficial effect of an experimental intervention), or that it is no larger than 0.8 (which might indicate a clinically important effect). It should be noted that these probabilities are specific to the choice of the prior distribution. Different meta-analysts may analyse the same data using different prior distributions and obtain different results. It is therefore important to carry out sensitivity analyses to investigate how the results depend on any assumptions made.

In the context of a meta-analysis, prior distributions are needed for the particular intervention effect being analysed (such as the odds ratio or the mean difference) and – in the context of a random-effects meta-analysis – on the amount of heterogeneity among intervention effects across studies. Prior distributions may represent subjective belief about the size of the effect, or may be derived from sources of evidence not included in the meta-analysis, such as information from non-randomized studies of the same intervention or from randomized trials of other interventions. The width of the prior distribution reflects the degree of uncertainty about the quantity. When there is little or no information, a ‘non-informative’ prior can be used, in which all values across the possible range are equally likely.

Most Bayesian meta-analyses use non-informative (or very weakly informative) prior distributions to represent beliefs about intervention effects, since many regard it as controversial to combine objective trial data with subjective opinion. However, prior distributions are increasingly used for the extent of among-study variation in a random-effects analysis. This is particularly advantageous when the number of studies in the meta-analysis is small, say fewer than five or ten. Libraries of data-based prior distributions are available that have been derived from re-analyses of many thousands of meta-analyses in the Cochrane Database of Systematic Reviews (Turner et al 2012).

Box 10.13.a Some potential advantages of Bayesian meta-analysis

Some potential advantages of Bayesian approaches over classical methods for meta-analyses are that they:

of various clinical outcome states; ); ); ); and

Statistical expertise is strongly recommended for review authors who wish to carry out Bayesian analyses. There are several good texts (Sutton et al 2000, Sutton and Abrams 2001, Spiegelhalter et al 2004).

10.14 Sensitivity analyses

The process of undertaking a systematic review involves a sequence of decisions. Whilst many of these decisions are clearly objective and non-contentious, some will be somewhat arbitrary or unclear. For instance, if eligibility criteria involve a numerical value, the choice of value is usually arbitrary: for example, defining groups of older people may reasonably have lower limits of 60, 65, 70 or 75 years, or any value in between. Other decisions may be unclear because a study report fails to include the required information. Some decisions are unclear because the included studies themselves never obtained the information required: for example, the outcomes of those who were lost to follow-up. Further decisions are unclear because there is no consensus on the best statistical method to use for a particular problem.

It is highly desirable to prove that the findings from a systematic review are not dependent on such arbitrary or unclear decisions by using sensitivity analysis (see MECIR Box 10.14.a ). A sensitivity analysis is a repeat of the primary analysis or meta-analysis in which alternative decisions or ranges of values are substituted for decisions that were arbitrary or unclear. For example, if the eligibility of some studies in the meta-analysis is dubious because they do not contain full details, sensitivity analysis may involve undertaking the meta-analysis twice: the first time including all studies and, second, including only those that are definitely known to be eligible. A sensitivity analysis asks the question, ‘Are the findings robust to the decisions made in the process of obtaining them?’

MECIR Box 10.14.a Relevant expectations for conduct of intervention reviews

Sensitivity analysis ( )
	It is important to be aware when results are robust, since the strength of the conclusion may be strengthened or weakened.

There are many decision nodes within the systematic review process that can generate a need for a sensitivity analysis. Examples include:

Searching for studies:

Should abstracts whose results cannot be confirmed in subsequent publications be included in the review?

Eligibility criteria:

Characteristics of participants: where a majority but not all people in a study meet an age range, should the study be included?
Characteristics of the intervention: what range of doses should be included in the meta-analysis?
Characteristics of the comparator: what criteria are required to define usual care to be used as a comparator group?
Characteristics of the outcome: what time point or range of time points are eligible for inclusion?
Study design: should blinded and unblinded outcome assessment be included, or should study inclusion be restricted by other aspects of methodological criteria?

What data should be analysed?

Time-to-event data: what assumptions of the distribution of censored data should be made?
Continuous data: where standard deviations are missing, when and how should they be imputed? Should analyses be based on change scores or on post-intervention values?
Ordinal scales: what cut-point should be used to dichotomize short ordinal scales into two groups?
Cluster-randomized trials: what values of the intraclass correlation coefficient should be used when trial analyses have not been adjusted for clustering?
Crossover trials: what values of the within-subject correlation coefficient should be used when this is not available in primary reports?
All analyses: what assumptions should be made about missing outcomes? Should adjusted or unadjusted estimates of intervention effects be used?

Analysis methods:

Should fixed-effect or random-effects methods be used for the analysis?
For dichotomous outcomes, should odds ratios, risk ratios or risk differences be used?
For continuous outcomes, where several scales have assessed the same dimension, should results be analysed as a standardized mean difference across all scales or as mean differences individually for each scale?

Some sensitivity analyses can be pre-specified in the study protocol, but many issues suitable for sensitivity analysis are only identified during the review process where the individual peculiarities of the studies under investigation are identified. When sensitivity analyses show that the overall result and conclusions are not affected by the different decisions that could be made during the review process, the results of the review can be regarded with a higher degree of certainty. Where sensitivity analyses identify particular decisions or missing information that greatly influence the findings of the review, greater resources can be deployed to try and resolve uncertainties and obtain extra information, possibly through contacting trial authors and obtaining individual participant data. If this cannot be achieved, the results must be interpreted with an appropriate degree of caution. Such findings may generate proposals for further investigations and future research.

Reporting of sensitivity analyses in a systematic review may best be done by producing a summary table. Rarely is it informative to produce individual forest plots for each sensitivity analysis undertaken.

Sensitivity analyses are sometimes confused with subgroup analysis. Although some sensitivity analyses involve restricting the analysis to a subset of the totality of studies, the two methods differ in two ways. First, sensitivity analyses do not attempt to estimate the effect of the intervention in the group of studies removed from the analysis, whereas in subgroup analyses, estimates are produced for each subgroup. Second, in sensitivity analyses, informal comparisons are made between different ways of estimating the same thing, whereas in subgroup analyses, formal statistical comparisons are made across the subgroups.

10.15 Chapter information

Editors: Jonathan J Deeks, Julian PT Higgins, Douglas G Altman; on behalf of the Cochrane Statistical Methods Group

Contributing authors: Douglas Altman, Deborah Ashby, Jacqueline Birks, Michael Borenstein, Marion Campbell, Jonathan Deeks, Matthias Egger, Julian Higgins, Joseph Lau, Keith O’Rourke, Gerta Rücker, Rob Scholten, Jonathan Sterne, Simon Thompson, Anne Whitehead

Acknowledgements: We are grateful to the following for commenting helpfully on earlier drafts: Bodil Als-Nielsen, Deborah Ashby, Jesse Berlin, Joseph Beyene, Jacqueline Birks, Michael Bracken, Marion Campbell, Chris Cates, Wendong Chen, Mike Clarke, Albert Cobos, Esther Coren, Francois Curtin, Roberto D’Amico, Keith Dear, Heather Dickinson, Diana Elbourne, Simon Gates, Paul Glasziou, Christian Gluud, Peter Herbison, Sally Hollis, David Jones, Steff Lewis, Tianjing Li, Joanne McKenzie, Philippa Middleton, Nathan Pace, Craig Ramsey, Keith O’Rourke, Rob Scholten, Guido Schwarzer, Jack Sinclair, Jonathan Sterne, Simon Thompson, Andy Vail, Clarine van Oel, Paula Williamson and Fred Wolf.

Funding: JJD received support from the National Institute for Health Research (NIHR) Birmingham Biomedical Research Centre at the University Hospitals Birmingham NHS Foundation Trust and the University of Birmingham. JPTH is a member of the NIHR Biomedical Research Centre at University Hospitals Bristol NHS Foundation Trust and the University of Bristol. JPTH received funding from National Institute for Health Research Senior Investigator award NF-SI-0617-10145. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health.

10.16 References

Agresti A. An Introduction to Categorical Data Analysis . New York (NY): John Wiley & Sons; 1996.

Akl EA, Kahale LA, Agoritsas T, Brignardello-Petersen R, Busse JW, Carrasco-Labra A, Ebrahim S, Johnston BC, Neumann I, Sola I, Sun X, Vandvik P, Zhang Y, Alonso-Coello P, Guyatt G. Handling trial participants with missing outcome data when conducting a meta-analysis: a systematic survey of proposed approaches. Systematic Reviews 2015; 4 : 98.

Akl EA, Kahale LA, Ebrahim S, Alonso-Coello P, Schünemann HJ, Guyatt GH. Three challenges described for identifying participants with missing data in trials reports, and potential solutions suggested to systematic reviewers. Journal of Clinical Epidemiology 2016; 76 : 147-154.

Altman DG, Bland JM. Detecting skewness from summary information. BMJ 1996; 313 : 1200.

Anzures-Cabrera J, Sarpatwari A, Higgins JPT. Expressing findings from meta-analyses of continuous outcomes in terms of risks. Statistics in Medicine 2011; 30 : 2967-2985.

Berlin JA, Longnecker MP, Greenland S. Meta-analysis of epidemiologic dose-response data. Epidemiology 1993; 4 : 218-228.

Berlin JA, Antman EM. Advantages and limitations of metaanalytic regressions of clinical trials data. Online Journal of Current Clinical Trials 1994; Doc No 134 .

Berlin JA, Santanna J, Schmid CH, Szczech LA, Feldman KA, Group A-LAITS. Individual patient- versus group-level data meta-regressions for the investigation of treatment effect modifiers: ecological bias rears its ugly head. Statistics in Medicine 2002; 21 : 371-387.

Borenstein M, Hedges LV, Higgins JPT, Rothstein HR. A basic introduction to fixed-effect and random-effects models for meta-analysis. Research Synthesis Methods 2010; 1 : 97-111.

Borenstein M, Higgins JPT. Meta-analysis and subgroups. Prev Sci 2013; 14 : 134-143.

Bradburn MJ, Deeks JJ, Berlin JA, Russell Localio A. Much ado about nothing: a comparison of the performance of meta-analytical methods with rare events. Statistics in Medicine 2007; 26 : 53-77.

Chinn S. A simple method for converting an odds ratio to effect size for use in meta-analysis. Statistics in Medicine 2000; 19 : 3127-3131.

da Costa BR, Nuesch E, Rutjes AW, Johnston BC, Reichenbach S, Trelle S, Guyatt GH, Jüni P. Combining follow-up and change data is valid in meta-analyses of continuous outcomes: a meta-epidemiological study. Journal of Clinical Epidemiology 2013; 66 : 847-855.

Deeks JJ. Systematic reviews of published evidence: Miracles or minefields? Annals of Oncology 1998; 9 : 703-709.

Deeks JJ, Altman DG, Bradburn MJ. Statistical methods for examining heterogeneity and combining results from several studies in meta-analysis. In: Egger M, Davey Smith G, Altman DG, editors. Systematic Reviews in Health Care: Meta-analysis in Context . 2nd edition ed. London (UK): BMJ Publication Group; 2001. p. 285-312.

Deeks JJ. Issues in the selection of a summary statistic for meta-analysis of clinical trials with binary outcomes. Statistics in Medicine 2002; 21 : 1575-1600.

DerSimonian R, Laird N. Meta-analysis in clinical trials. Controlled Clinical Trials 1986; 7 : 177-188.

DiGuiseppi C, Higgins JPT. Interventions for promoting smoke alarm ownership and function. Cochrane Database of Systematic Reviews 2001; 2 : CD002246.

Ebrahim S, Akl EA, Mustafa RA, Sun X, Walter SD, Heels-Ansdell D, Alonso-Coello P, Johnston BC, Guyatt GH. Addressing continuous data for participants excluded from trial analysis: a guide for systematic reviewers. Journal of Clinical Epidemiology 2013; 66 : 1014-1021 e1011.

Ebrahim S, Johnston BC, Akl EA, Mustafa RA, Sun X, Walter SD, Heels-Ansdell D, Alonso-Coello P, Guyatt GH. Addressing continuous data measured with different instruments for participants excluded from trial analysis: a guide for systematic reviewers. Journal of Clinical Epidemiology 2014; 67 : 560-570.

Efthimiou O. Practical guide to the meta-analysis of rare events. Evidence-Based Mental Health 2018; 21 : 72-76.

Egger M, Davey Smith G, Schneider M, Minder C. Bias in meta-analysis detected by a simple, graphical test. BMJ 1997; 315 : 629-634.

Engels EA, Schmid CH, Terrin N, Olkin I, Lau J. Heterogeneity and statistical significance in meta-analysis: an empirical study of 125 meta-analyses. Statistics in Medicine 2000; 19 : 1707-1728.

Greenland S, Robins JM. Estimation of a common effect parameter from sparse follow-up data. Biometrics 1985; 41 : 55-68.

Greenland S. Quantitative methods in the review of epidemiologic literature. Epidemiologic Reviews 1987; 9 : 1-30.

Greenland S, Longnecker MP. Methods for trend estimation from summarized dose-response data, with applications to meta-analysis. American Journal of Epidemiology 1992; 135 : 1301-1309.

Guevara JP, Berlin JA, Wolf FM. Meta-analytic methods for pooling rates when follow-up duration varies: a case study. BMC Medical Research Methodology 2004; 4 : 17.

Hartung J, Knapp G. A refined method for the meta-analysis of controlled clinical trials with binary outcome. Statistics in Medicine 2001; 20 : 3875-3889.

Hasselblad V, McCrory DC. Meta-analytic tools for medical decision making: A practical guide. Medical Decision Making 1995; 15 : 81-96.

Higgins JPT, Thompson SG. Quantifying heterogeneity in a meta-analysis. Statistics in Medicine 2002; 21 : 1539-1558.

Higgins JPT, Thompson SG, Deeks JJ, Altman DG. Measuring inconsistency in meta-analyses. BMJ 2003; 327 : 557-560.

Higgins JPT, Thompson SG. Controlling the risk of spurious findings from meta-regression. Statistics in Medicine 2004; 23 : 1663-1682.

Higgins JPT, White IR, Wood AM. Imputation methods for missing outcome data in meta-analysis of clinical trials. Clinical Trials 2008a; 5 : 225-239.

Higgins JPT, White IR, Anzures-Cabrera J. Meta-analysis of skewed data: combining results reported on log-transformed or raw scales. Statistics in Medicine 2008b; 27 : 6072-6092.

Higgins JPT, Thompson SG, Spiegelhalter DJ. A re-evaluation of random-effects meta-analysis. Journal of the Royal Statistical Society: Series A (Statistics in Society) 2009; 172 : 137-159.

Kjaergard LL, Villumsen J, Gluud C. Reported methodologic quality and discrepancies between large and small randomized trials in meta-analyses. Annals of Internal Medicine 2001; 135 : 982-989.

Langan D, Higgins JPT, Simmonds M. An empirical comparison of heterogeneity variance estimators in 12 894 meta-analyses. Research Synthesis Methods 2015; 6 : 195-205.

Langan D, Higgins JPT, Simmonds M. Comparative performance of heterogeneity variance estimators in meta-analysis: a review of simulation studies. Research Synthesis Methods 2017; 8 : 181-198.

Langan D, Higgins JPT, Jackson D, Bowden J, Veroniki AA, Kontopantelis E, Viechtbauer W, Simmonds M. A comparison of heterogeneity variance estimators in simulated random-effects meta-analyses. Research Synthesis Methods 2019; 10 : 83-98.

Lewis S, Clarke M. Forest plots: trying to see the wood and the trees. BMJ 2001; 322 : 1479-1480.

Lunn DJ, Thomas A, Best N, Spiegelhalter D. WinBUGS - A Bayesian modelling framework: Concepts, structure, and extensibility. Statistics and Computing 2000; 10 : 325-337.

Mantel N, Haenszel W. Statistical aspects of the analysis of data from retrospective studies of disease. Journal of the National Cancer Institute 1959; 22 : 719-748.

McIntosh MW. The population risk as an explanatory variable in research synthesis of clinical trials. Statistics in Medicine 1996; 15 : 1713-1728.

Morgenstern H. Uses of ecologic analysis in epidemiologic research. American Journal of Public Health 1982; 72 : 1336-1344.

Oxman AD, Guyatt GH. A consumers guide to subgroup analyses. Annals of Internal Medicine 1992; 116 : 78-84.

Peto R, Collins R, Gray R. Large-scale randomized evidence: large, simple trials and overviews of trials. Journal of Clinical Epidemiology 1995; 48 : 23-40.

Poole C, Greenland S. Random-effects meta-analyses are not always conservative. American Journal of Epidemiology 1999; 150 : 469-475.

Rhodes KM, Turner RM, White IR, Jackson D, Spiegelhalter DJ, Higgins JPT. Implementing informative priors for heterogeneity in meta-analysis using meta-regression and pseudo data. Statistics in Medicine 2016; 35 : 5495-5511.

Rice K, Higgins JPT, Lumley T. A re-evaluation of fixed effect(s) meta-analysis. Journal of the Royal Statistical Society Series A (Statistics in Society) 2018; 181 : 205-227.

Riley RD, Higgins JPT, Deeks JJ. Interpretation of random effects meta-analyses. BMJ 2011; 342 : d549.

Röver C. Bayesian random-effects meta-analysis using the bayesmeta R package 2017. https://arxiv.org/abs/1711.08683 .

Rücker G, Schwarzer G, Carpenter J, Olkin I. Why add anything to nothing? The arcsine difference as a measure of treatment effect in meta-analysis with zero cells. Statistics in Medicine 2009; 28 : 721-738.

Sharp SJ. Analysing the relationship between treatment benefit and underlying risk: precautions and practical recommendations. In: Egger M, Davey Smith G, Altman DG, editors. Systematic Reviews in Health Care: Meta-analysis in Context . 2nd edition ed. London (UK): BMJ Publication Group; 2001. p. 176-188.

Sidik K, Jonkman JN. A simple confidence interval for meta-analysis. Statistics in Medicine 2002; 21 : 3153-3159.

Simmonds MC, Tierney J, Bowden J, Higgins JPT. Meta-analysis of time-to-event data: a comparison of two-stage methods. Research Synthesis Methods 2011; 2 : 139-149.

Sinclair JC, Bracken MB. Clinically useful measures of effect in binary analyses of randomized trials. Journal of Clinical Epidemiology 1994; 47 : 881-889.

Smith TC, Spiegelhalter DJ, Thomas A. Bayesian approaches to random-effects meta-analysis: a comparative study. Statistics in Medicine 1995; 14 : 2685-2699.

Spiegelhalter DJ, Abrams KR, Myles JP. Bayesian Approaches to Clinical Trials and Health-Care Evaluation . Chichester (UK): John Wiley & Sons; 2004.

Spittal MJ, Pirkis J, Gurrin LC. Meta-analysis of incidence rate data in the presence of zero events. BMC Medical Research Methodology 2015; 15 : 42.

Sutton AJ, Abrams KR, Jones DR, Sheldon TA, Song F. Methods for Meta-analysis in Medical Research . Chichester (UK): John Wiley & Sons; 2000.

Sutton AJ, Abrams KR. Bayesian methods in meta-analysis and evidence synthesis. Statistical Methods in Medical Research 2001; 10 : 277-303.

Sweeting MJ, Sutton AJ, Lambert PC. What to add to nothing? Use and avoidance of continuity corrections in meta-analysis of sparse data. Statistics in Medicine 2004; 23 : 1351-1375.

Thompson SG, Smith TC, Sharp SJ. Investigating underlying risk as a source of heterogeneity in meta-analysis. Statistics in Medicine 1997; 16 : 2741-2758.

Thompson SG, Sharp SJ. Explaining heterogeneity in meta-analysis: a comparison of methods. Statistics in Medicine 1999; 18 : 2693-2708.

Thompson SG, Higgins JPT. How should meta-regression analyses be undertaken and interpreted? Statistics in Medicine 2002; 21 : 1559-1574.

Turner RM, Davey J, Clarke MJ, Thompson SG, Higgins JPT. Predicting the extent of heterogeneity in meta-analysis, using empirical data from the Cochrane Database of Systematic Reviews. International Journal of Epidemiology 2012; 41 : 818-827.

Veroniki AA, Jackson D, Viechtbauer W, Bender R, Bowden J, Knapp G, Kuss O, Higgins JPT, Langan D, Salanti G. Methods to estimate the between-study variance and its uncertainty in meta-analysis. Research Synthesis Methods 2016; 7 : 55-79.

Whitehead A, Jones NMB. A meta-analysis of clinical trials involving different classifications of response into ordered categories. Statistics in Medicine 1994; 13 : 2503-2515.

Yusuf S, Peto R, Lewis J, Collins R, Sleight P. Beta blockade during and after myocardial infarction: an overview of the randomized trials. Progress in Cardiovascular Diseases 1985; 27 : 335-371.

Yusuf S, Wittes J, Probstfield J, Tyroler HA. Analysis and interpretation of treatment effects in subgroups of patients in randomized clinical trials. JAMA 1991; 266 : 93-98.

For permission to re-use material from the Handbook (either academic or commercial), please see here for full details.

Organizing Your Social Sciences Research Assignments

Annotated Bibliography
Analyzing a Scholarly Journal Article
Group Presentations
Dealing with Nervousness
Using Visual Aids
Grading Someone Else's Paper
Types of Structured Group Activities
Group Project Survival Skills
Leading a Class Discussion
Multiple Book Review Essay
Reviewing Collected Works
Writing a Case Analysis Paper
Writing a Case Study
About Informed Consent
Writing Field Notes
Writing a Policy Memo
Writing a Reflective Paper
Writing a Research Proposal
Generative AI and Writing
Acknowledgments

Definition and Introduction

Case analysis is a problem-based teaching and learning method that involves critically analyzing complex scenarios within an organizational setting for the purpose of placing the student in a “real world” situation and applying reflection and critical thinking skills to contemplate appropriate solutions, decisions, or recommended courses of action. It is considered a more effective teaching technique than in-class role playing or simulation activities. The analytical process is often guided by questions provided by the instructor that ask students to contemplate relationships between the facts and critical incidents described in the case.

Cases generally include both descriptive and statistical elements and rely on students applying abductive reasoning to develop and argue for preferred or best outcomes [i.e., case scenarios rarely have a single correct or perfect answer based on the evidence provided]. Rather than emphasizing theories or concepts, case analysis assignments emphasize building a bridge of relevancy between abstract thinking and practical application and, by so doing, teaches the value of both within a specific area of professional practice.

Given this, the purpose of a case analysis paper is to present a structured and logically organized format for analyzing the case situation. It can be assigned to students individually or as a small group assignment and it may include an in-class presentation component. Case analysis is predominately taught in economics and business-related courses, but it is also a method of teaching and learning found in other applied social sciences disciplines, such as, social work, public relations, education, journalism, and public administration.

Ellet, William. The Case Study Handbook: A Student's Guide . Revised Edition. Boston, MA: Harvard Business School Publishing, 2018; Christoph Rasche and Achim Seisreiner. Guidelines for Business Case Analysis . University of Potsdam; Writing a Case Analysis . Writing Center, Baruch College; Volpe, Guglielmo. "Case Teaching in Economics: History, Practice and Evidence." Cogent Economics and Finance 3 (December 2015). doi:https://doi.org/10.1080/23322039.2015.1120977.

How to Approach Writing a Case Analysis Paper

The organization and structure of a case analysis paper can vary depending on the organizational setting, the situation, and how your professor wants you to approach the assignment. Nevertheless, preparing to write a case analysis paper involves several important steps. As Hawes notes, a case analysis assignment “...is useful in developing the ability to get to the heart of a problem, analyze it thoroughly, and to indicate the appropriate solution as well as how it should be implemented” [p.48]. This statement encapsulates how you should approach preparing to write a case analysis paper.

Before you begin to write your paper, consider the following analytical procedures:

Review the case to get an overview of the situation . A case can be only a few pages in length, however, it is most often very lengthy and contains a significant amount of detailed background information and statistics, with multilayered descriptions of the scenario, the roles and behaviors of various stakeholder groups, and situational events. Therefore, a quick reading of the case will help you gain an overall sense of the situation and illuminate the types of issues and problems that you will need to address in your paper. If your professor has provided questions intended to help frame your analysis, use them to guide your initial reading of the case.
Read the case thoroughly . After gaining a general overview of the case, carefully read the content again with the purpose of understanding key circumstances, events, and behaviors among stakeholder groups. Look for information or data that appears contradictory, extraneous, or misleading. At this point, you should be taking notes as you read because this will help you develop a general outline of your paper. The aim is to obtain a complete understanding of the situation so that you can begin contemplating tentative answers to any questions your professor has provided or, if they have not provided, developing answers to your own questions about the case scenario and its connection to the course readings,lectures, and class discussions.
Determine key stakeholder groups, issues, and events and the relationships they all have to each other . As you analyze the content, pay particular attention to identifying individuals, groups, or organizations described in the case and identify evidence of any problems or issues of concern that impact the situation in a negative way. Other things to look for include identifying any assumptions being made by or about each stakeholder, potential biased explanations or actions, explicit demands or ultimatums , and the underlying concerns that motivate these behaviors among stakeholders. The goal at this stage is to develop a comprehensive understanding of the situational and behavioral dynamics of the case and the explicit and implicit consequences of each of these actions.
Identify the core problems . The next step in most case analysis assignments is to discern what the core [i.e., most damaging, detrimental, injurious] problems are within the organizational setting and to determine their implications. The purpose at this stage of preparing to write your analysis paper is to distinguish between the symptoms of core problems and the core problems themselves and to decide which of these must be addressed immediately and which problems do not appear critical but may escalate over time. Identify evidence from the case to support your decisions by determining what information or data is essential to addressing the core problems and what information is not relevant or is misleading.
Explore alternative solutions . As noted, case analysis scenarios rarely have only one correct answer. Therefore, it is important to keep in mind that the process of analyzing the case and diagnosing core problems, while based on evidence, is a subjective process open to various avenues of interpretation. This means that you must consider alternative solutions or courses of action by critically examining strengths and weaknesses, risk factors, and the differences between short and long-term solutions. For each possible solution or course of action, consider the consequences they may have related to their implementation and how these recommendations might lead to new problems. Also, consider thinking about your recommended solutions or courses of action in relation to issues of fairness, equity, and inclusion.
Decide on a final set of recommendations . The last stage in preparing to write a case analysis paper is to assert an opinion or viewpoint about the recommendations needed to help resolve the core problems as you see them and to make a persuasive argument for supporting this point of view. Prepare a clear rationale for your recommendations based on examining each element of your analysis. Anticipate possible obstacles that could derail their implementation. Consider any counter-arguments that could be made concerning the validity of your recommended actions. Finally, describe a set of criteria and measurable indicators that could be applied to evaluating the effectiveness of your implementation plan.

Use these steps as the framework for writing your paper. Remember that the more detailed you are in taking notes as you critically examine each element of the case, the more information you will have to draw from when you begin to write. This will save you time.

NOTE : If the process of preparing to write a case analysis paper is assigned as a student group project, consider having each member of the group analyze a specific element of the case, including drafting answers to the corresponding questions used by your professor to frame the analysis. This will help make the analytical process more efficient and ensure that the distribution of work is equitable. This can also facilitate who is responsible for drafting each part of the final case analysis paper and, if applicable, the in-class presentation.

Framework for Case Analysis . College of Management. University of Massachusetts; Hawes, Jon M. "Teaching is Not Telling: The Case Method as a Form of Interactive Learning." Journal for Advancement of Marketing Education 5 (Winter 2004): 47-54; Rasche, Christoph and Achim Seisreiner. Guidelines for Business Case Analysis . University of Potsdam; Writing a Case Study Analysis . University of Arizona Global Campus Writing Center; Van Ness, Raymond K. A Guide to Case Analysis . School of Business. State University of New York, Albany; Writing a Case Analysis . Business School, University of New South Wales.

Structure and Writing Style

A case analysis paper should be detailed, concise, persuasive, clearly written, and professional in tone and in the use of language . As with other forms of college-level academic writing, declarative statements that convey information, provide a fact, or offer an explanation or any recommended courses of action should be based on evidence. If allowed by your professor, any external sources used to support your analysis, such as course readings, should be properly cited under a list of references. The organization and structure of case analysis papers can vary depending on your professor’s preferred format, but its structure generally follows the steps used for analyzing the case.

Introduction

The introduction should provide a succinct but thorough descriptive overview of the main facts, issues, and core problems of the case . The introduction should also include a brief summary of the most relevant details about the situation and organizational setting. This includes defining the theoretical framework or conceptual model on which any questions were used to frame your analysis.

Following the rules of most college-level research papers, the introduction should then inform the reader how the paper will be organized. This includes describing the major sections of the paper and the order in which they will be presented. Unless you are told to do so by your professor, you do not need to preview your final recommendations in the introduction. U nlike most college-level research papers , the introduction does not include a statement about the significance of your findings because a case analysis assignment does not involve contributing new knowledge about a research problem.

Background Analysis

Background analysis can vary depending on any guiding questions provided by your professor and the underlying concept or theory that the case is based upon. In general, however, this section of your paper should focus on:

Providing an overarching analysis of problems identified from the case scenario, including identifying events that stakeholders find challenging or troublesome,
Identifying assumptions made by each stakeholder and any apparent biases they may exhibit,
Describing any demands or claims made by or forced upon key stakeholders, and
Highlighting any issues of concern or complaints expressed by stakeholders in response to those demands or claims.

These aspects of the case are often in the form of behavioral responses expressed by individuals or groups within the organizational setting. However, note that problems in a case situation can also be reflected in data [or the lack thereof] and in the decision-making, operational, cultural, or institutional structure of the organization. Additionally, demands or claims can be either internal and external to the organization [e.g., a case analysis involving a president considering arms sales to Saudi Arabia could include managing internal demands from White House advisors as well as demands from members of Congress].

Throughout this section, present all relevant evidence from the case that supports your analysis. Do not simply claim there is a problem, an assumption, a demand, or a concern; tell the reader what part of the case informed how you identified these background elements.

Identification of Problems

In most case analysis assignments, there are problems, and then there are problems . Each problem can reflect a multitude of underlying symptoms that are detrimental to the interests of the organization. The purpose of identifying problems is to teach students how to differentiate between problems that vary in severity, impact, and relative importance. Given this, problems can be described in three general forms: those that must be addressed immediately, those that should be addressed but the impact is not severe, and those that do not require immediate attention and can be set aside for the time being.

All of the problems you identify from the case should be identified in this section of your paper, with a description based on evidence explaining the problem variances. If the assignment asks you to conduct research to further support your assessment of the problems, include this in your explanation. Remember to cite those sources in a list of references. Use specific evidence from the case and apply appropriate concepts, theories, and models discussed in class or in relevant course readings to highlight and explain the key problems [or problem] that you believe must be solved immediately and describe the underlying symptoms and why they are so critical.

Alternative Solutions

This section is where you provide specific, realistic, and evidence-based solutions to the problems you have identified and make recommendations about how to alleviate the underlying symptomatic conditions impacting the organizational setting. For each solution, you must explain why it was chosen and provide clear evidence to support your reasoning. This can include, for example, course readings and class discussions as well as research resources, such as, books, journal articles, research reports, or government documents. In some cases, your professor may encourage you to include personal, anecdotal experiences as evidence to support why you chose a particular solution or set of solutions. Using anecdotal evidence helps promote reflective thinking about the process of determining what qualifies as a core problem and relevant solution .

Throughout this part of the paper, keep in mind the entire array of problems that must be addressed and describe in detail the solutions that might be implemented to resolve these problems.

Recommended Courses of Action

In some case analysis assignments, your professor may ask you to combine the alternative solutions section with your recommended courses of action. However, it is important to know the difference between the two. A solution refers to the answer to a problem. A course of action refers to a procedure or deliberate sequence of activities adopted to proactively confront a situation, often in the context of accomplishing a goal. In this context, proposed courses of action are based on your analysis of alternative solutions. Your description and justification for pursuing each course of action should represent the overall plan for implementing your recommendations.

For each course of action, you need to explain the rationale for your recommendation in a way that confronts challenges, explains risks, and anticipates any counter-arguments from stakeholders. Do this by considering the strengths and weaknesses of each course of action framed in relation to how the action is expected to resolve the core problems presented, the possible ways the action may affect remaining problems, and how the recommended action will be perceived by each stakeholder.

In addition, you should describe the criteria needed to measure how well the implementation of these actions is working and explain which individuals or groups are responsible for ensuring your recommendations are successful. In addition, always consider the law of unintended consequences. Outline difficulties that may arise in implementing each course of action and describe how implementing the proposed courses of action [either individually or collectively] may lead to new problems [both large and small].

Throughout this section, you must consider the costs and benefits of recommending your courses of action in relation to uncertainties or missing information and the negative consequences of success.

The conclusion should be brief and introspective. Unlike a research paper, the conclusion in a case analysis paper does not include a summary of key findings and their significance, a statement about how the study contributed to existing knowledge, or indicate opportunities for future research.

Begin by synthesizing the core problems presented in the case and the relevance of your recommended solutions. This can include an explanation of what you have learned about the case in the context of your answers to the questions provided by your professor. The conclusion is also where you link what you learned from analyzing the case with the course readings or class discussions. This can further demonstrate your understanding of the relationships between the practical case situation and the theoretical and abstract content of assigned readings and other course content.

Problems to Avoid

The literature on case analysis assignments often includes examples of difficulties students have with applying methods of critical analysis and effectively reporting the results of their assessment of the situation. A common reason cited by scholars is that the application of this type of teaching and learning method is limited to applied fields of social and behavioral sciences and, as a result, writing a case analysis paper can be unfamiliar to most students entering college.

After you have drafted your paper, proofread the narrative flow and revise any of these common errors:

Unnecessary detail in the background section . The background section should highlight the essential elements of the case based on your analysis. Focus on summarizing the facts and highlighting the key factors that become relevant in the other sections of the paper by eliminating any unnecessary information.
Analysis relies too much on opinion . Your analysis is interpretive, but the narrative must be connected clearly to evidence from the case and any models and theories discussed in class or in course readings. Any positions or arguments you make should be supported by evidence.
Analysis does not focus on the most important elements of the case . Your paper should provide a thorough overview of the case. However, the analysis should focus on providing evidence about what you identify are the key events, stakeholders, issues, and problems. Emphasize what you identify as the most critical aspects of the case to be developed throughout your analysis. Be thorough but succinct.
Writing is too descriptive . A paper with too much descriptive information detracts from your analysis of the complexities of the case situation. Questions about what happened, where, when, and by whom should only be included as essential information leading to your examination of questions related to why, how, and for what purpose.
Inadequate definition of a core problem and associated symptoms . A common error found in case analysis papers is recommending a solution or course of action without adequately defining or demonstrating that you understand the problem. Make sure you have clearly described the problem and its impact and scope within the organizational setting. Ensure that you have adequately described the root causes w hen describing the symptoms of the problem.
Recommendations lack specificity . Identify any use of vague statements and indeterminate terminology, such as, “A particular experience” or “a large increase to the budget.” These statements cannot be measured and, as a result, there is no way to evaluate their successful implementation. Provide specific data and use direct language in describing recommended actions.
Unrealistic, exaggerated, or unattainable recommendations . Review your recommendations to ensure that they are based on the situational facts of the case. Your recommended solutions and courses of action must be based on realistic assumptions and fit within the constraints of the situation. Also note that the case scenario has already happened, therefore, any speculation or arguments about what could have occurred if the circumstances were different should be revised or eliminated.

Bee, Lian Song et al. "Business Students' Perspectives on Case Method Coaching for Problem-Based Learning: Impacts on Student Engagement and Learning Performance in Higher Education." Education & Training 64 (2022): 416-432; The Case Analysis . Fred Meijer Center for Writing and Michigan Authors. Grand Valley State University; Georgallis, Panikos and Kayleigh Bruijn. "Sustainability Teaching using Case-Based Debates." Journal of International Education in Business 15 (2022): 147-163; Hawes, Jon M. "Teaching is Not Telling: The Case Method as a Form of Interactive Learning." Journal for Advancement of Marketing Education 5 (Winter 2004): 47-54; Georgallis, Panikos, and Kayleigh Bruijn. "Sustainability Teaching Using Case-based Debates." Journal of International Education in Business 15 (2022): 147-163; .Dean, Kathy Lund and Charles J. Fornaciari. "How to Create and Use Experiential Case-Based Exercises in a Management Classroom." Journal of Management Education 26 (October 2002): 586-603; Klebba, Joanne M. and Janet G. Hamilton. "Structured Case Analysis: Developing Critical Thinking Skills in a Marketing Case Course." Journal of Marketing Education 29 (August 2007): 132-137, 139; Klein, Norman. "The Case Discussion Method Revisited: Some Questions about Student Skills." Exchange: The Organizational Behavior Teaching Journal 6 (November 1981): 30-32; Mukherjee, Arup. "Effective Use of In-Class Mini Case Analysis for Discovery Learning in an Undergraduate MIS Course." The Journal of Computer Information Systems 40 (Spring 2000): 15-23; Pessoa, Silviaet al. "Scaffolding the Case Analysis in an Organizational Behavior Course: Making Analytical Language Explicit." Journal of Management Education 46 (2022): 226-251: Ramsey, V. J. and L. D. Dodge. "Case Analysis: A Structured Approach." Exchange: The Organizational Behavior Teaching Journal 6 (November 1981): 27-29; Schweitzer, Karen. "How to Write and Format a Business Case Study." ThoughtCo. https://www.thoughtco.com/how-to-write-and-format-a-business-case-study-466324 (accessed December 5, 2022); Reddy, C. D. "Teaching Research Methodology: Everything's a Case." Electronic Journal of Business Research Methods 18 (December 2020): 178-188; Volpe, Guglielmo. "Case Teaching in Economics: History, Practice and Evidence." Cogent Economics and Finance 3 (December 2015). doi:https://doi.org/10.1080/23322039.2015.1120977.

Writing Tip

Ca se Study and Case Analysis Are Not the Same!

Confusion often exists between what it means to write a paper that uses a case study research design and writing a paper that analyzes a case; they are two different types of approaches to learning in the social and behavioral sciences. Professors as well as educational researchers contribute to this confusion because they often use the term "case study" when describing the subject of analysis for a case analysis paper. But you are not studying a case for the purpose of generating a comprehensive, multi-faceted understanding of a research problem. R ather, you are critically analyzing a specific scenario to argue logically for recommended solutions and courses of action that lead to optimal outcomes applicable to professional practice.

To avoid any confusion, here are twelve characteristics that delineate the differences between writing a paper using the case study research method and writing a case analysis paper:

Case study is a method of in-depth research and rigorous inquiry ; case analysis is a reliable method of teaching and learning . A case study is a modality of research that investigates a phenomenon for the purpose of creating new knowledge, solving a problem, or testing a hypothesis using empirical evidence derived from the case being studied. Often, the results are used to generalize about a larger population or within a wider context. The writing adheres to the traditional standards of a scholarly research study. A case analysis is a pedagogical tool used to teach students how to reflect and think critically about a practical, real-life problem in an organizational setting.
The researcher is responsible for identifying the case to study; a case analysis is assigned by your professor . As the researcher, you choose the case study to investigate in support of obtaining new knowledge and understanding about the research problem. The case in a case analysis assignment is almost always provided, and sometimes written, by your professor and either given to every student in class to analyze individually or to a small group of students, or students select a case to analyze from a predetermined list.
A case study is indeterminate and boundless; a case analysis is predetermined and confined . A case study can be almost anything [see item 9 below] as long as it relates directly to examining the research problem. This relationship is the only limit to what a researcher can choose as the subject of their case study. The content of a case analysis is determined by your professor and its parameters are well-defined and limited to elucidating insights of practical value applied to practice.
Case study is fact-based and describes actual events or situations; case analysis can be entirely fictional or adapted from an actual situation . The entire content of a case study must be grounded in reality to be a valid subject of investigation in an empirical research study. A case analysis only needs to set the stage for critically examining a situation in practice and, therefore, can be entirely fictional or adapted, all or in-part, from an actual situation.
Research using a case study method must adhere to principles of intellectual honesty and academic integrity; a case analysis scenario can include misleading or false information . A case study paper must report research objectively and factually to ensure that any findings are understood to be logically correct and trustworthy. A case analysis scenario may include misleading or false information intended to deliberately distract from the central issues of the case. The purpose is to teach students how to sort through conflicting or useless information in order to come up with the preferred solution. Any use of misleading or false information in academic research is considered unethical.
Case study is linked to a research problem; case analysis is linked to a practical situation or scenario . In the social sciences, the subject of an investigation is most often framed as a problem that must be researched in order to generate new knowledge leading to a solution. Case analysis narratives are grounded in real life scenarios for the purpose of examining the realities of decision-making behavior and processes within organizational settings. A case analysis assignments include a problem or set of problems to be analyzed. However, the goal is centered around the act of identifying and evaluating courses of action leading to best possible outcomes.
The purpose of a case study is to create new knowledge through research; the purpose of a case analysis is to teach new understanding . Case studies are a choice of methodological design intended to create new knowledge about resolving a research problem. A case analysis is a mode of teaching and learning intended to create new understanding and an awareness of uncertainty applied to practice through acts of critical thinking and reflection.
A case study seeks to identify the best possible solution to a research problem; case analysis can have an indeterminate set of solutions or outcomes . Your role in studying a case is to discover the most logical, evidence-based ways to address a research problem. A case analysis assignment rarely has a single correct answer because one of the goals is to force students to confront the real life dynamics of uncertainly, ambiguity, and missing or conflicting information within professional practice. Under these conditions, a perfect outcome or solution almost never exists.
Case study is unbounded and relies on gathering external information; case analysis is a self-contained subject of analysis . The scope of a case study chosen as a method of research is bounded. However, the researcher is free to gather whatever information and data is necessary to investigate its relevance to understanding the research problem. For a case analysis assignment, your professor will often ask you to examine solutions or recommended courses of action based solely on facts and information from the case.
Case study can be a person, place, object, issue, event, condition, or phenomenon; a case analysis is a carefully constructed synopsis of events, situations, and behaviors . The research problem dictates the type of case being studied and, therefore, the design can encompass almost anything tangible as long as it fulfills the objective of generating new knowledge and understanding. A case analysis is in the form of a narrative containing descriptions of facts, situations, processes, rules, and behaviors within a particular setting and under a specific set of circumstances.
Case study can represent an open-ended subject of inquiry; a case analysis is a narrative about something that has happened in the past . A case study is not restricted by time and can encompass an event or issue with no temporal limit or end. For example, the current war in Ukraine can be used as a case study of how medical personnel help civilians during a large military conflict, even though circumstances around this event are still evolving. A case analysis can be used to elicit critical thinking about current or future situations in practice, but the case itself is a narrative about something finite and that has taken place in the past.
Multiple case studies can be used in a research study; case analysis involves examining a single scenario . Case study research can use two or more cases to examine a problem, often for the purpose of conducting a comparative investigation intended to discover hidden relationships, document emerging trends, or determine variations among different examples. A case analysis assignment typically describes a stand-alone, self-contained situation and any comparisons among cases are conducted during in-class discussions and/or student presentations.

The Case Analysis . Fred Meijer Center for Writing and Michigan Authors. Grand Valley State University; Mills, Albert J. , Gabrielle Durepos, and Eiden Wiebe, editors. Encyclopedia of Case Study Research . Thousand Oaks, CA: SAGE Publications, 2010; Ramsey, V. J. and L. D. Dodge. "Case Analysis: A Structured Approach." Exchange: The Organizational Behavior Teaching Journal 6 (November 1981): 27-29; Yin, Robert K. Case Study Research and Applications: Design and Methods . 6th edition. Thousand Oaks, CA: Sage, 2017; Crowe, Sarah et al. “The Case Study Approach.” BMC Medical Research Methodology 11 (2011): doi: 10.1186/1471-2288-11-100; Yin, Robert K. Case Study Research: Design and Methods . 4th edition. Thousand Oaks, CA: Sage Publishing; 1994.

<< Previous: Reviewing Collected Works
Next: Writing a Case Study >>
Last Updated: Jun 3, 2024 9:44 AM
URL: https://libguides.usc.edu/writingguide/assignments

Systematic Reviews and Meta Analysis

Getting Started
Guides and Standards
Review Protocols
Databases and Sources
Randomized Controlled Trials
Controlled Clinical Trials
Observational Designs
Tests of Diagnostic Accuracy
Software and Tools
Where do I get all those articles?
Collaborations
EPI 233/528
Countway Mediated Search
Risk of Bias (RoB)

Systematic review Q & A

What is a systematic review.

A systematic review is guided filtering and synthesis of all available evidence addressing a specific, focused research question, generally about a specific intervention or exposure. The use of standardized, systematic methods and pre-selected eligibility criteria reduce the risk of bias in identifying, selecting and analyzing relevant studies. A well-designed systematic review includes clear objectives, pre-selected criteria for identifying eligible studies, an explicit methodology, a thorough and reproducible search of the literature, an assessment of the validity or risk of bias of each included study, and a systematic synthesis, analysis and presentation of the findings of the included studies. A systematic review may include a meta-analysis.

For details about carrying out systematic reviews, see the Guides and Standards section of this guide.

Is my research topic appropriate for systematic review methods?

A systematic review is best deployed to test a specific hypothesis about a healthcare or public health intervention or exposure. By focusing on a single intervention or a few specific interventions for a particular condition, the investigator can ensure a manageable results set. Moreover, examining a single or small set of related interventions, exposures, or outcomes, will simplify the assessment of studies and the synthesis of the findings.

Systematic reviews are poor tools for hypothesis generation: for instance, to determine what interventions have been used to increase the awareness and acceptability of a vaccine or to investigate the ways that predictive analytics have been used in health care management. In the first case, we don't know what interventions to search for and so have to screen all the articles about awareness and acceptability. In the second, there is no agreed on set of methods that make up predictive analytics, and health care management is far too broad. The search will necessarily be incomplete, vague and very large all at the same time. In most cases, reviews without clearly and exactly specified populations, interventions, exposures, and outcomes will produce results sets that quickly outstrip the resources of a small team and offer no consistent way to assess and synthesize findings from the studies that are identified.

If not a systematic review, then what?

You might consider performing a scoping review . This framework allows iterative searching over a reduced number of data sources and no requirement to assess individual studies for risk of bias. The framework includes built-in mechanisms to adjust the analysis as the work progresses and more is learned about the topic. A scoping review won't help you limit the number of records you'll need to screen (broad questions lead to large results sets) but may give you means of dealing with a large set of results.

This tool can help you decide what kind of review is right for your question.

Can my student complete a systematic review during her summer project?

Probably not. Systematic reviews are a lot of work. Including creating the protocol, building and running a quality search, collecting all the papers, evaluating the studies that meet the inclusion criteria and extracting and analyzing the summary data, a well done review can require dozens to hundreds of hours of work that can span several months. Moreover, a systematic review requires subject expertise, statistical support and a librarian to help design and run the search. Be aware that librarians sometimes have queues for their search time. It may take several weeks to complete and run a search. Moreover, all guidelines for carrying out systematic reviews recommend that at least two subject experts screen the studies identified in the search. The first round of screening can consume 1 hour per screener for every 100-200 records. A systematic review is a labor-intensive team effort.

How can I know if my topic has been been reviewed already?

Before starting out on a systematic review, check to see if someone has done it already. In PubMed you can use the systematic review subset to limit to a broad group of papers that is enriched for systematic reviews. You can invoke the subset by selecting if from the Article Types filters to the left of your PubMed results, or you can append AND systematic[sb] to your search. For example:

"neoadjuvant chemotherapy" AND systematic[sb]

The systematic review subset is very noisy, however. To quickly focus on systematic reviews (knowing that you may be missing some), simply search for the word systematic in the title:

"neoadjuvant chemotherapy" AND systematic[ti]

Any PRISMA-compliant systematic review will be captured by this method since including the words "systematic review" in the title is a requirement of the PRISMA checklist. Cochrane systematic reviews do not include 'systematic' in the title, however. It's worth checking the Cochrane Database of Systematic Reviews independently.

You can also search for protocols that will indicate that another group has set out on a similar project. Many investigators will register their protocols in PROSPERO , a registry of review protocols. Other published protocols as well as Cochrane Review protocols appear in the Cochrane Methodology Register, a part of the Cochrane Library .

Next: Guides and Standards >>
Last Updated: Sep 4, 2024 4:04 PM
URL: https://guides.library.harvard.edu/meta-analysis

10 Shattuck St, Boston MA 02115 | (617) 432-2136

| |
Copyright © 2020 President and Fellows of Harvard College. All rights reserved.

Systematic Review
Open access
Published: 02 September 2024

A systematic review and meta-analysis:comparing the efficacy of the Ilizarov technique alone with lengthening over a nail for lower extremity bone defects

Fei Tan 1 , 2 na1 ,
Cuixian Yang 3 na1 ,
Jiankang Zeng 1 , 2 ,
Jiahuan Li 1 , 2 ,
Peijie Li 1 , 2 ,
Yongjie Qiao 1 ,
Jing Wang 1 , 2 ,
Jiangming Zhang 1 , 2 ,
Dong Xie 1 , 2 ,
Shuo Ye 1 &
Shenghu Zhou 1

BMC Musculoskeletal Disorders volume 25 , Article number: 699 ( 2024 ) Cite this article

Metrics details

The task faced by surgeons becomes significantly more challenging when they encounter lower extremity bone defects due to a variety of causes requiring lengthening. The most discussed and successful approach is the Illizarov technique, or lengthening over a nail (LON):distraction osteogenesis is also widely performed with monoliteral external fixators and intramedullarylengthening nails have increasingly been used in the last decade.

The data were collected from PubMed, Cochrane Library, Embase, and the Web of Science for all available studies comparing the outcomes of Ilizarov technique alone and LON technique (from January 1, 1997, to November 30, 2023). The outcomes of interest encompassed the external fixation index (EFI) (month/cm), mean duration of follow-up (MFT) (month), length gained (LG) (cm), consolidation index (CIx) (month/cm), and bone healing index (BHI) (month/cm).Complications include pin tract infection rate (PTI), axial deviation rate (AD), occurrence of intramedullary infection (II), delayed consolidation rate (DC), as well as data categorized into three levels of problems, obstacles, and sequelae based on the severity of complications.Two reviewers independently assessed each study for quality and extracted data. The case–control or respective cohort studies were evaluated using the Newcastle–Ottawa scale (NOS) to determine their techniqueological rigor.The Cochrane Collaboration’s risk assessment tool was employed to perform quality evaluations for randomized controlled trials.

This review included thirteen studies comprising a total of 629 patients.The external fixation index (month/cm) was significantly smaller in the LON technique compared to the Ilizarov technique alone [Mean Difference(MD) = -29.59, 95% CI -39.68–-19.49, P < 0.00001].In terms of the mean follow-up time(month) (MD = -0.92, 95% CI -3.49–1.65, P = 0.57), length gained (cm) (MD = -0.87, 95%CI -2.80–1.07, P = 0.38), consolidation index (month/cm) (MD = 0.66, 95% CI -3.44–4.77, P = 0.75), and bone healing index (month/cm) (MD = -3.33, 95% CI -13.07–6.41, P = 0.5), there were no significant differences observed. The LON technique exhibited a lower incidence of axial deviation [Odds Ratio(OR) = 0.06, 95%CI 0.03–0.16, P < 0.00001] and pin tract infection (OR = 0.30, 95%CI 0.18–0.50, P < 0.00001) compared to the Ilizarov technique alone.The remaining complications, such as intramedullary infection rate (OR = 0.93, 95%CI 0.42–2.06, P = 0.85) and delayed consolidation rate(OR = 0.61, 95%CI 0.20–1.86, P = 0.38), did not exhibit statistically significant differences.Our findings demonstrated that the LON technique results in lower incidences of problems (38.5%vs.58.6%) and sequelae (16.6% vs.30.9%) when compared to the Ilizarov technique alone. However, the rates of obstacles (32.4% vs.32.3%) were comparable between the two methods.

Conclusions

Our findings indicate that patients treated with the LON technique experienced significantly shorter external fixation durations and a lower incidence of complications (e.g., pin tract infections and axial deviation) compared to those treated with the Ilizarov technique alone. Other outcome metrics showed no significant differences between the two techniques. However, the LON technique offers substantial benefits, including reduced external fixation times and increased comfort, which enhance patient compliance. In conclusion, the LON technique is a safe, reliable, and effective method for treating tibial and femoral defects.

Peer Review reports

Introduction

Segmental long bone defects present significant challenges for limb reconstruction surgeons, necessitating bone transport procedures to address bone loss. Such defects may arise from open fractures [ 1 ], the excision of necrotic bone in chronic osteomyelitis [ 2 ], the need for truncation lengthening in cases of poliomyelitis deformities [ 3 ], and idiopathic short stature. A predominant cause is large segmental bone defects resulting from open trauma. Without further intervention, these defects are unlikely to heal. Unhealed defects severely impair the patient’s quality of life, lead to psychological issues, and hinder their integration into normal society.

Professor Ilizarov, a Soviet physician, pioneered the Ilizarov Technique in 1950 [ 4 , 5 , 6 ]. This method, founded on the principles of distraction osteogenesis, aims to lengthen limbs and correct discrepancies. It employs external fixators to enhance local vascular distribution while maintaining the limb’s weight-bearing capacity during bone transportation. The technique boasts a bone healing rate exceeding 90% and is extensively documented and utilized for bone transport and limb lengthening [ 7 , 8 ]. Nevertheless, the prolonged use of external fixators can cause significant discomfort, restricting patients’ ability to engage in early functional exercises and potentially leading to joint contractures [ 9 , 10 ]. This discomfort can psychologically affect patients, increasing frustration and reducing compliance. Moreover, the extended use of fine Kirschner wires that penetrate the limb and bone tissue is linked to a higher risk of pin tract infections [ 11 ].

Building upon the Ilizarov technique, various bone transfer methods have been developed to address bone defects in the lower extremities. These include the Lengthening Over Nailing (LON) technique [ 12 ], Lengthening and Then Nail (LATN) [ 13 ], and Lengthening and Then Plate (LATP) [ 14 ]. A study by Paley et al. [ 15 ] demonstrated the efficacy of the LON technique in treating lower extremity bone defects. This method reduces the duration of external fixator usage, offering a more convenient approach. Subsequent clinical applications have shown that employing the LON technique for bone defects caused by fractures or infections yields satisfactory results, notably decreasing the overall time required for external fixation, effectively repairing substantial bone defects, and minimizing related complications [ 16 , 17 , 18 ]. However, challenges persist with the LON technique, such as potential risks of intramedullary infection, intramedullary nail breakage, and locking screw failures [ 19 , 20 ].

Choosing the optimal treatment for lower-extremity bone defects continues to be a significant challenge for clinicians. While previous studies have compared the Ilizarov technique alone to the LATN technique, to date, no comprehensive meta-analysis or systematic review has specifically evaluated the efficacy and outcomes of the LON technique against the Ilizarov technique alone. The aim of this study is to conduct a comparative analysis of the efficacy and outcomes of the LON technique relative to the Ilizarov technique alone in treating lower extremity bone defects.

The prospero registration number is CRD42023482000.

Search strategy

The present article was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. A comprehensive search was conducted across English language databases, including PubMed, Web of Science, Cochrane Library, and Embase, using Medical Subject Headings (MESH) terms “Ilizarov Technique” and “Fracture Fixation, Intramedullary,” to identify relevant literature. The search included data up to November 30, 2023, with no restriction on the language of publication. In addition, a comprehensive manual search was conducted to retrieve additional eligible studies from paper-based sources, and all gray literature, as well as references cited in included articles, were carefully screened.

Selection criteria

The criteria for inclusion were established as follows:

(1) The LON technique was used in the intervention technique and only the Ilizarov technique was used in the control technique; (2) the type of study consisted of a randomized controlled trial (RCT), a case–control study (CCS), a retrospective clinical study (RCS), or a prospective clinical study; (3) at least one of the following outcome measures was reported: external fixation index (EFI), mean follow-up time (MFT), length gained (LG), consolidation index (CIx), bone healing index (BHI), and complications such as axial deviation (AD), pin tract infection (PTI), intramedullary infection (II), delayed consolidation (DC), problems, obstacles, and sequelae data. These data are either fully or partially available.and (4)The etiology of bone defects in the literature included could be open fractures, defects due to chronic osteomyelitis, limb length discrepancies, or idiopathic short stature osteotomy defects, but complete control of chronic osteomyelitis was required. All bone defects gained a lengthening length of ≤ 8 cm.

The exclusion criteria were as follows:

(1) Non-peer-reviewed publications are excluded from consideration.(2) certain study designs, such as non-human trials, observational studies, case reports, case series, review articles, and letters to the editor, are not included in this analysis.(3) reviews, letters, and recommendations are also excluded from the scope of this study. (4) in cases where the full text cannot be obtained or the original data is incomplete, those sources will not be considered.

Data extraction

Two academics conducted independent literature reviews, with any disputes resolved by a third reviewer. The following data was extracted from the included studies: (1) first author names, study period, author country, study type, number of cases, and patients’ age; (2) external fixation index (EFI), mean follow-up time (MFT), length gained (LG), consolidation index (CIx), bone healing index (BHI), as well as complications such as axial deviation (AD), pin tract infection (PTI), intramedullary infection (II), delayed consolidation (DC), and problems, obstacles, or sequelae. Secondly, we gathered a range of secondary outcome measures, including the etiology of long bone defects, sex, age, duration of surgery, amount of blood loss, average time to achieve full weight-bearing status, mean size of bone defects, excellent and good functional outcomes assessment results, as well as evaluations for knee and ankle dorsiflexion contractures (Supplementary Material 1). The EFI, which calculates the ratio between the duration of external fixation (in months) and the total bone delivery size (in centimeters).Furthermore, in cases where the data were incomplete or ambiguous in the studies reviewed, efforts were made to establish communication with the respective investigators for clarification purposes.

Study quality assessment

The methodological quality of each included study was assessed by two independent reviewers using the Newcastle–Ottawa Scale (NOS) [ 21 ], a tool commonly employed for evaluating the quality of cohort and case–control studies. Within this assessment, one of the domains considered was case definition, which encompassed selecting study cohorts, comparing cohorts, and determining outcomes. A NOS score of ≥ 6 indicates high quality, with a maximum total score of 9. For randomized controlled trials, the Cochrane Collaboration’s risk assessment tool was utilized to conduct quality assessments.

Statistical analysis

The statistical data were calculated using the Review Manager software (Cochrane Collaboration, Copenhagen, Denmark) and Stata17 software (Stata Corp., College Station, TX, USA). Weighted mean difference (WMD) and 95% confidence interval (CI) were utilized for continuous variables (EFI, MFT, LG, CIx, and BHI), while odds ratios (OR) and 95% confidence intervals (CI) were employed for categorical variables (pin tract infection, axial deviation, intramedullary infection, and delayed consolidation). Weighted averages were employed to calculate aggregated data, which were subsequently presented as either numerical values or proportions based on the sample size of each individual study. We conducted meta-analyses for relevant outcomes in each study using inverse variance statistics to combine effect sizes and apply logarithmic transformations when necessary. The combined effect sizes were expressed as percentages of their respective 95% confidence intervals (CIs). Heterogeneity and the proportion of variation between studies were quantified using the I 2 statistic. I 2 values below 50%, between 50 and 75%, and above 75% were categorized as low, moderate, and high heterogeneity, respectively. A P-value ≥ 0.05 and an I 2 value < 50% indicated no statistical heterogeneity among the studies. A random-effects model was employed in cases of moderate or high heterogeneity; otherwise, a fixed-effects model was utilized. Sensitivity analysis was conducted to assess the robustness of the findings, if deemed necessary. Additionally, forest plots were employed to illustrate the results of each study and evaluate pooled estimates, respectively, while funnel plots were used to assess publication bias. The funnel plots were evaluated to identify potential publication bias. Funnel plots can only intuitively determine the presence of publication bias, and the results may be biased. This prompted us to perform the Begg test as a supplementary assessment.

Search results

The initial search yielded a total of 784 articles, with 248 retrieved from PubMed, 258 from Web of Science, 9 from Cochrane Library, and 266 from Embase. Furthermore, the manual search identified three studies that met the inclusion criteria. After removing duplicates, a total of 563 studies remained. Among these, 500 were excluded based on abstract and title screening. Subsequently, we retrieved the remaining 63 full-text papers for a more comprehensive analysis. Out of these, 50 papers were excluded due to various reasons, including outcome does not meet the request ( n = 33), inability to calculate results from available data ( n = 4), and missing data on key outcome indicators ( n = 13). The meta-analysis finally included thirteen articles (Fig. 1 ).

Literature searching procedure delineated in the form of a flow chart

Figure 1 shows the flowchart of the study selection process.

Demographic characteristics

Thirteen studies were conducted across eight countries, including the United States (one study), Japan (one study), Turkey (two studies), the United Kingdom (one study), Russia (one study), China (three studies), South Korea (three studies), and Egypt (one study). In limb lengthening surgery, a total of 308 cases using the Ilizarov technique alone and 321 cases using the combined technique were identified.In the 13 included studies, the follow-up ranged from 15.3 to 70 months, and the mean age of the patients was 30 years. The indications for limb lengthening were categorized into four main types: open fractures, defects due to chronic osteomyelitis, limb length discrepancies, or idiopathic short-stature osteotomy defects.Five studies conducted by Paley [ 15 ], Burghardt, et al. [ 22 ] specialized in limb lengthening for post-traumatic bone defects. Two studies conducted by Eralp [ 23 ], Sen et al. [ 24 ], specialized in limb lengthening for bone defects causing chronic osteomyelitis. Three studies conducted by Song [ 25 ], Watanabe, et al. [ 20 ] specialized in limb lengthening for lower limb inequality. Three studies conducted by Park and Sun specialized in limb lengthening for idiopathic short stature.For more information on the characteristics of the included studies, see Supplementary Table 1.

Quality assessment results

All studies [ 15 , 20 , 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 , 32 ], including 11 RCSs and 2 RCTs, were of high quality with NOS scores ≥ 6, involving 321 patients in the LON technique and 308 patients in the Ilizarov technique. The quality assessment and basic characteristics of the selected trials are shown in Tables 1 and 2 . The results of LON technique pooled and combined with Ilizarov technique are shown in Table 3 . According to the funnel plot (Fig. 4 A, B, C, D, E), the publication bias of some outcome indicators could not be directly judged. Therefore, we performed the Begg test, and the results showed that all outcome indicators had no publication bias.

Outcomes analysis

External fixation index (efi).

The EFI was recorded in twelve studies[ 15 , 20 , 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 ]. Data were analyzed using a random-effects model, and forest plots showed high heterogeneity (I 2 = 96%, P < 0.00001), revealing a significant difference between the two techniques (MD = -29.59, [95% CI -39.68 to -19.49]; I 2 = 96%, P < 0.00001). A meta-analysis of these twelve studies demonstrated that the LON technique exhibited a lower EFI compared to the Ilizarov technique alone (Fig. 2 A). This finding holds clinical significance as it indicates that the duration of external fixation was shorter in the LON technique than in the Ilizarov technique. A visual assessment of the funnel plot indicated the presence of a slight publication bias (Fig. 4 A). However,the Begger’s test was not statistically significant ( P = 0.732).

Comparison of external fixation index results ( A ), mean follow-up time results ( B ), length gained results ( C ), consolidation index results ( D ), and bone healing index results ( E ) between the LON and Ilizarov techniques

Mean Follow-up Time(MFT)

Twelve studies[ 15 , 20 , 22 , 23 , 24 , 25 , 27 , 28 , 29 , 30 , 31 , 32 ] involving 479 patients (233 in the LON technique and 246 in the Ilizarov technique) reported the MFT. The pooled results demonstrated that there was no significant difference in MFT between the two treatment techniques (MD = -0.92, [95% CI -3.49–1.65]; I 2 = 61%; P = 0.48) with low heterogeneity (Fig. 2 B). The results of the data showed no statistical differences in MFT between the two techniques. A visual assessment of the funnel plot indicated the presence of a slight publication bias (Fig. 4 B). However, Begger’s test was not statistically significant ( P = 0.945).

Length Gained (LG)

We performed a meta-analysis comparing the LG in patients with the LON technique versus the Ilizarov technique, including ten studies[ 15 , 20 , 22 , 23 , 25 , 26 , 27 , 28 , 29 , 30 ]. The random-effects model analysis showed no significant difference between the LON and Ilizarov techniques (MD = -0.87, [95%CI -2.80–1.07]; I 2 = 97%; P = 0.38). The results of the data showed no significant difference in LG between the two techniques (Fig. 2 C). A visual assessment of the funnel plot indicated the presence of a slight publication bias (Fig. 4 C). However, Begger’s test was not statistically significant ( P = 1.00).

Consolidation Index (CIx)

Six studies[ 15 , 20 , 22 , 23 , 27 , 28 ] reported CIx, including 233 patients. There were 126 patients in the LON technique and 107 patients in the Ilizarov technique. Heterogeneity analysis showed that there was no significant statistical heterogeneity between these studies (MD = 0.66, 95%CI -3.44–4.77; I 2 = 66%, P = 0.75). The results of the data showed no significant difference in CIx between the two techniques (Fig. 2 D).

Bone Healing Index (BHI)

Three studies [ 25 , 29 , 30 ] involving 110 patients (57 in the LON technique and 53 in the Ilizarov technique) reported the BHI. The pooled results demonstrated that there was no significant difference in BHI between the two treatment techniques (MD = -3.33, [95%CI -13.07–6.41]; I 2 = 12%; P = 0.50) with low heterogeneity (Fig. 2 E). Therefore, there were no statistical differences in BHI between the two techniques.

Adverse events

Thirteen studies [ 15 , 20 , 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 , 32 ] reported adverse events. Table 4 summarizes the incidence of axial deviation (AD), needle tract infection (PTI), intramedullary infection (II), and delayed consolidation (DC).

Axial Deviation (AD)

Nine[ 15 , 20 , 22 , 24 , 25 , 27 , 28 , 29 , 30 ] studies compared the rate of AD in patients with the LON technique versus the Ilizarov technique. Meta-analysis of these 9 studies showed that the rate of AD was significantly higher in the Ilizarov technique than that in the LON technique (56/158 vs. 4/168; OR 0.06, [95% CI 0.03–0.16]; I 2 = 0%; P < 0.00001) (Fig. 3 A).

Comparison of axial deviation results ( A ), pin tract infection results ( B ), intramedullary infection results ( C ), and delayed consolidation results ( D ) between the LON and Ilizarov techniques

Pin Tract Infection (PTI)

A total of eleven studies [ 15 , 20 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 , 32 ] provided data comparing the PTI rate in patients with LON versus the Ilizarov technique. The meta-analysis revealed that patients with the Ilizarov technique dramatically increased the risk of PTI compared to patients with the LON technique (86/203vs.56/214;OR 0.30, [95%CI 0.18–0.50];I 2 = 47%; P < 0.00001) (Fig. 3 B). A visual assessment of the funnel plot indicated the presence of a slight publication bias (Fig. 4 D). However,the Begger’s test was not statistically significant ( P = 0.119).

Intramedullary Infection (II)

Five studies [ 15 , 22 , 23 , 24 , 26 , 27 ] provided data comparing the intramedullary infection rate in patients in the LON technique versus the Ilizarov technique. The meta-analysis demonstrated that patients in the LON technique showed no statistically significant difference compared to the Ilizarov technique (12/108 vs.13/114; OR 0.93, [95% CI 0.42–2.06]; I 2 = 43%; P = 0.85) (Fig. 3 C).

Delayed Consolidation (DC)

Twelve studies [ 15 , 20 , 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 ] reported delayed consolidation between LON technique patients and Ilizarov technique patients. The rate of delayed consolidation after surgery showed no statistical difference between LON technique patients and Ilizarov technique patients (24/211 vs. 36/217; OR 0.61, [95% CI 0.20–1.86]; I 2 = 56%; P = 0.38) (Fig. 3 D). A visual assessment of the funnel plot indicated the presence of a slight publication bias (Fig. 4 E). However, the Begger’s test was not statistically significant ( P = 0.837).

Funnel plots show the results of the external fixation index ( A ); mean follow-up time ( B ); length gained ( C ); pin tract infection ( D ); and delayed consolidation ( E )

Subgroup analysis

When total heterogeneity is high and within-group heterogeneity is low, this indicates that the grouping factor is significant for heterogeneity.First, the article categorized the etiology of bone defects into defects due to open fractures, defects due to chronic osteomyelitis, and defects due to limb length discrepancy and idiopathic short stature osteotomies. In order to exclude the confounding factors mentioned above, the authors performed subgroup analyses of EFI, LG, MFT, and DC. The results showed that the four different etiologies were not significant factors in their heterogeneity (Table 5 ). Second, we conducted subgroup analyses of EFI, LG, MFT, and DC according to different regions, and the results showed that articles published in different regions were also not a significant factor in the heterogeneity of this study (Table 5 ). Finally, there may be other reasons for the source of heterogeneity, such as the fact that the literature is all retrospective studies and the included literature itself.

Sensitivity analysis

Sensitivity analyses were used to test the stability of the combined results. As shown, sensitivity results were stable for the external fixation index (Fig. 5 A), mean follow-up time (Fig. 5 B), length gained (Fig. 5 C), and consolidation index (Fig. 5 D), axial deviation (Fig. 6 A), pin tract infection (Fig. 6 B), intramedullary infection (Fig. 6 C), and delayed consolidation (Fig. 6 D).

Sensitivity analysis diagram of the external fixation index ( A ), mean follow-up time ( B ), length gained ( C ), and consolidation index ( D ) in high heterogeneity outcomes

Sensitivity analysis diagram of the axial deviation ( A ), pin tract infection ( B ), intramedullary infection ( C ), and delayed consolidation ( D ) in high heterogeneity outcomes

However, by manually eliminating the literature one by one by Revman, the heterogeneity decreased to 25% after removing the Park (2008) [ 29 ] literature in LG and 21% after removing the Sun (2011) [ 30 ] literature in DC. These two pieces of literature were the source of heterogeneity for their respective outcome indicators.The remaining outcome indicators, such as EFI, MFT, and CI, had high heterogeneity but stable results by subgroup analysis and sensitivity analysis.

Previous meta-analyses have highlighted the advantages of the LATN technique over the Ilizarov method alone. Sheridan et al. [ 33 ] utilized Kaplan–Meier survival curves to demonstrate that the mean time to external fixation removal in the Ilizarov-only group was 32.6 weeks (σ = 8.43, 95% CI 24.7–40.3), which is double that of the LATN group at 16.3 weeks (σ = 8.02, 95% CI 8.9–23.7), with a statistically significant difference ( P = 0.0015). Additionally, Xu (2017) et al. [ 34 ] reviewed four retrospective cohort studies, revealing that external fixation duration was significantly reduced in the LATN group compared to the Ilizarov group, along with enhanced postoperative bone healing and restoration of lower limb function. Thus, our study aims to further this comparison by analyzing the Modified LON technique against the Ilizarov method to provide objective evidence for treating lower extremity bone defects.

Of the 13 studies [ 15 , 20 , 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 , 32 ] reviewed, all focused on tibial or femoral defects. Our systematic evaluation found that the LON technique generally results in shorter external fixation times, a lower incidence of axial deviation, and fewer pin tract infections compared to the Ilizarov method alone. There were no significant differences between the two groups concerning lengthening length, mean follow-up time, consolidation time, and bone healing, aligning with the findings of 12 included studies [ 15 , 20 , 22 , 23 , 24 , 25 , 26 , 28 , 29 , 30 , 31 , 32 ]. Exceptionally, EL-Husseini et al. [ 27 ] reported a significantly longer mean consolidation time in the LON group.

The LON technique, specifically modified for tibial and femoral lengthening, is designed to shorten the duration of external fixation, promote early weight-bearing, prevent scab and newly formed bone breakage [ 10 , 35 ], thus enabling quicker resumption of daily activities. The primary difference between the two methods is that while the Ilizarov technique requires external fixation until consolidation is complete, the LON method only during the distraction phase, thereby reducing the overall duration of fixation [ 36 ]. In all literature concerning bone lengthening, the external fixation index has been a reliable measure of healing time. In a 2023 study by Xu [ 31 ], the mean EFI for the LON technique was 0.58 ± 0.07 (months/cm), significantly lower than the 0.73 ± 0.15 (months/cm) for the Ilizarov technique. Oh CW et al. [ 8 ] also reported shorter average external fixation times with the LON technique for bone transfer in tibial or femoral defects, noting that a low EFI was associated with immediate frame removal after a prolonged distraction phase, concluding that this technique facilitates a swift return to daily activities with minimal complications. Similarly, our study demonstrated a much longer mean external fixation time using the Ilizarov technique.

Across the reviewed literature, external fixation durations were consistently shorter with the LON technique compared to the Ilizarov method alone. To identify factors influencing this disparity, we investigated potential reasons for the reduced external fixation time associated with the LON technique. A plausible explanation is that bone reaming induces significant osteoinductive debris and restores, or even enhances, periosteal blood flow within days. This increased blood flow likely promotes periosteum formation, thereby facilitating and accelerating the bone healing process [ 37 , 38 ]. Moreover, the insertion of intramedullary nails appears to stimulate the production of small bone marrow proteins, such as VEGFA-10 [ 39 ], which aid in developing microvascular structures. Presently, mRNA corresponding to this protein has been detected in the marrow cavity. The combination of the Ilizarov method with intramedullary nailing has been shown to effectively promote bone healing. Rozbruch et al. [ 40 ] have demonstrated that reaming of the medullary cavity positively impacts distraction osteogenesis, though there is also evidence suggesting that medullary reaming can impair distraction osteogenesis.

In Guo’s (2012) study[ 28 ], excessive lengthening was found to cause malformation in regenerating bone. The literature indicates that higher percentages of bone lengthening correlate with increased risk of complications [ 41 ]. Kocaoglu et al. [ 42 ] established a threshold of 6 cm for total limb lengthening, beyond which the likelihood of complications increased. For defects exceeding 10 cm, trifocal segment transmission significantly reduced external fixator time and related complications [ 43 ]. In this study, the extent of lengthening reported in the literature was maintained within 8 cm, and a forest plot comparison revealed no significant differences in lengthening between the two groups, thereby discounting length as a confounding factor and strengthening the validity of the results. The LON technique maintains axial alignment and preserves normal force lines in the lower extremity bones. It has been documented that premature removal of the frame can result in secondary axial deviation. In this study, the incidence of axial deviation was markedly lower in the LON technique group than in the Ilizarov group, suggesting that early intramedullary nail fixation may prevent axial deviations.

Guo’s study (2012) [ 28 ] revealed a significant difference in the rate of pin tract infections between the Ilizarov and LON techniques (47.8% and 15.7%, respectively), with the Ilizarov group exhibiting a higher risk due to prolonged use of external fixation braces—a finding consistent with our results (OR = 0.32, 95% CI 0.12–0.85, P = 0.02).The LON technique has been associated with a significant risk of intramedullary infection, with reported incidence rates between 3 and 15%. To minimize this complication, it is advisable to prevent contact between the wire and intramedullary nails [ 15 , 44 ]. Despite these precautions, deep infections can still occur [ 45 ]. In our study, the rate of intramedullary infections was comparable between the LON and the Ilizarov techniques, with isolated infections observed in both. We hypothesized that these were not due to wire and nail contact but rather local infection spread at the osteotomy site. Other studies have suggested that patients undergoing the LATN technique are more prone to infection at the time of intramedullary nail implantation [ 40 ], as prolonged external fixator wear may lead to pin tract infections, thereby increasing the risk of deep infections upon intramedullary nail insertion [ 46 , 47 ]. Therefore, perioperative adverse events may be reduced with proper preoperative management of LATN techniques. During the perioperative period, pin tract infections should be aggressively managed to prevent deep infections in patients post-LATN [ 48 , 49 ]. However, Sheridan et al. [ 33 ] reported that neither LATN nor LATP techniques resulted in higher rates of deep infections.

This study also categorized complications encountered in both techniques using the Paley scoring system. Among the 13 studies reviewed [ 15 , 20 , 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 , 32 ], Yang (2018) [ 32 ] and Xu (2023) [ 31 ] reported no problems, obstacles, or sequelae(Table 6 ). Our findings demonstrated that the LON technique results in lower incidences of problems (38.5%vs.58.6%) and sequelae (16.6% vs.30.9%) when compared to the Ilizarov technique alone. However, the rates of obstacles (32.4% vs.32.3%) were comparable between the two methods (Table 6 ). A random-effects meta-analysis of “problems” by Sheridan et al. [ 33 ] confirmed a significantly higher relative risk of problems with the Ilizarov technique (RR = 1.66, 95%CI 1.40–1.97, P < 0.001). Similarly, the relative risk of “sequelae” was also significantly higher with the Ilizarov technique (RR = 1.79, 95%CI 1.28–2.49, P = 0.001). However, for “obstacles,” there was no significant difference between the techniques (RR = 0.97, 95%CI 0.85–1.10, P = 0.621), aligning with our findings.

Optimal bone healing and the restoration of lower limb function are paramount in treating bone defects. Deniz G. et al. [ 50 ] achieved complete bone restoration (100%) and high functional recovery (90%) in limb lengthening using the LON technique. The timeframe for resuming previous activities was comparable between techniques, with no significant differences in physical activity limitations at the final follow-up, corroborating the bone healing outcomes observed in our study.

Several limitations must be considered when interpreting the results of this study. Firstly, we were unable to obtain detailed information on potential confounders, such as the number of previous surgeries, type of antibiotics, or other nonmeasurable factors (e.g., types of intramedullary nails). Secondly, some studies from over a decade ago may have employed different surgical concepts than those currently in use. Consequently, our study could not analyze these risk factors or outcomes with uniform criteria. Consequently, we could not uniformly analyze these risk factors or outcomes. Moreover, most of the included studies were retrospective cohort studies, with only two being randomized controlled trials. The small number of comparable studies resulted in insufficient data and variability in patient demographics, techniques, and reported outcomes. Therefore, additional prospective studies and tighter controls for confounding factors are essential to more accurately assess the clinical efficacy of the LON technique. These recognized limitations are inherent to all studies utilizing this database design and could be mitigated through prospective data collection.

Availability of data and materials

All data generated or analyzed during this study is included in the Additional File.

Abbreviations

Lengthening over a nail

Lengthening and then nail

Lengthening and the plate

External fixtor index

Mean follow-up time

Length gained

Consolidation index

Bone healing index

Pin tract infection

Intramedullart infection

Axial deviation

Weighted mean difference

Confidence interval

Newcastle–Ottawa Scale

Case–control studies

Retrospective cohort studies

Prospective cohort studies

Mean difference

Relative risk

Masquelet AC, Kishi T, Benko PE. Very long-term results of post-traumatic bone defect reconstruction by the induced membrane technique. Orthop Traumatol Surg Res. 2019;105:159–66.

Article PubMed Google Scholar

Antonini A, Rossello C, Salomone C, Iacoviello P, Chiarlone F, Burastero G. Bone Defect Management with Vascularized Fibular Grafts in the Treatment of Grade III-IV Osteomyelitis. Handchir Mikrochir Plast Chir. 2019;51:444–52.

Abulaiti A, Yilihamu Y, Yasheng T, Alike Y, Yusufu A. The psychological impact of external fixation using the Ilizarov or Orthofix LRS method to treat tibial osteomyelitis with a bone defect. Injury. 2017;48:2842–6.

Aktuglu K, Erol K, Vahabi A. Ilizarov bone transport and treatment of critical-sized tibial bone defects: a narrative review. J Orthop Traumatol. 2019;20:22.

Article PubMed PubMed Central Google Scholar

Barker KL, Lamb SE, Simpson AH. Functional recovery in patients with nonunion treated with the Ilizarov technique. J Bone Joint Surg Br. 2004;86:81–5.

Article CAS PubMed Google Scholar

Codivilla A. On the means of lengthening, in the lower limbs, the muscles and tissues which are shortened through deformity. Clin Orthop Relat Res. 1904;1994:4–9.

Google Scholar

Nauth A, Schemitsch E, Norris B, Nollin Z, Watson JT. Critical-Size Bone Defects: Is There a Consensus for Diagnosis and Treatment? J Orthop Trauma. 2018;32(Suppl 1):S7-s11.

Oh CW, Apivatthakakul T, Oh JK, Kim JW, Lee HJ, Kyung HS, et al. Bone transport with an external fixator and a locking plate for segmental tibial defects. Bone Joint J. 2013;95-B(12):1667–72.

Lan X, Zhang L, Tang P, Xia H, Li G. S-osteotomy with lengthening and then nailing compared with traditional Ilizarov method. Int Orthop. 2013;37:1995–2000.

Paley D. Problems, obstacles, and complications of limb lengthening by the Ilizarov technique. Clin Orthop Relat Res. 1990;250:81–104.

Article Google Scholar

Danziger MB, Kumar A, DeWeese J. Fractures after femoral lengthening using the Ilizarov method. J Pediatr Orthop. 1995;15:220–3.

Popkov D, Journeau P, Popkov A, Haumont T, Lascombes P. Ollier’s disease limb lenghtening: should intramedullary nailing be combined with circular external fixation? Orthop Traumatol Surg Res. 2010;96:348–53.

Tsuchiya H, Morsy AF, Matsubara H, Watanabe K, Abdel-Wanis ME, Tomita K. Treatment of benign bone tumours using external fixation. Journal of Bone & Joint Surgery-british. 2007;89:1077–83.

CAS Google Scholar

Kim H, Lee SK, Kim KJ, Ahn JH, Choy WS, Kim YI, et al. Tibial lengthening using a reamed type intramedullary nail and an Ilizarov external fixator. Int Orthop. 2009;33:835–41.

Paley D, Herzenberg JE, Paremain G, Bhave A. Femoral lengthening over an intramedullary nail. A matched-case comparison with Ilizarov femoral lengthening. J Bone Joint Surg Am. 1997;79:1464–80.

Catagni MA, Azzam W, Guerreschi F, Lovisetti L, Poli P, Khan MS, et al. Trifocal versus bifocal bone transport in treatment of long segmental tibial bone defects. Bone Joint J. 2019;101-b(2):162–9.

Hutson JJ Jr. Salvage of pilon fracture nonunion and infection with circular tensioned wire fixation. Foot Ankle Clin. 2008;13(29–68):vi.

PubMed Google Scholar

Li Y, Shen S, Xiao Q, Wang G, Yang H, Zhao H, et al. Efficacy comparison of double-level and single-level bone transport with Orthofix fixator for treatment of tibia fracture with massive bone defects. Int Orthop. 2020;44:957–63.

Kocaoglu M, Eralp L, Bilen FE, Balci HI. Fixator-assisted acute femoral deformity correction and consecutive lengthening over an intramedullary nail. J Bone Joint Surg Am. 2009;91:152–9.

Watanabe K, Tsuchiya H, Sakurakichi K, Yamamoto N, Kabata T, Tomita K. Tibial lengthening over an intramedullary nail. J Orthop Sci. 2005;10:480–5.

Stang A. Critical evaluation of the Newcastle-Ottawa scale for the assessment of the quality of nonrandomized studies in meta-analyses. Eur J Epidemiol. 2010;25:603–5.

Burghardt RD, Manzotti A, Bhave A, Paley D, Herzenberg JE. Tibial lengthening over intramedullary nails: A matched case comparison with Ilizarov tibial lengthening. Bone Joint Res. 2016;5:1–10.

Article CAS PubMed PubMed Central Google Scholar

Eralp L, Kocaoğlu M, Polat G, Baş A, Dirican A, Azam ME. A comparison of external fixation alone or combined with intramedullary nailing in the treatment of segmental tibial defects. Acta Orthop Belg. 2012;78:652–9.

Sen C, Demirel M, Sağlam Y, Balcı HI, Eralp L, Kocaoğlu M. Acute shortening versus bone transport for the treatment of infected femur non-unions with bone defects. Injury. 2019;50:2075–83.

Song HR, Myrboh V, Oh CW, Lee ST, Lee SH. Tibial lengthening and concomitant foot deformity correction in 14 patients with permanent deformity after poliomyelitis. Acta Orthop. 2005;76:261–9.

Borzunov DY, Kolchin SN. Nonunion of the femoral shaft associated with limb shortening treated with a combined technique of external fixation over an intramedullary nail versus the Ilizarov method. Arch Orthop Trauma Surg. 2022;142:2185–92.

El-Husseini TF, Ghaly NA, Mahran MA, Al Kersh MA, Emara KM. Comparison between lengthening over nail and conventional Ilizarov lengthening: a prospective randomized clinical study. Strategies Trauma Limb Reconstr. 2013;8:97–101.

Guo Q, Zhang T, Zheng Y, Feng S, Ma X, Zhao F. Tibial lengthening over an intramedullary nail in patients with short stature or leg-length discrepancy: a comparative study. Int Orthop. 2012;36:179–84.

Park HW, Yang KH, Lee KS, Joo SY, Kwak YH, Kim HW. Tibial lengthening over an intramedullary nail with use of the Ilizarov external fixator for idiopathic short stature. J Bone Joint Surg Am. 2008;90:1970–8.

Sun XT, Easwar TR, Manesh S, Ryu JH, Song SH, Kim SJ, et al. Complications and outcome of tibial lengthening using the Ilizarov method with or without a supplementary intramedullary nail: a case-matched comparative study. J Bone Joint Surg Br. 2011;93:782–7.

Xu Y, Ma T, Ren C, Li M, Lu Y, Sun L, et al. Treatment of tibial large bone defects: A comparative study of bone transport over an intramedullary nail in combination with antibiotic-impregnated calcium sulphate versus bone transport alone with antibiotic-impregnated calcium sulphate. Injury. 2023;54(Suppl 2):S78-s85.

Yang H, Zhang Y, Han Q, Peng A, Zheng X, Qin S, et al. Effectiveness of minimally invasive osteotomy Ilizarov technique combined with intramedullary nail for femoral lengthening. Chin J Reparative Reconst Surg. 2018;32:1524–9.

Sheridan GA, Fragomen AT, Rozbruch SR. Integrated Limb Lengthening Is Superior to Classical Limb Lengthening: A Systematic Review and Meta-analysis of the Literature. J Am Acad Orthop Surg Glob Res Rev. 2020;4(6):e20.00054.

PubMed PubMed Central Google Scholar

Xu WG. Comparison of Intramedullary Nail Versus Conventional Ilizarov Method for Lower Limb Lengthening: A Systematic Review and Meta-Analysis. Orthop Surg. 2017;9:159–66.

Pacicca DM, Patel N, Lee C, Salisbury K, Lehmann W, Carvalho R, et al. Expression of angiogenic factors during distraction osteogenesis. Bone. 2003;33:889–98.

Park HW, Jahng JS, Hahn SB, Park BM, Park YJ. Femoral lengthening and deformity correction by the ilizarov method. Journal of the Korean Orthopaedic Association. 1993;28:2206.

Brinker MR, O’Connor DP. Exchange nailing of ununited fractures. J Bone Joint Surg Am. 2007;89:177–88.

Danckwardt-Lillieström G. Reaming of the medullary cavity and its effect on diaphyseal bone. A fluorochromic, microangiographic and histologic study on the rabbit tibia and dog femur. Acta Orthop Scand Suppl. 1969;128:1–153.

Zhang S, Chen L, Zhang G, Zhang B. Umbilical cord-matrix stem cells induce the functional restoration of vascular endothelial cells and enhance skin wound healing in diabetic mice via the polarized macrophages. Stem Cell Res Ther. 2020;11:39.

Rozbruch SR, Kleinman D, Fragomen AT, Ilizarov S. Limb lengthening and then insertion of an intramedullary nail: a case-matched comparison. Clin Orthop Relat Res. 2008;466:2923–32.

Antoci V, Ono CM, Antoci V Jr, Raney EM. Axial deformity correction in children via distraction osteogenesis. Int Orthop. 2006;30:278–83.

Kocaoglu M, Eralp L, Kilicoglu O, Burc H, Cakmak M. Complications encountered during lengthening over an intramedullary nail. J Bone Joint Surg Am. 2004;86:2406–11.

Vidyadhara S, Rao SK. A novel approach to juxta-articular aggressive and recurrent giant cell tumours: resection arthrodesis using bone transport over an intramedullary nail. Int Orthop. 2007;31:179–84.

Shyam AK, Song HR, An H, Isaac D, Shetty GM, Lee SH. The effect of distraction-resisting forces on the tibia during distraction osteogenesis. J Bone Joint Surg Am. 2009;91:1671–82.

Simpson AH, Cole AS, Kenwright J. Leg lengthening over an intramedullary nail. J Bone Joint Surg Br. 1999;81:1041–5.

Cosco F, Risi M, Pompili M, Boriani S. External fixation and sequential nailing in the treatment of open diaphyseal fractures of the tibia. Chir Organi Mov. 2001;86:191–7.

CAS PubMed Google Scholar

Nowotarski PJ, Turen CH, Brumback RJ, Scarboro JM. Conversion of external fixation to intramedullary nailing for fractures of the shaft of the femur in multiply injured patients. J Bone Joint Surg Am. 2000;82:781–8.

Giannoudis PV, Smith RM, Bellamy MC, Morrison JF, Dickson RA, Guillou PJ. Stimulation of the inflammatory system by reamed and unreamed nailing of femoral fractures. An analysis of the second hit. J Bone Joint Surg Br. 1999;81:356–61.

Wu CC, Shih CH, Chen WJ, Tai CL. High success rate with exchange nailing to treat a tibial shaft aseptic nonunion. J Orthop Trauma. 1999;13:33–8.

Gulabi D, Erdem M, Cecen GS, Avci CC, Saglam N, Saglam F. Ilizarov fixator combined with an intramedullary nail for tibial nonunions with bone loss: is it effective? Clin Orthop Relat Res. 2014;472:3892–901.

Download references

Acknowledgements

The authors are grateful to all authors who provided information of their studies.

Informed consent

Not applicable.

This study was funded by Tutor Project of Gansu University of Traditional Chinese Medicine (2023YXKY015), Lanzhou Science and Technology Plan (2023–2-11), Gansu province Science and Technology Plan (22JR5RA009), and Lanzhou Science and Technology Plan (2023-ZD-170).

Author information

Fei Tan and Cuixian Yang contributed equally to this work.

Authors and Affiliations

Department of Joint Surgery, The 940th Hospital of Joint Logistic Support Force of Chinese People’s Liberation Army, Lanzhou, Gansu, China

Fei Tan, Jiankang Zeng, Jiahuan Li, Peijie Li, Yongjie Qiao, Jing Wang, Jiangming Zhang, Dong Xie, Shuo Ye & Shenghu Zhou

Gansu University of Chinese Medicine, Lanzhou, Gansu, China

Fei Tan, Jiankang Zeng, Jiahuan Li, Peijie Li, Jing Wang, Jiangming Zhang & Dong Xie

Zhengzhou University, Zhengzhou, Henan, China

Cuixian Yang

You can also search for this author in PubMed Google Scholar

Contributions

Fei Tan, Cuixian Yang, and Shenghu Zhou conceived the study, participated in the study design, performed the statistical analysis, and drafted the manuscript. Fei Tan, Jiankang Zeng, Peijie Li Jiangming Zhang, and Jiahuan Li contributed to data collection and the statistical interpretation. Yongjie Qiao, Dong Xie, and Shenghu Zhou participated in the study design, and oversaw the manuscript drafting process. Jing Wang as well as Shuo Ye helped with the language. All authors reviewed the manuscript.

Corresponding author

Correspondence to Shenghu Zhou .

Ethics declarations

Ethics approval and consent to participate, consent for publication.

Not applicable in the section.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary material 1., supplementary material 2., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .

Reprints and permissions

About this article

Cite this article.

Tan, F., Yang, C., Zeng, J. et al. A systematic review and meta-analysis:comparing the efficacy of the Ilizarov technique alone with lengthening over a nail for lower extremity bone defects. BMC Musculoskelet Disord 25 , 699 (2024). https://doi.org/10.1186/s12891-024-07799-y

Download citation

Received : 31 March 2024

Accepted : 20 August 2024

Published : 02 September 2024

DOI : https://doi.org/10.1186/s12891-024-07799-y

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Lower extremity lengthening
Meta analysis

BMC Musculoskeletal Disorders

ISSN: 1471-2474

General enquiries: [email protected]

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Publications
Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

Advanced Search
Journal List
Wiley Open Access Collection

A brief introduction of meta‐analyses in clinical practice and research

Xiao‐meng wang.

1 Department of Epidemiology, School of Public Health, Southern Medical University, Guangzhou Guangdong, China

Xi‐Ru Zhang

Zhi‐hao li, wen‐fang zhong, associated data.

Data sharing is not applicable to this article because no datasets were generated or analyzed during the current study.

With the explosive growth of medical information, it is almost impossible for healthcare providers to review and evaluate all relevant evidence to make the best clinical decisions. Meta‐analyses, which summarize all existing evidence and quantitatively synthesize individual studies, have become the best available evidence for informing clinical practice. This article introduces the common methods, steps, principles, strengths and limitations of meta‐analyses and aims to help healthcare providers and researchers obtain a basic understanding of meta‐analyses in clinical practice and research.

This article introduces the common methods, principles, steps, strengths and limitations of meta‐analyses and aims to help clinicians and researchers obtain a basic understanding of meta‐analyses in clinical practice and research.

An external file that holds a picture, illustration, etc.
Object name is JGM-23-e3312-g001.jpg

1. INTRODUCTION

With the explosive growth of medical information, it has become almost impossible for healthcare providers to review and evaluate all related evidence to inform their decision making. 1 , 2 Furthermore, the inconsistent and often even conflicting conclusions of different studies can confuse these individuals. Systematic reviews were developed to resolve such situations, which comprehensively and systematically summarize all relevant empirical evidence. 3 Many systematic reviews contain meta‐analysis, which use statistical methods to combine the results of individual studies. 4 Through meta‐analyses, researchers can objectively and quantitatively synthesize results from different studies and increase the statistical strength and precision for estimating effects. 5 In the late 1970s, meta‐analysis began to appear regularly in the medical literature. 6 Subsequently, a plethora of meta‐analyses have emerged and the growth is exponential over time. 7 When conducted properly, a meta‐analysis of medical studies is considered as decisive evidence because it occupies a top level in the hierarchy of evidence. 8

An understanding of the principles, performance, advantages and weaknesses of meta‐analyses is important. Therefore, we aim to provide a basic understanding of meta‐analyses for clinicians and researchers in the present article by introducing the common methods, principles, steps, strengths and limitations of meta‐analyses.

2. COMMON META‐ANALYSIS METHODS

There are many types of meta‐analysis methods (Table 1 ). In this article, we mainly introduce five meta‐analysis methods commonly used in clinical practice.

Meta‐analysis methods

Methods	Definitions
Aggregate data meta‐analysis	Extracting summary results of studies available in published accounts
Individual participant data meta‐analysis	Collecting individual participant‐level data from original studies
Cumulative meta‐analysis	Adding studies to a meta‐analysis based on a predetermined order
Network meta‐analysis	Combining direct and indirect evidence to compare the effectiveness between different interventions
Meta‐analysis of diagnostic test accuracy	Identifying and synthesizing evidence on the accuracy of tests
Prospective meta‐analysis	Conducting meta‐analysis for studies that specify research selection criteria, hypotheses and analysis, but for which the results are not yet known
Sequential meta‐analysis	Combining the methodology of cumulative meta‐analysis with the technique of formal sequential testing, which can sequentially evaluate the available evidence at consecutive interim steps during the data collection
Meta‐analysis of the adverse events	Following the basic meta‐analysis principles to analyze the incidences of adverse events of studies

2.1. Aggregated data meta‐analysis

Although more information can be obtained based on individual participant‐level data from original studies, it is usually impossible to obtain these data from all included studies in meta‐analysis because such data may have been corrupted, or the main investigator may no longer be contacted or refuse to release the data. Therefore, by extracting summary results of studies available in published accounts, an aggregate data meta‐analysis (AD‐MA) is the most commonly used of all the quantitative approaches. 9 A study has found that > 95% of published meta‐analyses were AD‐MA. 10 In addition, AD‐MA is the mainstay of systematic reviews conducted by the US Preventive Services Task Force, the Cochrane Collaboration and many professional societies. 9 Moreover, AD‐MA can be completed relatively quickly at a low cost, and the data are relatively easy to obtain. 11 , 12 However, AD‐MA has very limited control over the data. A challenge with AD‐MA is that the association between an individual participant‐level covariate and the effect of the interventions at the study level may not reflect the individual‐level effect modification of that covariate. 13 It is also difficult to extract sufficient compatible data to undertake meaningful subgroup analyses in AD‐MA. 14 Furthermore, AD‐MA is prone to ecological bias, as well as to confounding from variables not included in the model, and may have limited power. 15

2.2. Individual participant data meta‐analysis

An individual participant data meta‐analysis (IPD‐MA) is considered the “gold standard” for meta‐analysis; this type of analysis collects individual participant‐level data from original studies. 15 Compared with AD‐MA, IPD‐MA has many advantages, including improved data quality, a greater variety of analytical types that can be performed and the ability to obtain more reliable results. 16 , 17

It is crucial to maintain clusters of participants within studies in the statistical implementation of an IPD‐MA. Clusters can be retained during the analysis using a one‐step or two‐step approach. 18 In the one‐step approach, the individual participant data from all studies are modeled simultaneously, at the same time as accounting for the clustering of participants within studies. 19 This approach requires a model specific to the type of data being synthesized and an appropriate account of the meta‐analysis assumptions (e.g. fixed or random effects across studies). Cheng et al . 20 proposed using a one‐step IPD‐MA to handle binary rare events and found that this method was superior to traditional methods of inverse variance, the Mantel–Haenszel method and the Yusuf‐Peto method. In the two‐step approach, the individual participant data from each study are analyzed independently for each separate study to produce aggregate data for each study (e.g. a mean treatment effect estimate and its standard error) using a statistical method appropriate for the type of data being analyzed (e.g. a linear regression model might be fitted for continuous responses, or a Cox regression might be applied for time‐to‐event data). The aggregate data are then combined to obtain an summary effect in the second step using a suitable model, such as weighting studies by the inverse of the variance. 21 For example, using a two‐step IPD‐MA, Grams et al . 22 found that apolipoprotein‐L1 kidney‐risk variants were not associated with incident cardiovascular disease or death independent of kidney measures.

Compared to the two‐step approach, the one‐step IPD‐MA is recommended for small meta‐analyses 23 and, conveniently, must only specify one model; however, this requires careful distinction of within‐study and between‐study variability. 24 The two‐step IPD‐MA is more laborious, although it allows the use of traditional, well‐known meta‐analysis techniques in the second step, such as those used by the Cochrane Collaboration (e.g. the Mantel–Haenszel method).

2.3. Cumulative meta‐analysis

Meta‐analyses are traditionally used retrospectively to review existing evidence. However, current evidence often undergoes several updates as new studies become available. Thus, updated data must be continuously obtained to simplify and digest the ever‐expanding literature. Therefore, cumulative meta‐analysis was developed, which adds studies to a meta‐analysis based on a predetermined order and then tracks the magnitude of the mean effect and its variance. 25 A cumulative meta‐analysis can be performed multiple times; not only can it obtain summary results and provide a comparison of the dynamic results, but also it can assess the impact of newly added studies on the overall conclusions. 26 For example, initial observational studies and systematic reviews and meta‐analyses suggested that frozen embryo transfer was better for mothers and babies; however, recent primary studies have begun to challenge these conclusions. 27 Maheshwari et al . 27 therefore conducted a cumulative meta‐analysis to investigate whether these conclusions have remained consistent over time and found that the decreased risks of harmful outcomes associated with pregnancies conceived from frozen embryos have been consistent in terms of direction and magnitude of effect over several years, with an increasing precision around the point estimates. Furthermore, continuously updated cumulative meta‐analyses may avoid unnecessary large‐scale randomized controlled trials (RCTs) and prevent wasted research efforts. 28

2.4. Network meta‐analysis

Although RCTs can directly compare the effectiveness of interventions, most of them compare the effectiveness of an intervention with a placebo, and there is almost no direct comparison between different interventions. 29 , 30 Network meta‐analyses comprise a relatively recent development that combines direct and indirect evidence to compare the effectiveness between different interventions. 31 Evidence obtained from RCTs is considered as direct evidence, whereas evidence obtained through one or more common comparators is considered as indirect evidence. For example, when comparing interventions A and C, direct evidence refers to the estimate of the relative effects between A and C. When no RCTs have directly compared interventions A and C, these interventions can be compared indirectly if both have been compared with B (placebo or some standard treatments) in other studies (forming an A–B–C “loop” of evidence). 32 , 33

A valid network meta‐analysis can correctly combine the relative effects of more than two studies and obtain a consistent estimate of the relative effectiveness of all interventions in one analysis. 34 This meta‐analysis may lead to a greater accuracy of estimating intervention effectiveness and the ability to compare all available interventions to calculate the rank of different interventions. 34 , 35 For example, phosphodiesterase type 5 inhibitors (PDE5‐Is) are the first‐line therapy for erectile dysfunction, although there are limited available studies on the comparative effects of different types of PDE5‐Is. 36 Using a network meta‐analysis, Yuan et al . 36 calculated the absolute effects and the relative rank of different PDE5‐Is to provide an overview of the effectiveness and safety of all PDE5‐Is.

Notably, a network meta‐analysis should satisfy the transitivity assumption, in which there are no systematic differences between the available comparisons other than the interventions being compared 37 ; in other words, the participants could be randomized to any of the interventions in a hypothetical RCT consisting of all the interventions included in the network meta‐analysis.

2.5. Meta‐analysis of diagnostic test accuracy

Sensitivity and specificity are commonly used to assess diagnostic accuracy. However, diagnostic tests in clinical practice are rarely 100% specific or sensitive. 38 It is difficult to obtain accurate estimates of sensitivity and specificity in small diagnostic accuracy studies. 39 , 40 Even in a large sample size study, the number of cases may still be small as a result of the low prevalence. By identifying and synthesizing evidence on the accuracy of tests, the meta‐analysis of diagnostic test accuracy (DTA) provides insight into the ability of medical tests to detect the target diseases 41 ; it also can provide estimates of test performance, allow comparisons of the accuracy of different tests and facilitate the identification of sources of variability. 42 For example, the FilmArray® (Biomerieux, Marcy‐l'Étoile, France) meningitis/encephalitis (ME) panel can detect the most common pathogens in central nervous system infections, although reports of false positives and false negatives are confusing. 43 Based on meta‐analysis of DTA, Tansarli et al . 43 calculated that the sensitivity and specificity of the ME panel were both > 90%, indicating that the ME panel has high diagnostic accuracy.

3. HOW TO PERFORM A META‐ANALYSIS

3.1. frame a question.

Researchers must formulate an appropriate research question at the beginning. A well‐formulated question will guide many aspects of the review process, including determining eligibility criteria, searching for studies, collecting data from included studies, structuring the syntheses and presenting results. 44 There are some tools that may facilitate the construction of research questions, including PICO, as used in clinical practice 45 ; PEO and SPICE, as used for qualitative research questions 46 , 47 ; and SPIDER, as used for mixed‐methods research. 48

3.2. Form the search strategy

It is crucial for researchers to formulate a search strategy in advance that includes inclusion and exclusion criteria, as well as a standardized data extraction form. The definition of inclusion and exclusion criteria depends on established question elements, such as publication dates, research design, population and results. A reasonable inclusion and exclusion criteria will reduce the risk of bias, increase transparency and make the review systematic. Broad criteria may increase the heterogeneity between studies, and narrow criteria may make it difficult to find studies; therefore, a compromise should be found. 49

3.3. Search of the literature databases

To minimize bias and reduce hampered interpretation of outcomes, the search strategy should be as comprehensive as possible, employing multiple databases, such as PubMed, Embase, Cochrane Central Registry of Controlled Trials, Scopus, Web of Science and Google Scholar. 50 , 51 Removing language restrictions and actively searching for non‐English bibliographic databases may also help researchers to perform a comprehensive meta‐analysis. 52

3.4. Select the articles

The selection or rejection of the included articles should be guided by the criteria. 53 Two independent reviewers may screen the included articles, and any disagreements should be resolved by consensus through discussion. First, the titles and abstracts of all relevant searched papers should be read, and inclusion or exclusion criteria applied to determine whether these papers meet. Then, the full texts of the included articles should be reviewed once more to perform the rejection again. Finally, the reference lists of these articles should be searched to widen the research as much as possible. 54

3.5. Data extraction

A pre‐formed standardized data extraction form should be used to extract data of included studies. All data should be carefully converted using uniform standards. Simultaneous extraction by multiple researchers might also make the extracted data more accurate.

3.6. Assess quality of articles

Checklists and scales are often used to assess the quality of articles. For example, the Cochrane Collaboration's tool 55 is usually used to assess the quality of RCTs, whereas the Newcastle Ottawa Scale 56 is one of the most common method to assess the quality of non‐randomized trials. In addition, Quality Assessment of Diagnostic Accuracy Studies 2 57 is often used to evaluate the quality of diagnostic accuracy studies.

3.7. Test for heterogeneity

Several methods have been proposed to detect and quantify heterogeneity, such as Cochran's Q and I 2 values. Cochran's Q test is used to determine whether there is heterogeneity in primary studies or whether the variation observed is due to chance, 58 but it may be underpowered because of the inclusion of a small number of studies or low event rates. 59 Therefore, p < 0.10 (not 0.05) indicates the presence of heterogeneity given the low statistical strength and insensitivity of Cochran's Q test. 60 Another common method for testing heterogeneity is the I 2 value, which describes the percentage of variation across studies that is attributable to heterogeneity rather than chance; this value does not depend on the number of studies. 61 I 2 values of 25%, 50% and 75% are considered to indicate low, moderate and high heterogeneity, respectively. 60

3.8. Estimate the summary effect

Fixed effects and random effects models are commonly used to estimate the summary effect in a meta‐analysis. 62 Fixed effects models, which consider the variability of the results as “random variation”, simply weight individual studies by their precision (inverse of the variance). Conversely, random effects models assume a different underlying effect for each study and consider this an additional source of variation that is randomly distributed. A substantial difference in the summary effect calculated by fixed effects models and random effects models will be observed only if the studies are markedly heterogeneous (heterogeneity p < 0.10) and the random effects model typically provides wider confidence intervals than the fixed effect model. 63 , 64

3.9. Evaluate sources of heterogeneity

Several methods have been proposed to explore the possible reasons for heterogeneity. According to factors such as ethnicity, the number of studies or clinical features, subgroup analyses can be performed that divide the total data into several groups to assess the impact of a potential source of heterogeneity. Sensitivity analysis is a common approach for examining the sources of heterogeneity on a case‐by‐case basis. 65 In sensitivity analysis, one or more studies are excluded at a time and the impact of removing each or several studies is evaluated on the summary results and the between‐study heterogeneity. Sequential and combinatorial algorithms are usually implemented to evaluate the change in between‐study heterogeneity as one or more studies are excluded from the calculations. 66 Moreover, a meta‐regression model can explain heterogeneity based on study‐level covariates. 67

3.10. Assess publication bias

A funnel plot is a scatterplot that is commonly used to assess publication bias. In a funnel plot, the x ‐axis indicates the study effect and the y ‐axis indicates the study precision, such as the standard error or sample size. 68 , 69 If there is no publication bias, the plot will have a symmetrical inverted funnel; conversely, asymmetry indicates the possibility of publication bias.

3.11. Present results

A forest plot is a valid and useful tool for summarizing the results of a meta‐analysis. In a forest plot, the results from each individual study are shown as a blob or square; the confidence interval, usually representing 95% confidence, is shown as a horizontal line that passes through the square; and the summary effect is shown as a diamond. 70

4. PRINCIPLES OF META‐ANALYSIS PERFORMANCE

There are four most important principles of meta‐analysis performance that should be emphasized. First, the search scope of meta‐analysis should be expanded as much as possible to contain all relevant research, and it is important to remove language restrictions and actively search for non‐English bibliographic databases. Second, any meta‐analysis should include studies selected based on strict criteria established in advance. Third, appropriate tools must be selected to evaluate the quality of evidence according to different types of primary studies. Fourth, the most suitable statistical model should be chosen for the meta‐analysis and a weighted mean estimate of the effect size should be calculated. Finally, the possible causes of heterogeneity should be identified and publication bias in the meta‐analysis must be assessed.

5. STRENGTHS OF META‐ANALYSIS

Meta‐analyses have several strengths. First, a major advantage is their ability to improve the precision of effect estimates with considerably increased statistical power, which is particularly important when the power of the primary study is limited as a result of the small sample size. Second, a meta‐analysis has more power to detect small but clinically significant effects and to examine the effectiveness of interventions in demographic or clinical subgroups of participants, which can help researchers identify beneficial (or harmful) effects in specific groups of patients. 71 , 72 Third, meta‐analyses can be used to analyze rare outcomes and outcomes that individual studies were not designed to test (e.g. adverse events). Fourth, meta‐analyses can be used to examine heterogeneity in study results and explore possible sources in case this heterogeneity would lead to bias from “mixing apples and oranges”. 73 Furthermore, meta‐analyses can compare the effectiveness of various interventions, supplement the existing evidence, and then offer a rational and helpful way of addressing a series of practical difficulties that plague healthcare providers and researchers. Lastly, meta‐analyses may resolve disputes caused by apparently conflicting studies, determine whether new studies are necessary for further investigation and generate new hypotheses for future studies. 7 , 74

6. LIMITATIONS OF META‐ANALYSIS

6.1. missing related research.

The primary limitation of a meta‐analysis is missing related research. Even in the ideal case in which all relevant studies are available, a faulty search strategy can miss some of these studies. Small differences in search strategies can produce large differences in the set of studies found. 75 When searching databases, relevant research can be missed as a result of the omission of keywords. The search engine (e.g. PubMed, Google) may also affect the type and number of studies that are found. 76 Moreover, it may be impossible to identify all relevant evidence if the search scope is limited to one or two databases. 51 , 77 Finally, language restrictions and the failure to search non‐English bibliographic databases may also lead to an incomplete meta‐analysis. 52 Comprehensive search strategies for different databases and languages might help solve this issue.

6.2. Publication bias

Publication bias means that positive findings are more likely to be published and then identified through literature searches rather than ambiguous or negative findings. 78 This is an important and key source of bias that is recognized as a potential threat to the validity of results. 79 The real research effect may be exaggerated or even falsely positive if only published articles are included. 80 For example, based on studies registered with the US Food and Drug Administration, Turner et al . 81 reviewed 74 trials of 12 antidepressants to assess publication bias and its influence on apparent efficacy. It was found that antidepressant studies with favorable outcomes were 16 times more likely to be published than those with unfavorable outcomes, and the apparent efficacy of antidepressants increased between 11% and 69% when the non‐published studies were not included in the analysis. 81 Moreover, failing to identify and include non‐English language studies may also increase publication bias. 82 Therefore, all relevant studies should be identified to reduce the impact of publication bias on meta‐analysis.

6.3. Selection bias

Because many of the studies identified are not directly related to the subject of the meta‐analysis, it is crucial for researchers to select which studies to include based on defined criteria. Failing to evaluate, select or reject relevant studies based on stricter criteria regarding the study quality may also increase the possibility of selection bias. Missing or inappropriate quality assessment tools may lead to the inclusion of low‐quality studies. If a meta‐analysis includes low‐quality studies, its results will be biased and incorrect, which is also called “garbage in, garbage out”. 83 Strictly defined criteria for included studies and scoring by at least two researchers might help reduce the possibility of selection bias. 84 , 85

6.4. Unavailability of information

The best‐case scenario for meta‐analyses is the availability of individual participant data. However, most individual research reports only contain summary results, such as the mean, standard deviation, proportions, relative risk and odds ratio. In addition to the possibility of reporting errors, the lack of information can severely limit the types of analyses and conclusions that can be achieved in a meta‐analysis. For example, the unavailability of information from individual studies may preclude the comparison of effects in predetermined subgroups of participants. Therefore, if feasible, the researchers could contact the author of the primary study for individual participant data.

6.5. Heterogeneity

Although the studies included in a meta‐analysis have the same research hypothesis, there is still the potential for several areas of heterogeneity. 86 Heterogeneity may exist in various parts of the studies’ design and conduct, including participant selection, interventions/exposures or outcomes studied, data collection, data analyses and selective reporting of results. 87 Although the difference of the results can be overcome by assessing the heterogeneity of the studies and performing subgroup analyses, 88 the results of the meta‐analysis may become meaningless and even may obscure the real effect if the selected studies are too heterogeneous to be comparable. For example, Nicolucci et al . 89 conducted a review of 150 published randomized trials on the treatment of lung cancer. Their review showed serious methodological drawbacks and concluded that heterogeneity made the meta‐analysis of existing trials unlikely to be constructive. 89 Therefore, combining the data in meta‐analysis for studies with large heterogeneity is not recommended.

6.6. Misleading funnel plot

Funnel plots are appealing because they are a simple technique used to investigate the possibility of publication bias. However, their objective is to detect a complex effect, which can be misleading. For example, the lack of symmetry in a funnel plot can also be caused by heterogeneity. 90 Another problem with funnel plots is the difficulty of interpreting them when few studies are included. Readers may also be misled by the choice of axes or the outcome measure. 91 Therefore, in the absence of a consensus on how the plot should be constructed, asymmetrical funnel plots should be interpreted cautiously. 91

6.7. Inevitable subjectivity

Researchers must make numerous judgments when performing meta‐analyses, 92 which inevitably introduces considerable subjectivity into the meta‐analysis review process. For example, there is often a certain amount of subjectivity when deciding how similar studies should be before it is appropriate to combine them. To minimize subjectivity, at least two researchers should jointly conduct a meta‐analysis and reach a consensus.

The explosion of medical information and differences between individual studies make it almost impossible for healthcare providers to make the best clinical decisions. Meta‐analyses, which summarize all eligible evidence and quantitatively synthesize individual results on a specific clinical question, have become the best available evidence for informing clinical practice and are increasingly important in medical research. This article has described the basic concept, common methods, principles, steps, strengths and limitations of meta‐analyses to help clinicians and investigators better understand meta‐analyses and make clinical decisions based on the best evidence.

AUTHOR CONTRIBUTIONS

CM designed and directed the study. XMW and XRZ had primary responsibility for drafting the manuscript. CM, ZHL, WFZ and PY provided insightful discussions and suggestions. All authors critically reviewed the manuscript for important intellectual content.

CONFLICT OF INTEREST STATEMENT

The authors declare that they have no conflicts of interest.

ACKNOWLEDGEMENTS

This work was supported by the Project Supported by Guangdong Province Universities and Colleges Pearl River Scholar Funded Scheme (2019 to CM) and the Construction of High‐level University of Guangdong (G820332010, G618339167 and G618339164 to CM). The funders played no role in the study design or implementation; manuscript preparation, review or approval; or the decision to submit the manuscript for publication.

Wang X‐M, Zhang X‐R, Li Z‐H, Zhong W‐F, Yang P, Mao C. A brief introduction of meta‐analyses in clinical practice and research . J Gene Med . 2021; 23 :e3312. 10.1002/jgm.3312 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]

Xiao‐Meng Wang and Xi‐Ru Zhang contributed equally to this work.

DATA AVAILABILITY STATEMENT

Incidence and risk factors of antituberculosis drug-induced liver injury in India: A systematic review and meta-analysis

Systematic review with meta-analysis
Published: 03 September 2024

Cite this article

Ramesh Kumar ORCID: orcid.org/0000-0001-5136-4865 1 ,
Abhishek Kumar 1 ,
Rishabh Patel ORCID: orcid.org/0009-0001-7061-4433 1 ,
Sabbu Surya Prakash ORCID: orcid.org/0009-0006-5117-5443 1 ,
Sudhir Kumar ORCID: orcid.org/0000-0001-7117-1382 1 ,
Himanshu Surya 1 &
Sudheer Marrapu 1

Antituberculosis drug-induced liver injury (ATDILI) is a significant problem of tuberculosis treatment. This systematic review and meta‑analysis aimed at evaluating the incidence and risk factors of ATDILI in adult patients with tuberculosis in India.

Five electronic databases were searched comprehensively for studies on Indian adult patients with tuberculosis investigating the incidence and/or risk factors of ATDILI. The relevant data was pooled in a random or fixed-effect model to calculate the pooled incidence with a 95% confidence interval (CI), standardized mean difference (MD) or odds ratio (OR).

Following the screening of 3221 records, 43 studies with 12,041 tuberculosis patients were finally included. Based on the random effect model, the pooled incidence of ATDILI was 12.6% (95% CI, 9.9–15.3%, p < 0.001, I 2 = 95.1%). The pooled incidence was higher in patients with daily treatment regimen compared to the thrice weekly regimen (16.5% vs. 3.5%). The concurrent hepatitis B or C infection, alcohol consumption and underlying chronic liver disease were associated with high incidence of ATDILI. The pooled incidence of acute liver failure (ALF) among ATDILI patients was 6.78% (95% CI 3.9–9.5%). Female gender (OR 1.24), older age (MD 0.26), lean body mass index (OR 3.8), hypoalbuminemia (OR 3.09), N-acetyltransferase slow acetylator genotypes (OR 2.3) and glutathione S-transferases M null mutation (OR 1.6) were found to be associated with an increased risk of ATDILI. The pooled mortality rate of ATDILI patients was 1.72% (95% CI 0.4–3.0%) overall and 71.8% (95% CI 49.3–94.2%) in case of ALF.

Overall, 12.6% patients of tuberculosis in India developed ATDILI when combination of first-line antituberculosis drugs was used. An average of 7% of ATDILI patients progressed to ALF which had a high mortality rate. Older age, female, poor nutritional status and some genetic polymorphisms were identified as significant risk factors.

Graphical Abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save.

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

Global tuberculosis report 2023. https://www.who.int/teams/global-tuberculosis-programme/tb-reports/global-tuberculosis-report-2023 . Accessed 25 January 2024.

Tostmann A, Boeree MJ, Aarnoutse RE, et al. Antituberculosis drug-induced hepatotoxicity: concise up-to-date review. J Gastroenterol Hepatol. 2008;23:192–202.

Devarbhavi H, Joseph T, Sunil Kumar N, et al. The Indian network of drug-induced liver injury: etiology, clinical features, outcome and prognostic markers in 1288 patients. J Clin Exp Hepatol. 2021;11:288–98.

Kumar R, Shalimar, Bhatia V, et al. Antituberculosis therapy-induced acute liver failure: magnitude, profile, prognosis, and predictors of outcome. Hepatology. 2010;51:1665–74.

Devarbhavi H, Choudhury AK, Sharma MK, et al. Drug-induce acute-on-chronic liver failure in Asian patients. Am J Gastroenterol. 2019;114:929–37.

Wang N, Chen X, Hao Z, et al. Incidence and temporal trend of antituberculosis drug-induced liver injury: a systematic review and meta-analysis. J Trop Med. 2022;2022:8266878. https://doi.org/10.1155/2022/8266878 .

Article PubMed PubMed Central Google Scholar

Nazir T, Farhat S, Adil M, Asraf Z, Haque A. Adverse drug reactions associated with first line anti-tubercular drugs, their prevalence and causality assessment in patients on Directly Observed Treatment Short-course (DOTS) in a tertiary care hospital. Int J Basic Clin Pharmacol. 2019;8:147–52.

Article Google Scholar

Kar P, Karna R, Ruttala R, et al. Clinical and molecular risk factors of anti-tubercular therapy induced hepatitis. J Clin Exp Hepatol. 2019;9:200–6.

Roy B, Chowdhury A, Kundu S, et al. Increased risk of antituberculosis drug-induced hepatotoxicity in individuals with glutathione S-transferase M1 ‘null’ mutation. J Gastroenterol Hepatol. 2001;16:1033–7.

Gupta VH, Amarapurkar DN, Singh M, et al. Association of N-acetyltransferase 2 and cytochrome P450 2E1 gene polymorphisms with antituberculosis drug-induced hepatotoxicity in Western India. J Gastroenterol Hepatol. 2013;28:1368–74.

Bose PD, Sarma MP, Medhi S, et al. Role of polymorphic N-acetyl transferase2 and cytochrome P4502E1 gene in antituberculosis treatment-induced hepatitis. J Gastroenterol Hepatol. 2011;26:312–8.

Pande JN, Singh SP, Khilnani GC, Khilnani S, Tandon RK. Risk factors for hepatotoxicity from antituberculosis drugs: a case-control study. Thorax. 1996;51:132–6.

Sharma SK, Balamurugan A, Saha PK, Pandey RM, Mehra NK. Evaluation of clinical and immunogenetic risk factors for the development of hepatotoxicity during antituberculosis treatment. Am J Respir Crit Care Med. 2002;166:916–9.

Singla N, Gupta D, Birbian N, Singh J. Association of NAT2, GST and CYP2E1 polymorphisms and anti-tuberculosis drug-induced hepatotoxicity. Tuberculosis (Edinb). 2014;94:293–8.

Latief M, Dar WR, Sofi N, et al. Novel risk factors and early detection of anti tubercular treatment induced liver injury-looking beyond American Thoracic Society Guidelines. Indian J Tuberc. 2017;64:26–32.

Yadav D, Kumar R, Dixit RK, et al. Association of Nat2 Gene Polymorphism with Antitubercular drug-induced hepatotoxicity in the Eastern Uttar Pradesh Population. Cureus. 2019;11:e4425.

Chatterjee S, Lyle N, Mandal A, Kundu S. GSTT1 and GSTM1 gene deletions are not associated with hepatotoxicity caused by antitubercular drugs. J Clin Pharm Ther. 2010;35:465–70.

Kewalramani MS, Vaishnao LS, Jaiswal KM, et al. Evaluation of hepatotoxicity of anti-tuberculosis regimens: a prospective study in tribal population of Central India. J Young Pharm. 2020;12:153–7.

Singla R, Sharma SK, Mohan A, et al. Evaluation of risk factors for antituberculosis treatment induced hepatotoxicity. Indian J Med Res. 2010;132:81–6.

PubMed CAS Google Scholar

Satyaraddi A, Velpandian T, Sharma SK, et al. Correlation of plasma anti-tuberculosis drug levels with subsequent development of hepatotoxicity. Int J Tuberc Lung Dis. 2014;18:188–95.

Mandal PK, Mandal A, Bhattacharyya SK. Comparing the daily versus the intermittent regimens of the anti-tubercular chemotherapy in the initial intensive phase in non-HIV, sputum positive, pulmonary tuberculosis patients. J Clin Diagn Res. 2013;7:292–5.

Sivaraj R, Umarani S, Parasuraman S, Muralidhar P. Revised national tuberculosis control program regimens with and without directly observed treatment, short-course: a comparative study of therapeutic cure rate and adverse reactions. Perspect Clin Res. 2014;5:16–9.

Gopalan N, Santhanakrishnan RK, Palaniappan AN, et al. Daily vs intermittent antituberculosis therapy for pulmonary tuberculosis in patients with HIV: a randomized clinical trial. JAMA Intern Med. 2018;178:485–93.

Raj Mani SS, Iyyadurai R, Mishra AK, et al. Predicting antitubercular drug-induced liver injury and its outcome and introducing a novel scoring system. Int J Mycobacteriol. 2021;10:116–21.

Misra UK, Kumar M, Kalita J. Standard versus sequential anti-tubercular treatment in patients with tuberculous meningitis: a randomized controlled trial. Trans R Soc Trop Med Hyg. 2021;115:94–102.

Juganya TJ, Alagammai P, Karthykayani TJ. Incidence and clinical profile of antituberculosis treatment-induced hepatitis in a tertiary care hospital in southern India. J Clin Diagn Res. 2020;14:OG01–5.

Rana SV, Sharma SK, Ola RP, et al. N-acetyltransferase 2, cytochrome P4502E1 and glutathione S-transferase genotypes in antitubercular treatment-induced hepatotoxicity in North Indians. J Clin Pharm Ther. 2014;39:91–6.

Nicoletti P, Devarbhavi H, Goel A, et al. Genetic risk factors in drug-induced liver injury due to isoniazid-containing antituberculosis drug regimens. Clin Pharmacol Ther. 2021;109:1125–35.

Akhtar J, Singh S, Verma AK, Pal R, Nath R. A prospective observational study to evaluate Glutathione S-transferase gene polymorphism and its association with Antitubercular drugs induced liver injury in tertiary hospital. Indian J Tuberc. 2022;69341–6.

Sharma SK, Jha BK, Sharma A, et al. Genetic polymorphisms of CYP2E1 and GSTM1 loci and susceptibility to anti-tuberculosis drug-induced hepatotoxicity. Int J Tuberc Lung Dis. 2014;18:588–93.

Saha A, Shanthi FXM, Winston AB, et al. Prevalence of hepatotoxicity from antituberculosis therapy: a five-year experience from South India. J Prim Care Community Health. 2016;7:171–4.

Ali N, Gupta N, Saravu K. Malnutrition as an important risk factor for drug-induced liver injury in patients on anti-tubercular therapy: an experience from a tertiary care center in South India. Drug Discov Ther. 2020;14:135–8.

Puri P, Kaur N, Pathania S, et al. Antitubercular therapy induced liver function tests abnormalities in human immunodeficiency virus infected individuals. Med J Armed Forces India. 2017;73:12–7.

Dahale AS, Puri AS, Verma A, et al. Extrapulmonary tuberculosis in cirrhosis: too familiar-too much unknown. J Clin Exp Hepatol. 2022;12:1021–2.

Sharma P, Tyagi P, Singla V, et al. Clinical and biochemical profile of tuberculosis in patients with liver cirrhosis. J Clin Exp Hepatol. 2015;5:8–13.

Mishra S, Taneja S, De A, Muthu V, et al. Tuberculosis in cirrhosis - a diagnostic and management conundrum. J Clin Exp Hepatol. 2022;12:278–86.

Saigal S, Agarwal SR, Nandeesh HP, Sarin SK. Safety of an ofloxacin-based antitubercular regimen for the treatment of tuberculosis in patients with underlying chronic liver disease: a preliminary report. J Gastroenterol Hepatol. 2001;16:1028–32.

Anand AC, Seth AK, Paul M, Puri P. Risk factors of hepatotoxicity during anti-tuberculosis treatment. Med J Armed Forces India. 2006;62:45–9.

Agal S, Baijal R, Pramanik S, et al. Monitoring and management of antituberculosis drug induced hepatotoxicity. J Gastroenterol Hepatol. 2005;20:1745–52.

Dawra S, Mandavdhare HS, Singh H, et al. Extra-abdominal involvement is associated with antitubercular therapy-related hepatitis in patients treated for abdominal tuberculosis. Clin Exp Hepatol. 2019;5:60–4.

Gaude GS, Chaudhury A, Hattiholi J. Drug-induced hepatitis and the risk factors for liver injury in pulmonary tuberculosis patients. J Family Med Prim Care. 2015;4:238–43.

Dhali A, Das K, Dhali GK, et al. Abdominal tuberculosis: clinical profile and outcome. Int J Mycobacteriol. 2021;10:414–20.

Singh A, Sahu MK, Panigrahi M, et al. Abdominal tuberculosis in Indians: still very pertinent. J Clin Tuberc Other Mycobact Dis. 2019;15:100097.

Sharma SK, Jha BK, Sharma A, et al. Genetic polymorphisms of N-acetyltransferase 2 & susceptibility to antituberculosis drug-induced hepatotoxicity. Indian J Med Res. 2016;144:924–8.

Singanayagam A, Sridhar S, Dhariwal J, et al. A comparison between two strategies for monitoring hepatic function during antituberculous therapy. Am J Respir Crit Care Med. 2012;185:653–9.

Mapamba DA, Sauli E, Mrema L, et al. Impact of N-acetyl cysteine (NAC) on tuberculosis (TB) patients—a systematic review. Antioxidants. 2022;11:2298.

Lang SM, Ortmann J, Rostig S, Schiffl H. Ursodeoxycholic acid attenuates hepatotoxicity of multidrug treatment of mycobacterial infections: a prospective pilot study. Int J Mycobacteriology. 2019;8:89–92.

Chen Q, Hu A, Ma A, et al. Effectiveness of prophylactic use of hepatoprotectants for tuberculosis drug-induced liver injury: a population-based cohort analysis involving 6,743 Chinese patients. Front Pharmacol. 2022;13:813682.

Keshavjee S, Gelmanova IY, Shin SS, et al. Hepatotoxicity during treatment for multidrug-resistant tuberculosis: occurrence, management and outcome. Int J Tuberc Lung Dis. 2012;16:596–603.

Conradie F, Bagdasaryan TR, Borisov S, et al. Bedaquiline-pretomanid-linezolid regimens for drug-resistant tuberculosis. N Engl J Med. 2022;387:810–23.

Sinha K, Marak IR, Singh WA. Adverse drug reactions in tuberculosis patients due to directly observed treatment strategy therapy: experience at an outpatient clinic of a teaching hospital in the city of Imphal, Manipur, India. J Assoc Chest Physicians. 2013;1:50–3.

Tak DK, Acharya L, Gowrinath K, Rao Padma GM, Subish P. Safety evaluation of antitubercular therapy under Revised National Tuberculosis Control Programme in India. J Clin Diagn Res. 2009;3:1395–401.

Gupta V, Guleria TC, Kumar S, et al. Anti-tuberculosis drug induced hepatotoxicity: a study from Himalayan region. Int J Res Med Sci. 2022;10:713–6.

Vijayalakshmi A, Thanmayi G, Jayakumari S. A prospective study on abnormal LFT patterns in patients receiving antituberculosis therapy. Asian J Pharm Clin Res. 2016;9:136–9.

Subbalaxmi MVS, Soanker R, Lakshmi AV. Evaluation of risk factors for development of anti-tubercular therapy induced hepatotoxicity: a prospective study. Curr Drug Saf. 2020;15:198–204.

Download references

Author information

Authors and affiliations.

Department of Gastroenterology, All India Institute of Medical Sciences, 4T Floor, OPD Block, Patna, 801 507, India

Ramesh Kumar, Abhishek Kumar, Rishabh Patel, Sabbu Surya Prakash, Sudhir Kumar, Himanshu Surya & Sudheer Marrapu

You can also search for this author in PubMed Google Scholar

Contributions

RK, AK and RP: conceptualization, methodology, data curation, writing; SSP and SK: data curation, writing; HS and SM: data curation, reviewing and editing. All authors have made a significant contribution and have approved the final manuscript.

Corresponding author

Correspondence to Ramesh Kumar .

Ethics declarations

Conflict of interest.

RK, AK, RP, SSP, SK, HS and SM declare no competing interests.

The authors are solely responsible for the data and the contents of the paper. In no way, the Honorary Editor-in-Chief, Editorial Board Members, the Indian Society of Gastroenterology or the printer/publishers are responsible for the results/findings and content of this article.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 33 KB)

Rights and permissions.

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Kumar, R., Kumar, A., Patel, R. et al. Incidence and risk factors of antituberculosis drug-induced liver injury in India: A systematic review and meta-analysis. Indian J Gastroenterol (2024). https://doi.org/10.1007/s12664-024-01643-w

Download citation

Received : 09 April 2024

Accepted : 21 June 2024

Published : 03 September 2024

DOI : https://doi.org/10.1007/s12664-024-01643-w

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Antituberculosis drugs
Hepatotoxicity
Liver injury
Tuberculosis
Find a journal
Publish with us
Track your research

IMAGES

Anova n metaanalysis
What is a Meta-Analysis? The benefits and challenges
Meta-Analysis and Case Study
PPT
Systematic Review and Meta-Analysis
PPT

VIDEO

Differences Between Action Research and Case Study
Case study Vs Case Series
Case study vs Personal Brand
Meta Analysis Research (मेटा विश्लेषण अनुसंधान) #ugcnet #ResearchMethodology #educationalbyarun
All Descriptive Studies
How to perform Meta-Analysis

COMMENTS

How to conduct a meta-analysis in eight steps: a practical guide
2.1 Step 1: defining the research question. The first step in conducting a meta-analysis, as with any other empirical study, is the definition of the research question. Most importantly, the research question determines the realm of constructs to be considered or the type of interventions whose effects shall be analyzed.
Systematic Reviews and Meta-Analysis: A Guide for Beginners
The graphical output of meta-analysis is a forest plot which provides information on individual studies and the pooled effect. Systematic reviews of literature can be undertaken for all types of questions, and all types of study designs. This article highlights the key features of systematic reviews, and is designed to help readers understand ...
Introduction to systematic review and meta-analysis
A systematic review collects all possible studies related to a given topic and design, and reviews and analyzes their results [ 1 ]. During the systematic review process, the quality of studies is evaluated, and a statistical meta-analysis of the study results is conducted on the basis of their quality. A meta-analysis is a valid, objective ...
Study Design 101: Meta-Analysis
Meta-analysis would be used for the following purposes: To establish statistical significance with studies that have conflicting results. To develop a more correct estimate of effect magnitude. To provide a more complex analysis of harms, safety data, and benefits. To examine subgroups with individual numbers that are not statistically significant.
Case study meta‐analysis in the social sciences. Insights on data
The various types of sources through which a given case is described may exhibit considerable differences in information quality, challenging the overall integrity of the results of a meta-analysis. 24 Studies appearing in more formal outlets, such as peer-reviewed journals, have been through some quality control providing a degree of ...
Systematic Reviews and Meta-analysis: Understanding the Best Evidence
Systematic reviews aim to identify, evaluate, and summarize the findings of all relevant individual studies over a health-related issue, thereby making the available evidence more accessible to decision makers. The objective of this article is to introduce the primary care physicians about the concept of systematic reviews and meta-analysis ...
Meta-Analysis
Definition. "A meta-analysis is a formal, epidemiological, quantitative study design that uses statistical methods to generalise the findings of the selected independent studies. Meta-analysis and systematic review are the two most authentic strategies in research. When researchers start looking for the best available evidence concerning ...
Meta-analysis and the science of research synthesis
Meta-analysis is the quantitative, scientific synthesis of research results. Since the term and modern approaches to research synthesis were first introduced in the 1970s, meta-analysis has had a ...
Systematic reviews vs meta-analysis: what's the difference?
Conclusions. A systematic review is an article that synthesizes available evidence on a certain topic utilizing a specific research question, pre-specified eligibility criteria for including articles, and a systematic method for its production. Whereas a meta-analysis is a quantitative, epidemiological study design used to assess the results of ...
PDF How to conduct a meta-analysis in eight steps: a practical guide
Meta-analysis is a central method for knowledge accumulation in many scien-tic elds (Aguinis et al. 2011c; Kepes et al. 2013). Similar to a narrative review, it serves as a synopsis of a research question or eld. However, going beyond a narra-tive summary of key ndings, a meta-analysis adds value in providing a quantitative
Case study meta-analysis in the social sciences. Insights on data
Meta-analytical methods face particular challenges in research fields such as social and political research, where studies often rest primarily on qualitative and case study research. In such contexts, where research findings are less standardized and amenable to structured synthesis, the case survey method has been proposed as a means of data ...
Chapter 10: Analysing data and undertaking meta-analyses
Many judgements are required in the process of preparing a meta-analysis. Sensitivity analyses should be used to examine whether overall findings are robust to potentially influential decisions. Cite this chapter as: Deeks JJ, Higgins JPT, Altman DG (editors). Chapter 10: Analysing data and undertaking meta-analyses.
Writing a Case Analysis Paper
Case study is unbounded and relies on gathering external information; case analysis is a self-contained subject of analysis. The scope of a case study chosen as a method of research is bounded. However, the researcher is free to gather whatever information and data is necessary to investigate its relevance to understanding the research problem.
In brief: What are systematic reviews and meta-analyses?
Sometimes the results of all of the studies found and included in a systematic review can be summarized and expressed as an overall result. This is known as a meta-analysis. The overall outcome of the studies is often more conclusive than the results of individual studies. But it only makes sense to do a meta-analysis if the results of the ...
Meta-analyses in management: What can we learn from clinical research
Meta-analyses (MAs) began to spread in the mid-20th century to integrate and synthesize the results of an increasing number of studies in areas such as psychology and epidemiology (Aguinis et al., 2011c; Geyskens et al., 2009).Essentially, MAs aim at synthesizing the effect of interest by aggregating the estimations of a number of primary studies to estimate a global effect size (Cooper ...
Meta-Analysis and Meta-Synthesis Methodologies: Rigorously Piecing
Meta-Analysis. Meta-analysis is a statistical analysis where studies are evaluated together to gain an understanding of the magnitude of similarities and differences in the reported outcomes (Borenstein et al. 2009; Cooper 2016; Glass 1976, 2000).Building on the foundation of a narrative review, a typical meta-analysis will have several common features: transparency of the search strategy ...
Meta‐analysis and traditional systematic literature reviews—What, why
Meta-analysis is a collection of statistical methods that integrates the results of a large number of studies to provide an aggregate summary of knowledge in a research domain (Littell et al., 2008). The advantage of meta-analysis over an individual study is in its higher power (i.e., sample size; Geyskens et al., 2009).
Systematic Reviews and Meta Analysis
A systematic review is guided filtering and synthesis of all available evidence addressing a specific, focused research question, generally about a specific intervention or exposure. The use of standardized, systematic methods and pre-selected eligibility criteria reduce the risk of bias in identifying, selecting and analyzing relevant studies.
Meta-Analysis
When a review is performed following predefined steps (ie, systematically) and its results are quantitatively analyzed, it is called meta-analysis. Publication of meta-analyses has increased exponentially in pubmed.gov; using the key word "meta-analysis, 1,473 titles. ". were yielded in 2007 and 176,704 on January 2020.
Meta-Analytic Methodology for Basic Research: A Practical Guide
The goal of this study is to present a brief theoretical foundation, computational resources and workflow outline along with a working example for performing systematic or rapid reviews of basic research followed by meta-analysis. Conventional meta-analytic techniques are extended to accommodate methods and practices found in basic research.
Meta-analyses of case studies: A method for studying regional and
This sort of circumstance would seem to call for a meta-analysis of the community level studies. 1 Acknowledging this point, some researchers have recently argued that meta-analyses of the case study literature should become an integral part of a 'portfolio' of approaches to the study of regional and global environmental change (Young et al ...
Meta-Analysis
Meta-analysis would be used for the following purposes: To establish statistical significance with studies that have conflicting results. To develop a more correct estimate of effect magnitude. To provide a more complex analysis of harms, safety data, and benefits. To examine subgroups with individual numbers that are not statistically significant.
A systematic review and meta-analysis:comparing the efficacy of the
The case-control or respective cohort studies were evaluated using the Newcastle-Ottawa scale (NOS) to determine their techniqueological rigor.The Cochrane Collaboration's risk assessment tool was employed to perform quality evaluations for randomized controlled trials. ... Meta-analysis of these 9 studies showed that the rate of AD was ...
A brief introduction of meta‐analyses in clinical practice and research
Sensitivity analysis is a common approach for examining the sources of heterogeneity on a case‐by‐case basis. 65 In sensitivity analysis, one or more studies are excluded at a time and the impact of removing each or several studies is evaluated on the summary results and the between‐study heterogeneity.
Incidence and risk factors of antituberculosis drug-induced liver
From 3221 records identified, 43 studies with 12,041 TB patients were finally included in the meta-analysis. Figure 1 depicts the process of identification, screening and inclusion of studies. The characteristics and quality of the 43 included studies are described in Table 1.A majority of studies (n = 30) had prospective cohort design, eight were case-control studies, while two were RCT and ...

Study Design 101: Meta-Analysis

Meta-Analysis

Disadvantages

Design pitfalls to look out for

Fictitious Example

Real-life Examples

Related Terms

Now test yourself!

Evidence Pyramid - Navigation

Meta-Analysis – Guide with Definition, Steps & Examples

What is Meta-Analysis

When Should you Conduct a Meta-Analysis?

Elements of a Meta-Analysis

Characteristics:

Strengths:

Criticisms:

Steps of Conducting the Meta-Analysis

Step 1: Research Question

Step 2: Systematic Review

Step 3: Data Extraction

Step 4: Standardisation and Weighting Studies

Step 5: Absolute Effect Estimation

Forest Plot

Relevance to Practice and Research

Methods and Assumptions in Meta-Analysis

Difference Between Meta-Analysis and Systematic Reviews

Software Packages For Meta-Analysis

Assessing the Quality of Meta-Analysis

Hire an Expert Writer

Examples of Meta-Analysis

Frequently Asked Questions

Why is meta-analysis important?

What is an example of a meta-analysis?

What is the definition of meta-analysis in clinical research?

Is meta-analysis qualitative or quantitative?

You May Also Like

Meta-analysis and the science of research synthesis

Access options

Similar content being viewed by others

Eight problems with literature reviews and how to fix them

The past, present and future of Registered Reports

Raiders of the lost HARK: a reproducible inference framework for big data science

Acknowledgements

Author information

Contributions

Corresponding authors

Ethics declarations

Additional information

PowerPoint slides

About this article

Share this article

This article is cited by

Investigate the relationship between the retraction reasons and the quality of methodology in non-Cochrane retracted systematic reviews: a systematic review

A meta-analysis on global change drivers and the risk of infectious disease

Systematic review of the uncertainty of coral reef futures under climate change

Meta-analysis reveals weak associations between reef fishes and corals

Quick links

Systematic reviews vs meta-analysis: what’s the difference?

What is a systematic review?

What is a meta-analysis?

When can a meta-analysis be implemented?

Why are meta-analyses important?

Conclusions

Verónica Tanco Tellechea

Subscribe to our newsletter

Related Articles

How to read a funnel plot

Heterogeneity in meta-analysis

Natural killer cells in glioblastoma therapy

Save citation to file

Add to My Bibliography

Case study meta-analysis in the social sciences. Insights on data quality and reliability from a large-N case survey

Similar articles

Publication types

Grants and funding

LinkOut - more resources

Cochrane Training

Key Points:

10.1 Do not start here!

10.2 Introduction to meta-analysis