Misusing I2 for inconsistency, overlooking OIS for imprecision, and ignoring the continuum of certainty ratings: common pitfalls in GRADE assessments

ORCID

Abstract

BackgroundThe GRADE framework guides ratings of certainty in evidence that includes the possibility of rating down certainty in one or more of five domains. GRADE users face challenges in making certainty of evidence judgments, and these challenges can result in inappropriate ratings. Potential problems include over-reliance on the I² statistic, neglect of sample size adequacy, and binary decision-making.ObjectivesTo demonstrate applying key principles of visual criteria for inconsistency judgments, sample size considerations for imprecision judgments, and domain and overall certainty judgments on a continuum.MethodsWe use examples from four meta-analyses, one evaluating nasal continuous positive airway pressure (NCPAP) in preterm infants, two comparing anti-malarial regimens, and another comparing tooth brushing versus no tooth brushing for ventilator-associated pneumonia. These examples illustrate the application of key principles, including visual criteria, sample size considerations, and certainty judgments along a continuum.ResultsIn two examples, I² was high, but visual inspection showed consistent point estimates, overlapping confidence intervals, and estimates all on the same side of the threshold. Therefore, rating down for inconsistency was not justified. In two other examples, we calculated the Optimal Information Size (OIS) using a 25% relative risk reduction. In both, the sample size fell short, warranting a rating down for imprecision. In one antimalarial efficacy example, the sample size failed to meet the OIS based on 25% RRR but would have been sufficient with a 30% RRR. This placed the concern at the lower end of serious. These domain-level judgments, made along a continuum, contributed to overall certainty ratings that lay at the upper end of their respective categories.ConclusionsApplying GRADE principles through visual inspection of forest plots in evaluation of consistency, OIS-based evaluation of precision, and continuum-based domain judgments supports more transparent, and decision-relevant certainty ratings. Clearly describing the degree of concern within each domain enhances the clarity and utility of evidence summaries for decision-makers.

Publication Date

2026-03-05

Publication Title

Journal of Clinical Epidemiology

Volume

194

ISSN

0895-4356

Acceptance Date

2026-03-02

Deposit Date

2026-04-01

Embargo Period

2027-03-05

This document is currently not available here.

This item is under embargo until 05 March 2027

Share

COinS