Overinterpretation and Misreporting of Diagnostic Accuracy Studies: Evidence of “Spin”

Published Online:https://doi.org/10.1148/radiol.12120527

Approximately three in 10 studies of the diagnostic accuracy of biomarkers or other medical tests published in journals with an impact factor of 4 or higher show false-positive results, and almost all studies contain practices that facilitate overinterpretation.

Purpose

To estimate the frequency of distorted presentation and overinterpretation of results in diagnostic accuracy studies.

Materials and Methods

MEDLINE was searched for diagnostic accuracy studies published between January and June 2010 in journals with an impact factor of 4 or higher. Articles included were primary studies of the accuracy of one or more tests in which the results were compared with a clinical reference standard. Two authors scored each article independently by using a pretested data-extraction form to identify actual overinterpretation and practices that facilitate overinterpretation, such as incomplete reporting of study methods or the use of inappropriate methods (potential overinterpretation). The frequency of overinterpretation was estimated in all studies and in a subgroup of imaging studies.

Results

Of the 126 articles, 39 (31%; 95% confidence interval [CI]: 23, 39) contained a form of actual overinterpretation, including 29 (23%; 95% CI: 16, 30) with an overly optimistic abstract, 10 (8%; 96% CI: 3%, 13%) with a discrepancy between the study aim and conclusion, and eight with conclusions based on selected subgroups. In our analysis of potential overinterpretation, authors of 89% (95% CI: 83%, 94%) of the studies did not include a sample size calculation, 88% (95% CI: 82%, 94%) did not state a test hypothesis, and 57% (95% CI: 48%, 66%) did not report CIs of accuracy measurements. In 43% (95% CI: 34%, 52%) of studies, authors were unclear about the intended role of the test, and in 3% (95% CI: 0%, 6%) they used inappropriate statistical tests. A subgroup analysis of imaging studies showed 16 (30%; 95% CI: 17%, 43%) and 53 (100%; 95% CI: 92%, 100%) contained forms of actual and potential overinterpretation, respectively.

Conclusion

Overinterpretation and misreporting of results in diagnostic accuracy studies is frequent in journals with high impact factors.

© RSNA, 2013

Supplemental material: http://radiology.rsna.org/lookup/suppl/doi:10.1148/radiol.12120527/-/DC1

References

  • 1 Fletcher RH, Black B. “Spin” in scientific writing: scientific mischief and legal jeopardy. Med Law 2007;26(3):511–525. MedlineGoogle Scholar
  • 2 Horton R. The rhetoric of research. BMJ 1995;310(6985):985–987. Crossref, MedlineGoogle Scholar
  • 3 Marco CA, Larkin GL. Research ethics: ethical issues of data reporting and the quest for authenticity. Acad Emerg Med 2000;7(6):691–694. Crossref, MedlineGoogle Scholar
  • 4 Bossuyt PM, Reitsma JB, Bruns DEet al.. The STARD statement for reporting studies of diagnostic accuracy: explanation and elaboration. Ann Intern Med 2003;138(1):W1–W12. Crossref, MedlineGoogle Scholar
  • 5 Zinsmeister AR, Connor JT. Ten common statistical errors and how to avoid them. Am J Gastroenterol 2008;103(2):262–266. Crossref, MedlineGoogle Scholar
  • 6 Scott IA, Greenberg PB, Poole PJ. Cautionary tales in the clinical interpretation of studies of diagnostic tests. Intern Med J 2008;38(2):120–129. Crossref, MedlineGoogle Scholar
  • 7 Boutron I, Dutton S, Ravaud P, Altman DG. Reporting and interpretation of randomized controlled trials with statistically nonsignificant results for primary outcomes. JAMA 2010;303(20):2058–2064. Crossref, MedlineGoogle Scholar
  • 8 Bossuyt PM, Reitsma JB, Bruns DEet al.. Towards complete and accurate reporting of studies of diagnostic accuracy: The STARD Initiative. Ann Intern Med 2003;138(1):40–44. Crossref, MedlineGoogle Scholar
  • 9 Lumbreras B, Parker LA, Porta M, Pollán M, Ioannidis JP, Hernández-Aguado I. Overinterpretation of clinical applicability in molecular diagnostic research. Clin Chem 2009;55(4):786–794. Crossref, MedlineGoogle Scholar
  • 10 Smidt N, Rutjes AW, van der Windt DAet al.. Quality of reporting of diagnostic accuracy studies. Radiology 2005;235(2):347–353. LinkGoogle Scholar
  • 11 Devillé WL, Bezemer PD, Bouter LM. Publications on diagnostic test evaluation in family medicine journals: an optimal search strategy. J Clin Epidemiol 2000;53(1):65–69. Crossref, MedlineGoogle Scholar
  • 12 Knottnerus JA, Muris JW. Assessment of the accuracy of diagnostic tests: the cross-sectional study. In: Knottnerus JA, ed. The evidence base of clinical diagnosis. London, England: BMJ Publishing Group, 2002; 39–59. Google Scholar
  • 13 Irwig L, Bossuyt P, Glasziou P, Gatsonis CA, Lijmer JG. Designing studies to ensure that estimates of test accuracy will travel. In: Knottnerus JA, ed. The evidence base of clinical diagnosis. London, England: BMJ Publishing Group, 2002; 95–116. Google Scholar
  • 14 Chalmers I. Underreporting research is scientific misconduct. JAMA 1990;263(10):1405–1408. Crossref, MedlineGoogle Scholar
  • 15 Ioannidis JP. Why most published research findings are false. PLoS Med 2005;2(8):e124. Crossref, MedlineGoogle Scholar
  • 16 Young NS, Ioannidis JP, Al-Ubaydli O. Why current publication practices may distort science. PLoS Med 2008;5(10):e201. Crossref, MedlineGoogle Scholar
  • 17 Wang R, Lagakos SW, Ware JH, Hunter DJ, Drazen JM. Statistics in medicine—reporting of subgroup analyses in clinical trials. N Engl J Med 2007;357(21):2189–2194. Crossref, MedlineGoogle Scholar
  • 18 Lagakos SW. The challenge of subgroup analyses—reporting without distorting. N Engl J Med 2006;354(16):1667–1669. Crossref, MedlineGoogle Scholar
  • 19 Pepe MS, Feng Z, Janes H, Bossuyt PM, Potter JD. Pivotal evaluation of the accuracy of a biomarker used for classification or prediction: standards for study design. J Natl Cancer Inst 2008;100(20):1432–1438. Crossref, MedlineGoogle Scholar
  • 20 Fritz JM, Wainner RS. Examining diagnostic tests: an evidence-based perspective. Phys Ther 2001;81(9):1546–1564. Crossref, MedlineGoogle Scholar
  • 21 Bossuyt PM, Irwig L, Craig J, Glasziou P. Comparative accuracy: assessing new tests against existing diagnostic pathways. BMJ 2006;332(7549):1089–1092. Crossref, MedlineGoogle Scholar
  • 22 Montori VM, Jaeschke R, Schünemann HJet al.. Users’ guide to detecting misleading claims in clinical research reports. BMJ 2004;329(7474):1093–1096. Crossref, MedlineGoogle Scholar
  • 23 Ewald B. Post hoc choice of cut points introduced bias to diagnostic research. J Clin Epidemiol 2006;59(8):798–801. Crossref, MedlineGoogle Scholar
  • 24 Leeflang MM, Moons KG, Reitsma JB, Zwinderman AH. Bias in sensitivity and specificity caused by data-driven selection of optimal cutoff values: mechanisms, magnitude, and solutions. Clin Chem 2008;54(4):729–737. Crossref, MedlineGoogle Scholar
  • 25 Harper R, Reeves B. Reporting of precision of estimates for diagnostic accuracy: a review. BMJ 1999;318(7194):1322–1323. Crossref, MedlineGoogle Scholar
  • 26 Habbema JDF, Eijekmans R, Krijnen P, Knottnerus JA. Analysis of data on the accuracy of diagnostic tests. In: Knottnerus JA, ed. The evidence base of clinical diagnosis. London, England: BMJ Publishing Group, 2002; 117–144. Google Scholar
  • 27 Altman DG. Why we need confidence intervals. World J Surg 2005;29(5):554–556. Crossref, MedlineGoogle Scholar
  • 28 Hayen A, Macaskill P, Irwig L, Bossuyt P. Appropriate statistical methods are required to assess diagnostic tests for replacement, add-on, and triage. J Clin Epidemiol 2010;63(8):883–891. Crossref, MedlineGoogle Scholar
  • 29 Clopper CJ, Pearson ES. The use of confidence or fiducial limits illustrated in the case of the binomial. Biometrika 1934;26(4):404–413. CrossrefGoogle Scholar
  • 30 Hintze JL. PASS 11 (Power Analysis and Sample Size). Kaysville Utah: NCSS, 2011. Google Scholar
  • 31 Hage CA, Davis TE, Fuller Det al.. Diagnosis of histoplasmosis by antigen detection in BAL fluid. Chest 2010;137(3):623–628. Crossref, MedlineGoogle Scholar
  • 32 la Fougère C, Pöpperl G, Levin Jet al.. The value of the dopamine D2/3 receptor ligand 18F-desmethoxyfallypride for the differentiation of idiopathic and nonidiopathic parkinsonian syndromes. J Nucl Med 2010;51(4):581–587. Crossref, MedlineGoogle Scholar
  • 33 Xu F, Yan Q, Wang Het al.. Performance of detecting IgM antibodies against enterovirus 71 for early diagnosis. PLoS ONE 2010;5(6):e11388. Crossref, MedlineGoogle Scholar
  • 34 Ioannidis JP, Panagiotou OA. Comparison of effect sizes associated with biomarkers reported in highly cited individual articles and in subsequent meta-analyses. JAMA 2011;305(21):2200–2210. Crossref, MedlineGoogle Scholar
  • 35 Bossuyt PM. The thin line between hope and hype in biomarker research. JAMA 2011;305(21):2229–2230. Crossref, MedlineGoogle Scholar
  • 36 Paranjothy B, Shunmugam M, Azuara-Blanco A. The quality of reporting of diagnostic accuracy studies in glaucoma using scanning laser polarimetry. J Glaucoma 2007;16(8):670–675. Crossref, MedlineGoogle Scholar
  • 37 Bossuyt PM. STARD statement: still room for improvement in the reporting of diagnostic accuracy studies. Radiology 2008;248(3):713–714. LinkGoogle Scholar
  • 38 Wilczynski NL. Quality of reporting of diagnostic accuracy studies: no change since STARD statement publication—before-and-after study. Radiology 2008;248(3):817–823. LinkGoogle Scholar
  • 39 Fontela PS, Pant Pai N, Schiller I, Dendukuri N, Ramsay A, Pai M. Quality and reporting of diagnostic accuracy studies in TB, HIV and malaria: evaluation using QUADAS and STARD standards. PLoS ONE 2009;4(11):e7753. Crossref, MedlineGoogle Scholar
  • 40 Areia M, Soares M, Dinis-Ribeiro M. Quality reporting of endoscopic diagnostic studies in gastrointestinal journals: where do we stand on the use of the STARD and CONSORT statements? Endoscopy 2010;42(2):138–147. Crossref, MedlineGoogle Scholar
  • 41 Selman TJ, Morris RK, Zamora J, Khan KS. The quality of reporting of primary test accuracy studies in obstetrics and gynaecology: application of the STARD criteria. BMC Womens Health 2011;11:8. Crossref, MedlineGoogle Scholar
  • 42 Pitkin RM, Branagan MA, Burmeister LF. Accuracy of data in abstracts of published research articles. JAMA 1999;281(12):1110–1111. Crossref, MedlineGoogle Scholar
  • 43 Beller EM, Glasziou PP, Hopewell S, Altman DG. Reporting of effect direction and size in abstracts of systematic reviews. JAMA 2011;306(18):1981–1982. Crossref, MedlineGoogle Scholar

Article History

Received March 17, 2012; revision requested April 23; revision received August 21; accepted September 12; final version accepted October 15.
Published online: May 2013
Published in print: May 2013