Overinterpretation and Misreporting of Diagnostic Accuracy Studies: Evidence of “Spin”
Abstract
Approximately three in 10 studies of the diagnostic accuracy of biomarkers or other medical tests published in journals with an impact factor of 4 or higher show false-positive results, and almost all studies contain practices that facilitate overinterpretation.
Purpose
To estimate the frequency of distorted presentation and overinterpretation of results in diagnostic accuracy studies.
Materials and Methods
MEDLINE was searched for diagnostic accuracy studies published between January and June 2010 in journals with an impact factor of 4 or higher. Articles included were primary studies of the accuracy of one or more tests in which the results were compared with a clinical reference standard. Two authors scored each article independently by using a pretested data-extraction form to identify actual overinterpretation and practices that facilitate overinterpretation, such as incomplete reporting of study methods or the use of inappropriate methods (potential overinterpretation). The frequency of overinterpretation was estimated in all studies and in a subgroup of imaging studies.
Results
Of the 126 articles, 39 (31%; 95% confidence interval [CI]: 23, 39) contained a form of actual overinterpretation, including 29 (23%; 95% CI: 16, 30) with an overly optimistic abstract, 10 (8%; 96% CI: 3%, 13%) with a discrepancy between the study aim and conclusion, and eight with conclusions based on selected subgroups. In our analysis of potential overinterpretation, authors of 89% (95% CI: 83%, 94%) of the studies did not include a sample size calculation, 88% (95% CI: 82%, 94%) did not state a test hypothesis, and 57% (95% CI: 48%, 66%) did not report CIs of accuracy measurements. In 43% (95% CI: 34%, 52%) of studies, authors were unclear about the intended role of the test, and in 3% (95% CI: 0%, 6%) they used inappropriate statistical tests. A subgroup analysis of imaging studies showed 16 (30%; 95% CI: 17%, 43%) and 53 (100%; 95% CI: 92%, 100%) contained forms of actual and potential overinterpretation, respectively.
Conclusion
Overinterpretation and misreporting of results in diagnostic accuracy studies is frequent in journals with high impact factors.
© RSNA, 2013
Supplemental material: http://radiology.rsna.org/lookup/suppl/doi:10.1148/radiol.12120527/-/DC1
References
- 1 . “Spin” in scientific writing: scientific mischief and legal jeopardy. Med Law 2007;26(3):511–525. Medline, Google Scholar
- 2 . The rhetoric of research. BMJ 1995;310(6985):985–987. Crossref, Medline, Google Scholar
- 3 . Research ethics: ethical issues of data reporting and the quest for authenticity. Acad Emerg Med 2000;7(6):691–694. Crossref, Medline, Google Scholar
- 4 . The STARD statement for reporting studies of diagnostic accuracy: explanation and elaboration. Ann Intern Med 2003;138(1):W1–W12. Crossref, Medline, Google Scholar
- 5 . Ten common statistical errors and how to avoid them. Am J Gastroenterol 2008;103(2):262–266. Crossref, Medline, Google Scholar
- 6 . Cautionary tales in the clinical interpretation of studies of diagnostic tests. Intern Med J 2008;38(2):120–129. Crossref, Medline, Google Scholar
- 7 . Reporting and interpretation of randomized controlled trials with statistically nonsignificant results for primary outcomes. JAMA 2010;303(20):2058–2064. Crossref, Medline, Google Scholar
- 8 . Towards complete and accurate reporting of studies of diagnostic accuracy: The STARD Initiative. Ann Intern Med 2003;138(1):40–44. Crossref, Medline, Google Scholar
- 9 . Overinterpretation of clinical applicability in molecular diagnostic research. Clin Chem 2009;55(4):786–794. Crossref, Medline, Google Scholar
- 10 . Quality of reporting of diagnostic accuracy studies. Radiology 2005;235(2):347–353. Link, Google Scholar
- 11 . Publications on diagnostic test evaluation in family medicine journals: an optimal search strategy. J Clin Epidemiol 2000;53(1):65–69. Crossref, Medline, Google Scholar
- 12 . Assessment of the accuracy of diagnostic tests: the cross-sectional study. In: Knottnerus JA, ed. The evidence base of clinical diagnosis. London, England: BMJ Publishing Group, 2002; 39–59. Google Scholar
- 13 . Designing studies to ensure that estimates of test accuracy will travel. In: Knottnerus JA, ed. The evidence base of clinical diagnosis. London, England: BMJ Publishing Group, 2002; 95–116. Google Scholar
- 14 . Underreporting research is scientific misconduct. JAMA 1990;263(10):1405–1408. Crossref, Medline, Google Scholar
- 15 . Why most published research findings are false. PLoS Med 2005;2(8):e124. Crossref, Medline, Google Scholar
- 16 . Why current publication practices may distort science. PLoS Med 2008;5(10):e201. Crossref, Medline, Google Scholar
- 17 . Statistics in medicine—reporting of subgroup analyses in clinical trials. N Engl J Med 2007;357(21):2189–2194. Crossref, Medline, Google Scholar
- 18 . The challenge of subgroup analyses—reporting without distorting. N Engl J Med 2006;354(16):1667–1669. Crossref, Medline, Google Scholar
- 19 . Pivotal evaluation of the accuracy of a biomarker used for classification or prediction: standards for study design. J Natl Cancer Inst 2008;100(20):1432–1438. Crossref, Medline, Google Scholar
- 20 . Examining diagnostic tests: an evidence-based perspective. Phys Ther 2001;81(9):1546–1564. Crossref, Medline, Google Scholar
- 21 . Comparative accuracy: assessing new tests against existing diagnostic pathways. BMJ 2006;332(7549):1089–1092. Crossref, Medline, Google Scholar
- 22 . Users’ guide to detecting misleading claims in clinical research reports. BMJ 2004;329(7474):1093–1096. Crossref, Medline, Google Scholar
- 23 . Post hoc choice of cut points introduced bias to diagnostic research. J Clin Epidemiol 2006;59(8):798–801. Crossref, Medline, Google Scholar
- 24 . Bias in sensitivity and specificity caused by data-driven selection of optimal cutoff values: mechanisms, magnitude, and solutions. Clin Chem 2008;54(4):729–737. Crossref, Medline, Google Scholar
- 25 . Reporting of precision of estimates for diagnostic accuracy: a review. BMJ 1999;318(7194):1322–1323. Crossref, Medline, Google Scholar
- 26 . Analysis of data on the accuracy of diagnostic tests. In: Knottnerus JA, ed. The evidence base of clinical diagnosis. London, England: BMJ Publishing Group, 2002; 117–144. Google Scholar
- 27 . Why we need confidence intervals. World J Surg 2005;29(5):554–556. Crossref, Medline, Google Scholar
- 28 . Appropriate statistical methods are required to assess diagnostic tests for replacement, add-on, and triage. J Clin Epidemiol 2010;63(8):883–891. Crossref, Medline, Google Scholar
- 29 . The use of confidence or fiducial limits illustrated in the case of the binomial. Biometrika 1934;26(4):404–413. Crossref, Google Scholar
- 30 . PASS 11 (Power Analysis and Sample Size). Kaysville Utah: NCSS, 2011. Google Scholar
- 31 . Diagnosis of histoplasmosis by antigen detection in BAL fluid. Chest 2010;137(3):623–628. Crossref, Medline, Google Scholar
- 32 . The value of the dopamine D2/3 receptor ligand 18F-desmethoxyfallypride for the differentiation of idiopathic and nonidiopathic parkinsonian syndromes. J Nucl Med 2010;51(4):581–587. Crossref, Medline, Google Scholar
- 33 . Performance of detecting IgM antibodies against enterovirus 71 for early diagnosis. PLoS ONE 2010;5(6):e11388. Crossref, Medline, Google Scholar
- 34 . Comparison of effect sizes associated with biomarkers reported in highly cited individual articles and in subsequent meta-analyses. JAMA 2011;305(21):2200–2210. Crossref, Medline, Google Scholar
- 35 . The thin line between hope and hype in biomarker research. JAMA 2011;305(21):2229–2230. Crossref, Medline, Google Scholar
- 36 . The quality of reporting of diagnostic accuracy studies in glaucoma using scanning laser polarimetry. J Glaucoma 2007;16(8):670–675. Crossref, Medline, Google Scholar
- 37 . STARD statement: still room for improvement in the reporting of diagnostic accuracy studies. Radiology 2008;248(3):713–714. Link, Google Scholar
- 38 . Quality of reporting of diagnostic accuracy studies: no change since STARD statement publication—before-and-after study. Radiology 2008;248(3):817–823. Link, Google Scholar
- 39 . Quality and reporting of diagnostic accuracy studies in TB, HIV and malaria: evaluation using QUADAS and STARD standards. PLoS ONE 2009;4(11):e7753. Crossref, Medline, Google Scholar
- 40 . Quality reporting of endoscopic diagnostic studies in gastrointestinal journals: where do we stand on the use of the STARD and CONSORT statements? Endoscopy 2010;42(2):138–147. Crossref, Medline, Google Scholar
- 41 . The quality of reporting of primary test accuracy studies in obstetrics and gynaecology: application of the STARD criteria. BMC Womens Health 2011;11:8. Crossref, Medline, Google Scholar
- 42 . Accuracy of data in abstracts of published research articles. JAMA 1999;281(12):1110–1111. Crossref, Medline, Google Scholar
- 43 . Reporting of effect direction and size in abstracts of systematic reviews. JAMA 2011;306(18):1981–1982. Crossref, Medline, Google Scholar
Article History
Received March 17, 2012; revision requested April 23; revision received August 21; accepted September 12; final version accepted October 15.Published online: May 2013
Published in print: May 2013