Flawed datasets cast doubt on AI tools predicting diabetes and stroke

April 15, 2026 at 10:18 AM

•

1 min read

Flawed datasets cast doubt on AI tools predicting diabetes and stroke — Photo: Nature

TL;DR Summary

Researchers found that 124 papers used two Kaggle datasets to train stroke- and diabetes-prediction models that may be built on fabricated data; some models are already in clinical use in Indonesia, Spain, and the US, with journals investigating; irregular data patterns—such as unreal completeness and duplicated values—cast doubt on reliability, prompting calls for data-source disclosure and removal of the dubious datasets to prevent flawed clinical decisions.

Topics:health #clinical-decision-making #data-provenance #diabetes #machine-learning #science #stroke

Share this article

Dozens of AI disease-prediction models were trained on dubious data Nature

Reading Insights

Total Reads

Unique Readers

Time Saved

7 min

vs 8 min read

Condensed

96%

1,513 → 66 words

Want the full story? Read the original article

Read on Nature

JavaScript Required

tl;dr daily news requires JavaScript to be enabled. Please enable JavaScript in your browser settings.

Related Sources

Reading Insights