Researchers at Mass General Brigham, one of the leading medical and research institutions in the US, have conducted an analysis which shows that ChatGPT is approximately 72% accurate in making various clinical decisions, including diagnosis and treatment prescription. The results of the study have been published in the Journal of Medical Internet Research.
Using ChatGPT in clinical practice
“There are no specific standards for comparison, but we estimate that the model’s performance is comparable to that of a physician fresh out of medical school, such as an intern or resident. This points to the potential of large language models as an additional tool in the medical field,” said Mark Succi, MD, one of the authors of the study.
In the study, ChatGPT analysed 36 standardised clinical cases. The artificial intelligence had to suggest a range of possible or differential diagnoses based on initial patient data such as age, gender and symptoms, as well as whether the case was an emergency. The algorithm was then fed additional information and asked to make a final diagnosis and select a treatment plan.
By comparison, in the US, doctors diagnose illnesses with an average accuracy of 77% on the first try and successfully prescribe treatment in 68% of cases. So ChatGPT is not much worse than doctors.