The study was by Dr Hidde ten Berg, from the department of emergency medicine and Dr Steef Kurstjens, from the department of clinical chemistry and haematology, both at Jeroen Bosch Hospital, 's-Hertogenbosch, The Netherlands.
Dr ten Berg told the Congress: “Like a lot of people, we have been trying out ChatGPT and we were intrigued to see how well it worked for examining some complex diagnostic cases. So, we set up a study to assess how well the chatbot worked compared to doctors with a collection of emergency medicine cases from daily practice.”
The research, which is published this month in the Annals of Emergency Medicine, included anonymised details on 30 patients who were treated at Jeroen Bosch Hospital’s emergency department in 2022. The researchers entered physicians’ notes on patients’ signs, symptoms and physical examinations into two versions of ChatGPT (the free 3.5 version and the subscriber 4.0 version). They also provided the chatbot with results of lab tests, such as blood and urine analysis. For each case, they compared the shortlist of likely diagnoses generated by the chatbot to the shortlist made by emergency medicine doctors and to the patient’s correct diagnosis.
They found a large overlap (around 60%) between the shortlists generated by ChatGPT and the doctors. Doctors had the correct diagnosis within their top five likely diagnoses in 87% of the cases, compared to 97% for ChatGPT version 3.5 and 87% for version 4.0.
Dr ten Berg said: “We found that ChatGPT performed well in generating a list of likely diagnoses and suggesting the most likely option. We also found a lot of overlap with the doctors’ lists of likely diagnoses. Simply put, this indicates that ChatGPT was able suggest medical diagnoses much like a human doctor would.
“For example, we included a case of a patient presenting with joint pain that was alleviated with painkillers, but redness, joint pain and swelling always recurred. In the previous days, the patient had a fever and sore throat. A few times there was a discolouration of the fingertips. Based on the physical exam and additional tests, the doctors thought the most likely diagnosis was probably rheumatic fever, but ChatGPT was correct with its most likely diagnosis of vasculitis.
“It’s vital to remember that ChatGPT is not a medical device and there are concerns over privacy when using ChatGPT with medical data. However, there is potential here for saving time and reducing waiting times in the emergency department. The benefit of using artificial intelligence could be in supporting doctors with less experience, or it could help in spotting rare diseases.”
MEDICA-tradefair.com; Source: European Society for Emergency Medicine