The ChatGPT chatbot appeared almost at the same level as the expert doctor

According to the Financial Times, a new study from the University of Cambridge's School of Clinical Medicine shows that the GPT-4 AI performs almost as well as an expert in ophthalmology assessment.Researchers tested GPT-3.5, Palm 2 and LLaMA large language models with 87 multiple-choice questions. Five specialist ophthalmologists, three ophthalmology interns and two junior lay doctors participated in the same pilot test.Questions were asked about everything from photosensitivity to serious eye damage. The answers to these questions were not publicly available, so the researchers believed that large language models were not already trained on them.


The GPT-4 scored higher than trainees and junior doctors, answering 60 questions correctly. While the doctors answered 37 questions correctly on average. Specialist doctors provided accurate answers to 56 questions; But with an average score of 66.4, they showed that they are still ahead of AI.


PalM 2 answered 49 questions and GPT-3.5 correctly answered 42 questions. LLaMA scored the lowest among other large language models with 28 correct responses.The researchers noted that their new study asked a limited number of questions, particularly in certain categories; Meaning actual results may vary. Large language models inherently tend to "illusion" or construct things differently. For example, misdiagnosis of cataracts or cancer and generally low accuracy in disease diagnosis can have very dangerous consequences.