This research involves the use of Generative Large Language Models (LLMs) in evidence-based medicine to help clinicians manage the rapidly advancing medical research. Real-world clinical cases were curated and converted into .json files for analysis by LLMs. The models’ performance was evaluated based on their ability to make clinical decisions similar to real-world clinicians. The study found GPT-4 to be the most capable of autonomous operation in a clinical setting, effectively ordering relevant investigations and conforming to clinical guidelines. However, limitations were observed in handling complex guidelines and diagnostic nuances.

 

Publication date: 8 Jan 2024
Project Page: Not provided
Paper: https://arxiv.org/pdf/2401.02851