To the Editor: Following the release of the generative pre‐trained transformer (GPT) ChatGPT in November 2022, a wide range of large language models (LLMs), including ChatGPT‐3.5 (GPT‐3‐derived), ChatGPT‐4 and New Bing (GPT‐4‐derived), have been made publicly available. There is suggestion that ChatGPT‐4 outperforms ChatGPT‐3.5 in answering questions from medical exams,1 but it is unknown whether GPT‐4‐derived LLMs consistently outperform GPT‐3‐derived LLMs.
The full article is accessible to AMA members and paid subscribers. Login to read more or purchase a subscription now.
Please note: institutional and Research4Life access to the MJA is now provided through Wiley Online Library.
- 1. Nori H, King N, McKinney SM, et al. Capabilities of GPT‐4 on medical challenge problems [preprint]. arXiv 230313375; 20 Mar 2023. https://doi.org/10.48550/arXiv.2303.13375 (viewed Mar 2023).
- 2. OpenAI. GPT‐4 technical report [preprint]. ArXiv 2303.08774; 15 Mar 2023. https://doi.org/10.48550/arXiv.2303.08774 (viewed Mar 2023).
- 3. Australian Medical Council Limited. MCQ trial examination [website]. Canberra: AMC, 2022. https://www.amc.org.au/assessment/mcq/mcq‐trial/ (viewed Mar 2023).
- 4. Kung TH, Cheatham M, Medenilla A, et al. Performance of ChatGPT on USMLE: potential for AI‐assisted medical education using large language models. PLOS Digit Health 2023; 2: e0000198.
- 5. Gilson A, Safranek CW, Huang T, et al. How does ChatGPT perform on the United States Medical Licensing Examination? The implications of large language models for medical education and knowledge assessment. JMIR Med Educ 2023; 9: e45312.
We thank Joshua Kovoor for providing editorial and statistical support to this piece.
No relevant disclosures.