Expert of Experts Verification and Alignment (EVAL) Framework for Large Language Models Safety in Gastroenterology | Institution for Social and Policy Studies

Author(s):

Maro Giuffre, Kisung You, Ziteng Pang, Simone Kresevic, Sunny Chung, Ryan Chen, Youngmin, Ko, Colleen, Chan, Theo Saarinen, Milos Ajcevic, Lory S. Croce, Guadalupe Garcia-Tsao, Ian Gralnek, Joseph J.Y. Sung, Alan Barkun, Loren Laine, Jasjeet Sekhon, Bradly Stadie, and Dennis Shung

ISPS ID:

isps25-34

Full citation:

Giuffrè, M., You, K., Pang, Z. et al. Expert of Experts Verification and Alignment (EVAL) Framework for Large Language Models Safety in Gastroenterology. npj Digit. Med. 8, 242 (2025). https://doi.org/10.1038/s41746-025-01589-z

Abstract:

Large language models generate plausible text responses to medical questions, but inaccurate responses pose significant risks in medical decision-making. Grading LLM outputs to determine the best model or answer is time-consuming and impractical in clinical settings; therefore, we introduce EVAL (Expert-of-Experts Verification and Alignment) to streamline this process and enhance LLM safety for upper gastrointestinal bleeding (UGIB). We evaluated OpenAI’s GPT-3.5/4/4o/o1-preview, Anthropic’s Claude-3-Opus, Meta’s LLaMA-2 (7B/13B/70B), and Mistral AI’s Mixtral (7B) across 27 configurations, including zero-shot baseline, retrieval-augmented generation, and supervised fine-tuning. EVAL uses similarity-based ranking and a reward model trained on human-graded responses for rejection sampling. Among the employed similarity metrics, Fine-Tuned ColBERT achieved the highest alignment with human performance across three separate datasets (ρ = 0.81–0.91). The reward model replicated human grading with 87.9% of cases across temperature settings and significantly improved accuracy through rejection sampling by 8.36% overall. EVAL offers scalable potential to assess accuracy for high-stakes medical decision-making.

Supplemental information:

Full article.

Related data:

Expert-generated questions are available in Table 3 of the manuscript, while expert free-text answers and real-world clinical questions can be found in the supplementary files.

Code can be provided based on personal requests. Please contact the corresponding author. The reward model has been uploaded on Hugging Face at the following link: https://huggingface.co/ZachariahPang/medical_reward_model.

Publication date:

2025

Publication type:

Peer Reviewed Article

Publication name:

npj Digital Medicine

Discipline:

Interdisciplinary

Area of study:

Health