Claude 3 Opus (5-shot) | 75.8 | The Claude 3 Model Family: Opus, Sonnet, Haiku | - |
Human Performance (single annotator) | 78.0 | PubMedQA: A Dataset for Biomedical Research Question Answering | |
Flan-PaLM (540B, Few-shot) | 79 | Large Language Models Encode Clinical Knowledge | - |
Flan-PaLM (62B, Few-shot) | 77.2 | Large Language Models Encode Clinical Knowledge | - |