RoBERTa-Large 355M (fine-tuned) | 76.7 | RoBERTa: A Robustly Optimized BERT Pretraining Approach | |
LLaMA-3 8B+MoSLoRA (fine-tuned) | 81.0 | Mixture-of-Subspaces in Low-Rank Adaptation | |
BERT-base 110M (fine-tuned) | 63.1 | SocialIQA: Commonsense Reasoning about Social Interactions | |
BERT-large 340M (fine-tuned) | 64.5 | SocialIQA: Commonsense Reasoning about Social Interactions | |