Command Palette
Search for a command to run...
Albert Q. Jiang; Alexandre Sablayrolles; Arthur Mensch; Chris Bamford; Devendra Singh Chaplot; Diego de las Casas; Florian Bressand; Gianna Lengyel; Guillaume Lample; Lucile Saulnier; Lélio Renard Lavaud; Marie-Anne Lachaux; Pierre Stock; Teven Le Scao; Thibaut Lavril; Thomas Wang; Timothée Lacroix; William El Sayed

Abstract
We introduce Mistral 7B v0.1, a 7-billion-parameter language model engineered for superior performance and efficiency. Mistral 7B outperforms Llama 2 13B across all evaluated benchmarks, and Llama 1 34B in reasoning, mathematics, and code generation. Our model leverages grouped-query attention (GQA) for faster inference, coupled with sliding window attention (SWA) to effectively handle sequences of arbitrary length with a reduced inference cost. We also provide a model fine-tuned to follow instructions, Mistral 7B -- Instruct, that surpasses the Llama 2 13B -- Chat model both on human and automated benchmarks. Our models are released under the Apache 2.0 license.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| answerability-prediction-on-peerqa | Mistral-IT-v02-7B-32k | Macro F1: 0.4703 |
| arithmetic-reasoning-on-gsm8k | Mistral 7B (maj@8) | Accuracy: 52.2 Parameters (Billion): 7 |
| code-generation-on-mbpp | Mistral 7B (3-shot) | Accuracy: 47.5 |
| common-sense-reasoning-on-arc-challenge | Mistral 7B (0-shot) | Accuracy: 55.5 |
| common-sense-reasoning-on-arc-easy | Mistral 7B (0-shot) | Accuracy: 80.0 |
| common-sense-reasoning-on-winogrande | Mistral 7B (0-shot) | Accuracy: 75.3 |
| math-word-problem-solving-on-math | Mistral 7B (maj@4) | Accuracy: 13.1 Parameters (Billions): 7 |
| multi-task-language-understanding-on-mmlu | Mistral 7B (5-shot) | Average (%): 60.1 |
| question-answering-on-natural-questions | Mistral 7B (5-shot) | EM: 28.8 |
| question-answering-on-peerqa | Mistral-v02-7B-32k | AlignScore: 0.0827 Prometheus-2 Answer Correctness: 3.4245 Rouge-L: 0.1922 |
| question-answering-on-piqa | Mistral 7B (0-shot) | Accuracy: 83.0 |
| question-answering-on-triviaqa | Mistral 7B (5-shot) | EM: 69.9 |
| zero-shot-video-question-answer-on-intentqa | Mistral (7B) | Accuracy: 50.4 |
| zero-shot-video-question-answer-on-next-gqa | Mistral (7B) | Acc@GQA: 9.2 |
| zero-shot-video-question-answer-on-next-qa | Mistral (7B) | Accuracy: 51.1 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.