WebInstruct-verified Multi-domain Reasoning Dataset
Date
4 days ago
Publish URL
License
Apache 2.0
WebInstruct-verified is a multi-domain reasoning dataset jointly released by the University of Waterloo and Vector Institute in 2025. The related paper results are "General-Reasoner: Advancing LLM Reasoning Across All Domains", which aims to enhance LLMs' reasoning ability in diverse fields while retaining their strengths in mathematics.
This dataset contains approximately 230,000 reasoning questions, covering a variety of answer formats, including multiple-choice questions and a balanced distribution of numerical expression datasets. The dataset primarily covers disciplines such as mathematics, physics, chemistry, finance, and various other humanities and social sciences.
Dataset characteristics:
- Zero RL training: Direct reinforcement learning from the base LLM, bypassing the intermediate supervision stage.
- Diverse reasoning data: Over 230K high-quality, verifiable questions sourced from the web, filtered for answer verifiability across disciplines.
- Model-based Verifier: A compact 1.5B generative verifier model for context-aware, thought-chain answer verification that outperforms traditional rule-based approaches.
