HyperAIHyperAI

Command Palette

Search for a command to run...

Console

WildSpeech-Bench Speech Understanding Generation Benchmark Dataset

Date

2 months ago

Size

169.51 MB

Organization

Tencent AI Lab
Tencent

Paper URL

2506.21875

License

CC BY 4.0

WildSpeech-Bench is the first benchmark for evaluating the speech-to-speech capabilities of SpeechLLM, released by Tencent in 2025. The related paper results are "WildSpeech-Bench: Benchmarking End-to-End SpeechLLMs in the Wild", which aims to measure the model's ability to understand and generate complete speech input to speech output (Speech-to-Speech, S2S) in real voice interaction scenarios.

The dataset contains 1,100 queries across five main categories: information queries, solution requests, opinion exchanges, text creation, and paralinguistic expressions. Each category corresponds to a common user intent. 1,000 of these queries are from general voice interaction scenarios (including information queries, solution requests, opinion exchanges, and text creation), while another 100 are characterized by paralinguistic features such as pauses, intonation, stuttering, and near-phonetic word recognition. Each query is accompanied by diverse speech output examples, encompassing a wide range of speaker attributes (gender, age, voice variants), acoustic conditions, and noise environment settings, to more realistically simulate the diversity and challenges of natural voice interaction.

Dataset construction process

WildSpeech-Bench.torrent
Seeding 2Downloading 0Completed 0Total Downloads 15
  • WildSpeech-Bench/
    • README.md
      1.83 KB
    • README.txt
      3.66 KB
      • data/
        • WildSpeech-Bench.zip
          169.51 MB

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp