HyperAI

CAS-VSR-W1k Lip Reading Recognition Dataset

Date

3 years ago

Organization

Publish URL

vipl.ict.ac.cn

License

非商业用途

Download Help
特色图像

CAS-VSR-W1k, formerly known as LRW-1000, is the largest publicly available Mandarin lexical-level lip sync dataset. The dataset contains 1,000 word classes and 700,000 samples from more than 2,000 speakers. The dataset contains more than 1,000,000 Chinese character instances.

Each category corresponds to a syllable of a Mandarin word consisting of one or several Chinese characters. The dataset is designed to cover natural variations in different speech modes and imaging conditions to incorporate challenges encountered in real applications.