Question Answering On Next Qa Open Ended
评估指标
Accuracy
Confidence Score
评测结果
各个模型在此基准测试上的表现结果
模型名称 | Accuracy | Confidence Score | Paper Title | Repository |
---|---|---|---|---|
MovieChat | 49.9 | 2.7 | MovieChat: From Dense Token to Sparse Memory for Long Video Understanding | |
Video-ChatGPT | 54.6 | 3.2 | Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models | |
VideoChat | 56.6 | 3.2 | VideoChat: Chat-Centric Video Understanding | |
Vista-LLaMA | 60.7 | 3.4 | Vista-LLaMA: Reliable Video Narrator via Equal Distance to Visual Tokens | - |
Flash-VStream | 61.6 | 3.4 | Flash-VStream: Memory-Based Real-Time Understanding for Long Video Streams | |
MovieChat+ | 54.8 | 3.0 | MovieChat+: Question-aware Sparse Memory for Long Video Question Answering |
0 of 6 row(s) selected.