Zeroshot Video Question Answer On Msrvtt Qa
Metrics
Accuracy
Confidence Score
Results
Performance results of various models on this benchmark
Model Name | Accuracy | Confidence Score | Paper Title | Repository |
---|---|---|---|---|
Chat-UniVi-7B | 55.0 | 3.1 | - | - |
TS-LLaVA-34B | 66.2 | 3.6 | - | - |
BT-Adapter (zero-shot) | 51.2 | 2.9 | - | - |
Video-LLaVA-7B | 59.2 | 3.5 | - | - |
LLaMA-VID-7B (2 Token) | 57.7 | 3.2 | - | - |
IG-VLM | 63.8 | 3.5 | - | - |
Omni-VideoAssistant | 55.3 | 3.3 | - | - |
Elysium | 67.5 | 3.2 | - | - |
MovieChat | 52.7 | 2.6 | - | - |
SUM-shot+Vicuna | 56.8 | - | - | - |
CAT-7B | 62.1 | 3.5 | - | - |
BT-Adapter (zero-shot) | 51.2 | 2.9 | - | - |
VideoChat2 | 54.1 | 3.3 | - | - |
Vista-LLaMA-7B | 60.5 | 3.3 | - | - |
Tarsier (34B) | 66.4 | 3.7 | - | - |
Video-LaVIT | 59.3 | 3.3 | - | - |
Video Chat-7B | 45.0 | 2.5 | - | - |
VideoGPT+ | 60.6 | 3.6 | - | - |
Video-ChatGPT-7B | 49.3 | 2.8 | - | - |
PLLaVA (34B) | 68.7 | 3.6 | - | - |
0 of 30 row(s) selected.