HyperAI
HyperAI超神经
首页
算力平台
文档
资讯
论文
教程
数据集
百科
SOTA
LLM 模型天梯
GPU 天梯
顶会
开源项目
全站搜索
关于
中文
HyperAI
HyperAI超神经
Toggle sidebar
全站搜索…
⌘
K
Command Palette
Search for a command to run...
首页
SOTA
零样本视频问答
Zero Shot Video Question Answer On Intentqa
Zero Shot Video Question Answer On Intentqa
评估指标
Accuracy
评测结果
各个模型在此基准测试上的表现结果
Columns
模型名称
Accuracy
Paper Title
Repository
ENTER
71.5
ENTER: Event Based Interpretable Reasoning for VideoQA
-
LVNet
71.1
Too Many Frames, Not All Useful: Efficient Strategies for Long-Form Video QA
TS-LLaVA-34B
67.9
TS-LLaVA: Constructing Visual Tokens through Thumbnail-and-Sampling for Training-Free Video Large Language Models
VidCtx (7B)
67.1
VidCtx: Context-aware Video Question Answering with Image Models
VideoTree (GPT4)
66.9
VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos
IG-VLM
65.3
An Image Grid Can Be Worth a Video: Zero-shot Video Question Answering Using a VLM
LLoVi (GPT-4)
64.0
A Simple LLM Framework for Long-Range Video Question-Answering
SeViLA (4B)
60.9
Self-Chained Image-Language Model for Video Localization and Question Answering
SlowFast-LLaVA-34B
60.1
SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models
LangRepo (12B)
59.1
Language Repository for Long Video Understanding
LLoVi (7B)
53.6
A Simple LLM Framework for Long-Range Video Question-Answering
Mistral (7B)
50.4
Mistral 7B
Random
20.0
-
-
0 of 13 row(s) selected.
Previous
Next
Zero Shot Video Question Answer On Intentqa | SOTA | HyperAI超神经