HyperAI超神经

Visual Question Answering On Vip Bench

评估指标

GPT-4 score (bbox)
GPT-4 score (human)

评测结果

各个模型在此基准测试上的表现结果

比较表格
模型名称GPT-4 score (bbox)GPT-4 score (human)
inst-it-boosting-multimodal-instance50.549.0
instructblip-towards-general-purpose-vision35.835.2
gpt-4-technical-report-160.759.9
gpt4roi-instruction-tuning-large-language35.1-
kosmos-2-grounding-multimodal-large-language26.9-
qwen-vl-a-frontier-large-vision-language45.3-
improved-baselines-with-visual-instruction41.842.9
shikra-unleashing-multimodal-llm-s33.7-
improved-baselines-with-visual-instruction47.1-
gpt-4-technical-report-152.851.4
qwen-vl-a-frontier-large-vision-language39.241.7
inst-it-boosting-multimodal-instance45.148.2
making-large-language-models-better-data48.348.2