Visual Reasoning On Bongard Openworld
评估指标
2-Class Accuracy
评测结果
各个模型在此基准测试上的表现结果
模型名称 | 2-Class Accuracy | Paper Title | Repository |
---|---|---|---|
Human | 91.0 | Bongard-OpenWorld: Few-Shot Reasoning for Free-form Visual Concepts in the Real World | - |
ChatCaptioner + ChatGPT | 49.3 | Bongard-OpenWorld: Few-Shot Reasoning for Free-form Visual Concepts in the Real World | - |
Componential analysis - gpt-4o | 92.8 | Cognitive Paradigms for Evaluating VLMs on Visual Reasoning Task | - |
componential analysis - gemini-2.0 | 93.6 | Cognitive Paradigms for Evaluating VLMs on Visual Reasoning Task | - |
BLIP-2 + ChatGPT (Fine-tuned) | 63.3 | Bongard-OpenWorld: Few-Shot Reasoning for Free-form Visual Concepts in the Real World | - |
InstructBLIP + ChatGPT + Neuro-Symbolic | 55.5 | Bongard-OpenWorld: Few-Shot Reasoning for Free-form Visual Concepts in the Real World | - |
SNAIL | 64.0 | Bongard-OpenWorld: Few-Shot Reasoning for Free-form Visual Concepts in the Real World | - |
Otter | 49.3 | Bongard-OpenWorld: Few-Shot Reasoning for Free-form Visual Concepts in the Real World | - |
InstructBLIP + GPT-4 | 63.8 | Bongard-OpenWorld: Few-Shot Reasoning for Free-form Visual Concepts in the Real World | - |
0 of 9 row(s) selected.