Visual Question Answering On Iconqa
评估指标
Reasoning (Alg.)
Reasoning (Com.)
Reasoning (Cou.)
Reasoning (Est.)
Reasoning (Fra.)
Reasoning (Geo.)
Reasoning (Mea.)
Reasoning (Pat.)
Reasoning (Pro.)
Reasoning (Sce.)
Reasoning (Sen.)
Reasoning (Spa.)
Reasoning (Tim.)
Sub-tasks (Blank)
Sub-tasks (Img.)
Sub-tasks (Txt.)
评测结果
各个模型在此基准测试上的表现结果
比较表格
模型名称 | Reasoning (Alg.) | Reasoning (Com.) | Reasoning (Cou.) | Reasoning (Est.) | Reasoning (Fra.) | Reasoning (Geo.) | Reasoning (Mea.) | Reasoning (Pat.) | Reasoning (Pro.) | Reasoning (Sce.) | Reasoning (Sen.) | Reasoning (Spa.) | Reasoning (Tim.) | Sub-tasks (Blank) | Sub-tasks (Img.) | Sub-tasks (Txt.) |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
iconqa-a-new-benchmark-for-abstract-diagram | 50.27 | 81.69 | 70.68 | 99.02 | 77.60 | 81.80 | 98.83 | 56.60 | 85.70 | 67.01 | 84.11 | 51.42 | 67.72 | 78.28 | 77.72 | 72.17 |
iconqa-a-new-benchmark-for-abstract-diagram | 28.02 | 48.19 | 33.63 | 40.46 | 33.06 | 38.03 | 38.07 | 33.66 | 40.76 | 35.37 | 45.25 | 37.14 | 48.09 | 28.45 | 41.64 | 36.86 |
iconqa-a-new-benchmark-for-abstract-diagram | 50.62 | 75.60 | 71.05 | 99.22 | 74.09 | 80.05 | 99.07 | 62.78 | 70.94 | 58.52 | 81.78 | 49.46 | 66.72 | 77.08 | 76.66 | 70.47 |
iconqa-a-new-benchmark-for-abstract-diagram | 31.73 | 45.26 | 37.64 | 62.29 | 32.48 | 38.71 | 64.02 | 36.29 | 37.51 | 35.47 | 45.25 | 37.52 | 47.37 | 46.65 | 41.56 | 36.02 |
iconqa-a-new-benchmark-for-abstract-diagram | 11.12 | 41.20 | 18.38 | 3.62 | 34.84 | 30.30 | 0.36 | 34.81 | 38.81 | 34.25 | 45.16 | 36.49 | 35.82 | 0.29 | 41.70 | 36.87 |
iconqa-a-new-benchmark-for-abstract-diagram | 50.55 | 84.95 | 71.13 | 99.02 | 75.81 | 82.61 | 98.91 | 59.22 | 87.65 | 66.72 | 86.10 | 53.38 | 69.99 | 79.27 | 79.67 | 72.69 |
iconqa-a-new-benchmark-for-abstract-diagram | 49.18 | 83.67 | 71.01 | 99.41 | 78.37 | 81.31 | 99.38 | 60.81 | 87.84 | 61.25 | 86.10 | 48.34 | 69.77 | 78.53 | 78.71 | 72.39 |
iconqa-a-new-benchmark-for-abstract-diagram | 47.32 | 82.73 | 68.94 | 99.08 | 76.20 | 79.86 | 98.99 | 54.79 | 84.87 | 62.49 | 83.25 | 49.70 | 68.00 | 74.52 | 77.36 | 71.25 |
iconqa-a-new-benchmark-for-abstract-diagram | 47.46 | 82.12 | 67.56 | 97.06 | 73.77 | 79.99 | 96.50 | 55.67 | 82.45 | 66.92 | 82.12 | 53.20 | 66.50 | 75.54 | 76.33 | 70.82 |
iconqa-a-new-benchmark-for-abstract-diagram | 56.73 | 87.00 | 77.81 | 98.24 | 82.13 | 81.87 | 97.98 | 68.75 | 95.73 | 62.39 | 92.49 | 55.62 | 77.98 | 83.62 | 82.66 | 75.19 |
iconqa-a-new-benchmark-for-abstract-diagram | 51.10 | 82.12 | 70.84 | 98.95 | 77.41 | 82.60 | 98.76 | 58.46 | 86.07 | 68.80 | 84.72 | 54.64 | 68.66 | 78.92 | 79.15 | 72.34 |
iconqa-a-new-benchmark-for-abstract-diagram | 50.00 | 80.65 | 65.01 | 99.54 | 72.43 | 80.07 | 99.46 | 55.01 | 83.75 | 58.22 | 84.54 | 45.78 | 68.28 | 73.03 | 75.92 | 68.51 |