HyperAI
HyperAI超神经
首页
算力平台
文档
资讯
论文
教程
数据集
百科
SOTA
LLM 模型天梯
GPU 天梯
顶会
开源项目
全站搜索
关于
中文
HyperAI
HyperAI超神经
Toggle sidebar
全站搜索…
⌘
K
Command Palette
Search for a command to run...
首页
SOTA
图像字幕生成
Image Captioning On Nocaps Val Out Domain
Image Captioning On Nocaps Val Out Domain
评估指标
CIDEr
SPICE
评测结果
各个模型在此基准测试上的表现结果
Columns
模型名称
CIDEr
SPICE
Paper Title
Repository
BLIP-2 ViT-G FlanT5 XL (zero-shot)
124.8
15.1
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
BLIP-2 ViT-G OPT 6.7B (zero-shot)
124.4
14.8
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
BLIP-2 ViT-G OPT 2.7B (zero-shot)
123.4
15.1
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
BLIP_ViT-L
115.3
14.4
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
SimVLM
115.2
-
SimVLM: Simple Visual Language Model Pretraining with Weak Supervision
BLIP_CapFilt-L
111.5
14.2
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
LEMON_large
111.3
14.0
Scaling Up Vision-Language Pre-training for Image Captioning
-
OmniVL
106.3
14.2
OmniVL:One Foundation Model for Image-Language and Video-Language Tasks
-
Enc-Dec
94.5
11.9
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
VinVL
88.3
12.1
VinVL: Revisiting Visual Representations in Vision-Language Models
0 of 10 row(s) selected.
Previous
Next
Image Captioning On Nocaps Val Out Domain | SOTA | HyperAI超神经