HyperAI

KRIS-Bench: Benchmarking Next-Level Intelligent Image Editing Models

Yongliang Wu, Zonghui Li, Xinting Hu, Xinyu Ye, Xianfang Zeng, Gang Yu, Wenbo Zhu, Bernt Schiele, Ming-Hsuan Yang, Xu Yang
Release Date: 5/25/2025
KRIS-Bench: Benchmarking Next-Level Intelligent Image Editing Models
Abstract

Recent advances in multi-modal generative models have enabled significantprogress in instruction-based image editing. However, while these modelsproduce visually plausible outputs, their capacity for knowledge-basedreasoning editing tasks remains under-explored. In this paper, we introduceKRIS-Bench (Knowledge-based Reasoning in Image-editing Systems Benchmark), adiagnostic benchmark designed to assess models through a cognitively informedlens. Drawing from educational theory, KRIS-Bench categorizes editing tasksacross three foundational knowledge types: Factual, Conceptual, and Procedural.Based on this taxonomy, we design 22 representative tasks spanning 7 reasoningdimensions and release 1,267 high-quality annotated editing instances. Tosupport fine-grained evaluation, we propose a comprehensive protocol thatincorporates a novel Knowledge Plausibility metric, enhanced by knowledge hintsand calibrated through human studies. Empirical results on 10 state-of-the-artmodels reveal significant gaps in reasoning performance, highlighting the needfor knowledge-centric benchmarks to advance the development of intelligentimage editing systems.