WebClick Web Page Understanding Benchmark Dataset
WebClick is a high-quality web understanding benchmark dataset for evaluating the ability of multimodal models and agents to understand web interfaces, interpret user commands, and take precise actions in digital environments.
The dataset contains 1,639 English webpage screenshots from more than 100 websites, which are accompanied by accurately annotated natural language instructions and pixel-level click targets.
Dataset structure:
- agentbrowse(36%): Pages encountered by the SurferH agent when solving WebVoyager's Web retrieval tasks
- humanbrowse (31.8%): Pages and elements that humans interact with when performing everyday tasks (e-shopping, travel planning, personal organization)
- calendars (32.2%): Focuses on a specialized subset of calendar interfaces, which is a known challenge for UI comprehension models
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.