Command Palette
Search for a command to run...
Valderrama Natalia ; Puentes Paola Ruiz ; Hernández Isabela ; Ayobi Nicolás ; Verlyk Mathilde ; Santander Jessica ; Caicedo Juan ; Fernández Nicolás ; Arbeláez Pablo

Abstract
Most benchmarks for studying surgical interventions focus on a specificchallenge instead of leveraging the intrinsic complementarity among differenttasks. In this work, we present a new experimental framework towards holisticsurgical scene understanding. First, we introduce the Phase, Step, Instrument,and Atomic Visual Action recognition (PSI-AVA) Dataset. PSI-AVA includesannotations for both long-term (Phase and Step recognition) and short-termreasoning (Instrument detection and novel Atomic Action recognition) inrobot-assisted radical prostatectomy videos. Second, we present Transformersfor Action, Phase, Instrument, and steps Recognition (TAPIR) as a strongbaseline for surgical scene understanding. TAPIR leverages our dataset'smulti-level annotations as it benefits from the learned representation on theinstrument detection task to improve its classification capacity. Ourexperimental results in both PSI-AVA and other publicly available databasesdemonstrate the adequacy of our framework to spur future research on holisticsurgical scene understanding.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| surgical-phase-recognition-on-misaw | TAPIR | mAP: 94.24 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.