HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

Social Scene Understanding: End-to-End Multi-Person Action Localization and Collective Activity Recognition

Timur Bagautdinov; Alexandre Alahi; François Fleuret; Pascal Fua; Silvio Savarese

Social Scene Understanding: End-to-End Multi-Person Action Localization and Collective Activity Recognition

Abstract

We present a unified framework for understanding human social behaviors in raw image sequences. Our model jointly detects multiple individuals, infers their social actions, and estimates the collective actions with a single feed-forward pass through a neural network. We propose a single architecture that does not rely on external detection algorithms but rather is trained end-to-end to generate dense proposal maps that are refined via a novel inference scheme. The temporal consistency is handled via a person-level matching Recurrent Neural Network. The complete model takes as input a sequence of frames and outputs detections along with the estimates of individual actions and collective activities. We demonstrate state-of-the-art performance of our algorithm on multiple publicly available benchmarks.

Benchmarks

BenchmarkMethodologyMetrics
action-recognition-in-videos-on-volleyballGTT (VGG19)
Accuracy: 82.6
action-recognition-in-videos-on-volleyballSSU (GT)
Accuracy: 81.8

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Social Scene Understanding: End-to-End Multi-Person Action Localization and Collective Activity Recognition | Papers | HyperAI