HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

LLMs are Good Action Recognizers

Haoxuan Qu Yujun Cai Jun Liu

LLMs are Good Action Recognizers

Abstract

Skeleton-based action recognition has attracted lots of research attention. Recently, to build an accurate skeleton-based action recognizer, a variety of works have been proposed. Among them, some works use large model architectures as backbones of their recognizers to boost the skeleton data representation capability, while some other works pre-train their recognizers on external data to enrich the knowledge. In this work, we observe that large language models which have been extensively used in various natural language processing tasks generally hold both large model architectures and rich implicit knowledge. Motivated by this, we propose a novel LLM-AR framework, in which we investigate treating the Large Language Model as an Action Recognizer. In our framework, we propose a linguistic projection process to project each input action signal (i.e., each skeleton sequence) into its sentence format'' (i.e., anaction sentence''). Moreover, we also incorporate our framework with several designs to further facilitate this linguistic projection process. Extensive experiments demonstrate the efficacy of our proposed framework.

Benchmarks

BenchmarkMethodologyMetrics
skeleton-based-action-recognition-on-ntu-rgbdLit-llama
Accuracy (CS): 95
Accuracy (CV): 98.4
skeleton-based-action-recognition-on-ntu-rgbd-1Lit-llama
Accuracy (Cross-Setup): 91.5
Accuracy (Cross-Subject): 88.7

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
LLMs are Good Action Recognizers | Papers | HyperAI