8 months ago

Computer Vision

Convolutional Neural Network

Method/Architecture

Computer Vision

Hui Li Zidong Guo Seon-Min Rhee Seungju Han Jae-Joon Han

Abstract

Accurate facial landmarks are essential prerequisites for many tasks relatedto human faces. In this paper, an accurate facial landmark detector is proposedbased on cascaded transformers. We formulate facial landmark detection as acoordinate regression task such that the model can be trained end-to-end. Withself-attention in transformers, our model can inherently exploit the structuredrelationships between landmarks, which would benefit landmark detection underchallenging conditions such as large pose and occlusion. During cascadedrefinement, our model is able to extract the most relevant image featuresaround the target landmark for coordinate prediction, based on deformableattention mechanism, thus bringing more accurate alignment. In addition, wepropose a novel decoder that refines image features and landmark positionssimultaneously. With few parameter increasing, the detection performanceimproves further. Our model achieves new state-of-the-art performance onseveral standard facial landmark detection benchmarks, and shows goodgeneralization ability in cross-dataset evaluation.

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp

8 months ago

Computer Vision

Convolutional Neural Network

Method/Architecture

Computer Vision

Hui Li Zidong Guo Seon-Min Rhee Seungju Han Jae-Joon Han

Abstract

Accurate facial landmarks are essential prerequisites for many tasks relatedto human faces. In this paper, an accurate facial landmark detector is proposedbased on cascaded transformers. We formulate facial landmark detection as acoordinate regression task such that the model can be trained end-to-end. Withself-attention in transformers, our model can inherently exploit the structuredrelationships between landmarks, which would benefit landmark detection underchallenging conditions such as large pose and occlusion. During cascadedrefinement, our model is able to extract the most relevant image featuresaround the target landmark for coordinate prediction, based on deformableattention mechanism, thus bringing more accurate alignment. In addition, wepropose a novel decoder that refines image features and landmark positionssimultaneously. With few parameter increasing, the detection performanceimproves further. Our model achieves new state-of-the-art performance onseveral standard facial landmark detection benchmarks, and shows goodgeneralization ability in cross-dataset evaluation.

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp