Command Palette
Search for a command to run...
{Dina Khattab Bassel Safwat Chawky Youssef Mohamed Mostafa Mina Abd El-Massih Nashed Mohamed Hussein Kamal Mohamed Mostafa Soliman}
Abstract
Automatic recognition of violence between individuals or crowds in videos has a broad interest. In this work, an end-to-end deep neural network model for the purpose of recognizing violence in videos is proposed. The proposed model uses a pre-trained VGG-16 on ImageNet as spatial feature extractor followed by Long Short-Term Memory (LSTM) as temporal feature extractor and sequence of fully connected layers for classification purpose. The achieved accuracy is near state-of-the-art. Also, we contribute by introducing a new benchmark called Real- Life Violence Situations which contains 2000 short videos divided into 1000 violence videos and 1000 non-violence videos. The new benchmark is used for fine-tuning the proposed models achieving a best accuracy of 88.2%.
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| action-recognition-on-real-life-violence | CNN+LSTM | accuracy: 88.8% |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.