Video Activity Recognition
Introduction
Problem Definition
Human Brain’s perception of an activity
Different types of sensors for activity recognition
- RGB
- RGB-D
- Neuromorphic hardware
- Accelerometer
Traditional Methods
Categorisation of activity recognition from literature
- input data
- DIfferent methods for temporal context
- 2DCNNs
- LSTMs
- 3DCNNs
- Transformers
- Two stream approaches
- Image + motion
- Image + skeleton
- Image + motion + skeleton
- Image + segmentation masks
- Image + bounding boxes
- Types of interaction
- Human to human
- Human to object
- Single human
- Number of activities
- Single activity
- Multiple activities
- Crowd
- Understanding crowd behavior using activity recognition
Available Datasets
Video classification
- Important methods
- LRCN
- C#D
- Conv3d and attention
- Two stream
- TSN
- Action VLAD
- Hidden two stream
- I3D
- T3D
- LSTM and RNN
- 2DCNN
- 3DCNN
- Resents
- Inception 3d CNN’s
- Fusion
- Late early and slow fusion
- Graph convolutions based methods
- STGCN
- Temporal action localization
Datasets used for evaluation
Comparison table
Visualizing networks learning for video activity recognition
- Explaining learning of models
- Network learning visualization
Challenges in activity recognition
Application of Activity recognition
Conclusion
Active research groups in Video Activity Recognition
References