Video Activity Recognition

Introduction

Problem Definition

Human Brain’s perception of an activity

Different types of sensors for activity recognition

  • RGB
  • RGB-D
  • Neuromorphic hardware
  • Accelerometer

    Traditional Methods

Categorisation of activity recognition from literature

  • input data
    • RGB
    • RGBD
    • RGB+FLOW
  • DIfferent methods for temporal context
    • 2DCNNs
    • LSTMs
    • 3DCNNs
    • Transformers
  • Two stream approaches
    • Image + motion
    • Image + skeleton
    • Image + motion + skeleton
    • Image + segmentation masks
    • Image + bounding boxes
  • Types of interaction
    • Human to human
    • Human to object
    • Single human
  • Number of activities
    • Single activity
    • Multiple activities
  • Crowd
    • Understanding crowd behavior using activity recognition

Available Datasets

Video classification

  • Important methods
    • LRCN
    • C#D
    • Conv3d and attention
    • Two stream
    • TSN
    • Action VLAD
    • Hidden two stream
    • I3D
    • T3D
  • LSTM and RNN
    • LRCNN
  • 2DCNN
  • 3DCNN
    • Resents
    • Inception 3d CNN’s
    • Fusion
      • Late early and slow fusion
      • Graph convolutions based methods
      • STGCN
    • Temporal action localization
      • Tubelets

Datasets used for evaluation

Comparison table

Visualizing networks learning for video activity recognition

  • Explaining learning of models
  • Network learning visualization

Challenges in activity recognition

Application of Activity recognition

Conclusion

Active research groups in Video Activity Recognition

References