← Back
🔍
Interpreting Video Models
PythonPyTorchTimeSFormerGrad-CAM
Explored interpretability of video action recognition models. Visualized internal attention mechanisms of TimeSFormer to reveal which spatio-temporal regions influence predictions. Applied Grad-CAM to 3D CNN-based models, providing insights into how convolutional architectures perceive video content across spatial and temporal dimensions.