MTFL: Multi-Timescale Feature Learning for
Weakly-Supervised Anomaly Detection in Surveillance Videos

Yiling Zhang, Erkut Akdag, Egor Bondarev, Peter H. N. De With

AIMS Group, Department of Electrical Engineering, Eindhoven University of Technology

Overview

Anomaly detection for public safety requires modeling fine-grained motion and contextual information across multiple time scales. We propose Multi-Timescale Feature Learning (MTFL), which leverages short-, medium-, and long-term temporal tubelets within a Video Swin Transformer to enhance spatio-temporal representations. MTFL achieves 89.78% AUC on UCF-Crime, outperforming existing methods, and demonstrates complementary performance with 95.32% AUC on ShanghaiTech and 84.57% AP on XD-Violence. In addition, we introduce the Video Anomaly Detection Dataset (VADD), an extended version of UCF-Crime containing 2,591 videos across 18 anomaly classes.

Teaser figure

Workflow of Multi-Timescale Feature Learning (MTFL) model. The input video is segmented into 𝑇 snippets. The Multi- Timescale Feature Generator (MTFG) creates three sets of 𝑇 features of 𝐷 dimensions, 𝐅L, 𝐅M, and 𝐅S, corresponding to features extracted within long, medium, and short temporal tubelets. Next, the Multi-Timescale Feature Fusion (MTFF) captures the correlations among three features and the dependencies among different video snippets to fuse the features into the output feature matrix 𝐗. The final anomaly scores of 𝑇 snippets are obtained after a classifier. A loss function involving feature magnitude loss and classification loss is used for training the MTFF and the classifier.

Results

Method overview

Qualitative anomaly score visualizations of the proposed MTFL model on representative samples from UCF-Crime and VADD. The red regions indicate the manually annotated temporal segments corresponding to anomalous events.

Citation

@article{mtfl2024,
  title   = {MTFL: Weakly Supervised Anomaly Detection in Surveillance Videos},
  author  = {Zhang, Yiling and Akdag, Erkut and Bondarev, Egor and De With, Peter H. N.},
  journal = {arXiv preprint arXiv:2410.05900},
  year    = {2024}
}