This is the official implementation of our work End-to-End Semi-Supervised Learning for Video Action Detection at CVPR'22. Paper
This is the command line argument to run the code respectively for variance and gradient maps:
python main.py --epochs 100 --bs 8 --loc_loss dice --lr 1e-4\
--pkl_file_label train_annots_20_labeled.pkl\
--pkl_file_unlabel train_annots_80_unlabeled.pkl\
--wt_loc 1 --wt_cls 1 --wt_cons 0.1\
--const_loss l2\
--bv --n_frames 5 --thresh_epoch 11\
--exp_id cyclic_variance_maps
python main.py --epochs 100 --bs 8 --loc_loss dice --lr 1e-4\
--pkl_file_label train_annots_20_labeled.pkl\
--pkl_file_unlabel train_annots_80_unlabeled.pkl\
--wt_loc 1 --wt_cls 1 --wt_cons 0.1\
--const_loss l2\
--gv\
--exp_id gradient_maps
Parameters explanation:
- bv - Temporal Variance Attentive Mask
- gv - Gradient Smoothness Attentive Mask
- wt_loc - Weight for localization loss
- wt_cls - Weight for classification loss
- wt_cons - Weight for consistency loss
- exp_id - Experiment id to set the folder name for saving checkpoints
- pkl_file_label - Labeled subset
- pkl_file_unlabel - Unlabeled subset
python evaluate.py --ckpt exp_id_folder
Link to download I3D pre-trained weights:
https://github.com/piergiaj/pytorch-i3d/tree/master/models
We have used rgb_charades.pt for our experiments.
UCF101-24 splits: Pickle files
JHMDB-21 splits: Text files
Set data path for UCF101 videos in ucf_dataloader.py inside datasets.
If you find this work useful, please consider citing the following paper:
@InProceedings{Kumar_2022_CVPR,
author = {Kumar, Akash and Rawat, Yogesh Singh},
title = {End-to-End Semi-Supervised Learning for Video Action Detection},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2022},
pages = {14700-14710}
}