Source code for "Bi-modal Transformer for Dense Video Captioning" (BMVC 2020)
-
Updated
Apr 8, 2023 - Jupyter Notebook
Source code for "Bi-modal Transformer for Dense Video Captioning" (BMVC 2020)
End-to-End Dense Video Captioning with Parallel Decoding (ICCV 2021)
[ACL 2020] PyTorch code for MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning
PyTorch implementation of Multi-modal Dense Video Captioning (CVPR 2020 Workshops)
Summary about Video-to-Text datasets. This repository is part of the review paper *Bridging Vision and Language from the Video-to-Text Perspective: A Comprehensive Review*
A curated list of “Temporally Language Grounding” and related area
A Pytorch implemention for some state-of-the-art models for" Temporally Language Grounding in Untrimmed Videos"
Second-place solution to dense video captioning task in ActivityNet Challenge (CVPR 2020 workshop)
Dense video captioning in PyTorch
Python implementation of extraction of several visual features representations from videos
Add a description, image, and links to the activitynet-captions topic page so that developers can more easily learn about it.
To associate your repository with the activitynet-captions topic, visit your repo's landing page and select "manage topics."