Visual Studio JavaScript Dynamic Pages Tutorial

Towards Generalisable Video Moment Retrieval: Visual-Dynamic Injection to Image-Text Pre-Training

Abstract: The correlation between the vision and text is essential for video moment retrieval (VMR), however, existing methods heavily rely on separate pre-training feature extractors for visual and ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

Towards Generalisable Video Moment Retrieval: Visual-Dynamic Injection to Image-Text Pre-Training

Trending now