Abstract: Pre-trained video generation models hold great potential for generative video super-resolution (VSR). However, adapting them for full-size VSR, as most existing methods do, suffers from ...
Abstract: We present an architecture and a training recipe that adapts pretrained open-world image models to localization in videos. Understanding the open visual world (without being constrained by ...