Synchformer: Efficient Synchronization from Sparse Cues
The article introduces Synchformer, a new model for audio-visual synchronization focused on ‘in-the-wild’ videos, such as those found on YouTube, where synchronization cues can be sparse. The authors propose a…
Continue reading