Deniz Jafari will be presenting “VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text” by Akbari et al. You can find the paper here: https://arxiv.org/pdf/2104.11178.pdf.
Join Zoom Meeting
https://utoronto.zoom.us/j/82817308412
Meeting ID: 828 1730 8412
One tap mobile
+16465189805,,82817308412# US (New York)
+16465588656,,82817308412# US (New York)
Dial by your location
Meeting ID: 828 1730 8412
Find your local number: https://utoronto.zoom.us/u/kYqCIBAgM