Efficient Video-Text Learning with Iterative Co-tokenization#Questions and Answers#Video#Machine Learning#Google#Blog·ai.googleblog.com·Aug 10, 2022Efficient Video-Text Learning with Iterative Co-tokenization
End-to-end Generative Pre-training for Multimodal Video Captioning#Multimodal#Video#Transformers#Google#Blog·ai.googleblog.com·Jun 7, 2022End-to-end Generative Pre-training for Multimodal Video Captioning
VDTTS: Visually-Driven Text-To-Speech#Speech#Video#Google#Blog·ai.googleblog.com·Apr 8, 2022VDTTS: Visually-Driven Text-To-Speech