Follow
Andrew Rouditchenko
Andrew Rouditchenko
PhD Student at MIT CSAIL
Verified email at mit.edu - Homepage
Title
Cited by
Cited by
Year
The sound of pixels
H Zhao, C Gan, A Rouditchenko, C Vondrick, J McDermott, A Torralba
Proceedings of the European conference on computer vision (ECCV), 570-586, 2018
4842018
Avlnet: Learning audio-visual language representations from instructional videos
A Rouditchenko, A Boggust, D Harwath, B Chen, D Joshi, S Thomas, ...
Proc. Interspeech 2021, 1584-1588, 2021
1162021
Self-supervised audio-visual co-segmentation
A Rouditchenko, H Zhao, C Gan, J McDermott, A Torralba
ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019
1102019
Everything at once-multi-modal fusion transformer for video retrieval
N Shvetsova, B Chen, A Rouditchenko, S Thomas, B Kingsbury, RS Feris, ...
Proceedings of the ieee/cvf conference on computer vision and pattern …, 2022
732022
Multimodal clustering networks for self-supervised learning from unlabeled videos
B Chen, A Rouditchenko, K Duarte, H Kuehne, S Thomas, A Boggust, ...
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2021
502021
Cross-modal discrete representation learning
AH Liu, SY Jin, CIJ Lai, A Rouditchenko, A Oliva, J Glass
arXiv preprint arXiv:2106.05438, 2021
262021
Contrastive audio-visual masked autoencoder
Y Gong, A Rouditchenko, AH Liu, D Harwath, L Karlinsky, H Kuehne, ...
The Eleventh International Conference on Learning Representations, 2022
212022
Cmkd: Cnn/transformer-based cross-model knowledge distillation for audio classification
Y Gong, S Khurana, A Rouditchenko, J Glass
arXiv preprint arXiv:2203.06760, 2022
212022
Cascaded Multilingual Audio-Visual Learning from Videos
A Rouditchenko, A Boggust, D Harwath, S Thomas, H Kuehne, B Chen, ...
Proc. Interspeech 2021, 3006-3010, 2021
72021
Label-efficient audio classification through multitask learning and self-supervision
T Lee, T Gong, S Padhy, A Rouditchenko, A Ndirango
arXiv preprint arXiv:1910.12587, 2019
62019
Uavm: Towards unifying audio and visual models
Y Gong, AH Liu, A Rouditchenko, J Glass
IEEE Signal Processing Letters 29, 2437-2441, 2022
52022
C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval
A Rouditchenko, YS Chuang, N Shvetsova, S Thomas, R Feris, ...
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
22023
Routing with self-attention for multimodal capsule networks
K Duarte, B Chen, N Shvetsova, A Rouditchenko, S Thomas, A Liu, ...
arXiv preprint arXiv:2112.00775, 2021
22021
Spoken ObjectNet: A Bias-Controlled Spoken Caption Dataset
I Palmer, A Rouditchenko, A Barbu, B Katz, J Glass
Proc. Interspeech 2021, 3650-3654, 2021
22021
Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages
A Rouditchenko, S Khurana, S Thomas, R Feris, L Karlinsky, H Kuehne, ...
arXiv preprint arXiv:2305.12606, 2023
12023
What, when, and where?--Self-Supervised Spatio-Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions
B Chen, N Shvetsova, A Rouditchenko, D Kondermann, S Thomas, ...
arXiv preprint arXiv:2303.16990, 2023
2023
Learning Audio-Video Language Representations
A Rouditchenko
Massachusetts Institute of Technology, 2021
2021
Cascaded Multilingual Audio-Visual Learning from Videos-Extended Abstract Andrew Rouditchenko1, Angie Boggust1, David Harwath2, Samuel Thomas3, Hilde Kuehne3, Brian Chen4 …
A Rouditchenko, A Boggust, D Harwath, S Thomas, H Kuehne, B Chen, ...
Everything at Once–Multi-modal Fusion Transformer for Video Retrieval Supplementary Material
N Shvetsova, B Chen, A Rouditchenko, S Thomas, B Kingsbury, R Feris, ...
The system can't perform the operation now. Try again later.
Articles 1–19