Follow
Arsha Nagrani
Arsha Nagrani
Research Scientist, Google
Verified email at google.com - Homepage
Title
Cited by
Cited by
Year
Voxceleb: a large-scale speaker identification dataset
A Nagrani, JS Chung, A Zisserman
arXiv preprint arXiv:1706.08612, 2017
23762017
Voxceleb2: Deep speaker recognition
JS Chung, A Nagrani, A Zisserman
arXiv preprint arXiv:1806.05622, 2018
22162018
Frozen in time: A joint video and image encoder for end-to-end retrieval
M Bain, A Nagrani, G Varol, A Zisserman
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2021
7042021
Voxceleb: Large-scale speaker verification in the wild
A Nagrani, JS Chung, W Xie, A Zisserman
Computer Speech & Language 60, 101027, 2020
6202020
Attention bottlenecks for multimodal fusion
A Nagrani, S Yang, A Arnab, A Jansen, C Schmid, C Sun
Advances in neural information processing systems 34, 14200-14213, 2021
4392021
Use what you have: Video retrieval using representations from collaborative experts
Y Liu, S Albanie, A Nagrani, A Zisserman
arXiv preprint arXiv:1907.13487, 2019
4002019
Utterance-level aggregation for speaker recognition in the wild
W Xie, A Nagrani, JS Chung, A Zisserman
ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019
3862019
Epic-fusion: Audio-visual temporal binding for egocentric action recognition
E Kazakos, A Nagrani, A Zisserman, D Damen
Proceedings of the IEEE/CVF international conference on computer vision …, 2019
3502019
Emotion recognition in speech using cross-modal transfer in the wild
S Albanie, A Nagrani, A Vedaldi, A Zisserman
Proceedings of the 26th ACM international conference on Multimedia, 292-301, 2018
2972018
Seeing voices and hearing faces: Cross-modal biometric matching
A Nagrani, S Albanie, A Zisserman
Proceedings of the IEEE conference on computer vision and pattern …, 2018
2272018
Chimpanzee face recognition from videos in the wild using deep learning
D Schofield, A Nagrani, A Zisserman, M Hayashi, T Matsuzawa, D Biro, ...
Science advances 5 (9), eaaw0736, 2019
1842019
Localizing visual sounds the hard way
H Chen, W Xie, T Afouras, A Nagrani, A Vedaldi, A Zisserman
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2021
1502021
Learnable pins: Cross-modal embeddings for person identity
A Nagrani, S Albanie, A Zisserman
Proceedings of the European Conference on Computer Vision (ECCV), 71-88, 2018
1362018
Spot the conversation: speaker diarisation in the wild
JS Chung, J Huh, A Nagrani, T Afouras, A Zisserman
arXiv preprint arXiv:2007.01216, 2020
1332020
End-to-end generative pretraining for multimodal video captioning
PH Seo, A Nagrani, A Arnab, C Schmid
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022
1312022
Cough against covid: Evidence of covid-19 signature in cough sounds
P Bagad, A Dalmia, J Doshi, A Nagrani, P Bhamare, A Mahale, S Rane, ...
arXiv preprint arXiv:2009.08790, 2020
1282020
Disentangled speech embeddings using cross-modal self-supervision
A Nagrani, JS Chung, S Albanie, A Zisserman
ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020
992020
Vid2seq: Large-scale pretraining of a visual language model for dense video captioning
A Yang, A Nagrani, PH Seo, A Miech, J Pont-Tuset, I Laptev, J Sivic, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
902023
Pali-x: On scaling up a multilingual vision and language model
X Chen, J Djolonga, P Padlewski, B Mustafa, S Changpinyo, J Wu, ...
arXiv preprint arXiv:2305.18565, 2023
802023
Voxsrc 2020: The second voxceleb speaker recognition challenge
A Nagrani, JS Chung, J Huh, A Brown, E Coto, W Xie, M McLaren, ...
arXiv preprint arXiv:2012.06867, 2020
792020
The system can't perform the operation now. Try again later.
Articles 1–20