Follow
Dara Bahri
Dara Bahri
Research Scientist, Google DeepMind
Verified email at google.com - Homepage
Title
Cited by
Cited by
Year
Efficient Transformers: A Survey
D Tay, Yi and Dehghani, Mostafa and Bahri, Dara and Metzler
ACM Computing Surveys, 2022 55 (0360-0300), 2022
1343*2022
Long range arena: A benchmark for efficient transformers
Y Tay, M Dehghani, S Abnar, Y Shen, D Bahri, P Pham, J Rao, L Yang, ...
arXiv preprint arXiv:2011.04006, 2020
6572020
Berkeley advanced reconstruction toolbox
M Uecker, F Ong, JI Tamir, D Bahri, P Virtue, JY Cheng, T Zhang, M Lustig
Proc. Intl. Soc. Mag. Reson. Med 23 (2486), 9, 2015
4622015
Synthesizer: Rethinking self-attention for transformer models
Y Tay, D Bahri, D Metzler, DC Juan, Z Zhao, C Zheng
International conference on machine learning, 10183-10192, 2021
3762021
Sparse sinkhorn attention
Y Tay, D Bahri, L Yang, D Metzler, DC Juan
International Conference on Machine Learning, 9438-9447, 2020
3342020
Ul2: Unifying language learning paradigms
Y Tay, M Dehghani, VQ Tran, X Garcia, J Wei, X Wang, HW Chung, ...
arXiv preprint arXiv:2205.05131, 2022
2782022
Transformer memory as a differentiable search index
Y Tay, V Tran, M Dehghani, J Ni, D Bahri, H Mehta, Z Qin, K Hui, Z Zhao, ...
Advances in Neural Information Processing Systems 35, 21831-21843, 2022
2222022
Ext5: Towards extreme multi-task scaling for transfer learning
V Aribandi, Y Tay, T Schuster, J Rao, HS Zheng, SV Mehta, H Zhuang, ...
arXiv preprint arXiv:2111.10952, 2021
2062021
Scarf: Self-supervised contrastive learning using random feature corruption
D Bahri, H Jiang, Y Tay, D Metzler
arXiv preprint arXiv:2106.15147, 2021
1732021
Confident adaptive language modeling
T Schuster, A Fisch, J Gupta, M Dehghani, D Bahri, V Tran, Y Tay, ...
Advances in Neural Information Processing Systems 35, 17456-17472, 2022
1702022
Rethinking search: making domain experts out of dilettantes
D Metzler, Y Tay, D Bahri, M Najork
Acm sigir forum 55 (1), 1-27, 2021
1442021
Charformer: Fast character transformers via gradient-based subword tokenization
Y Tay, VQ Tran, S Ruder, J Gupta, HW Chung, D Bahri, Z Qin, ...
arXiv preprint arXiv:2106.12672, 2021
1402021
Unifying language learning paradigms
Y Tay, M Dehghani, VQ Tran, X Garcia, D Bahri, T Schuster, HS Zheng, ...
arXiv preprint arXiv:2205.05131 10, 2022
1392022
Sharpness-aware minimization improves language model generalization
D Bahri, H Mobahi, Y Tay
arXiv preprint arXiv:2110.08529, 2021
902021
Deep k-nn for noisy labels
D Bahri, H Jiang, M Gupta
International Conference on Machine Learning, 540-550, 2020
892020
Are pre-trained convolutions better than pre-trained transformers?
Y Tay, M Dehghani, J Gupta, D Bahri, V Aribandi, Z Qin, D Metzler
arXiv preprint arXiv:2105.03322, 2021
862021
StructFormer: Joint unsupervised induction of dependency and constituency structure from masked language modeling
Y Shen, Y Tay, C Zheng, D Bahri, D Metzler, A Courville
arXiv preprint arXiv:2012.00857, 2020
462020
Hypergrid transformers: Towards a single model for multiple tasks
Y Tay, Z Zhao, D Bahri, D Metzler, DC Juan
International conference on learning representations, 2020
442020
Omninet: Omnidirectional representations from transformers
Y Tay, M Dehghani, V Aribandi, J Gupta, PM Pham, Z Qin, D Bahri, ...
International Conference on Machine Learning, 10193-10202, 2021
362021
Reverse engineering configurations of neural text generation models
Y Tay, D Bahri, C Zheng, C Brunk, D Metzler, A Tomkins
arXiv preprint arXiv:2004.06201, 2020
302020
The system can't perform the operation now. Try again later.
Articles 1–20