Follow
Dirk Groeneveld
Dirk Groeneveld
Allen Institute for Artificial Intelligence
Verified email at allenai.org
Title
Cited by
Cited by
Year
Construction of the literature graph in semantic scholar
W Ammar, D Groeneveld, C Bhagavatula, I Beltagy, M Crawford, ...
arXiv preprint arXiv:1805.02262, 2018
4902018
Documenting large webtext corpora: A case study on the colossal clean crawled corpus
J Dodge, M Sap, A Marasović, W Agnew, G Ilharco, D Groeneveld, ...
arXiv preprint arXiv:2104.08758, 2021
3812021
Generating search result summaries
D Groeneveld, D Meyerzon, D Mowatt
US Patent 8,285,699, 2012
1192012
From ‘F’to ‘A’on the NY regents science exams: An overview of the aristo project
P Clark, O Etzioni, T Khot, D Khashabi, B Mishra, K Richardson, ...
Ai Magazine 41 (4), 39-53, 2020
1142020
Name search using a ranking function
DH Groeneveld, D Meyerzon, D Mowatt, JA Alspaugh
US Patent 8,645,417, 2014
602014
Generating search result summaries
D Groeneveld, D Meyerzon, D Mowatt
US Patent 7,853,587, 2010
512010
Dolma: An open corpus of three trillion tokens for language model pretraining research
L Soldaini, R Kinney, A Bhagia, D Schwenk, D Atkinson, R Authur, ...
arXiv preprint arXiv:2402.00159, 2024
502024
A simple yet strong pipeline for hotpotqa
D Groeneveld, T Khot, A Sabharwal
arXiv preprint arXiv:2004.06753, 2020
442020
Olmo: Accelerating the science of language models
D Groeneveld, I Beltagy, P Walsh, A Bhagia, R Kinney, O Tafjord, AH Jha, ...
arXiv preprint arXiv:2402.00838, 2024
432024
What's In My Big Data?
Y Elazar, A Bhagia, I Magnusson, A Ravichander, D Schwenk, A Suhr, ...
arXiv preprint arXiv:2310.20707, 2023
432023
IKE-an interactive tool for knowledge extraction
B Dalvi, S Bhakthavatsalam, C Clark, P Clark, O Etzioni, A Fader, ...
Proceedings of the 5th workshop on automated knowledge base construction, 12-17, 2016
332016
Ananya Harsh Jha, Hamish Ivison, Ian Magnusson, Yizhong Wang, et al. 2024. Olmo: Accelerating the science of language models
D Groeneveld, I Beltagy, P Walsh, A Bhagia, R Kinney, O Tafjord
arXiv preprint arXiv:2402.00838, 2024
302024
Ananya Harsh Jha
D Groeneveld, I Beltagy, P Walsh, A Bhagia, R Kinney, O Tafjord
Hamish Ivison, Ian Magnusson, Yizhong Wang, Shane Arora, David Atkinson …, 2024
222024
Generating search result summaries
D Groeneveld, D Meyerzon, D Mowatt
US Patent 8,032,519, 2011
162011
Name search using a ranking function
DH Groeneveld, D Meyerzon, D Mowatt, JA Alspaugh
US Patent 9,727,639, 2017
122017
Dolma: An open corpus of 3 trillion tokens for language model pretraining research
L Soldaini, R Kinney, A Bhagia, D Schwenk, D Atkinson, R Authur, ...
Allen Institute for AI, Tech. Rep, 5998-6008, 2023
92023
DataComp-LM: In search of the next generation of training sets for language models
J Li, A Fang, G Smyrnis, M Ivgi, M Jordan, S Gadre, H Bansal, E Guha, ...
arXiv preprint arXiv:2406.11794, 2024
62024
Large language model distillation doesn’t need a teacher
AH Jha, D Groeneveld, E Strubell, I Beltagy
arXiv preprint arXiv:2305.14864, 2023
42023
Construction of the literature graph in semantic scholar. NAACL
W Ammar, D Groeneveld, C Bhagavatula, I Beltagy, M Crawford, ...
URL: https://www. semanticscholar. org/paper …, 2018
42018
Paloma: A benchmark for evaluating language model fit
I Magnusson, A Bhagia, V Hofmann, L Soldaini, AH Jha, O Tafjord, ...
arXiv preprint arXiv:2312.10523, 2023
32023
The system can't perform the operation now. Try again later.
Articles 1–20