Dataperf: Benchmarks for data-centric ai development M Mazumder, C Banbury, X Yao, B Karlaš, W Gaviria Rojas, S Diamos, ... Advances in Neural Information Processing Systems 36, 2024 | 81 | 2024 |
Findings of the BabyLM Challenge: Sample-efficient pretraining on developmentally plausible corpora A Warstadt, A Mueller, L Choshen, E Wilcox, C Zhuang, J Ciro, ... Proceedings of the BabyLM Challenge at the 27th Conference on Computational …, 2023 | 36 | 2023 |
Adversarial nibbler: A data-centric challenge for improving the safety of text-to-image models A Parrish, HR Kirk, J Quaye, C Rastogi, M Bartolo, O Inel, J Ciro, ... arXiv preprint arXiv:2305.14384, 2023 | 3 | 2023 |
The PRISM Alignment Project: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models HR Kirk, A Whitefield, P Röttger, A Bean, K Margatina, J Ciro, R Mosquera, ... arXiv preprint arXiv:2404.16019, 2024 | | 2024 |
Speech Wikimedia: A 77 Language Multilingual Speech Dataset RM Gómez, J Eusse, J Ciro, D Galvez, R Hileman, K Bollacker, D Kanter arXiv preprint arXiv:2308.15710, 2023 | | 2023 |