Tom B Brown

Cited by

	All	Since 2019
Citations	43539	42899
h-index	35	34
i10-index	43	36

20000

10000

5000

15000

2018201920202021202220232024296 517 1390 4408 7744 19947 8719

Public access

View all

3 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Jared KaplanJohns Hopkins University & AnthropicVerified email at pha.jhu.edu
Dario AmodeiCEO and Co-Founder at AnthropicVerified email at anthropic.com
Benjamin MannMember of Technical Staff, AnthropicVerified email at anthropic.com
Sam McCandlishAnthropicVerified email at anthropic.com
Amanda AskellAnthropicVerified email at askell.io
Jeffrey WuOpenAIVerified email at openai.com
Alec RadfordOpenAIVerified email at openai.com
Paul ChristianoNational Institute of Standards and TechnologyVerified email at nist.gov
Catherine OlssonAnthropicVerified email at mit.edu
Ian GoodfellowDeepMindVerified email at deepmind.com
Jan LeikeOpenAIVerified email at openai.com
Aurko RoyGoogle DeepmindVerified email at google.com
Justin GilmerGoogleVerified email at google.com
Tom HenighanAnthropic

Tom B Brown

Anthropic

Verified email at anthropic.com - Homepage

Artificial Intelligence Language Modeling Reinforcement Learning


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Language models are few-shot learners T Brown, B Mann, N Ryder, M Subbiah, JD Kaplan, P Dhariwal, ... Advances in neural information processing systems 33, 1877-1901, 2020	30392*	2020
Deep reinforcement learning from human preferences PF Christiano, J Leike, T Brown, M Martic, S Legg, D Amodei Advances in neural information processing systems 30, 2017	2102	2017
Extracting training data from large language models N Carlini, F Tramer, E Wallace, M Jagielski, A Herbert-Voss, K Lee, ... 30th USENIX Security Symposium (USENIX Security 21), 2633-2650, 2021	1209	2021
Scaling laws for neural language models J Kaplan, S McCandlish, T Henighan, TB Brown, B Chess, R Child, ... arXiv preprint arXiv:2001.08361, 2020	1105	2020
Adversarial patch TB Brown, D Mané, A Roy, M Abadi, J Gilmer arXiv preprint arXiv:1712.09665, 2017	926	2017
Fine-tuning language models from human preferences DM Ziegler, N Stiennon, J Wu, TB Brown, A Radford, D Amodei, ... arXiv preprint arXiv:1909.08593, 2019	859	2019
Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei. 2020. Scaling laws for neural language models J Kaplan, S McCandlish, T Henighan, TB Brown, B Chess arXiv preprint arXiv:2001.08361 2, 1557-1566, 2020	749	2020
Training a helpful and harmless assistant with reinforcement learning from human feedback Y Bai, A Jones, K Ndousse, A Askell, A Chen, N DasSarma, D Drain, ... arXiv preprint arXiv:2204.05862, 2022	681	2022
Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei. Scaling laws for neural language models J Kaplan, S McCandlish, T Henighan, TB Brown, B Chess arXiv preprint arXiv:2001.08361 1 (2), 4, 2020	628	2020
Constitutional ai: Harmlessness from ai feedback Y Bai, S Kadavath, S Kundu, A Askell, J Kernion, A Jones, A Chen, ... arXiv preprint arXiv:2212.08073, 2022	583	2022
Technical report on the cleverhans v2. 1.0 adversarial examples library N Papernot, F Faghri, N Carlini, I Goodfellow, R Feinman, A Kurakin, ... arXiv preprint arXiv:1610.00768, 2016	531*	2016
Aurko Roy, Martın Abadi, and Justin Gilmer. Adversarial patch TB Brown, D Mané arXiv preprint arXiv:1712.09665 2 (3), 4, 2017	486	2017
cleverhans v2. 0.0: an adversarial machine learning library N Papernot, I Goodfellow, R Sheatsley, R Feinman, P McDaniel arXiv preprint arXiv:1610.00768 10, 2016	319	2016
Scaling laws for autoregressive generative modeling T Henighan, J Kaplan, M Katz, M Chen, C Hesse, J Jackson, H Jun, ... arXiv preprint arXiv:2010.14701, 2020	235	2020
Language models (mostly) know what they know S Kadavath, T Conerly, A Askell, T Henighan, D Drain, E Perez, ... arXiv preprint arXiv:2207.05221, 2022	223	2022
Red teaming language models to reduce harms: Methods, scaling behaviors, and lessons learned D Ganguli, L Lovitt, J Kernion, A Askell, Y Bai, S Kadavath, B Mann, ... arXiv preprint arXiv:2209.07858, 2022	214	2022
A general language assistant as a laboratory for alignment A Askell, Y Bai, A Chen, D Drain, D Ganguli, T Henighan, A Jones, ... arXiv preprint arXiv:2112.00861, 2021	214	2021
In-context learning and induction heads C Olsson, N Elhage, N Nanda, N Joseph, N DasSarma, T Henighan, ... arXiv preprint arXiv:2209.11895, 2022	189	2022
Predictability and surprise in large generative models D Ganguli, D Hernandez, L Lovitt, A Askell, Y Bai, A Chen, T Conerly, ... Proceedings of the 2022 ACM Conference on Fairness, Accountability, and …, 2022	172	2022
A mathematical framework for transformer circuits N Elhage, N Nanda, C Olsson, T Henighan, N Joseph, B Mann, A Askell, ... Transformer Circuits Thread 1, 1, 2021	149	2021

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors