Proximal policy optimization algorithms J Schulman, F Wolski, P Dhariwal, A Radford, O Klimov arXiv preprint arXiv:1707.06347, 2017 | 17234 | 2017 |
Improved techniques for training gans T Salimans, I Goodfellow, W Zaremba, V Cheung, A Radford, X Chen Advances in neural information processing systems 29, 2016 | 9994 | 2016 |
Language models are unsupervised multitask learners A Radford, J Wu, R Child, D Luan, D Amodei, I Sutskever OpenAI blog 1 (8), 9, 2019 | 9710 | 2019 |
Improving language understanding by generative pre-training A Radford, K Narasimhan, T Salimans, I Sutskever | 9115 | 2018 |
Openai gym G Brockman, V Cheung, L Pettersson, J Schneider, J Schulman, J Tang, ... arXiv preprint arXiv:1606.01540, 2016 | 7210 | 2016 |
Infogan: Interpretable representation learning by information maximizing generative adversarial nets X Chen, Y Duan, R Houthooft, J Schulman, I Sutskever, P Abbeel Advances in neural information processing systems 29, 2016 | 5197 | 2016 |
Multi-agent actor-critic for mixed cooperative-competitive environments R Lowe, YI Wu, A Tamar, J Harb, OAI Pieter Abbeel, I Mordatch Advances in neural information processing systems 30, 2017 | 4617 | 2017 |
Domain randomization for transferring deep neural networks from simulation to the real world J Tobin, R Fong, A Ray, J Schneider, W Zaremba, P Abbeel 2017 IEEE/RSJ international conference on intelligent robots and systems …, 2017 | 3087 | 2017 |
Glow: Generative flow with invertible 1x1 convolutions DP Kingma, P Dhariwal Advances in neural information processing systems 31, 2018 | 3075 | 2018 |
Hindsight experience replay M Andrychowicz, F Wolski, A Ray, J Schneider, R Fong, P Welinder, ... Advances in neural information processing systems 30, 2017 | 2663 | 2017 |
Concrete problems in AI safety D Amodei, C Olah, J Steinhardt, P Christiano, J Schulman, D Mané arXiv preprint arXiv:1606.06565, 2016 | 2549 | 2016 |
Weight normalization: A simple reparameterization to accelerate training of deep neural networks T Salimans, DP Kingma Advances in neural information processing systems 29, 2016 | 2134 | 2016 |
Deep reinforcement learning from human preferences PF Christiano, J Leike, T Brown, M Martic, S Legg, D Amodei Advances in neural information processing systems 30, 2017 | 2109 | 2017 |
Improved variational inference with inverse autoregressive flow DP Kingma, T Salimans, R Jozefowicz, X Chen, I Sutskever, M Welling Advances in neural information processing systems 29, 2016 | 1977 | 2016 |
Dota 2 with large scale deep reinforcement learning C Berner, G Brockman, B Chan, V Cheung, P Dębiak, C Dennison, ... arXiv preprint arXiv:1912.06680, 2019 | 1691 | 2019 |
Evolution strategies as a scalable alternative to reinforcement learning T Salimans, J Ho, X Chen, S Sidor, I Sutskever arXiv preprint arXiv:1703.03864, 2017 | 1660 | 2017 |
Learning dexterous in-hand manipulation OpenAI, M Andrychowicz, B Baker, M Chociej, R Józefowicz, B McGrew, ... arXiv preprint arXiv:1808.00177, 2018 | 1606* | 2018 |
Generating long sequences with sparse transformers R Child, S Gray, A Radford, I Sutskever arXiv preprint arXiv:1904.10509, 2019 | 1555 | 2019 |
Sim-to-real transfer of robotic control with dynamics randomization XB Peng, M Andrychowicz, W Zaremba, P Abbeel 2018 IEEE international conference on robotics and automation (ICRA), 3803-3810, 2018 | 1419 | 2018 |
Adversarial training methods for semi-supervised text classification T Miyato, AM Dai, I Goodfellow arXiv preprint arXiv:1605.07725, 2016 | 1238 | 2016 |