Follow
David Lindner
Title
Cited by
Cited by
Year
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
S Casper, X Davies, C Shi, TK Gilbert, J Scheurer, J Rando, R Freedman, ...
arXiv preprint arXiv:2307.15217, 2023
1372023
Red-Teaming the Stable Diffusion Safety Filter
J Rando, D Paleka, D Lindner, L Heim, F Tramèr
NeurIPS ML Safety Workshop, 2022
652022
Tracr: Compiled Transformers as a Laboratory for Interpretability
D Lindner, J Kramár, M Rahtz, T McGrath, V Mikulik
Conference on Neural Information Processing Systems (NeurIPS), 2023
272023
Sensing Social Media Signals for Cryptocurrency News
J Beck, R Huang, D Lindner, T Guo, C Zhang, D Helbing, ...
Companion Proceedings of The 2019 World Wide Web Conference, 2019
162019
GoSafeOpt: Scalable Safe Exploration for Global Optimization of Dynamical Systems
B Sukhija, M Turchetta, D Lindner, A Krause, S Trimpe, D Baumann
Artificial Intelligence, 103922, 2023
152023
Information Directed Reward Learning for Reinforcement Learning
D Lindner, M Turchetta, S Tschiatschek, K Ciosek, A Krause
Conference on Neural Information Processing Systems (NeurIPS), 2021
142021
Active exploration for inverse reinforcement learning
D Lindner, A Krause, G Ramponi
Advances in Neural Information Processing Systems 35, 5843-5853, 2022
112022
Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning
J Rocamonde, V Montesinos, E Nava, E Perez, D Lindner
arXiv preprint arXiv:2310.12921, 2023
92023
Humans are not Boltzmann Distributions: Challenges and Opportunities for Modelling Human Feedback and Interaction in Reinforcement Learning
D Lindner, M El-Assady
Communication in Human-AI Interaction Workshop (CHAI) at IJCAI-ECAI, 2022
82022
Addressing the Long-term Impact of ML Decisions via Policy Regret
D Lindner, H Heidari, A Krause
International Joint Conferences on Artificial Intelligence (IJCAI), 2021
62021
Challenges for Using Impact Regularizers to Avoid Negative Side Effects
D Lindner, K Matoba, A Meulemans
SafeAI Workshop at AAAI 2021, 2021
62021
Interactively Learning Preference Constraints in Linear Bandits
D Lindner, S Tschiatschek, K Hofmann, A Krause
International Conference on Machine Learning (ICML), 2022
52022
Topological semimetals and insulators in three-dimensional honeycomb materials
D Wawrzik, D Lindner, M Hermanns, S Trebst
Physical Review B 98 (11), 115114, 2018
52018
Learning Safety Constraints from Demonstrations with Unknown Rewards
D Lindner, X Chen, S Tschiatschek, K Hofmann, A Krause
arXiv preprint arXiv:2305.16147, 2023
42023
RLHF-Blender: A Configurable Interactive Interface for Learning from Diverse Human Feedback
Y Metz, D Lindner, R Baur, D Keim, M El-Assady
Interactive Learning with Implicit Human Feedback Workshop at ICML, 2023
32023
Learning What To Do by Simulating the Past
D Lindner, R Shah, P Abbeel, A Dragan
International Conference on Learning Representations (ICLR), 2021
32021
Detecting Spiky Corruption in Markov Decision Processes
J Mancuso, T Kisielewski, D Lindner, A Singh
Workshop on Artificial Intelligence Safety at IJCAI 2019, 2019
22019
Evaluating Frontier Models for Dangerous Capabilities
M Phuong, M Aitchison, E Catt, S Cogan, A Kaskasoli, V Krakovna, ...
arXiv preprint arXiv:2403.13793, 2024
2024
Algorithmic Foundations for Safe and Efficient Reinforcement Learning from Human Feedback
D Lindner
ETH Zurich, 2023
2023
The system can't perform the operation now. Try again later.
Articles 1–19