Harm van Seijen
Harm van Seijen
Microsoft Research
E-mail confirmado em microsoft.com - Página inicial
Título
Citado por
Citado por
Ano
A theoretical and empirical analysis of Expected Sarsa
H Van Seijen, H Van Hasselt, S Whiteson, M Wiering
2009 ieee symposium on adaptive dynamic programming and reinforcement …, 2009
1282009
Hybrid reward architecture for reinforcement learning
H Van Seijen, M Fatemi, J Romoff, R Laroche, T Barnes, J Tsang
Advances in Neural Information Processing Systems, 5392-5402, 2017
1252017
True online TD (lambda)
H Seijen, R Sutton
International Conference on Machine Learning, 692-700, 2014
802014
True online temporal-difference learning
H Van Seijen, AR Mahmood, PM Pilarski, MC Machado, RS Sutton
The Journal of Machine Learning Research 17 (1), 5057-5096, 2016
622016
A Deeper Look at Planning as Learning from Replay
H van Seijen, RS Sutton
International Conference on Machine Learning, 2015
352015
Planning by prioritized sweeping with small backups
H Van Seijen, RS Sutton
arXiv preprint arXiv:1301.2343, 2013
24*2013
Exploiting Best-Match Equations for Efficient Reinforcement Learning.
H van Seijen, S Whiteson, H van Hasselt, M Wiering
Journal of Machine Learning Research 12 (6), 2011
222011
Multi-advisor reinforcement learning
R Laroche, M Fatemi, J Romoff, H van Seijen
arXiv preprint arXiv:1704.00756, 2017
132017
Efficient abstraction selection in reinforcement learning
H Van Seijen, S Whiteson, L Kester
Computational Intelligence 30 (4), 657-699, 2014
132014
On value function representation of long horizon problems
L Lehnert, R Laroche, H van Seijen
AAAI, 2018
112018
Effective multi-step temporal-difference learning for non-linear function approximation
H van Seijen
arXiv preprint arXiv:1608.05151, 2016
102016
Switching between representations in reinforcement learning
H Van Seijen, S Whiteson, L Kester
Interactive Collaborative Information Systems, 65-84, 2010
102010
Separation of concerns in reinforcement learning
H van Seijen, M Fatemi, J Romoff, R Laroche
arXiv preprint arXiv:1612.05159, 2016
82016
Forward actor-critic for nonlinear function approximation in reinforcement learning
V Veeriah, H van Seijen, RS Sutton
Proceedings of the 16th Conference on Autonomous Agents and MultiAgent …, 2017
62017
Switching between different state representations in reinforcement learning
H van Seijen, B Bakker, L Kester
Proceedings of the 26th IASTED International Conference on Artificial …, 2008
62008
Switching between different state representations in reinforcement learning
H van Seijen, B Bakker, L Kester
Proceedings of the 26th IASTED International Conference on Artificial …, 2008
62008
Using a logarithmic mapping to enable lower discount factors in reinforcement learning
H Van Seijen, M Fatemi, A Tavakoli
Advances in Neural Information Processing Systems, 14134-14144, 2019
42019
Postponed updates for temporal-difference reinforcement learning
H van Seijen, S Whiteson
2009 Ninth International Conference on Intelligent Systems Design and …, 2009
42009
Reinforcement learning with multiple, qualitatively different state representations
HH van Seijen, B Bakker, L Kester
42007
Improving scalability of reinforcement learning by separation of concerns
H van Seijen, M Fatemi, J Romoff, R Laroche
arXiv preprint arXiv:1612.05159, 2016
32016
O sistema não pode executar a operação agora. Tente novamente mais tarde.
Artigos 1–20