Seguir
Huang Jiawei
Huang Jiawei
E-mail confirmado em inf.ethz.ch - Página inicial
Título
Citado por
Citado por
Ano
Minimax weight and q-function learning for off-policy evaluation
M Uehara, J Huang, N Jiang
International Conference on Machine Learning, 9659-9668, 2019
1772019
Weightnet: Revisiting the design space of weight networks
N Ma, X Zhang, J Huang, J Sun
European Conference on Computer Vision, 776-792, 2020
972020
Minimax value interval for off-policy evaluation and policy optimization
N Jiang, J Huang
Advances in Neural Information Processing Systems 33, 2747-2758, 2020
752020
A minimax learning approach to off-policy evaluation in confounded Partially Observable Markov Decision Processes
C Shi, M Uehara, J Huang, N Jiang
International Conference on Machine Learning, 2022
31*2022
From Importance Sampling to Doubly Robust Policy Gradient
J Huang, N Jiang
International Conference on Machine Learning, 4434-4443, 2019
262019
Towards Deployment-Efficient Reinforcement Learning: Lower Bound and Optimality
J Huang, J Chen, L Zhao, T Qin, N Jiang, TY Liu
International Conference on Learning Representations 2022, 2022
242022
On the convergence rate of off-policy policy optimization methods with density-ratio correction
J Huang, N Jiang
International Conference on Artificial Intelligence and Statistics, 2658-2705, 2022
10*2022
On the Statistical Efficiency of Mean Field Reinforcement Learning with General Function Approximation
J Huang, B Yardim, N He
arXiv preprint arXiv:2305.11283, 2023
22023
Tiered Reinforcement Learning: Pessimism in the Face of Uncertainty and Constant Regret
J Huang, L Zhao, T Qin, W Chen, N Jiang, TY Liu
Advances in Neural Information Processing Systems 35, 2022
22022
Robust Knowledge Transfer in Tiered Reinforcement Learning
J Huang, N He
Advances in Neural Information Processing Systems 36, 2024
2024
Model-Based RL for Mean-Field Games is not Statistically Harder than Single-Agent RL
J Huang, N He, A Krause
arXiv preprint arXiv:2402.05724, 2024
2024
O sistema não pode executar a operação agora. Tente novamente mais tarde.
Artigos 1–11