Seguir
Takahisa Imagawa
Takahisa Imagawa
Afiliação desconhecida
E-mail confirmado em graco.c.u-tokyo.ac.jp
Título
Citado por
Citado por
Ano
Dropout q-functions for doubly efficient reinforcement learning
T Hiraoka, T Imagawa, T Hashimoto, T Onishi, Y Tsuruoka
arXiv preprint arXiv:2110.02034, 2021
542021
Learning robust options by conditional value at risk optimization
T Hiraoka, T Imagawa, T Mori, T Onishi, Y Tsuruoka
Advances in Neural Information Processing Systems 32, 2019
302019
Meta-model-based meta-policy optimization
T Hiraoka, T Imagawa, V Tangkaratt, T Osa, T Onishi, Y Tsuruoka
Asian Conference on Machine Learning, 129-144, 2021
112021
Enhancements in monte carlo tree search algorithms for biased game trees
T Imagawa, T Kaneko
2015 IEEE Conference on Computational Intelligence and Games (CIG), 43-50, 2015
82015
Off-policy meta-reinforcement learning with belief-based task inference
T Imagawa, T Hiraoka, Y Tsuruoka
IEEE Access 10, 49494-49507, 2022
72022
Estimating the maximum expected value through upper confidence bound of likelihood
T Imagaw, T Kaneko
2017 Conference on Technologies and Applications of Artificial Intelligence …, 2017
62017
Optimistic proximal policy optimization
T Imagawa, T Hiraoka, Y Tsuruoka
arXiv preprint arXiv:1906.11075, 2019
42019
Monte carlo tree search with robust exploration
T Imagawa, T Kaneko
Computers and Games: 9th International Conference, CG 2016, Leiden, The …, 2016
22016
難しさが手番で異なる局面でのモンテカルロ木探索の性能の改善
今川孝久, 金子知適
ゲームプログラミングワークショップ 2013 論文集, 162-169, 2013
22013
Off-policy meta-reinforcement learning based on feature embedding spaces
T Imagawa, T Hiraoka, Y Tsuruoka
arXiv preprint arXiv:2101.01883, 2021
12021
Refining manually-designed symbol grounding and high-level planning by policy gradients
T Hiraoka, T Onishi, T Imagawa, Y Tsuruoka
arXiv preprint arXiv:1810.00177, 2018
12018
モンテカルロ木探索の改善に関する研究
今川孝久
(No Title), 2018
12018
モンテカルロ木探索における状態価値の推定方法の改善
今川孝久, 金子知適
ゲームプログラミングワークショップ 2017 論文集 2017, 34-41, 2017
12017
モンテカルロ木探索における子孫の勝敗確定時のプレイアウト結果の修正
今川孝久, 金子知適
ゲームプログラミングワークショップ 2016 論文集 2016, 13-20, 2016
12016
Unsupervised Discovery of Continuous Skills on a Sphere
T Imagawa, T Hiraoka, Y Tsuruoka
arXiv preprint arXiv:2305.14377, 2023
2023
難しさが手番で異なるゲームのモデル化とモンテカルロ木探索の性能の分析・改善
今川孝久, 金子知適
情報処理学会論文誌 55 (11), 2353-2361, 2014
2014
多腕バンディットアルゴリズムの MCTS への応用と性能の分析
今川孝久, 金子知適
ゲームプログラミングワークショップ 2014 論文集 2014, 145-150, 2014
2014
O sistema não pode executar a operação agora. Tente novamente mais tarde.
Artigos 1–17