**arXiv Statistics** @arxiv_stats@qoto.org · 2022-02-02T03:20:07Z

arXiv Statistics @arxiv_stats@qoto.org

Optimal Estimation of Off-Policy Policy Gradient via Double Fitted Iteration. (arXiv:2202.00076v1 [stat.ML]) http://arxiv.org/abs/2202.00076

Feb 02, 2022, 03:20 · · feed2toot · · ·