Our findings call for a change in how we evaluate performance in deep RL, for which we present a more rigorous evaluation methodology, accompanied with an open-source library rliable, to prevent unreliable results from stagnating the field.


To cite this paper, please use the following reference:

  title={Deep reinforcement learning at the edge of the statistical precipice},
  author={Agarwal, Rishabh and Schwarzer, Max and Castro, Pablo Samuel and Courville, Aaron C and Bellemare, Marc},
  journal={Advances in Neural Information Processing Systems},


Rishabh Agarwal
Google Research, Brain Team and Mila
Pablo Samuel Castro
Google Research, Brain Team
Marc G. Bellemare
Google Research, Brain Team

For questions, please contact us at: rishabhagarwal@google.com.