Abstract

Trajectory optimizers for model-based reinforcement learning, such as the Cross-Entropy Method (CEM), can yield compelling results even in high-dimensional control tasks and sparse-reward environments. However, their sampling inefficiency prevents them from being used for real-time planning and control. We propose an improved version of the CEM algorithm for fast planning, with novel additions including temporally-correlated actions and memory, requiring 2.7-22× less samples and yielding a performance increase of 1.2-10× in high-dimensional control problems.

Media

CoRL 2020 Presentation

Experiments Videos

Humanoid Stand-Up

Relocate

Additional Information

BibTex

  @inproceedings{PinneriEtAl2020:iCEM,
    
 title = {Sample-efficient Cross-Entropy Method for Real-time Planning},
    
 author = {Pinneri, Cristina and Sawant, Shambhuraj and Blaes, Sebastian and Achterhold, Jan and Stueckler, Joerg and Rolinek, Michal and Martius, Georg},
    
 booktitle = {Conference on Robot Learning 2020},
    
 year = {2020},
    
 url = {https://corlconf.github.io/paper_217}

}

Sample-efficient Cross-Entropy Method for Real-time Planning Cristina Pinneri, Shambhuraj Sawant, Sebastian Blaes, Jan Achterhold, Jörg Stückler, Michal Rolínek, Georg Martius