Neuro-algorithmic Policies Enable Fast Combinatorial Generalization Marin Vlastelica, Michal Rolinek, Georg Martius

Abstract

Although model-based and model-free approa\-ches to learning the control of systems have achieved impressive results on standard benchmarks, generalization to task variations is still lacking.Recent results suggest that generalization for standard architectures improves only after obtaining exhaustive amounts of data.We give evidence that generalization capabilities are in many cases bottlenecked by the inability to generalize on the combinatorial aspects of the problem. Furthermore, we show that for a certain subclass of the MDP framework, this can be alleviated by neuro-algorithmic architectures.

Media