https://arxiv.org/pdf/2601.20802
Jonas Hübotter,Frederike Lübeck, Lejs Behric, Anton Baumann, Marco Bagatella, Daniel Marta1, Ido Hakimi, Idan Shenfeld, Thomas Kleine Buening, Carlos Guestrin, Andreas Krause1
ETH Zurich, Max Planck Institute for Intelligent Systems, MIT, Stanford
🚀 Unlocking Reinforcement Learning: The Power of Self-Distillation!
In this post, we