Understanding and Improving Hyperbolic Deep Reinforcement Learning
The exponential volume growth of hyperbolic geometry makes it a natural fit for reinforcement learning (RL). However, hyperbolic deep RL faces severe optimization challenges, and formal analysis of why optimization fails is lacking. We identify key factors that determine the success and failure of training hyperbolic deep RL agents. By analyzing the gradients of core operations in the Poincaré Ball and Hyperboloid models of hyperbolic geometry, we show that large-norm embeddings destabilize gradient-based training, leading to trust-region violations in proximal policy optimization (PPO). Based on these insights, we introduce Hyper++, a new hyperbolic deep RL agent that consists of three components: (1) feature regularization guaranteeing bounded norms while avoiding the curse of dimensionality from clipping; (2) a categorical value loss for stable critic training; and (3) a more optimization-friendly formulation of hyperbolic network layers. On ProcGen, we show that Hyper++ guarantees stable learning, outperforms prior hyperbolic agents, and reduces wall-clock time by approximately 30%. On Atari-5 with Double DQN, Hyper++ strongly outperforms Euclidean and hyperbolic baselines. We release our code at https://github.com/Probabilistic-and-Interactive-ML/hyper-rl.
Top
- Klein, Timo
- Lang, Thomas
- Shkabrii, Andrii
- Sturm, Alexander
- Sidak, Kevin
- Miklautz, Lukas
- Plant, Claudia
- Velaj, Yllka
- Tschiatschek, Sebastian
Top
Category |
Paper in Conference Proceedings or in Workshop Proceedings (Paper) |
Event Title |
The Fourteenth International Conference on Learning Representations 2026 |
Divisions |
Data Mining and Machine Learning |
Subjects |
Kuenstliche Intelligenz |
Event Location |
Rio de Janeiro |
Event Type |
Conference |
Event Dates |
April 23-27, 2026 |
Date |
2026 |
Export |
Top
