Understanding and Improving Hyperbolic Deep Reinforcement Learning

Content

Abstract
Authors
Shortfacts

Abstract

The exponential volume growth of hyperbolic geometry makes it a natural fit for reinforcement learning (RL). However, hyperbolic deep RL faces severe optimization challenges, and formal analysis of why optimization fails is lacking. We identify key factors that determine the success and failure of training hyperbolic deep RL agents. By analyzing the gradients of core operations in the Poincaré Ball and Hyperboloid models of hyperbolic geometry, we show that large-norm embeddings destabilize gradient-based training, leading to trust-region violations in proximal policy optimization (PPO). Based on these insights, we introduce Hyper++, a new hyperbolic deep RL agent that consists of three components: (1) feature regularization guaranteeing bounded norms while avoiding the curse of dimensionality from clipping; (2) a categorical value loss for stable critic training; and (3) a more optimization-friendly formulation of hyperbolic network layers. On ProcGen, we show that Hyper++ guarantees stable learning, outperforms prior hyperbolic agents, and reduces wall-clock time by approximately 30%. On Atari-5 with Double DQN, Hyper++ strongly outperforms Euclidean and hyperbolic baselines. We release our code at https://github.com/Probabilistic-and-Interactive-ML/hyper-rl.

Top

Authors

Klein, Timo
Lang, Thomas
Shkabrii, Andrii
Sturm, Alexander
Sidak, Kevin
Miklautz, Lukas
Plant, Claudia
Velaj, Yllka
Tschiatschek, Sebastian

Top

Shortfacts

Category	Paper in Conference Proceedings or in Workshop Proceedings (Paper)
Event Title	The Fourteenth International Conference on Learning Representations 2026
Divisions	Data Mining and Machine Learning
Subjects	Kuenstliche Intelligenz
Event Location	Rio de Janeiro
Event Type	Conference
Event Dates	April 23-27, 2026
Date	2026
Export

Top