Understanding and Improving Hyperbolic Deep Reinforcement Learning

Understanding and Improving Hyperbolic Deep Reinforcement Learning

Abstract

The exponential volume growth of hyperbolic geometry makes it a natural fit for reinforcement learning (RL). However, hyperbolic deep RL faces severe optimization challenges, and formal analysis of why optimization fails is lacking. We identify key factors that determine the success and failure of training hyperbolic deep RL agents. By analyzing the gradients of core operations in the Poincaré Ball and Hyperboloid models of hyperbolic geometry, we show that large-norm embeddings destabilize gradient-based training, leading to trust-region violations in proximal policy optimization (PPO). Based on these insights, we introduce Hyper++, a new hyperbolic deep RL agent that consists of three components: (1) feature regularization guaranteeing bounded norms while avoiding the curse of dimensionality from clipping; (2) a categorical value loss for stable critic training; and (3) a more optimization-friendly formulation of hyperbolic network layers. On ProcGen, we show that Hyper++ guarantees stable learning, outperforms prior hyperbolic agents, and reduces wall-clock time by approximately 30%. On Atari-5 with Double DQN, Hyper++ strongly outperforms Euclidean and hyperbolic baselines. We release our code at https://github.com/Probabilistic-and-Interactive-ML/hyper-rl.

Grafik Top
Authors
  • Klein, Timo
  • Lang, Thomas
  • Shkabrii, Andrii
  • Sturm, Alexander
  • Sidak, Kevin
  • Miklautz, Lukas
  • Plant, Claudia
  • Velaj, Yllka
  • Tschiatschek, Sebastian
Grafik Top
Shortfacts
Category
Paper in Conference Proceedings or in Workshop Proceedings (Paper)
Event Title
The Fourteenth International Conference on Learning Representations 2026
Divisions
Data Mining and Machine Learning
Subjects
Kuenstliche Intelligenz
Event Location
Rio de Janeiro
Event Type
Conference
Event Dates
April 23-27, 2026
Date
2026
Export
Grafik Top