FOGO: A New Era in Optimizing AI Memory Retention

By Marcus YipJune 10, 2026

FOGO, a novel optimizer, challenges conventional training norms by addressing gradient interference, boosting AI learning and retention.

Forgetting isn't just a problem in continual learning. It's a widespread issue in AI optimization. Dominant mini-batch gradients often overshadow less frequent yet valuable update directions, leading to short-term forgetting. Over time, this compounds into long-term forgetting, a classic failure in AI training.

Introducing FOGO

FOGO emerges as a revolutionary solution. This optimizer identifies and resolves gradient interference, both during standard training and continual learning. By spectrally orthogonalizing momentum updates, FOGO prevents dominant directions from hijacking optimization efforts. It then encodes representative past directions into a compact memory using random projections. This ensures that pairwise distances are preserved even in reduced dimensional spaces.

How Does It Work?

With each training step, FOGO addresses conflicts between current updates and stored directions through orthogonal corrections. These corrections are elevated via a proximal step, maintaining minimal overhead without storing excess data.

FOGO's practical applications span various domains. In class-imbalanced classification, continual visual learning under domain and class shifts, and continual fine-tuning of models like LLaVA-7B and GPT-2 pretraining, FOGO consistently outperforms traditional optimizers like Adam and Muon.

Why FOGO Matters

The chart tells the story: FOGO enhances convergence and knowledge retention significantly. In a world where AI's adaptiveness is essential, memory retention can't be compromised.

Visualize this: AI systems forgetting valuable learning moments due to outdated optimization methods. FOGO is a major shift, ensuring AI systems aren't just learning, but retaining critical knowledge. Is it too bold to suggest that this method might become the new standard?

The trend is clearer when you see it. FOGO not only addresses a technical challenge but also paves the way for more resilient AI systems. As we've seen, numbers in context spotlight FOGO's potential to reshape how we approach AI training.

Share this article:

Get AI news in your inbox

Daily digest of what matters in AI.

FOGO: A New Era in Optimizing AI Memory Retention

Introducing FOGO

How Does It Work?

Why FOGO Matters

Key Terms Explained