Google Unveils Titans: A New AI Model for Efficient Long-Context Processing and Adaptive Learning
December 7, 2025
MIRAS, the accompanying blueprint, reframes sequence models as associative memory systems and outlines how information is stored, retained, and updated, including attention-free variants such as YAAD, MONETA, and MEMORA for robustness in long-context workloads.
Beyond text, Titans have shown results in genomic modeling and time-series forecasting, maintaining efficient training and fast inference speeds for long-context understanding across domains.
Titans combines fast recurrent design with attention accuracy and employs a deep neural memory module to summarize and integrate information across millions of tokens.
Titans is claimed to scale to a context window beyond two million tokens, greatly extending long-sequence processing capabilities.
The announcement sits in a Techmeme-style roundup that includes sponsor content and podcast/product links, but Titans remains the primary substantive point.
Titans blends the speed of recurrent networks with the precision of transformers, using a surprise metric to decide what information to permanently store while managing memory capacity through momentum and adaptive weight decay.
It achieves precise short-term memory via windowed attention and sustains a trainable long-term memory that updates during inference, addressing the limits of traditional Transformers on very long inputs.
Ablation studies show that deeper memory modules improve performance as sequence length grows, underscoring the importance of memory depth.
Google plans to release code soon and envisions broader applications beyond text, including DNA modeling and potentially video models, contingent on benchmark results translating into real-world performance.
Three Titans variants—Memory as Context (MAC), Memory as Gate (MAG), and Memory as Layer (MAL)—offer different approaches to long-term memory, with MAC excelling on very long sequences.
MIRAS provides a theoretical foundation for combining new information with old memories, detailing memory architecture, attentional bias, retention gate, and memory update rules that lead to three attention-free models: Moneta, Yaad, and Memora.
The largest Titans model discussed contains about 760 million parameters and emphasizes long-context capabilities without an unduly large parameter count.
Titans reportedly outperforms several prior models across language modeling, zero-shot reasoning, genomics, and time-series tasks, achieving strong results on the BABILong long-context benchmark with fewer parameters and contexts surpassing two million tokens.
On BABILong, Titans surpassed larger models like GPT-4 and Llama3 variants in long-context comprehension, and even beat Llama3 with Retrieval Augmented Generation in certain scenarios.
In testing, Titans outperforms traditional Transformers and many hybrids on long-context tasks, handling contexts over two million tokens and achieving high accuracy on lengthy needles-in-haystack benchmarks.
MIRAS treats sequences as internal lookups linking inputs (keys) to outputs (values) and poses four design questions about the lookup structure and update rules, guiding new attention-free variants.
A core mechanism, the surprise metric, determines which inputs differ meaningfully from existing memory and should be stored permanently.
Google formalizes Titans and introduces MIRAS as a framework for continuous learning and long-term memory beyond static pretraining.
Google positions Titans and MIRAS as foundational for a new generation of AI capable of adaptive reasoning over large datasets, continuous learning, and efficient long-context processing with broad research and application implications.
Summary based on 4 sources
Get a daily email with more Tech stories
Sources

THE DECODER • Dec 5, 2025
Google outlines MIRAS and Titans, a possible path toward continuously learning AI
OfficeChai • Dec 7, 2025
Google Introduces Titans Architecture To Give LLMs Long-Term Memory
