Tenstorrent Unveils High-Speed Galaxy Blackhole AI Platform, Claims 10x Faster Video Generation Than Leading GPUs
May 2, 2026
Additional deployments are planned with Virtu Financial, Turiyam, Cirrascale, and ai& across on‑premises and cross‑border environments to showcase real‑world use cases.
GenAI video performance demonstrations claim up to 10x faster video generation, with a 720p 81-frame video produced in about 2.4 seconds on a Galaxy supercluster.
Galaxy systems reportedly deliver 10x faster real‑time AI video generation versus leading GPU systems, demonstrated with Prodia Labs producing 720p video in 2.4 seconds.
A0 silicon is shipping, though software bugs are being addressed, and an open‑source software stack for Galaxy Blackhole is being developed.
Tenstorrent unveils Galaxy Blackhole servers, a fully networked AI platform that combines compute, memory, and networking in a single system to scale AI workloads, and announces general availability of the Galaxy Blackhole as a unified Networked AI architecture.
Performance benchmarks for the DeepSeek-R1-0528 671B model show decoding at over 350 tokens per second per user and sub-4 second time-to-first-token for 100K context when run on Galaxy superclusters.
Galaxy Blackhole targets leading general‑purpose AI performance across workloads, including LLM inference and video tasks, with notable prefill and decode capabilities.
Deployments include integrations with Equinix Distributed AI Hub and partnerships with BetterBrain and OrionVM to enable end‑to‑end AI infrastructure and orchestration for agentic workloads.
Blitz Mode delivers ultra‑low latency and high throughput for long‑context LLM inference, achieving 350+ tokens per second per user and sub‑4 second TTF1 on a 671B model.
Pricing and scalability: air‑cooled Galaxy Blackhole servers start around $110,000 for a single 32‑chip unit delivering 23 PFLOPS FP8, with multi‑server superclusters (4–36 servers) starting at $440,000.
Galaxy is a full‑stack AI platform compatible with open‑source ecosystems via TT‑Forge and TT‑Lang, supporting broad model compatibility including ~90% of HuggingFace models on their hardware.
Public demonstrations occurred during the TT-Deploy livestream, with analysis and coverage by a tech outlet.
Summary based on 2 sources

