Inferact Secures $150M in Seed Funding to Revolutionize AI Inference Technology
January 22, 2026
Inferact, a startup formed from the open-source vLLM project, has raised a $150 million seed round that values the company around $800 million, led by Andreessen Horowitz and Lightspeed Venture Partners.
Inferact aims to sustain and scale the vLLM open-source project while building a commercial universal inference engine to advance next-generation inference across hardware.
The broader AI app ecosystem—including Cursor, ChatGPT, Decagon, and Harvey—can run with current models, reducing the need for frequent new releases.
Competition includes open-source spinoffs like RadixArk and cloud providers; AWS is leveraging Inferentia and Trainium to reduce inference costs and latency, challenging NVIDIA.
The seed round was co-led by a16z and Lightspeed, with additional participation from Sequoia Capital, Altimeter Capital, Redpoint Ventures, and ZhenFund.
vLLM uses PagedAttention and model quantization to reduce memory usage and boost throughput, delivering higher performance and cost efficiency for inference.
Continual batching and memory-efficient scheduling in vLLM enable higher throughput and steadier tail latency across diverse models, lowering cost per 1,000 tokens and improving reliability under burst traffic.
Implementation challenges include balancing ongoing open-source development with enterprise needs, plus establishing robust SLAs, pricing, and partnerships with cloud providers and system integrators.
The company plans to capitalize on the shift toward AI inference and serving, targeting faster, cheaper, and more scalable deployment, with potential focus on managed services, adaptive batching, autoscaling, and multi-cloud or hybrid setups.
The move follows a trend of academic-to-commercial transitions, with UC Berkeley-origin projects like SGLang and RadixArk illustrating growing investment in inference runtimes and scheduling layers.
Inferact’s CEO, Simon Mo, notes existing vLLM users such as AWS and a major shopping app, signaling enterprise interest in faster, more affordable AI inference.
Inferact intends to convert current open-source users into enterprise customers while collaborating with large players rather than competing directly with them.
Summary based on 8 sources
Get a daily email with more Startups stories
Sources

TechCrunch • Jan 22, 2026
Inference startup Inferact lands $150M to commercialize vLLM
SiliconANGLE • Jan 23, 2026
Inferact launches with $150M in funding to commercialize vLLM - SiliconANGLE
Andreessen Horowitz • Jan 22, 2026
Investing in Inferact