Perceptron Launches Groundbreaking Mk1 AI: Affordable, Multi-Modal Video Processing at Frontier-Level Efficiency
May 12, 2026
The company was founded in late 2024 by Armen Aghajanyan and Akshat Shrivastava, ex-Meta researchers, who pursue “physical AI” capable of understanding real-world video and sensory streams for robotics, manufacturing, security, and related fields.
Mk1 includes a developer platform (Perceptron SDK) offering features such as Focus for region localization via prompts, Counting for dense scene item counting, and In-Context Learning for few-shot task adaptation.
The architecture supports temporal reasoning, allowing queries about specific moments in long streams and returning structured time codes to aid video clipping and event detection.
The launches spotlight Mk1, a multi-modal physical AI with strong temporal continuity that can process native video up to 2 frames per second over a 32K token context, preserving object identity through occlusions in long streams.
Mk1 is priced at $0.15 per million input tokens and $1.50 per million output tokens, positioned as 80–90% cheaper than Claude Sonnet 4.5, GPT-5, and Gemini 3.1 Pro.
A broad partner ecosystem backs the launch, with real-world uses like auto-clipping of sports highlights, teleoperation data labeling for robotics, real-time defect detection on manufacturing lines, and context-aware wearables on smart glasses, signaling practical adoption.
Physical reasoning is a key differentiator, enabling pixel-precise analysis of object dynamics, reading gauges and clocks, and dating vintage footage from visual cues, as shown by a test on a 1906 New York skyscraper construction film.
Perceptron frames Mk1 as part of the Efficiency Frontier, delivering frontier-level reasoning at a blended cost around $0.30 versus GPT-5 at about $2.00 and Gemini 3.1 Pro around $3.00.
Perceptron employs a dual licensing model: Mk1 is closed-source with API access for enterprise security and performance, while the Isaac series provides open-weights options for edge/low-latency deployments, with on-premise commercial licenses.
Mk1 demonstrates strong performance on spatial and video benchmarks, including EmbSpatialBench, RefSpatialBench, EgoSchema Hard Subset, and VSI-Bench, surpassing several competitive models.
Summary based on 1 source
