Superposition Blog

Coral NPU: When Embedded AI Stops Being a Concept and Becomes a Full Stack

For years, talking about embedded artificial intelligence almost always meant referring to a promising idea with no real foundation. Most devices ran simplified versions of “AI,” with low accuracy and total dependency on the cloud. The term edge computing was stuck in buzzword territory. There was no stack, no architecture, no clear direction.

With the release of the Coral NPU on October 15, 2025, Google changed this landscape for good. No incrementalism, no half-measures. What was once a set of scattered components, silicon, frameworks, drivers, models, turned into an integrated, open-source, production-ready platform. Running embedded AI finally became viable, accessible, and reproducible.

Embedded AI Stopped Being an Experiment and Became a Product

Coral NPU goes beyond a chip. It represents a complete and open stack:

  • A programmable C runtime with native SDKs for LiteRT (successor to TensorFlow Lite), PyTorch, and JAX via MLIR/IREE.
  • The architecture is 100% RISC-V (RV32IM + RVV 1.0 vector extensions via Zve32x).
  • All code is on GitHub under a permissive, modular, lock-free license.

Important note:
The matrix execution unit is still under development and will be released on GitHub later in 2025. The initial launch includes the full and functional RISC-V scalar core and vector execution unit.

The reference design delivers 512 GOPS with typical consumption of ~6 mW at 1 GHz. These numbers allow transformers and compact LLMs to run directly on IoT devices, wearables, TWS earbuds, smart cameras, AR glasses, and devices that previously struggled with even a simple neural network.

This architecture is already running in commercial production in the Synaptics Astra SL2610 line (covering five pin-to-pin-compatible families: SL2611, SL2613, SL2615, SL2617, and SL2619), the world’s first SoCs with Coral NPU integration. The Torq T1 NPU subsystem combined with Coral NPU reaches up to 1 TOPS. Samples have been available since October 15, 2025, with general availability planned for Q2 2026.

Philosophy: Making the Cloud Optional

Coral NPU was created with a clear purpose: filter, interpret, and act at the edge.

By processing locally, it reduces tokens sent to the cloud, eliminates latency, cuts power consumption, and raises privacy to a design standard. Local intelligence becomes the primary decision layer. The cloud becomes the exception, not the rule.

This model decentralizes AI structurally, creating a distributed intelligence network that is faster, safer, and more sustainable by default. Decisions happen where data is born.

Full Verticalization: From Silicon to Framework

While other big tech companies scale horizontally with cloud clusters and infrastructure, Google is verticalizing:

  • Android as the edge operating system
  • LiteRT/TensorFlow as a unified framework
  • Coral NPU as the open-source local execution engine

This integration removes intermediaries and gives developers total control over every layer. The code remains public, the architecture extensible, and the ecosystem grows through strategic partnerships, such as VeriSilicon (announced November 13, 2025), which accelerates adoption in AR glasses, smart home devices, and always-on applications with edge-based LLMs.

The Edge Becomes the Protagonist

This paradigm shift is structural in nature.
Embedded AI has gained extreme efficiency, mature tooling, a common language, and a crystal-clear understanding of where it creates value: exactly where data is born. In the sensor. In the microphone. In the wearable. In the device.

By placing real computational power at the edge, the Coral NPU builds a new kind of infrastructure, one in which “intelligent” lives in the physical world, not just in data centers.

Critical Infrastructure Starts at the Edge

The launch of Coral NPU goes beyond performance or open-source positioning. It’s a strategic move.

Artificial intelligence is rapidly transitioning from an isolated feature to the invisible infrastructure of our everyday lives. To achieve this, it must be present everywhere, all the time, with zero latency, negligible power consumption, and total autonomy.

The stack Google delivers with Coral is the most pragmatic and complete answer to this future. Embedded AI has ceased to be an experiment and has become critical infrastructure.

With this move, the edge has gained absolute prominence and redefined what we can build.

References

Official Coral NPU website

Open-source repository on GitHub

fabio_Seixas_3a650dabf0.png
Fabio Seixas
CEO
Share this

LET’S WORK TOGETHER

GET IN TOUCH

Softo - USOrlando, FL, USA7345 W Sand Lake RD

Softo - BrazilRio de Janeiro, RJ, BrazilAvenida Oscar Niemeyer, 2000

get-in-touch@sof.to
Softo information map

1/3