How and Why Microcontrollers Can Help Democratize Access to Edge AI
Contributed By DigiKey's North American Editors
2025-02-18
Over the past few years, edge AI has been increasing in popularity. The associated global market is expected to grow at a compounded annual rate of 27.8% until 2035, increasing to a net value of $356.84 billion.
A variety of factors is fueling this demand. Processing data at the edge addresses security concerns that companies might have about routing sensitive or proprietary information to the cloud. Edge processing also decreases latency, which might be of importance in real-time applications where split-second decisions have to be made. Industrial IoT (IIoT) devices deliver data-driven operations, which in turn increase the use cases for edge AI. Rapidly expanding implementations—from portable medical devices to wearables and IIoT—are boosting the market for AI at the edge.
As the technology grows in popularity, there is an associated growing demand for components that can handle data processing needs in embedded systems.
The computing processing choices: microcontrollers or microprocessors
The vast majority of IoT devices deployed today in industrial and other embedded gear are low-power devices with very little memory. The processing power they have comes from small embedded microcontrollers (MCUs). These MCUs have low-power architectures that allow the embedded systems to be far more cost-effective than those with microprocessors.
Up until the advent of edge AI, MCUs have served the processing needs of IoT devices well. But traditional MCUs usually cannot deliver the compute power needed for more complex machine learning algorithms that are the hallmark of edge AI applications. Such algorithms typically run on graphics processing units (GPUs) and microprocessors which have more computing power. Using these components, however, comes with its own drawbacks, including the amount of power consumed. Microprocessors or GPUs are not the most energy-efficient solutions. As a result, microprocessor-driven edge computing might not be the best fit for all edge AI applications, and vendors choose to rely on MCUs instead.
Standalone MCUs are less expensive than GPUs and microprocessors. To scale edge AI, there is a growing need to leverage the advantages of MCUs—low cost and low power consumption—while also increasing computational power.
Indeed, over the years, a few factors have converged to increase the capabilities of MCUs at the edge.
What is aiding the use of MCUs at the edge
While the general assumption has been that the traditional MCU is too lightweight for AI-related data processing, changes both in the design of the MCU and in the wider technology ecosystem are spurring their adoption in edge AI use cases.
These factors include:
The integration of AI accelerators in MCUs: When the MCU alone struggles to meet edge computing demands, integrating it with an AI/ML accelerator such as neural processing units (NPUs) or digital signal processors (DSPs) improves performance.
For example, the STM32N6 Series CPUs (Figure 1) from STMicroelectronics are based on the Arm Cortex-M55 running at 800 MHz. The Arm Helium vector processing technology brings DSP processing capabilities to a standard CPU. The STM32N6 is the first STM32 MCU that embeds the ST Neural-ART Accelerator, an in-house developed NPU engineered for powerful edge AI applications.
Figure 1: The STM32N6 is the first STM32 MCU that embeds the ST Neural-ART Accelerator, an in-house developed neural processing unit (NPU) engineered for power-efficient edge AI applications. (Image Source: STMicroelectronics)
- Optimized AI models for the edge: Heavy-duty AI and machine learning algorithms cannot simply transfer to MCUs. They need to be optimized for constrained computing resources. Compact AI architectures like TinyML and MobileNet do just that together with optimization techniques, enabling even MCUs at the edge to execute AI algorithms. STMicroelectronics launched STM32Cube.AI, a software solution that converts a neural network into optimized C code for STM32 MCUs. Using the solution in conjunction with the STM32N6 helps ensure the performance needed for edge AI applications despite processing and memory constraints.
The rise of AI ecosystems: Simply having a hardware component capable of AI-related processing at the edge is not sufficient. Executing AI algorithms at the edge requires developer-friendly ecosystems that help make AI deployments easier. Specific tools like TensorFlow Lite for Microcontrollers help deliver such solutions. Open-source communities like Hugging Face and other platforms offer pre-trained models and code libraries that developers can test and tailor for their specific use cases. Such AI ecosystems lower the barrier for adoption of the technology and democratize access even for resource-strapped businesses that might not be able to develop proprietary AI models from scratch.
STMicroelectronics has a specifically tailored hardware and software ecosystem, the ST Edge AI Suite, for optimized edge AI solutions. The suite consolidates many of ST’s AI libraries and tools to make it easier for developers to find models, data sources, tools, and compilers that can generate code for the microcontroller.
Pre-trained models in a model zoo provide a starting point for developers. These models use the Open Neural Network Exchange (ONNX) format, an open standard to represent machine learning models in areas such as computer vision (CV), natural language processing (NLP), generative AI (GenAI), and graph machine learning.
- Codes for standardization and interoperability: While AI ecosystems have helped companies test edge AI use cases, open and standardized model formats have aided seamless integration across hardware systems. Compatibility across software tools and MCUs has helped decrease the roadblocks for edge AI implementations.
- Attention to security at the edge: While MCUs eliminate or at least decrease the need for cloud processing of data, the hardware components provide additional layers of security. They usually include features like hardware encryption and secure boot, which protect both data and AI models from malicious actors.
Noteworthy features of the STM32N6 hardware
The STM32N6 series includes a high-performance MCU with an NPU, a camera module bundle, and a discovery kit. The series uses a typical ARM Cortex-M architecture and has several key features that make these devices suitable for AI at the edge. These include:
- Neural ART Accelerator, which can run neural network models. It is optimized for intensive AI algorithms, clocked at 1 GHz, and provides 600 GOPS at an average of 3 TOPS/W energy efficiency.
- Support for “Helium” M-profile Vector Extension (MPVE) instructions, a set of ARM instructions that enable powerful neural network and DSP functions. These instructions are designed, for example, to work with 16-bit and 32-bit floating point numbers, which lets them efficiently manipulate low-precision numbers. These are important for processing ML models.
- ST Edge AI Suite, a repository of free software tools, use cases, and documentation that helps developers of all experience levels create AI for the intelligent edge. The suite also includes tools like the ST Edge AI Developer Cloud, which features dedicated neural networks in the STM32 model zoo, a board farm for real-world benchmarking, and more.
- Nearly 300 configurable multiply-accumulate units and two 64-bit AXI memory buses for a throughput of 600 GOPS.
- Built-in dedicated image signal processor (ISP), which can directly interface with multiple 5-megapixel cameras. For building systems that incorporate cameras, developers have to tune the ISP for a particular CMOS camera sensor and its lens. Tuning typically requires specialized expertise or third-party help. ST provides developers with a special desktop software called iQTune for this purpose. This software, running on a Linux workstation, communicates with embedded code on the STM32 and analyzes color accuracy, image quality, and statistics, and configures the registers of the ISP appropriately.
- Supports MIPI CSI-2, the most popular camera interface on mobile applications, without needing an external ISP compatible with this particular camera serial interface.
- Many additional capabilities on a single device mean that developers can now run a neural network in conjunction with a GUI without having to use multiple MCUs.
- Robust security, including Target SESIP Level 3 and PSA Level 3 certifications.
Conclusion
Machine learning applications running on the edge used to need heavy-duty microprocessors in embedded systems to bear the burden of executing complex algorithms. Thanks to powerful MCUs like the STM33N6 series CPUs from STMicroelectronics, companies are now able to democratize AI at the edge. STMicroelectronics delivers an entire ecosystem for AI deployment at the edge, including the software and hardware components for inferencing.

Disclaimer: The opinions, beliefs, and viewpoints expressed by the various authors and/or forum participants on this website do not necessarily reflect the opinions, beliefs, and viewpoints of DigiKey or official policies of DigiKey.