Quickly Deploy Powerful and Efficient AI and Machine Learning Using Renesas RA8M1 MCUs

By Kenton Williston

Contributed By DigiKey's North American Editors

2024-03-20

The rise of artificial intelligence (AI), machine learning (ML), and other computationally intensive workloads at the network edge for the Internet of Things (IoT) is putting an extra processing load on microcontrollers (MCUs). Handling these new workloads increases power consumption, even as designers are asked to minimize power and accelerate time to market.

Designers need a computing option that retains the efficiency of an MCU, while adding high-performance features tailored specifically to low-power use cases. This option should also preserve the simple deployment models associated with traditional MCUs, while adding sufficient features to support the sophisticated applications enabled by AI and ML, such as voice control and predictive maintenance.

This article discusses the factors driving the demand for AI and ML and explains why new processor architectures are needed to deliver these capabilities efficiently. It then introduces the RA8M1 MCU family from Renesas and shows how it can be used to address these requirements.

The requirements of edge AI and ML

The demand for AI and ML is increasing in edge IoT applications ranging from building automation and industrial devices to home appliances. Even relatively small, low-power embedded systems are now tasked with workloads such as keyword spotting, voice command control, and audio/image processing. Target applications include sensor hubs, drone navigation and control, augmented reality (AR), virtual reality (VR), and communications equipment.

To minimize energy usage, overhead, and latency while ensuring privacy, processing data at the edge is often preferred to sending it to the cloud. This is challenging for designers as edge devices are frequently resource-constrained, particularly when battery-powered.

Enhanced MCUs for edge computing

AI and ML workloads typically involve performing the same mathematical operation repeatedly across a large data set. These workloads are amenable to acceleration using single instruction, multiple data (SIMD) processing. SIMD performs several mathematical operations in parallel, delivering considerably higher throughput and better power efficiency than conventional processing.

Because traditional MCUs lack SIMD functionality, they need help executing AI and ML workloads. One solution is to use a digital signal processor (DSP) or other SIMD accelerators alongside the MCU. However, this multi-processor approach complicates system design.

Another option is to switch to a higher-performance microprocessor unit (MPU) equipped with SIMD capabilities. This can deliver the necessary performance in a single-processor setup, but MPUs have trade-offs in power consumption and feature sets. For example, not all MPUs are designed to deliver the deterministic, low-latency computing required in MCU-oriented applications.

Enabling AI and ML in MCUs

Recognizing the need for an optimized suite of MCUs to support AI and ML workloads, Renesas introduced the RA8M1 MCU series (Figure 1). The series is based on an Arm® Cortex®-M85 architecture with Helium and TrustZone, and they can run at 480 megahertz (MHz) with a typical power consumption of 225 microamperes per megahertz (µA/MHz).

Figure 1: The Renesas RA8M1 MCU is based on an Arm Cortex-M85 and includes Helium technology to accelerate AI and ML processing. (Image source: Renesas)

Designed for efficient performance and low power consumption, the RA8M1 MCU has features like determinism, short interrupt time, and state-of-the-art power management support. The processor achieves a performance efficiency of 6.39 CoreMark per megahertz (CoreMark/MHz).

Helium is a SIMD M-Profile Vector Extension (MVE) that significantly accelerates signal processing and ML. It adds 150 scalar and vector instructions and enables the processing of 128-bit registers (Figure 2). It is optimized for resource-constrained, lower-power microcontrollers. For example, Helium reuses the floating-point unit (FPU) registers rather than introducing new SIMD registers. This helps lower the processor’s power consumption and reduce design complexity.

Diagram of Helium reuses the FPU register bank for vector processing Figure 2: Helium reuses the FPU register bank for vector processing. (Image source: Arm)

As shown in Figure 3, the RA8M1’s Cortex-M85 includes Arm’s TrustZone technology. TrustZone provides hardware isolation for critical firmware, assets, and private information. The Cortex-M85 also adds new security and safety capabilities, such as the pointer authentication and branch target identification (PACBTI) extension. These security features are particularly valuable in an AI context where a device may interact with personal data.

Image of Arm Cortex-M85’s TrustZone Figure 3: The Cortex-M85’s TrustZone provides hardware isolation for critical firmware, assets, and private information. (Image source: Arm)

Hardware features to look for in an AI-capable MCU

An MCU should combine efficient performance with a robust feature set to support AI applications. The RA8M1 is well-equipped for motor control, programmable logic control (PLC), metering, and other industrial and IoT applications.

For example, AI algorithms require a lot of memory. The RA8M1 system memory includes up to 2 megabytes (Mbytes) of flash and 1 Mbyte of SRAM. The SRAM includes 128 kilobytes (Kbytes) of tightly coupled memory (TCM), which enables fast memory access for high-performance calculations.

To ensure reliable operation, 384 Kbytes of the user SRAM and the entire 128 Kbytes of TCM are configured as error correction code (ECC) memory. The 32 Kbyte instruction and data caches are also ECC protected.

The RA8M1 incorporates multiple security features in addition to those included in the Arm core. These include the Reprogrammable Secure Intellectual Property (RSIP) cryptographic engine for secure data processing, immutable storage for critical data protection, and tamper protection mechanisms.

For communication interfaces, the MCU is equipped with Ethernet for network connectivity, Controller Area Network Flexible Data Rate (CAN FD) for automotive and industrial applications, and USB High-Speed/Full-Speed for general connectivity. It also incorporates a camera interface and an octal Serial Peripheral Interface (SPI) with decryption on the fly for external memory.

Analog interfaces include 12-bit analog-to-digital converters (ADCs) and digital-to-analog converters (DACs), high-speed analog comparators, and three sample-and-hold circuits. For serial communication, the RA8M1 supports multiple protocols, including a Serial Communication Interface (SCI) with SPI, a Universal Asynchronous Receiver/Transmitter (UART), and Inter-Integrated Circuit (I²C) modes. The MCU also offers the Improved Inter-Integrated Circuit (I3C) for enhanced data transfer rates and efficiency.

Developers needing full access to these input/output (I/O) capabilities can use a ball grid array (BGA) package like the 224-pin R7FA8M1AHECBD#UC0. Those seeking a more streamlined printed circuit board (pc board) design and assembly process might consider using a low-profile quad flat package (LQFP) option such as the 144-pin R7FA8M1AHECFB#AA0.

Development environments for AI applications

Designers interested in experimenting with the RA8M1 series can start with the EK-RA8M1 R7FA8M eval board (Figure 4). This board includes an RJ45 RMII Ethernet interface, a USB High-Speed Host and Device interface, and a three-pin CAN FD header. For memory, it features 64 Mbytes of octal SPI flash.

Figure 4: The EK-RA8M1 eval board has robust I/O support to exercise the RA8M1 MCU. (Image source: Renesas)

The RA8M1 is supported by the Renesas Flexible Software Package (FSP), a comprehensive framework designed to provide a user-friendly, scalable, and high-quality software base for embedded system designs.

The package offers development tools, including the e² studio integrated development environment (IDE) based on the popular Eclipse IDE. It also contains two prominent, royalty-free, real-time operating systems: Azure RTOS and FreeRTOS.

The package includes lightweight, production-ready drivers that support common use cases in embedded systems. Combined with the eval board, these drivers give developers a quick path to experimenting with the RA8M1 I/O.

Conclusion

The RA8M1 gives developers a new option for implementing AI and ML workloads in edge IoT applications that saves power, enhances performance, reduces complexity, and shortens time to market.

Disclaimer: The opinions, beliefs, and viewpoints expressed by the various authors and/or forum participants on this website do not necessarily reflect the opinions, beliefs, and viewpoints of DigiKey or official policies of DigiKey.