How to Select and Use an Audio Codec and Microcontroller for Embedded Audio Feedback Files
Contributed By DigiKey's North American Editors
2020-12-02
There is a growing need among embedded systems to provide high-fidelity audio instead of buzzers for user feedback, including alarms and alerts. While beeps and chirps have been effective in the past, users are expecting advanced sounds that can only be produced through playing audio from file formats such as MP3s. The problem is that audio playback can appear intimidating and add additional cost and complexity to a system. The first instinct is to find a microcontroller that can play MP3s, but this often adds several dollars to the bill of materials (BOM) and considerable complexity to the embedded software.
One solution that is particularly good at balancing the additional cost and software complexity is to use an audio codec. Audio codecs not only accept an audio data stream from a microcontroller, they often also have multiple features that allow the developer to carefully tune the audio playback system to improve the quality of the sound played by the system.
This article will discuss the role of audio codecs, the main characteristics that developers should consider when making a selection, and how to apply them effectively. Solutions from AKM Semiconductor, Texas Instruments, and Maxim Integrated will be introduced and used as examples here, though others are also available. It will conclude with tips and tricks on how to accelerate audio playback application development using a codec, while lowering system cost.
What are audio codecs?
An audio codec is a hardware component that is capable of encoding or decoding a digital data stream containing audio information1. An audio codec is useful because it allows the audio processing to be offboarded from the microcontroller. This can significantly decrease the software complexity, and also allow a less expensive and less capable microcontroller to be used for an application.
A typical audio codec will contain several functional blocks:
- An I2S interface to transmit or receive encoded digital audio data
- An I2C interface to configure and read the audio codec’s control registers
- A microphone input which is connected to an analog-to-digital converter (ADC)
- At least one audio output channel such as a speaker output, but most also include a line out and may include multiple speaker outputs for stereo playback
- A digital block that contains high-pass, low-pass, notch, and equalizer filters to tune audio playbacks and recordings
An example audio codec that is quite popular due to its low cost and audio capabilities is the AK4637EN 24-bit audio codec from AKM Semiconductor (Figure 1). The AK4637EN has all these features, in addition to a beep generator input that can be used to generate a beep using a pulse width modulation (PWM) signal at a desired frequency.
Figure 1: The AK4637EN is an audio codec with a mono speaker output that has audio playback and recording capabilities. It also contains an internal audio block that can be used to filter incoming and outgoing audio to improve audio fidelity. (Image source: AKM Semiconductor)
Developers will find that the main differentiator for an audio codec is going to be whether it outputs mono or stereo audio, as well as the digital block capabilities. For example, the AK4637EN offers a high-pass filter, a low-pass filter, a four-band equalizer, an auto-leveling channel feature and a single-band equalizer. The latter can be used as a notch filter. How a developer sets up these digital filters can dramatically affect how a system sounds.
The audio codec can sometimes intimidate a developer that is new to audio playback. For example, while the AK4637EN is a simple audio codec, a quick examination of the datasheet shows that it has 64 configurable registers. That might seem like a lot at first, but most of those registers are used to set the filter coefficients for the various digital filters that are available. There are only a handful that need to be used to get the system outputting audio properly, making the driver development for an audio codec far simpler than a newbie might imagine.
How to select an audio codec
One of the key drivers to selecting anything in product development is cost, and audio codecs are no different. Still, it is important to keep in mind that developers get what they pay for, so when it comes to audio, a team must carefully weigh the design requirements against the key solution parameters.
The first consideration is the required output from the audio codec. There are several different choices. For example, the AK4637EN has a line output and a mono speaker output. There are other codecs like the Texas Instruments TLV320AIC3110IRHBR stereo audio codec that can drive two speakers at 1.29 watts (Figure 2).
Figure 2: The TI TLV320AIC3110IRHBR is an audio codec with stereo output and amplification in addition to a microphone input. The codec can drive 1.29 watts from internal amplifiers and has programmable digital audio blocks. (Image source: Texas Instruments)
Other audio codecs like the Maxim Integrated MAX9867 are designed to only drive a pair of headphones (Figure 3). The MAX9867 has the typical I2S and I2C digital interfaces, but it also contains stereo microphone inputs and two line ins that can be digitally selected.
Figure 3: The Maxim Integrated MAX9867 audio codec can drive stereo headphones and select between digital, microphone and line inputs. (Image source: Maxim Integrated)
Deciding between these three solutions as to what the output type will be (or even the input) is a critical early decision.
Developers also need to consider what they will be driving. Will the audio codec be directly driving headphones, one speaker or a pair of speakers, and what will the output rating be? If the system will be driving a 5 watt speaker, there are not many codecs for embedded systems that will do that. Instead, a developer may want to select the line out and use a separate Class-D amplifier to drive the speaker directly. This saves some cost while also providing design flexibility.
Two final considerations are the internal routing and digital filtering capabilities. Here is where the real differentiation and cost differences are determined for an audio codec. For example, the TLV320AIC311IRHBR has de-pop and soft start capabilities to minimize speaker popping and allow for a smooth transition into audio playback. It also has an internal mixer for each output channel and digital volume control.
It is up to the developer to carefully balance their needs from the audio codec with the BOM and the amount of board space that will be consumed by the circuitry.
The audio playback system
When working with an audio codec, it is important to realize that there are several different blocks outside the audio codec that are necessary to achieve successful audio playback. The exact blocks will vary slightly based on application and the method decided on for playback, but a generalized diagram is shown in Figure 4.
Figure 4: A generalized connection block diagram for an audio playback system in a typical embedded application shows that there needs to be storage for audio files, which can be on the microcontroller or on external memory. (Image source: Beningo Embedded Group)
There are several points in this diagram that are worth discussing. First, there needs to be some method that is used to store the audio playback files. There are two options for this; store the files internally in the microcontroller flash memory or store them externally in flash memory. The choice will depend on how large the audio file(s) are and how large the internal flash memory is on the microcontroller.
Developers also need to consider what the audio playback format will be. The most common is to use an MP3. In this case, the selected microcontroller needs to have a software stack that supports MP3 decoding. This allows the MP3 file to be opened and then pushed using a dynamic memory access (DMA) controller out via the I2S interface. Even the I2S port can be configured for master/slave and several other modes, so this needs to be carefully examined to ensure that the data is transferred to the codec at the correct rate.
As mentioned earlier, an external audio amplifier may or may not be needed depending on the application. A typical codec outputs around 1 to 1.5 watts, which is useful to drive a small speaker. To drive a 3 watt or larger speaker, it will be necessary to use external amplifier. Again, the most widely used are Class-D. The amplifier does not necessarily need to have variable gain either. The audio codec can adjust volume control digitally to provide a wide range of output power.
One area that is often overlooked is bulk capacitance. When audio is playing, it can pull heavily on the power rails. If there is not enough capacitance on the board, the output quality can be dramatically affected and can take on a twangy sound along with several other unwanted noises. This can be detected by carefully monitoring the power rails during testing. It is not a bad idea during pc board development to leave extra footprints on the board to allow different capacitance values to be tried in order to tune the output circuitry.
Tips and tricks for selecting and using an audio codec
Audio codecs can dramatically simplify the embedded software and provide an application with great sounding audio quality. Audio codecs can be tricky if a developer has not worked with them before. To successfully leverage an audio codec, there are several “tips and tricks” teams should keep in mind such as:
- Use the direct memory access controller (DMA) feature within a microcontroller to feed the audio codec with minimal CPU intervention. This will help to ensure that the codec is not “starved” for data.
- When audio is not being played, use the codecs mute feature to prevent low-level output noise from reaching the speaker.
- When disabling or enabling playback, use an audio codec’s soft mute feature to prevent speaker popping and other unwanted noise.
- Use a terminal application to output the codec registers after the codec has been initialized. This can be especially useful when attempting to debug issues or tune the speaker output circuitry and enclosure.
- Leverage the internal digital filter mechanisms included in a codec. The digital filters allow a developer to equalize the output, filter out unwanted high and low frequencies, and maximize the quality of the sound system.
- Do not forget that tuning the sound will only be a useful endeavor when the circuit board and speaker are installed in the enclosure, as the enclosure and mounting make a huge difference.
To get started, developers can experiment with the MAX9867EVKIT+ evaluation kit for Maxim Integrated’s MAX9867 (Figure 5).
Figure 5: The MAX9867EVKIT+ eval kit for the MAX9867 connects to a PC over a USB cable and features RCA inputs, headphone outputs, and fiberoptic transmit and receive modules. (Image source: Maxim Integrated)
The kit comprises the board and associated software and comes configured to send and receive audio data using the Sony/Philips digital interface (S/PDIF), though it can also be set to use I2S. It has two RCA input jacks, two 3.5 millimeter (mm) analog output headphone jacks, and fiberoptic receive and transmit modules. The software is Windows compatible, and when connected to a PC over a USB cable it opens into a graphical user interface (GUI) through which the developer can experiment with the MAX9867’s settings (Figure 6).
Figure 6: Using the Windows-based GUI, users can experiment with a wide range of MAX9867 settings, starting with Clock and Digital Audio (selected tab), all the way to Registers 1 and Registers 2 (right). (Image source: Maxim Integrated)
Conclusion
Embedded system users have become accustomed to quality audio to the point that it is now expected instead of buzzers and beeps for alarms, alerts, and other user audio feedback. This puts the onus upon development teams to implement MP3 playback capabilities in their systems. This can at first appear to be a complex endeavor. However, by using the right audio codec alongside a microcontroller, and by following some design best practices, developers can balance the cost and complexity associated with audio applications.
References

Disclaimer: The opinions, beliefs, and viewpoints expressed by the various authors and/or forum participants on this website do not necessarily reflect the opinions, beliefs, and viewpoints of DigiKey or official policies of DigiKey.