Avoid CAN Transceiver Data Transmission Problems Through System-Level Testing

By Steven Keeping

Contributed By DigiKey's North American Editors

2023-10-11

Controller Area Network (CAN) is a proven and robust communication standard used in the industrial automation and automotive sectors, among others. Two versions of the technology exist: CAN2.0, and the more recent version, CAN-Flexible Data (FD). Legacy CAN2.0 systems can be enhanced by adding CAN-FD nodes, which offer higher payload throughput to support critical communication events.

The technologies are generally compatible, but as the complexity and bus length of systems rise, the higher speed of CAN-FD in mixed systems can cause synchronization problems, resulting in transmission failure.

A CAN system test involving a single short bus connecting two controller/transceiver pairs may be satisfactory for simple systems. However, this test often fails to detect problems that can manifest in more complex multi-node systems with a combination of CAN2.0 and CAN-FD components. Exclusively testing across all potential use cases with a twin of the production system can detect all the problems that might occur in the field.

This article briefly introduces CAN2.0 and CAN-FD and explains the transmission challenges. It then describes testing techniques to ensure that systems using these networks exhibit minimal field failures. It introduces example components from Analog Devices that incorporate fault detection and reporting, and it shows how the use of such components can accelerate the testing phase and troubleshooting in deployed systems. An associated evaluation board is also highlighted.

What are CAN2.0 and CAN-FD?

CAN is a standard for distributed communications with built-in fault handling. The physical (PHY) and data link layers (DLL) are specified in the ISO-118981 standard.

Features of CAN include:

Allowance for multiple masters on a bus
Inherent priority levels for messages
Bus arbitration by message priority
Error detection and recovery at multiple levels
Synchronization of data timing across nodes with separate clock sources

CAN uses a differential voltage data transmission scheme featuring two bus voltage states: ‘recessive’ (driver outputs are high impedance) and ‘dominant’ with thresholds as shown in Table 1.

Logic	RS-485 levels	CAN state	CAN levels
1	A - B ≥ +200 mV	Recessive	CANH - CANL ≤ 0.5 V
0	A - B ≤ -200 mV	Dominant	CANH - CANL ≥ 0.9 V

Table 1: CAN recessive and dominant voltage levels compared to RS-485. Note that the dominant (higher) voltage corresponds to Logic ‘0’. (Image source: Analog Devices)

Nodes transmit the dominant state for Logic ‘0’ (in this state, one bus line (CANH) is high, and the other (CANL) is low) and the recessive state for Logic ‘1’. An idle CAN bus is distinguished from one in a recessive bit transmission mode through the detection of multiple recessive bits after the end of the standard frame or an error frame (Figure 1).

Diagram of CAN transmission scheme Figure 1: CAN transmission scheme. Idle mode is identified by multiple recessive bits. (Image source: Analog Devices)

CAN transceivers provide the differential PHY interface between the DLL, the CAN controller (which is often embedded inside another device such as a microcontroller), and the physical wiring of the CAN bus. The various elements required to implement a CAN application are shown in Figure 2, along with their relationship to Open Systems Interconnection (OSI) layers and the features implemented by each item.

Figure 2: The CAN transceiver forms the differential PHY interface between the CAN controller and the CAN bus. (Image source: Analog Devices)

CAN2.0 was introduced in 1991 and offers a nominal throughput of 500 kilobits per second (Kbits/s). Because this data rate sometimes proved insufficient for critical communication events, CAN-FD was launched in 2012. CAN-FD provided a nominal throughput of up to 2 megabits per second (Mbits/s) for normal operating conditions and up to 5 Mbits/s for diagnostics or programming. Note that the higher speed communication is only applicable to the message payload; other elements of the message, such as the 11-bit identifier, cyclic redundancy check (CRC), and acknowledgment (ACK) are sent at the CAN2.0 rate of 500 Kbits/s.

A further difference between CAN2.0 and CAN-FD is in the standard data frame payload, which is increased from 8 bytes (B) for CAN2.0 up to 64 B for CAN-FD. This increase in payload makes CAN-FD communication more efficient by enhancing the overhead/data ratio. Moreover, messages that previously had to be split due to CAN2.0’s 8 B payload limit can now be combined into one message using CAN-FD. Additionally, security can be enhanced via the encryption of CAN-FD messages because of the higher data rate and increased payload.

It is common to mix CAN2.0 and CAN-FD nodes in the same network because CAN-FD controllers support both CAN2.0 and CAN-FD protocols. Mixing nodes is popular because it allows legacy networks to migrate to the faster protocol over an extended duration. One downside of mixed systems is additional cost and complexity because transceivers must be able to support a CAN-FD filtering method on CAN2.0 nodes to ensure error frames are not created during CAN-FD communication.

CAN’s arbitration and error mechanisms

Any connected CAN node can transmit data onto the bus. To avoid communication clashes, nodes arbitrate for use of the bus so that messages are transmitted one after another according to their priority. CAN employs nondestructive and transparent arbitration; the node succeeding during arbitration continues transmitting its higher priority message without any other node interfering with or corrupting the information. Such arbitration is possible because the transmission of a dominant bit overwrites the recessive bus state.

The standard data frame includes a message identifier and several flag bits. This information is known as the “arbitration field.” It dictates arbitration and, as a result, message priority. Messages with a lower ID (more initial “0”s) have a higher priority (Figure 3).

Figure 3: The CAN standard data frame includes a message identifier plus RTR and IDE flag bits. This arbitration field dictates arbitration and message priority. (Image source: Analog Devices)

Even with the arbitration scheme, things can go wrong. To cope with problems, the CAN protocol has mechanisms to support error checking and handling. These mechanisms include:

Transmission bit verification
CRC check
Fixed-form bit field checks
Mandatory message ACK

The errors are handled using the following mechanisms:

Error frames
Error counters
Node error states

Any CAN controller can detect errors and react by triggering error frames and error node counters. An error frame is distinguished by its use of six consecutive dominant or recessive bits. Such a sequence is at odds with normal transmission rules, making it detectable by other nodes. Nodes transmitting error frames subsequently send recessive bits until the bus is detected as being in the recessive state. Following a further transmission of seven recessive bits, the node can attempt transmission of regular CAN frames (Figure 4).

Image of example of a faulty transmission Figure 4: In this example of a faulty transmission (due to extra bits [1] resulting in a CRC bit error), the six consecutive bit error frame is shown to the far right. (Image source: Analog Devices)

In addition to error frame transmission, every CAN node implements transmit and receive error counters. An error increases the count by one, while a successful transmission or receipt of a message decreases the counter by one. Based on the error counters, a node may be in an “error active,” “error passive,” or “bus off” state. In the error active state, the node can communicate on the bus and send active error flags when errors are detected. An error passive state occurs when the counter exceeds 127; in this state, the node can send only passive error flags. The node becomes error active again once the counter is below 127. If the counter exceeds 256, the node enters the bus off state and can’t communicate on the bus. The node counters can be reset to 0 after it receives 128 sequences of 11 consecutive recessive bits.

The importance of comprehensive testing

CAN’s arbitration and error mechanisms help keep systems running in the field when faults occur. However, higher efficiency operation is possible by designing systems to limit transmit and receive faults. Testing the proposed system under several operational scenarios is one way of identifying and fixing weaknesses prior to deployment.

A common technique is to exercise the selected CAN transceiver by transmitting typical operational standard data frames into the transceiver’s TxD pin using a function generator and checking if any errors occur. While this is a sensible test for a single node, it is not a good representation of how a multi-node system with a long bus is likely to perform in the field. For example, problems that can arise with complex systems include reflections and other artifacts produced by circuit stubs during high-frequency operation. These can introduce a phase shift between bits.

CAN’s arbitration mechanism only works if bits are synchronized. If the bit-to-bit phase shift exceeds more than one-half of a single-bit transmission time, then synchronization fails, and arbitration is impossible.

In CAN2.0 legacy systems running at 500 Kbits/s to 1 Mbit/s, the single-bit transmission time is of sufficient duration that induced phase shifts are rarely a problem. However, because of CAN-FD’s higher throughput speeds, bit transmission times are shortened, and phase shifts can quickly become significant.

Such challenges need to be mitigated by moving away from simply testing single nodes to instead verifying a design by duplicating the complete end system and testing it under various operating conditions. While this is more time consuming and expensive than the basic test, it is far less expensive than dealing with field failures and disgruntled customers.

A practical example

To see how phase shift testing works in practice, consider a system designed with a CAN transceiver and CAN controller from a shortlisted supplier. The node is connected to a 20 meter (m) bus, which also supports many other nodes, including CAN2.0 and CAN-FD components. For test purposes, the node is transmitting at 13.3 Mbits/s, which corresponds to a bit width of 75 nanoseconds (ns). For synchronization and arbitration purposes, the controller samples at 80% of the TxD bit width, so it requires a minimum RxD bit width of 0.8 x 75 = 60 ns, including rise time, fall time, and loop delay. The tested component produced a TxD bit width of 48 ns, resulting in system failure.

The same test was performed on an alternative CAN transceiver, the MAX33012EASA+ from Analog Devices. In this test, the TxD bit width was measured at 75 ns, with the RxD bit width measured at 72 ns. The 72 ns bit width exceeds the 80% sample time requirement of 60 ns, so the system synchronization and arbitration operate satisfactorily. The throughput of 13.3 Mbits/s is faster than the system will use in the target applications, demonstrating it is robust enough to operate across all anticipated operational conditions (Figure 5).

Figure 5: Results of a test running a MAX33012EASA+ CAN transceiver at 13.3 Mbits/s (75 ns TxD bit width) on a 20 m bus. The RxD bit width is 72 ns, sufficient to ensure the controller’s 80% sample time (60 ns) is satisfied and synchronization is achieved. (Image source: Analog Devices)

Built-in troubleshooting

The testing cycle can be made easier and less expensive by using components that incorporate fault detection and reporting. Components such as the MAX33012EASA+ CAN transceiver not only quickly highlight issues with prototype and pre-production CAN circuits, but they are also useful for applications where rapid troubleshooting is important for live control-system deployments.

The MAX33012EASA+ is a +5 volt CAN transceiver that addresses common faults like overcurrent, overvoltage, and transmission failure. It is fault protected up to ±65 volts, making it suitable for applications where overvoltage protection is required. A common-mode voltage range of ±25 volts enables communication in noisy environments such as those with heavy machinery. The CANH and CANL outputs are short-circuit current-limited and protected against excessive power dissipation by thermal shutdown circuitry that places the driver outputs in a high impedance state.

The MAX33012EASA+ operates at up to 5 Mbits/s and features an option to slow the slew rate to 8 volts/microsecond (μs) to minimize electromagnetic interference (EMI) and allow the use of unshielded twisted or parallel cable (Figure 6).

Figure 6: Shown is the MAX33012EASA+ application circuit in a multimode system. In this example, the microcontroller includes an embedded CAN controller. (Image source: Analog Devices)

The CAN transceiver’s fault detection is enabled on power-up by passing 100 low-to-high transitions through TxD (typically one or two standard data frames, depending on which protocol is used). After fault detection is enabled, if a fault is detected, then another 16 low-to-high transitions on TxD are required to transmit the fault code. Finally, 10 more pulses are needed to clear the fault.

Transmission failure detection is triggered when the signal on RxD does not match TxD for 10 consecutive cycles after fault detection is enabled. This can occur, for example, when both termination resistors are missing, or there is a short between either CANH to ground or CANL to VDD, resulting in the differential signal not meeting the specification.

Analog Devices supplies an evaluation board, the CANbus Interface Arduino Platform Evaluation Board MAX33012E, which can be used to demonstrate the functionality of the MAX33012E. While the device features an Arduino shield form factor, it can be used as a standalone evaluation board.

Conclusion

To ensure reliable field operation of multimode CAN2.0 and CAN-FD mixed systems, it is important to thoroughly test the full design. However, simple single-node tests are inadequate when it comes to detecting faults that could later cause field failures due to synchronization problems that can corrupt the technology’s arbitration mechanism. This initial testing and later field troubleshooting of mixed, multi-node CAN systems can be eased by selecting CAN transceivers that have built-in fault detection and reporting.

Disclaimer: The opinions, beliefs, and viewpoints expressed by the various authors and/or forum participants on this website do not necessarily reflect the opinions, beliefs, and viewpoints of DigiKey or official policies of DigiKey.