Random number glitch compromises 35bn IoT devices

  • August 12, 2021
  • Steve Rogerson

More than 35 billion IoT devices worldwide could be affected by a bug discovered by researchers at Arizona-based Bishop Fox Labs.

The crack affects every IoT device with a hardware random number generator (RNG) that can see them fail to generate random numbers properly, which undermines security for any upstream use.

Researchers Dan Petro and Allan Cecil presented their findings at this month’s Def Con 29 event.

To perform most security-relevant operations, computers need to generate secrets via an RNG. These secrets then form the basis of cryptography, access controls, authentication and more. The details of exactly how and why these secrets are generated varies for each use, but the canonical example is generating an encryption key. Random numbers are one of the bedrock foundations of computer security.

But these numbers aren’t always as random as they should be when it comes to IoT devices. In fact, in many cases, devices are choosing encryption keys of 0 or worse. This can lead to a catastrophic collapse of security for any upstream use.

As of 2021, most new IoT systems-on-a-chip (SoCs) have a dedicated hardware RNG peripheral that’s designed to solve exactly this problem. But unfortunately, it’s not that simple. How the peripheral is used is critically important, and the current state of the art in IoT can only be aptly described as “doing it wrong”, say the researchers.

One of the more glaring pitfalls happens when developers fail to check error code responses, which often results in numbers that are decidedly less random than required for a security-relevant use.

When an IoT device requires a random number, it makes a call to the dedicated hardware RNG either through the device’s SDK or increasingly through an IoT operating system. What the function call is named varies, but it takes place in the hardware abstraction layer (HAL). This is an API created by the device manufacturer and is designed to interface more easily with the hardware through C code so developers do not need to mess around with setting and checking specific registers unique to the device. 

The HAL function to the RNG peripheral can fail for a variety of reasons, but by far the most common and exploitable is that the device has run out of entropy. Hardware RNG peripherals pull entropy out of the universe through a variety of means such as analogue sensors or EMF readings but don’t have it in infinite supply. They’re only capable of producing so many random bits per second. If the device tries to get too many random numbers too quickly, the calls will begin to fail.

But when a device needs to generate a new 2048bit private key, as an example, it will call the RNG HAL function over and over in a loop. This starts to tax the hardware’s ability to keep up and, in practice, they often can’t. The first few calls may succeed, but they will typically start to cause errors quickly.

Things aren’t much better even when developers have time on their side. Some devices, such as the STM32, have sizable documentation and even vendor-provided proof of randomness whitepapers, but these are an exception. Few devices have even a basic description of how the hardware RNG is supposed to work, and fewer still have any kind of documentation about basic things such as expected operating speed, safe operating temperature ranges and statistical evidence of randomness.

“Anecdotally speaking, we attempted to follow the STM32 documentation carefully and still managed to create code that incorrectly handled error responses,” said the researchers. “It took multiple attempts and substantial code to block additional calls to the RNG and spin loop properly when there were error responses. And even then, we observed questionable results that made us doubt our code. It’s no wonder developers are doing IoT RNG, well, wrong.”

This affects the entire IoT industry. The core vulnerability here doesn’t lie in a single device’s SDK or in any particular SoC implementation.

Device owners need to keep an eye out for updates and make sure to apply them when they become available. This is an issue that can be solved with software, but it may take some time. In the meantime, users should be careful about trusting IoT gadgets too much. For home devices that require an internet connection, the researchers advise placing them in a dedicated network segment that can only reach out externally. This will help contain any breach from spreading to the rest of the network.

If possible, developers should select IoT devices that include a CSPRNG API seeded from a variety of entropy sources including hardware RNGs. If there’s no CSPRNG available and there is no other choice, they should carefully review both the libraries they are relying on as well as their own code to ensure they are not working with code that reads from uninitialised memory, ignores hardware RNG peripheral registers or error conditions, or fails to block when no more entropy is available.

Device manufacturers and IoT operation systems should deprecate and/or disable any direct use of the RNG HAL function in their SDK. Instead, include a CSPRNG API that is seeded using robust and diverse entropy sources with proper hardware RNG handling.