Dear Reader,
In my current project, I was unpleasantly reminded what is wrong with embedded Linux systems. An OEM paid me to implement secure boot with integrity checks for the u-boot, kernel and rootfs images. The target platform was an iMX8M Plus SoM from Variscite.
With a few modifications, I could get the integrity checks for the u-boot and kernel images working. These two checks are supported by Variscite’s secure-boot layer. The check for the rootfs is not. I found out that Toradex’s security layer supports all three checks. Unfortunately, this layer uses Toradex specific code. Making the layer work on my Variscite SoM took me some time. It took me even more time to figure out that the kernel of the Variscite system was lacking an essential bug fix required by the third check. I still have to get the third check working, but I am optimistic that I will.
NXP, Variscite and most of the other SoM makers would tout this secure boot implementation as an end-to-end solution. There is nothing more they can do. They are wrong! Every OEM must implement some tricky functionality on their own.
The production build must sign the u-boot and kernel image with private keys retrieved from a secure storage (e.g., an encrypted USB drive or a hardware security module HSM).
When the workers assemble the device in the factory, they must flash the u-boot and rootfs images into the eMMC storage.
The same workers must install the public secure-boot keys into fuses (one-time programmable memory) and blow another fuse to close the device. The workers better get this step right, because the device will only boot from correctly signed images from then on.
If SoM stood for Solution on Module, the custom Linux system provided by the hardware companies would contain such functionality as a ready-made solution. This is the functionality I implemented over the last weeks.
Enjoy reading,
Burkhard
Solutions on Modules
Embedded Linux: A Mess of Epic Proportions
OEMs carefully select the SoC, SoM or terminal (a.k.a., panel PC, display computer, embedded HMI) - by price, availability, peripherals, security features, touch display, MPU, MCU, GPU or NPU. What most OEMs overlook is that they just bought into a custom Linux system of unknown provenance. Building custom embedded Linux systems with Yocto is a big black hole for time and money. The root problem is the lack of well-defined interfaces between the application and the operating system layer. But I’m getting ahead of myself.
Note: You can replace Yocto with Buildroot, Debian, nixOS, Docker or any other fashionable method. None of them solves the root problem. My focus is on Yocto, because I used it in almost every project over the last decade. I have built the odd system with Buildroot and Debian, and I use Docker for my Yocto builds. I think that something is very wrong if an OEM requires expertise in building Linux systems.
When building custom embedded Linux systems, OEMs face the following problems - and probably some more.
Lack of End-to-End Solutions
What does an OEM have to do implement OTA updates? Far too much!
Let us assume that the OEM wants to do OTA updates using an A/B strategy. The OEM must select a client like SwUpdate, RAUC or Mender and a server like Memfault, QBee or Mender. That’s the easy part. The hard part is to get the client-server combination working on the device.
The OEM’s developers must partition the internal eMMC disk into an A, B and data partition. Some providers of update-clients calculate the parameters for the partitions in Yocto class functions. Others delegate the partitioning to SoM makers, who hopefully provide a suitable script.
The developers must configure and build the Yocto layers for the chosen client-server combination. This step never works out of the box.
The developers must create an interface that allows the HMI to perform an OTA update for the u-boot and rootfs images. The interface provides functions for checking the availability of an update, installing an update now, installing an update at a scheduled time or receiving the update progress. The implementation of these functions differs depending on the selected client-server combination.
The developers must implement the HMI for OTA updates in their main application. If there is no HMI, they make the main application perform OTA updates automatically - e.g., at scheduled maintenance times for machines.
The developers must figure out that OTA updates require root privileges, but that the main application must never run with root privileges - for security and safety reasons. Consequently, the main application and the OTA updater must run in different processes with different privileges.
Every OEM must reinvent the wheel and implement such an update solution. They must spend a good amount of time and money on something that is absolutely not their core business. Instead, OTA-update providers give OEMs some documentation and some excuses, why they can’t provide an end-to-end solution for the most common SoCs like NXP’s iMX6 or iMX8. Pathetic!
Copy-and-Paste Programming
Copy-and-paste programming has become the dominant method for Yocto layers. The existing layer doesn’t do exactly what developers think it should do. Hence, they fork it and modify the existing recipes, classes and configurations - until the build works. They just saddled themselves with the maintenance of the fork for the lifetime of their products - for 10, 15 or even more years.
When OEM developers do this, I might excuse this. They don’t know better. Linux builds take scarce time away from working on their core business. When the developers of terminal, SoM or SoC manufacturers do this, I won’t excuse copy-and-paste programming. Let me give you an example: setting up secure boot for iMX6, iMX7 and iMX8 SoCs.
Every SoM maker provides its own implementation for “secure boot” in its own Yocto layer, if they provide one at all. If you look at these different layers, you’ll recognise the same functionality and sometimes even the same code for signing u-boot and kernel images over and over again.
The sins don’t stop there. The secure-boot layer depends on the BSP layers of the SoM maker, which depend on the BSP layers of the SoC makers. If the secure-boot layer supports two or more SoCs, say, NXP and TI, it pulls in the BSP layers for each SoC and for each SoM-specific extension. This tight coupling makes it hard to pick some extra feature from one SoM maker and use it in the secure-boot layer of another SoM maker. This is nothing else but lock-in.
The situation is even worse. Secure boot is not SoM specific but SoC specific! So, NXP, TI or any other SoC maker could provide a solution for secure boot - for all OEMs to use. Any SoM-specific features - if there are any at all - go into the machine configuration.
The SoM makers should push their secure-boot “solutions” into the BSP layer of the SoC maker or even into a machine-independent layer like meta-security. SoC makers should ensure that there are not dozens of very similar solutions out there but only one solution. And this solution must be in the BSP layer provided by the SoC maker.
Out-of-Date Systems
In 2024, I came across embedded Linux systems built with Yocto 1.8 (EOL: 10/2015), Yocto 2.1 (EOL: 10/2016) and Yocto 2.7 (EOL: 10/2019). These outdated Linux systems are still operating vending machines, professional appliances and construction machines.
The case of the driver terminal for construction machines built with Yocto 2.7 is telling. The OEM sources the terminals from one of the world’s biggest terminal makers producing hundred thousands of terminals per year. The OEM hired a service company for building the HMI and the custom Linux system. The OEM started the project 2.5 years after the end of life (EOL) of Yocto 2.7. Neither the OEM, nor the service company nor the terminal maker cared.
Well, nobody cared until the project ran into a problem with streaming video from a rear-view camera over Ethernet to a Qt6 video element. Getting Qt6 working on such an old system was already a challenge. Fixing the bug in the Qt6 video element turned out to be impossible. It would have required the team to find the right bug fixes in the GStreamer changes accumulated over nearly three years. The team downgraded the Qt6 application to Qt5, where video streaming worked.
Even now, the project is still stuck on Yocto 2.7 and Qt5. The main reason is that the terminal maker doesn’t have enough developers to move to Yocto 4.0 or 5.0 and doesn’t really care about their customers. Their terminals are the cheapest in the industry. If OEMs buy cheap, they get cheap.
Thanks to the EU CRA (Cyber Resilience Act), all these companies are forced to care. And from 2027, they will have to pay heavy fines if they don’t care. They can’t point at the others. They must work together to ensure compliance with the EU CRA. The major benefit for OEMs is up-to-date software with many bugs already fixed and with many new features. In short, OEMs can save a lot of time and money by keeping their systems up-to-date.
Lack of Good Developers
One reason for the lack of end-to-end solutions, copy-and-paste programming and out-of-date systems is the lack of good developers. OEMs, terminal makers, SoM makers, SoC makers and some more companies compete for developers that can build custom Linux systems.
Yocto development is not as shiny as creating an ECU, HMI or AI assistant. It is not considered as “real” programming but as something done quickly on the side. It also has a huge learning curve and is complex. You need to be slightly masochistic to get into Yocto development and keep doing it. Unsurprisingly, only few people want to do it.
Solutions on Modules
In an earlier episode of this newsletter, I argued that SoM should stand for Solutions on Module and not for System on Module. With “system”, the focus is too much on hardware. Software is just an annoying afterthought. This attitude got us into this mess of epic proportions.
Toradex understand that their business is much more than hardware and that their main differentiator is software. Whether you buy an iMX8-based SoM from Toradex, Variscite, Seco, Kontron, Avnet or anyone else doesn’t matter. You lose time on building customised Linux systems and adding standard features like OTA updates, remote support over VNC, secure boot and inter-processor communication.
Toradex save you this time by providing ready-made solutions so that you can focus on your core business. They give SoM a new meaning. SoM doesn’t stand for system on module but for solution on module.
Burkhard Stubert, Better Built By Burkhard: Episode 39
When it comes to solutions on modules, Toradex is far ahead of the pack. But even Toradex has considerable work to do. They could turn their OTA-update solution into a microservice. They could adhere to the DRY principle and uncouple their meta-toradex-security layer - probably the best in the industry - from their SoMs.
The SoC, SoM and terminal makers should have one guiding question for their work: How can we help OEMs focus on their core business and save the most time and money? Here is my answer.
Microservices
In short, the microservice architectural style is an approach to developing a single application as a suite of small services, each running in its own process and communicating with lightweight mechanisms […]. These services are built around business capabilities and independently deployable by fully automated deployment machinery. There is a bare minimum of centralized management of these services, which may be written in different programming languages and use different data storage technologies.
James Lewis and Martin Fowler, Microservices, 2014 (emphasis mine)
Now, look again at the example from the section Lack of End-to-End Solutions. Microservices are the obvious fit for OTA updates, aren’t they? OTA updates are a business capability that every OEM needs but that never is the OEM’s core business.
A microservices has a well-defined interface that hides the technology used for OTA updates. For the OEM, it doesn’t matter whether OTA updates use SwUpdate/Memfault, RAUC/QBee or Mender. They want delta, bootloader and secure updates. They want fleet management. They don’t want to spend their own time. And they want all that at a reasonable price.
An OTA update must run in its own process, because it requires root privileges. The applications of an embedded device must not run as root for security reasons (another firm requirement of the EU CRA and similar legislation). This is not possible, if an application links a library with OTA-update capabilities. As they run in separate processes, the applications and the OTA-update microservice are independently deployable and communicate with a lightweight mechanism like DBUS, gRPC or QtRemoteObjects.
Here are some more business capabilities that make for good microservices.
The factory workers assembling the OEM’s devices must install the bootloader and rootfs images in the internal eMMC storage of the device. They run an installer application from an SD card or a ramdisk image. The part of the application doing the installation could be a microservice. OEMs may provide different HMIs, although all of them communicate with the same microservice.
The same factory workers must write the secure-boot keys into one-time-programmable e-fuses and close the device by blowing another e-fuse. Again, the HMI may differ from OEM to OEM but the microservice for writing the e-fuses would be the same.
If multiple applications, say, on a driver terminal read messages from the same CAN bus, only one application will see these messages. The solution could be a microservice that reads the messages from all CAN buses and publishes them to the subscribed applications. You could move the microservice to an otherwise idle microcontroller on your SoC. The microcontroller and the HMI application running on the microprocessor would exchange messages using RPMsg.
A microservice could handle the user authentication. It can use any technology for authentication like face ID, fingerprints, NFC, RFID, QR code or PIN. The application using the authentication service won’t know.
Whenever an application uses operating-system capabilities requiring root access like changing the display brightness, power management and network management, you may want to introduce a microservice.
Embedded Linux systems already have a lot of microservices. They are better known as systemd services.
Tasks in an RTOSs with microkernel architecture are microservices.
Microservices will help commoditise business capabilities that are outside the core business of OEMs. Imagine a world where an OEM must only add an image feature “ota-update” to the image recipe and set the update client and server in a configuration. The Yocto build produces a microservice for OTA updates. The application developers implement one screen in the HMI and hook up the screen with the microservice. They don’t have to bother how to partition the eMMC storage, how to build the update client and server with Yocto, and how to implement the functionality provided by the microservice. Wouldn’t that be a nice world!
DRY = Don’t Repeat Yourself
[The DRY Principle:]
Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.
David Thomas and Andrew Hunt, The Pragmatic Programmer (Extract)
The example from the section Copy-and-Paste Programming is a clear violation of the DRY Principle. Secure boot should be available from a single place: the BSP layer of the SoC maker. All OEMs, terminal and SoM makers would use the same functionality for signing the bootloader and Linux kernel. Even better, they wouldn’t even notice that their device used secure boot. Secure boot would be enabled by default.
All companies in the supply chain should keep their layers to an absolute minimum. The best layers are those that do not exist. If you copy and modify parts of other layers or if you find yourself prepending and appending tasks in recipe extensions regularly, you do something wrong. You should work with the authors of the other layers and move duplicated functionality into a single place.
All companies in the supply chain would benefit from such a collaboration. Instead of paying a team to build a custom Linux system, OEMs could spend the money on their core business.
Terminal and SoM makers could keep their BSPs up-to-date more easily. The few developers they have could focus on true differentiation of their products. They could, for example, provide microservices for OTA updates, factory image installers, factory HAB installers, user authentication, etc. - for all the things every OEM needs over and over again. Microservices provide an API between Linux and the OEM’s applications. That is again an application of the DRY principle.
SoC makers should play a much more active role in driving their ecosystem in the right direction: solutions on modules. Their focus should be much more on end-to-end solutions working out of the box than on hardware. Hardware is a commodity. Operating systems are a necessary evil. Ready-made solutions make OEMs happy!