Episode 14: Burkhard on Qt Embedded Systems
Welcome to Episode 14 of my newsletter on Qt Embedded Systems!
In January, I had the chance to test LoRaWAN and BLE Long Range (Coded PHY) for wireless communication indoors - with walls and other obstacles. The selection of the “right” technology is based on an interesting trade-off between bandwidth and range. Read more in The Hardware Corner and find out which dev kits I used.
When I mashed up the BLE Long Range examples for my tests, I wrote some bare-metal code in C for a Cortex-M4 microcontroller. The last time I did this was 13 years ago, when I implemented an Internet radio on a 16-bit microcontroller. It will be fun to implement an application for BLE Long Range communication in the next weeks 🎉
I think that many developers of Qt embedded systems will do some bare-metal programming soon. All the i.MX8 SoCs have a Cortex-M4 microcontroller on board. We can offload safety-critical or real-time applications to the M4. The BLE Long Range application mentioned before is such an example.
Enjoy reading and stay safe - Burkhard 💜
My Blog Posts and Talks
The Qt Project launched Qt 6.0 in December 2020 - the first major version in 8 years. I put Qt 6.0 through its paces and migrated the driver terminal of a sugar beet harvester from Qt 5.12 via Qt 5.15 to Qt 6.0. I describe the migration process step by step and how to fix the compilation and runtime errors and warnings.
Migrating the application with 50K lines of QML and C++ code took me 1.5 days. Writing the post took me longer 😉 The migration process is very straightforward and fast.
The big problem is that many important modules like SerialBus, Multimedia, WebEngine, RemoteObjects and Charts didn’t make it into Qt 6.0 and most of them won’t be available before Qt 6.2 planned for autumn 2021. As we can hardly build any Qt embedded system without these modules, we won’t be able to use Qt 6 in real-life products before the end of this year.
I had the pleasure to be a guest on the FOSS-North podcast. The FOSS-North crew - Johan Thelin and Henrik Sandklef - interviewed me about using LGPL on Qt embedded systems. We covered a wide-range of topics.
How did I get into license-compliance checking?
What is the difference between consumer and business products?
Why is this difference important?
What are the typical problems of my customers with LGPL?
How do I provide the installation information for a Linux system?
How can companies use Qt Virtual Keyboard without violating the GPL and without having to open-source their application code?
How does an EU directive for ensuring the updatability of consumer products relate to LGPL?
It was my first appearance on a podcast. I hope you, dear reader, enjoy our conversation as much as we did.
Thank you very much, Johan and Henrik, for having me on your amazing podcast 🙏🙏🙏
This is the video of my first talk at QtDay Italy 2020 (see this link for all videos). When we develop a Qt application for an embedded system, we want to press the Run button in QtCreator and let QtCreator do all the rest: cross-compile a CMake-based project, deploy the application via SSH to the embedded device and run the application on the embedded device.
In the end, the process shall feel as seamless as running a Qt application on a PC. Making QtCreator and CMake work seamlessly together for cross-builds is a frustrating endeavour. I show you how to navigate the treacherous waters. If you prefer reading, you find the corresponding post here.
This is the video of my second talk at QtDay Italy 2020. The driver terminal of a sugar beet harvester receives 1000-1500 CAN messages per second. If the GUI were exposed to such a flood, it would freeze. The CAN middleware must reduce the number of messages to a more palatable 100-150 messages per second.
I explain how to set up CAN interfaces on desktop and embedded Linux systems, the architecture of the CAN middleware and how to avoid buffer overflows when writing 50 or 100 messages to the CAN bus without pause. You find some more posts about CAN here.
The Hardware Corner
In December and January, I was on a hardware shopping spree for a new project. One big challenge is that the two devices A and B are in separate rooms. So, they must talk to each other over a wireless link through a wall. B receives data from several sensors wirelessly. There may be obstacles between B and the sensors.
A is a hand-held device to supervise the operation of B. B moves around during operation in a hazardous environment. The distance between A and B can be up to 50 metres. B must stop operation, if it hasn’t heard from A for 1 second.
A fairly obvious choice is LoRaWAN (Long Range Wide Area Network). LoRaWAN is used in building automation. Many sensors (e.g., electricity meters, temperature sensors, smoke detectors) send small amounts of data at regular intervals to a LoRaWAN gateway. The gateway is connected to the Internet and forwards the sensor data into the cloud. Indoors sensors and gateways can be hundreds of metres apart, outdoors even several kilometres.
My setup consists of two gateways WisGate Developer D4 from RAK Wireless for A and B. The gateways are powered by a Raspberry Pi 4. The LoRa hat sits on top of the Pi 4. The hat is built around Semtech’s 1257 chip. The gateways come in a solid black casing.
Semtech provides a LoRa library libloragw and some code examples on its lora_gateway repository. The example util_tx_test sends a message every second, the example util_pkt_logger receives messages. We build the examples on the Raspberry Pi 4, which comes with Raspbian OS and the compilers. We run util_txt_test -r 1257 -f 868 on one gateway and util_pkt_logger on the other gateway.
I morphed this sender-receiver example into a ping-pong example, where the two gateways increment a counter. This is more difficult than it sounds, because LoRa radios are unidirectional. They cannot receive and transmit at the same time. The ping-ponging use case looks as follows.
A is in transmit mode and B is in receive mode
A sends the counter to B.
A switches to receive mode.
B receives the counter and increments it.
B switches to transmit mode.
Continue with the first step with the roles of A and B reversed.
One ping-pong roundtrip must happen in 1 second to prevent B from stopping its operation. The example incremented the counter to 15 and 20 before it stopped because of a lost message. Sending 2 bytes of data back and forth within 1 second between two gateways 1.5m apart didn’t work reliably.
The distance requirements are not a problem for LoRaWAN. The bandwidth and the unidirectional nature of LoRaWAN is. The bandwidth is in the range of 500-1000 bytes per second. If this is roughly the amount of data per second flowing from the sensors to B, from B to A and from A to B, the network will be overloaded. Having to stop the operation of B every couple of seconds because of a missing keep-alive message from A makes B unusable. So, LoRaWAN isn’t good enough.
Neither Bluetooth Classic nor normal Bluetooth Low Energy (BLE) would meet the distance requirements especially with walls and other obstacles between A and B. Bluetooth 5 saw the introduction of BLE Long Range. BLE Long Range is an optional feature. So, the Bluetooth modules of many i.MX8, i.MX7 and i.MX6 modules won’t support it yet.
Mohammad Afaneh’s post How to Achieve Ranges of over 1 Km using Bluetooth Low Energy is an excellent introduction into BLE Long Range. BLE Long Range uses a new physical layer called Coded PHY. Coded PHY uses the same data rate as the standard 1 Mbps PHY, but encodes every data bit with 8 bits. This redundancy allows the receiver to reconstruct the data from corrupt messages correctly. This allows for a maximum theoretical data rate of 125 Kbps, which is a lot more than LoRa offers.
I bought four nRF52840 development kits and two nRF52840 dongles. The dev kits come with coin cells but not with micro-USB cables. Micro-USB cables were used for charging smartphones in the pre-USB-C era. I found six, at least 10-year-old cables when rummaging through my numerous boxes and drawers with electronics.
The USB interface is used for powering, programming and debugging the nRF52840 devices from a computer. NordicSemi provides all development tools for Windows, Linux and Mac - a rare service! When we connect the BLE devices with a computer, they should show up as a USB drive J-LINK.
If the BLE device doesn’t show up as a USB drive, we enter the bootloader by holding down the IF Boot/Reset button while power-cycling the board. A USB drive called BOOTLOADER will show up. We drag and drop the latest J-LINK Interface MCU firmware on the USB drive. The firmware can be downloaded from here. After another power-cycle, the BLE device shows up as the USB drive J-LINK.
In order to develop our own Bluetooth applications, we must install the nRF5 SDK v17.0.2. The Getting Started explains how to build and run the Blinky application and how to install the BLE stack (the soft device S140) on the microcontroller (PCA10056) of the dev kit. Yes, you read that correctly! Writing Bluetooth applications is bare-metal programming of a Cortex-M4 microcontroller in C.
The SDK comes with several instructive BLE examples, where BLE devices play the Peripheral role (e.g., sensors), the Central role or both roles (e.g., devices A and B). The ATT_MTU Throughput Example is a good example to check the range and bandwidth for BLE Long Range. This example was also the basis for NordicSemi’s maximum-range test. Nordic used a drone to have a clear line of sight. The drone went out of range long before the BLE devices did. The range of the BLE devices was 1.3 km.
My test results were also quite impressive. The sender and receiver were 15 m apart with two ferro-concrete ceilings and a ferro-concrete wall in-between. The data rate was 5 Kbps and the signal strength -90 dBm. The connection dropped out a couple of times but sender and receiver could reconnect.
BLE Long Range may not reach distances of 50 m. At shorter distances, it will, however, get enough data from A to B and back in 1 s so that B can operate smoothly without failing the dead man’s condition. All in all, BLE Long Range seems to be a better solution than LoRaWAN for the given use case.
Secure firmware updates with code signing by François Baldassari (Memfault)
“In 2020, it is reckless to implement firmware update for our systems without some form of authentication.” When we install firmware or the Linux kernel on a device, we create a cryptographic signature. When the device powers up, the bootloader verifies the signature. The signature guarantees that nobody has tampered with the firmware. If the firmware is OK, the bootloader starts it.
François explains step by step how signing firmware and verifying a signature works. He provides the source code how to generate the signature and how to integrate the signature generation in the build process. He even details how to change the private keys if they get compromised.
“Your firmware signing process is only as secure as your key storage mechanism.” Access to the keys must be restricted to as few people as possible. Developers must not have access. Hence, git repositories are definitely the wrong place to store the keys. Developers have special development keys.
CI systems perform the release builds and offer means to restrict the access to the keys (e.g., Restricted Contexts by CircleCI). Only persons with special authorisation can start release builds. The same authorisation can be used during production programming (see the next entry) to install network keys, support keys and other keys on the devices. Note that these keys differ from device to device.
If we use Qt under LGPLv3 on a consumer product, we must allow users to install a modified Qt version on the device. This Qt version is not signed at all or is signed with a special key. Hence, the device can warn the user about the modified Qt version or void the warranty. François only writes about signing firmware for microcontrollers, but we can apply his approach to Qt embedded systems as well. And we better do!
Production Programming for Linux by Toradex
When the hardware of an embedded system is assembled, we must ensure that the correct Linux image is installed on the system. Additionally, some system-specific data like an ID number and encryption keys must be stored on the system. This process is called production programming and should be done mostly automated. Programming hundreds or even thousands of systems manually would be error-prone and expensive.
The Toradex Easy Installer makes production programming a lot easier. It reads the image from a USB drive, an SD card or a local server and flashes it onto the system in a fully automated and unattended way. Post-installation scripts allow us to install the ID number and encryption keys on the system and to update the firmware of attached microcontrollers. Assembly workers must wire up the system but not run any installation scripts.
When we buy computer-on-modules (CoMs) from Toradex, we get production programming with the Easy Installer for free. Other CoM or SoM manufacturers don’t help with production programming. Hence, we must write the tools for production programming ourselves and train the assembly workers.
Defensive Programming - Friend or Foe? by Tyler Hoffman (Memfault)
We have all seen the prototypical example of defensive programming. Some function creates an object on the heap and stores the pointer to the object in a member variable m_obj. Every function using m_obj returns immediately if m_obj is null - without any error handling.
The program will gradually slide into an unintended state, exhibit strange behaviour or crash eventually. Finding the root cause in these situations is difficult. This is not defensive programming but an abuse of it.
Offensive programming would replace the bouncing if-statement by an assertion. If m_obj is null, the assertion fails. The program tells us loud and clear that something is wrong and where it went wrong. A debugger shows the trace leading to the assertion failure. We get early feedback and can fix problems early.
Tyler shows how to use offensive programming to find performance issues (e.g., GUI freezing, slow response time to button presses), memory issues (no free heap or stack memory, excessive memory fragmentation) and locking issues (deadlocked threads, program stalls). Assertions play an important role in finding these issues. Here are two slightly more elaborate examples.
The GUI writes some data to flash storage. If this operation takes too long, the GUI freezes. We start a timer before we call the write operation. If the timer fires, the write has taken too long and the program stops on a false assertion. Otherwise, we stop the timer when the write finishes.
Multiple threads use a mutex to share data. The mutex is locked using a timeout of say 1 second. If the timeout fires, locking and an assertion fail. We have found a deadlock.
Tyler argues in favour of playing offense in the production code. The assertions don’t fail but log failures in log files. We should use offensive programming in software parts over which we have full control. We should not use it, for example, for peripheral drivers, third-party code written by external developers or data from communication stacks.
All About Code Review by Michaela Greiler
When I think of code reviews, I think of Michaela Greiler. I read her newsletter, follow her on Twitter (@mgreiler) and check her blog regularly. Michaela has topped this: She provides a “curated list of articles, tools, checklists and other awesome resources about code reviews” on All About Code Review.
We can find out how Google or Microsoft do code reviews. Google only allows code reviews by developers who have been especially trained (“readability certificate”) and who work on the code actively. Code reviews are done frequently on small chunks of code. 90% of the code reviews happen on less than 24 lines of code. Google’s code review guidelines are online as well.
In the section Code Review Articles, Michaela lists a dozen articles with good practical tips for conducting code reviews. My favourites are How to Give Respectful and Constructive Code Review Feedback, Code Review Guidelines for Humans and How to Do Code Reviews Like a Human. The articles are chock-full of good advice, which will lead to more effective code reviews.
All About Code Review also provides a list of tools, checklists and even case studies about losses due to lacking code reviews.
Copyright (C) *|CURRENT_YEAR|* *|LIST:COMPANY|*. All rights reserved.
*|IF:REWARDS|* *|HTML:REWARDS|* *|END:IF|*