Episode 41: Better Built By Burkhard

Qt for Memory-Constrained Devices

May 01, 2023

Qt for Memory-Constrained Devices

Qt for MCUs should better be called Qt for Memory-Constrained Devices, as Yoann Lopes, the Technical Program Manager for Qt for MCUs at The Qt Company, pointed out in a conversation in mid April. Qt for Memory-Constrained Devices gives a clear hint where The Qt Company wants to position Qt for MCUs. You should expect Qt for MCUs to run on both microcontrollers and low-end microprocessors in the medium term.

Yoann highlighted that the main driver for companies to use Qt for MCUs is cost. Unsurprisingly, Qt for MCUs is used for the HMI of high-volume consumer devices like dishwashers, washing machines, ovens, refrigerators, coffee machines, printers, e-bikes, e-scooters, wearables (see EM Microelectronic case study), motorcycles and oximeters. 50 cents more for a more powerful SoC make quite a difference when you sell one million devices per year.

Device manufacturers face another problem. Yoann and I discussed home-appliance manufacturers as the prototypical example (see also the white paper UIUX Trends for Appliances & Electronics Product Lines). While high-end appliances run on microprocessors (MPUs) with embedded Linux, mid-range and low-end appliances run, for cost reasons, on microcontrollers (MCUs) bare-metal or with a simple RTOS. Manufacturers work with the normal Qt platform for MPU-based appliances and with Qt for MCUs for MCU-based appliances.

Currently, Qt for MPUs (a.k.a. standard Qt) and Qt for MCUs are technically two mostly distinct frameworks. Hence, home-appliances manufacturers must still build two mostly different systems: one for MPUs and one for MCUs. Instead, they would like to run scaled-down versions of the HMI application from the high-end appliances on their mid-range appliances (currently on MCUs) - without changing the Qt framework. This would allow them to benefit from economy-of-scales effects.

The manufacturers’ favourite way of scaling down would be to run the HMI application on a very low-cost MPU but still with embedded Linux. Low-cost implies very little main and flash memory. There is a catch though! It would take a prohibitively large effort to fit standard Qt into such a memory-constrained device.

Other manufactures face the same problem as the home-appliance manufacturers but in the opposite direction. Their main HMI application runs on an MCU and they need to scale up to a more powerful MPU. Most HMI frameworks developed especially for MCUs don’t scale up well. They remind me of HMIs from the 90s.

If Qt for MCUs ran on very low-end and low-cost MPUs, the manufacturers could seamlessly scale up from MCUs over low-cost MPUs to higher-cost MPUs or scale down in the other direction. These manufacturers want to delay the switch from MPU to MCU or vice versa as long as possible. If they have to switch, they want to reuse as much software as possible.

This is exactly the business case for Qt for memory-constrained devices. And, it is a pretty convincing business case, as well.

Begin of digression (my thoughts only).

Qt for MCUs is only available as part of the Qt for Device Creation offering. Similar, Qt for memory-constrained devices would be a commercial-only offering that could be extended to all embedded, mobile and desktop devices in the long term. It could be the new premium Qt offering: the high-price low-volume offering.

The differentiation from the LGPL offering would be simple. If you want scalability from high to low end and vice versa or just the best Qt experience, you must buy Qt for memory-constrained devices. There is no competing LGPL option.

The current Qt for Device Creation license would become the low-price high-volume offering. Customers would get basic support, some paid Qt add-on modules like QtQuick 3D, QtVirtualKeyboard, QtMQTT or QtCoAP, and a business-friendly licensing. The low annual fixed price would make it a no-brainer for businesses to buy this license.

That would be a much easier sale for The Qt Company. More importantly, it would be a more pleasant sales process for potential customers.

End of digression and back to my conversation with Yoann.

Such a convincing business case didn’t exist BC (Before Corona) because of a technical roadblock. Qt for MCUs was stuck at 6-8 MB RAM and 12-16 MB flash memory. The MCUs need basic graphics acceleration like blitting, rotating and scaling images, and text rendering. Such high-end MCUs are in price regions of low-cost MPUs.

This was my state of knowledge and lead me to this conclusion in my Embedded World Round-Up: “I don’t see Qt as an option for microcontrollers”. My knowledge was outdated and my conclusion was wrong. I struck through the paragraph in question and added a new paragraph:

Qt for MCUs is on par with Slint with respect to memory consumption. The runtime of Qt for MCUs consumes 200-250 KB of RAM. Consequently, Qt for MCUs can run on the same lower-end and less expensive microcontrollers as Slint.

So, what made me change my verdict? The Qt-for-MCUs stakeholders had understood that they couldn’t shrink standard Qt down to 200-500 KB RAM consumption. So, they set out to build a new Qt runtime optimised for MCUs and a fortiori for memory-constrained devices. The results speak for themselves.

Smart oven reference implementation: RAM - Qt runtime: 237 KB RAM, RAM - frame buffer: 19 KB RAM; flash - Qt app: 1.16 MB; SoC: ESP32-S3-BOX.
E-bike reference implementation: RAM - Qt runtime: 204 KB RAM, RAM - frame buffer: 696 KB; flash - Qt app: 4.3 MB; SoC: STM32H750B.
A real-life motorcycle instrument cluster based on Qt for MCUs runs easily with the 1.5 MB RAM available on a Renesas RH850/D1M1V2 MCU.

Note: “RAM - Qt runtime” includes the RAM needed by Qt itself, by the application and by the display and GPU drivers. All components are statically linked into a single Qt runtime.

This minimisation of Qt possible was only possible by building a Qt runtime from scratch. The runtime is responsible for rendering QML items and text, for evaluating properties and functions, and for the C++ API with signals, slots and properties. And it should do all this in 200-250 KB RAM.

An engine with full support for QML and JavaScript is too big for an MCU. Just-in-time compilation, let alone interpretation of QML and JavaScript is too slow for MCUs.

The major enabler for getting rid of the QML/JavaScript engine are the QtQuick Ultralite code generation tools. These tools transpile QML into C++ code, which is compiled into machine code as usually. However, the compiler alone wouldn’t bring down the size of the Qt runtime and the Qt applications far enough.

Therefore, the Qt-for-MCUs team defined a very small QML subset, QtQuick UltraLite, with heavily restricted expressions for properties and functions. QtQuick UltraLite supports only 9 out of the 50+ QtQuick controls (including button, slider, check box, switch and progress bar).

With Qt 6, the developers turned QML into a strongly typed language. As the QML types are aligned with the C++ types, the C++ API can drop the overhead of passing QVariant objects between QML and C++. This measure reduces the size of the C++ API drastically and makes QML programs run faster.

Thanks to Yoann for an instructive conversation about Qt for memory-constrained devices 🙏

Note: The featured image “Small screens everywhere… and they run MCUs” is courtesy to The Qt Company, which holds the copyright.

My Content

Reference Image: Verdin iMX8M Plus, Yocto 4.0, Qt 6.5

The link provides my notes how to build Yocto 4.0 (kirkstone) with meta-qt6 (Qt 6.5) for the Toradex Verdin iMX8M Plus. It is not a polished post. I hope it’s enough for you to reproduce your builds with meta-qt6. The pages Reference Images Based on Toradex Verdin iMX8M Plus and Reference Image: Verdin iMX8M Plus, Yocto 4.0, Qt 5.15 may be useful.

My example Internet radio application starts properly on power-up but doesn’t play the radio station. I’d guess that QtMultimedia with ffmpeg - its new default multimedia backend from Qt 6.5 - haven’t been tests on all platforms yet. I didn’t have the time yet to fix the problem.

Around the Web

Valentina Cupać (Optivem Journal): Hexagonal Architecture - The Why

I am a happy reader of Valentina’s newsletter Optivem Journal. She writes and talks about Hexagonal Architecture (a.k.a. Ports-and-Adapters Architecture), TDD, Microservices Architecture and Event-Driven Architecture. Although Valentina comes from enterprise software, her advice is easily applicable to embedded software.

In the post and video Hexagonal Architecture - The Why, Valentina gives a step-by-step walk-through how to transform a big ball of mud into an 3-layer architecture and into a hexagonal architecture. The main difference between 3-layer and hexagonal architecture is that there are APIs (ports) between the adapters (presentation and persistence layers) and the business logic.

The APIs belong to the business logic and not to the adapters or layers - using dependency inversion for driven ports: the persistence port in the example. For testing, the business logic uses test doubles for the adapters. The adapters use test doubles for the business logic. Test doubles are alternative implementations of the ports (e.g., fakes or mocks).

Different teams can develop each adapter in isolation - without having the implementation of the business logic. Another team can develop the business logic - without having the adapter implementations. The teams only need properly defined interfaces between business logic and adapters.

Luca Ceresoli (Bootlin):Yocto: sharing the sstate cache and download directories

You can use the same download directory and shared state cache for different Yocto builds by setting the variables DL_DIR and SSTATE_DIR in local.conf. You can even point these variables to a network share so that developers can re-use the download and build artefacts from each other. This speeds up Yocto builds tremendously and saves a lot of space.

If you set DL_DIR and SSTATE_DIR outside the BitBake build environment (e.g., in ~/.bashrc), you will find out that these variables are overwritten by BitBake. You can avoid this by adding the variables to BB_ENV_PASSTHROUGH_ADDITIONS.

export BB_ENV_PASSTHROUGH_ADDITIONS="DL_DIR SSTATE_DIR" export DL_DIR="${HOME}/data/bitbake.downloads" export SSTATE_DIR="${HOME}/data/bitbake.sstate"

This is a useful method to make environment variables visible in the BitBake build environment. Kas - my favourite Yocto build tool - uses this method to pass DL_DIR, SSTATE_DIR and TMPDIR through to the BitBake environment.

Ivan Solovev (The Qt Company): Qt CAN Bus API extensions

I mentioned the QtCanBus extensions to Qt 6.5 in my January newsletter. With detailed examples, Ivan shows how to compose a QCanMessageDescription from several QCanSignalDescriptions, and how to encode and decode messages with QCanFrameProcessor.

The new class QCanDbcFileParser reads a CAN message specification file in DBC format and adds message descriptions to the CAN frame processor for decoding and encoding. By the way, I have written three code generators for J1939 messages for different customers over the last 10 years. The new QCanBus functionality would have saved me a lot of time. And, it will as I am sure there will be a fourth and fifth code generator.

Better Built By Burkhard

Episode 41: Better Built By Burkhard

Qt for Memory-Constrained Devices

Qt for Memory-Constrained Devices

My Content

Reference Image: Verdin iMX8M Plus, Yocto 4.0, Qt 6.5

Around the Web

Valentina Cupać (Optivem Journal): Hexagonal Architecture - The Why

Luca Ceresoli (Bootlin):Yocto: sharing the sstate cache and download directories

Ivan Solovev (The Qt Company): Qt CAN Bus API extensions

Discussion about this post