Episode 18: Burkhard on Qt Embedded Systems

Jun 07, 2021

Welcome to Episode 18 of my newsletter on Qt Embedded Systems!

My May newsletter is all about system design and architecture. The trigger was the inspiring Software Architecture Workshop held by Ruth Malan and Dana Bredemeyer [BM21]. I wrote two posts about the architecture of Qt embedded systems (more to come) and read the relevant parts of four books to understand the authors’ unique perspective on system architecture.

I’ll judge the different approaches by the following question: How does the approach help me increase the chance of creating a good, right and successful architecture?

Enjoy reading and stay safe - Burkhard 💜

My Blog Posts

Architecture of Qt Embedded Systems: Single vs. Multiple GUI Applications

When is a system with a single GUI app without a window manager good enough? When is a system with multiple apps with a window manager the better choice? What are the requirements on a window manager? Shall we use a Wayland compositor or X11? The answers for a concrete Qt embedded system have considerable influence on the system architecture.

Architecture of Qt Embedded Systems: Operating Conditions

Operator conditions like light, temperature, water, dust and vibration are a rich source for constraints and qualities of a system architecture. They affect the SoC, display and connector selection, the touch gesture recognition, the communication between system components, the cabling, the use of a window manager and more.

Reading on System Architecture

[BM21] Dana Bredemeyer and Ruth Malan, Software Architecture Workshop (training)

A system architecture should meet three important criteria. It should be

good - technically sound.
right - meeting stakeholder needs.
successful - delivering value.

Good. Well-known architecture patterns or styles like the Layers, Ports-and-Adapters, Service-Oriented, Microservices and Microkernel pattern very likely lead to good system architectures. One architecture alone will rarely be the best choice for your system. The secret lies in mixing them in the right way. Qt embedded systems often have the Ports-and-Adapter architecture at their core complemented by a Service-Oriented and Microkernel architecture. [RF20] covers the technical aspects of architectures in detail.

Right. The good-criterion of an architecture looks at the inside of the system. The right-criterion examines the influence of stakeholders on the system from the outside. The right architecture must meet the needs of many different stakeholders. Here are some examples from my experience.

The job of a harvester driver becomes considerably simpler, when the harvester automatically adapts the cutting height depending on the ripeness of the maize plants. This has serious consequences on the architecture. The system must be able to perform on-board image recognition to infer the ripeness from the colour of the plants. Hence, it requires an appropriate SoC like the NVIDIA Jetson.

Your employers or customers have different needs. They want to release the product in 12 months on a small budget. They want to run the system for 10-15 years in the field with minimal support costs. They want to extend the system regularly with new features.

The developers’ needs enforce other constraints on the system. Here are some I encounter frequently on projects. Most of the developers have less than three years of professional experience. Half of them have no experience with the programming language and framework you want to use.

Systems have many more stakeholders. They must satisfy regulatory, safety and security provisions. Your system communicates with other systems like cloud servers, machines and sensors. It uses software and hardware components from third-party suppliers.

As architects, we must unearth the stakeholder needs or architecturally significant requirements (ASRs) as [Kee17] calls them (see p. 49). In their workshop [BM21], Ruth and Dana let the participants try out many different tools to look at the system from different perspectives: context map, stakeholder map and profile, empathy map, component model, CRC-R cards, essential use cases, risk storming, system properties, deployment diagram and more. [Kee17] offers a rich toolbox (see Part III) as well.

By iterating these tools until they don’t provide any new or important ASRs, you will converge towards the right architecture. Knowing a good target architecture and having profound domain knowledge speeds up the convergence to the right architecture.

Successful. This criterion widens the perspective from the immediate system context to the market, in which the system succeeds or fails. You have competitors. They may be able to adapt faster to market changes. Your competitors get their product faster to the market than you. Your product becomes commoditised or obsolete.

The competitor may be inside your company. Nokia started developing the Linux-based MeeGo phones, when their Symbian phones still accounted for more than 50% of all smartphones. The establishment torpedoed the newcomers wherever possible. The rest is history. Nokia imploded within four years and disappeared as a phone maker.

The system depends on hardware and software components from third-party suppliers. These suppliers may go out of business, produce bad-quality components, deliver the components late or increase their prices. If you don’t have a plan B, you must spend a lot of time to work around these problems.

Many of these influences cannot be foreseen. They remain unknown or unknowable. If you anticipate them, they’ll make for important ASRs. It’s often time that tells whether an architecture is successful.

A successful architecture is defined as delivering value. Scrum, XP and other agile methods are all about maximising the business value delivered to the customer. They are also all about getting feedback early, quickly and regularly.

By implementing the ASRs using, say, XP (especially TDD, refactoring and continuous integration), we get an executable architecture. We extend the implementation in parallel to the architecture. The implementation gives us early and profound feedback about the architecture that visual models can’t. For example, diagrams can’t tell us whether our architecture is able to handle 1000 messages per second from the machine.

You get the most successful architectures by finding the right balance between modelling and implementation. Dana told the story, where the three best architects he knows switch between modelling and coding to produce their architectures. Neither big up-front design nor rushing into coding produces successful architectures. The right mix of both does as [Fow04] argues. [FPK17] turn this idea into evolutionary architectures.

Risk is the chance of something good or bad happening. Something good happening is perceived as value, whereas something bad happening as cost (negative value). So, it’s not surprising that risk storming is an effective way of discovering ASRs especially constraints and qualities. [Fai10] uses risks to guide the creation of architectures.

[Kee17] Michael Keeling, Design It! From Programmer to Software Architect (book), The Pragmatic Programmers, 2017

According to [BM21], the right architecture must meet the stakeholder needs, which [Kee17] calls architecturally significant requirements (ASRs). [Kee17] defines four categories of ASRs (p. 49).

Constraints Unchangeable design decisions, usually given, sometimes chosen.
Quality Attributes Externally visible properties that characterise how the system operates in a specific context.
Influential Functional Requirements Features and functions that require special attention in the architecture.
Other Influencers Time, knowledge, experience, skills, office politics, your own geeky biases, and all the other stuff that sways your decision making.

Constraints and quality attributes (qualities, for short) are also known as non-functional requirements. Qualities are negotiable, whereas constraints typically are not. Here are some examples for harvester terminals (see my posts above).

Technical constraint. The terminal must operate continuously in temperatures from -15°C to +70°C.
Business constraint. The first release of the terminal must be available for the autumn harvest in 2022.
Quality. The contents on the display must be readable in sunlight, daylight and nightlight.
Quality. The terminal GUI must be able to display 300 messages per second on average and 1300 messages per second at peak times.

Influential functional requirements are difficult-to-implement, high-value or high-priority user stories. The essential user stories making up the minimum viable product (MVP) are good candidates. Here are some examples for the harvester terminal.

User Story. As the driver, I always want to see the engine speed so that I know when the engine becomes overloaded or under-utilised.
User Story. As the driver, I want to change the cutting height depending on the ripeness of the maize plants.
User Story. As the driver, I must record the harvested area, diesel consumption, working hours and more so that my agency can invoice the farmers.

In Agile, user stories are placeholders for future conversations. In the conversation, the team breaks down the user story to communication between stakeholders and components or to communication between components.

Influencers interact with the system as part of a larger ecosystem. Some examples:

Influencer. The terminal is built by an in-house team, whereas the ECUs are built by a third party being paid a fixed price for the ECUs.
Influencer. The company struggles to hire Qt/C++ embedded developers.
Influencer. The terminal SoC is in short supply.

All the ASRs are listed in the ASR workbook (see p. 60). The ASR workbook additionally contains the business context (stakeholders, business goals), user personas and a glossary. At the beginning of the project, the ASR workbook changes often. When the architecture matures and when parts of the architecture get implemented, changes become less and less frequent. The workbook remains a useful source of information for all stakeholders.

The author provides 38 activities (see Part III - The Architect’s Toolbox) that help you create an architecture. He divides the activities in four groups.

Activities to understand the problem. You apply these activities to find the ASRs.
Activities to explore potential solutions. These activities help you to find a good architecture. “Since all design is redesign, exploration starts by considering solutions we already know […]”. Profound knowledge of architecture patterns and of successful architectures from similar projects helps tremendously. The title of Chapter 6 says it well: “Choose an Architecture (Before It Chooses You)”.
Activities to make the design tangible. These activities provide you with early feedback about the architecture. They involve building a prototype. I would actually start building working software guided by the most important qualities and user stories - instead of building a throw-away prototype. Using TDD, refactoring and continuous integration guarantees that the software is easy to change (see also [Fow04] below).
Activities to evaluate design options. These activities help you decide how well the architecture meets the stakeholder needs, that is, whether you have found the right architecture - in [BM21] parlance. The goal is to identify architecture decisions that are hard and costly to change later on. This goal makes it clear how important easy-to-change software is.

I have one quibble with all these activities. They don’t tell you how good or bad their result is. They also don’t give you a method that yields better results. So, it’s down to the profound knowledge of your team to judge the quality of the results. This knowledge also guides you in the selection of the most useful activities and lets you focus on the important stuff.

[RF20] Mark Richards and Neal Ford, Fundamentals of Software Architecture (book), O’Reilly, 2020

The authors dedicate more than 40% of the book to architecture patterns, which they call architecture styles (Part II). This makes the book a good complement to [Kee17] with 6% on the same topic.

Unfortunately, [RF20] lacks the most important pattern for Qt embedded systems: the hexagonal architecture or ports-and-adapter pattern. [Kee17] has a two-page description (p. 82-83). Alistair Cockburn created the pattern and describes it in the post Hexagonal Architecture. 5 of the 8 architecture styles are relevant for Qt embedded systems:

Chapter 10 - Layered Architecture Style. GUI applications traditionally have three layers: HMI, business logic and data.
Chapter 11 - Pipeline Architecture Style. GStreamer builds up a pipeline of sources, filters and sinks to process video and audio.
Chapter 12 - Microkernel Architecture Style. This style is also known as plugin architecture style. Qt uses it extensively. For example, QtCanBus provides plugins for different CAN devices like SocketCAN, PeakCAN, VectorCAN and VirtualCAN. All plugins have the same interface QCanBusDevice.
Chapter 13 - Service-Based Architecture Style. Infotainment systems in cars have GUI applications like Radio, Media and Phone that talk to their respective service over inter-process communication.
Chapter 14 - Event-Driven Architecture Style. An MQTT client on your system communicates with other MQTT clients via an MQTT broker in the cloud.

In the section Architect Role (p. 101), the authors make a fairly disputable statement.

Generally the component is the lowest level of the software system an architect interacts directly with […] Components consist of classes or functions […], whose design falls under the responsibility of tech leads or developers.

The authors worry that architects would otherwise micromanage developers and tech leads and that the organisation couldn’t cultivate the next generation of architects. The authors’ advice smacks of waterfall. The architects hand over the architecture to the tech leads who break down the design of their components for the developers to implement.

An agile team doesn’t distinguish architects, tech leads and developers. It only has developers that cooperate to build a successful product. A good, right and successful architecture is one important piece in this cooperation, not more and not less. In the same way the team works together to create an architecture, it also works together to build the system.

A good architect has designed and developed several different products before. I haven’t seen a good architect yet who isn’t a good developer. Pairing a good architect with developers is the best way to spread architecture knowledge through a team. Forgoing the expertise of a good architect would hurt the team and reduce the chance for a successful product.

[Fai10] George Fairbanks, Just Enough Software Architecture: A Risk-Driven Approach (book), Marshall & Brainerd, 2010

The author aptly describes my problem with the processes for creating architectures (p. 89): “Broadly speaking, [the processes] lay out a large number of modelling and analysis techniques, many of which are expensive, and suggest that your project is risking its success if these techniques are not done.”

As time is always at a premium in projects, you don’t want to waste time on irrelevant or low-risk parts of the architecture. You can also overlook important or high-risk parts of the architecture and pay for it later with a several times more expensive re-implementation. You can fall into both traps, no matter whether you use planned design, no design or anything in-between.

Risk-driven architecture helps you minimise the chance that you fall into any of these traps (p. 36): “Its essential idea is that the effort you spend on designing your software architecture should be commensurate with the risks faced by your project.” The risk-driven approach consists of three steps (p. 37):

Identify and prioritize risks
Select and apply a set of techniques
Evaluate risk reduction

Let me give you an example from my experience. An operator controls a cleaning robot in another room over a handheld terminal. No person may enter the room during cleaning, because the robot uses UV rays or highly aggressive chemicals. The communication between terminal and robot is wireless. The robot must stop, if the operator hits the emergency stop on the terminal or if it hasn’t received an alive message from the terminal for half a second.

The emergency stop was the requirement with the highest risk in the project. If it didn’t work, the project would be dead.
LoRaWAN and Bluetooth Long Range were the contenders for the wireless communication.
Some calculations showed that Bluetooth Long Range would provide enough bandwidth but LoRaWAN probably would not. Some practical experiments with a prototype in realistic room settings confirmed the calculations. The safety-critical communication code should run in an isolated application on the terminal or even run on the microcontroller. This would make the certification simpler.

Scrum and XP choose the user story with the highest business value as the next story to work on. The story with the highest value is not necessarily the story with the highest risk. If a risk materialises, it has a cost or negative value as in the above example. Hence, the author rightly suggests to “change the feature backlog into a feature & risk backlog” (p. 57). As all stories, risks should have testable acceptance criteria.

Quality attributes are typically the most important drivers for the architecture. And they are the inverse of risks: A risk is “the lack of a needed quality attribute” (p. 40). This is the reason why risk storming (activity 35 in [Kee17]) is so effective in discovering qualities. Once we have prioritised the risks/qualities, we know where to spend our efforts for creating the architecture.

The author introduces a presumptive architecture (p. 23) as “a family of architectures that is dominant in a particular domain. Rather than justifying their choice to use it, developers in that domain may have to justify a choice that differs from the presumptive architecture.” The presumptive architecture and de-facto standard for Qt embedded systems is the hexagonal architecture. Keep in mind: “Systems that follow presumptive architectures usually succeed” (p. 25).

The author acknowledges at several places that running code yields invaluable feedback for design. For example (p. 50): “With current design techniques, it is nearly impossible to perfect the design without feedback from running code.” The next post [Fow04] discusses the right balance between up-front design and running code.

[Fow04] Martin Fowler, Is Design Dead? (post), 2004

XP eschews up-front design. It evolves the architecture along with implementing the user stories. The right architecture will somehow appear when you strictly follow the XP practices. Simple design comes from adhering to the XP mantras: “Do the simplest thing that could possibly work” and “you ain’t gonna need it” (YAGNI). And if needed, you can still refactor the architecture into a simpler one.

XP design is called evolutionary design. The other extreme is planned design - achieved by big design up-front (BDUF). As [Fai10] points out, both planned and evolutionary design regularly miss the high-risk and important parts of the architecture - for different reasons.

Both design approaches prove their value, when they can flatten the change curve. “The change curve says that as the project runs, it becomes exponentially more expensive to make changes.”

For planned design, it’s either hit or miss. If planned design overlooks a high-risk requirement (and it will!), the project suffers from the exponential change curve. Otherwise, it doesn’t. Evolutionary design still has a joker. You can refactor the wrong design into the right one.

The effort for refactoring becomes exponentially higher the further up the project is on the change curve. So, it’s best to refactor your design to the simplest possible design frequently and to nip exponential growth in the bud. Refactoring requires a high-quality test suite best gained through TDD. Continuous integration (CI) ensures that every team member works on the same design. Only the cooperation of TDD, CI and refactoring enable the flattening of the change curve.

Evolutionary design converges a lot faster towards the right architecture, if you have a target architecture in mind. The target architecture could be the presumptive architecture from [Fai10] or a successful architecture from a previous project. You adapt this architecture to the special needs of the current project. So, there is space for little design up-front or minimal planned design (see [Fai10, p. 50]): Martin Fowler “does roughly 20% planned design and 80% evolutionary design”.

The “main source of complexity is the irreversibility of decisions”. Hence, evolutionary “designers need to think about how they can avoid irreversibility in their decisions. Rather than trying to get the right decision now, look for a way to either put off the decision until later (when you’ll have more information) or make the decision in such a way that you’ll be able to reverse it later on without too much difficulty.” Refactoring together with TDD and CI gives you the freedom to reverse your decisions.

Early and frequent feedback from working software - a hallmark of XP and evolutionary design - is invaluable. “If a mistake is made in requirements it can be spotted and fixed before the cost of fixing becomes prohibitive. This same rapid spotting is also important for design.”

Evolutionary design only works, if somebody (typically the architect) ensures “that the design quality stays high” and the rest of the team goes along. This person tries to spot messy areas in the code base and makes sure that the messes get fixed quickly before they get out of control. [FPK17] gives tips how to automate these tasks.

The skill list for an architect may look daunting, but being a good architect has never been easy.

A constant desire to keep code as clear and simple as possible.
Refactoring skills so you can confidently make improvements whenever you see the need.
A good knowledge of patterns: not just the solutions but also appreciating when to use them and how to evolve into them.
Designing with an eye to future changes, knowing that decisions taken now will have to be changed in the future.
Knowing how to communicate the design to the people who need to understand it, using code, diagrams and above all: conversation.

[FPK17] Neal Ford, Rebecca Parsons and Patrick Kua, Building Evolutionary Architectures: Support Constant Change, O’Reilly, 2017

With non-negligible effort, you have found the right architecture. How can you prevent the right architecture from degrading over time? The authors’ answer (p. 7): “Once architects have chosen important characteristics, they want to guide changes to the architecture to protect those characteristics.”

Architectural changes are guided by fitness functions. “Architects define a fitness function to explain what better is and to help measure when the goal is met. In software, fitness functions check that developers preserve important architectural characteristics” (p. 15).

Qualities often make good fitness functions. For example, the harvester terminal must be able to handle 300 messages per second on average and up to 1300 messages per second at peak times. The fitness function is a benchmark test. The test sends 300-1300 messages per second to the component handling the messages and checks that the component doesn’t forward more than 150 messages per second to the GUI.

Runtime monitors can be used as fitness functions. For example, two components run in different threads of the same process and communicate asynchronously with Qt’s signal and slot functions (using message queues behind the scenes). Nothing prevents one component to call a function in the other component directly circumventing Qt’s behind-the-scenes synchronisation. A monitor could raise an alarm if the thread ID of the caller and callee differ.

Fitness functions can also check for dependency cycles on component or class level. They can warn you whenever you introduce a cycle. They can also guide you when you remove cycles from legacy software. Fitness functions can check whether functions stay below a certain cyclomatic complexity.

[Fow04] made it the architect’s job to ensure “that the design quality stays high”, that is, to prevent the right architecture from degrading. 13 years later, advances in DevOps and tooling enable architects to automate the tactical aspects of their jobs by adding fitness functions to the deployment pipeline.

This frees architects to focus on the strategical aspects of their jobs. [Fow04] says it well: “Instead of an architect who makes all the important decisions, you have a coach that teaches developers to make important decisions. As Ward Cunningham pointed out, by that [architects amplify their] skills, and [add] more to a project than any lone hero can.”

Better Built By Burkhard

Discussion about this post