About Real-time Systems

Real-time Systems

On a very general level, it is not straightforward to delineate a real-time system from a “non-real-time” one. Even a batch process has real-time aspects as regards interactions with peripheral devices. Same for PC operating systems such as Windows or macOS. And real-time systems can have non-real-time aspects as well, eg. in the area of user interactions.

Usually, real-time systems and control systems are discussed together, even considered the same from a practical point of view, because control system require more or less strict response times to external events. Control systems are based on control theory, where they detect events and read data from the controlled system via sensors, and write their corresponding response to the controlled system via actuators.

That is, even if real-time systems do not need to be control systems, strictly speaking, they usually are, and we will focus on real-time control systems accordingly here. We will use the term real-time system in general.

Real-time systems are focused on running control processes, which often are ever-repeating. User-interaction is not a primary focus, in fact, it can even be detrimental to timing accuracy. As discussed below, control processes can be implemented with sets of tasks, preserving state in module variables, or threads that inherently hold state, for example using coroutines. We will use the term process for both approaches.

For this discussion, we’ll use this simple definition of a real-time system:

A real-time system is able to react to a defined set of external events within a defined time period.

The reaction includes the the detection of the event, the acquisition of the related data, the determination of the appropriate response, and the application of the response the controlled system.

Real-time Operating Systems

A real-time system does not require a real-time operating system. In fact, many real-time controllers are implemented simply as embedded programs. Embedded programs are cross-compiled and statically linked on a host computer, and transferred to the target system. There, the boot loader puts the address of the first instruction, as well as the starting address of the stack, into defined address locations, and the microprocessor starts executing the downloaded embedded program from there.

Of course, the embedded program can, and usually does, make use of libraries to define and implement the control processes, and their scheduling and interaction, as well as the access to the peripheral devices. The resulting program, however, is a single binary. Any change to a module requires to recompile, link, and download.

A real-time operating system provides a functional substrate for a real-time controller program. Like a suitable set of libraries for single-binary embedded control programs, it defines the notion of control processes and their interactions, the access to peripheral devices, and so on. However, it “exists” on the target hardware without any control process running. As any operating system, it can manage resources, provide a file system, and a standard user interface. New or changed modules can be loaded dynamically from files or file-like data sources, eg. stored on an SD card or in PROM, allowing on-the-fly changes and extensions at run-time.

Real-time Kernels

A real-time operating system provides defined abstractions and APIs for all its services. However, most of what a real-time operating system can provide in this respect can also be achieved by a well-defined and implemented set of libraries that get linked into a single binary. In fact, so called real-time kernels provide useful abstractions and APIs, just like a real-time operating system. The boundaries can get pretty blurry, from a programmer’s practical point of view, if we abstract from the basic difference between a single binary program, and one that runs “on top” of an independent target operating system.

As the name implies, Oberon RTK is a kernel, not an operating system.

Processes & Scheduling

Control Process

A (control) process defines and implements an independent thread of control, or thread of execution, as part of a control program. Such program can comprise one or more processes, each focusing on its defined domain of control, goals and duties. Meaning that processes should be as self-contained as possible, with as narrow as feasible interfaces to each other (high cohesion, low coupling).

Processes are the dynamic building blocks of a control program, much like Oberon modules are static building blocks regarding program structure. Oberon modules provide an ideal conceptual and implementation substrate to realise processes.

A process gets invoked by a scheduling mechanism, and it can, but is not required to, hold state between invocations.

A process can be implemented in different ways. The implementation approaches include a set of tasks which are scheduled and run systematically, eg. based on an explicit finite state machine, as well as coroutines.

Scheduling

Initially, we will only consider cooperative scheduling, where processes are given full control of their processor, or processor core, until they “voluntarily” yield this control back to the scheduler. Interrupts and interrupt handlers can muddy this simple concept somewhat, and depending on their interaction with the “normal” processes, in particular as regards shared data, this must be dealt with accordingly.

There are substantial benefits with the cooperative approach, including being very simple and transparent, avoiding complex schemes to arbitrate access to shared data and devices, and allowing for simple context switches with little overhead.

However, cooperative scheduling puts the responsibility onto the programmer to fulfil the timing requirements. The real-time operating kernel can not guarantee timing constraints, the control program must do this. The programmer’s duty is to create a “fair process society” where all processes get their opportunity to do their job within their specific timing constraints.

Later versions of RTK may consider pre-emptive scheduling, and even pre-emptive scheduling with time slicing.

Interrupts

We can distinguish between two needs, or uses, of interrupts.

First, interrupts can react to genuine external events in the controlled system (environment), such as detection of a required measurement, or react to exceptional situations in the control system, such as failure states.
Second, interrupts can be used to implement purely internal regular and recurring mechanisms, for example to empty a buffer towards a peripheral device.

We’ll try to avoid the second type as much as possible, or use it closely coupled to processes. The use of the DMA facilities on the RP2040 will hopefully alleviate the need of interrupts of the second kind.

Basic Considerations and Requirements

Let’s discuss a few basic requirements for a real-time operating system that I consider to be fundamental. The list is by no means comprehensive, and will probably amended going forward. We’ll leave out process interaction and coordination in a first round.

Timing

It is in the name: timing is the crucial aspect of a real-time system. With cooperative scheduling, a roster of processes must usually run on a fixed schedule, with as little jitter as possible. The process schedule is the basis for fulfilling timing requirements and constraints.

A real-time kernel should provide facilities and services for the strict timing of processes.

Busy Waiting

Polling by busy waiting can be detrimental to process timing, in particular when interacting with a peripheral device. Busy waiting inside a process should only be considered in exceptional circumstances, when the time spent waiting is very short compared to the overall process schedule. Polling can be delegated to the scheduler though, allowing other processes to run as well.

Let’s take the example of an RS232 device, running at 115,200 baud transmit speed. The transmission of one ASCII character will take 87 microseconds, hence a short string of, say, six characters will take just above half a millisecond, during which the processor executes the busy waiting. A processor clocked at 50 MHz waits for over 26,000 cycles, which could be allocated to a different process. If we have a typical process schedule with a roster of processes to run each five, ten or 20 milliseconds, wasting half a millisecond for waiting is not ideal with respect to the overall timing.

With an SPI device running at 10 MHz transmission speed, driving an LC display, with each character consisting of 48 pixels with 16 bits, each character takes 77 microseconds to transmit. Same order as with the RS232 device. If the SPI device reads from a sensor, say, just one 16 bit value, however, busy waiting might be feasible within the overall process schedule.

As special case it might be acceptable to use in-process busy waiting upon system startup, allowing a short time for the hardware (eg. oscillators, releasing sub-system resets) and control program to settle before it is expected to work within the timing constraints.

A real-time kernel should provide facilities and services to avoid in-process polling by busy waiting.

Run-time Error Protection and Recovery

By their very nature, real-time control programs run unsupervised, or at least they usually do not require a human operator present at all times. Hence, run-time errors need to be autonomously detected and also dealt with.

We shall use the term run-time error to mean a condition that prevents one or more processes from correctly executing their code, because the explicit or implicit invariants have been violated, rendering the data and computational logic unreliable and unpredictable.

Usual run-time errors include:

computational errors, such as divide by zero, array index out of bounds, nil pointer dereference
resource errors, such as out of memory conditions, failing or disconnected peripheral devices
process synchronisation errors, eg. a deadlock
process lockout errors, eg. a run-away process that never yields control, or a process that is not being scheduled due to overly long processing times in other processes, or a process that has locked itself out due to the internal state conditions

Run-time errors in software can also be caused on the hardware level, eg. if an external device provides faulty data, maybe only spuriously so, in which case a so called disturbance is encountered. One disturbance does need to result in a run-time error if caught and handled appropriately by the process, but frequent disturbances should be handled as error.

A real-time kernel should provide the means and services to detect run-time errors, and to autonomously recover from them, without human operator interaction. The compiler used for the control program, as well as the hardware, form an integral part for both detection and recovery.

System Startup

A control program can initially be started by a human operator. But consider the unsupervised operation thereafter, eg. in the case of a run-time error, or a power outage.

Therefore, a real-time kernel should provide the facilities and services to automatically (re-) start one or more control programs and its processes upon startup.