Overview
This document and the accompanying test program TrapHandlers
explore and evaluate the use of synchronous trap exceptions as means to grant exclusive mutating access to shared data, or other protected functionality or hardware.
The following topics are covered:
- basic exception behaviour, regarding exception priorities and exception numbers;
- using trap handlers to implement exclusive access to protected data and functions;
- passing parameters to trap handlers.
We’ll use module Alarms
to trigger hardware-based asynchronous interrupts.
Program Description
Structure
The test program TrapHandlers implements
-
three alarm interrupt handlers, which will be triggered with different timing relationships, interrupt priorities, and alarm numbers to evaluate the interrupt behaviour, first the basics, then when triggering trap handlers (
TrapHandlers.ah0
,TrapHandlers.ah1
, andTrapHandlers.ah2
); -
a trap exception handler in various implementation variants, which will synchronously be triggered by software, not hardware, both from alarm interrupt handlers as well as from program thread mode code (Kernel threads or otherwise), using one of the unwired interrupts on the NVIC (
TrapHandlers.th0
,TrapHandlers.th1
,TrapHandlers.th2
,TrapHandlers.th3
); -
procedures to trigger the trap exception handlers from code in thread mode (
TrapHandlers.tt0
,TrapHandlers.tt1
).
The trap mechanism is of specific interest with view on the kernel-v2, as well as other similar system modules. This program, however, does not make use of the kernel.
Timing and Pre-caching
The measured times are used to interpret the results, so we want to get them as precise and consistent as possible.
Hence, the test handlers and procedures are pre-cached to avoid any influence of the flash memory caching mechanism. Also, the handlers are coded in-line, without calls to library code, eg. to de-assert the alarm interrupts, to get consistent and comparable results without the need to also pre-cache any library code.
Test Cases
Program module TrapHandlers
is structured furthermore into test cases that can be selected in TrapHandlers.run
, and which do all the required set-up and selection of code paths in the different procedures.
As an aside, there’s a lot of duplicated code in these test cases, which could be factored out, but I have left it as unwieldy as it is, since there is the advantage that the parameters for each test case are clear and visible in one spot.
Data Recording and Output
The different handlers and procedures collect their run data, which will then be printed by yet another alarm handler TrapHandlers.pR
.
Terminology
I may at times use language such as “ah1 fired at 10250”, in lieu of “the alarm triggering ah1 fired at 10250”, which is not precise, but avoids lengthy wording.
Basic Test Cases
Test Case 0: Baseline
ah1
fires after the run time ofah0
.- Nothing really interesting here, just to explain the output.
Build and run TrapHandlers
, which prints to the serial terminal:
test case: 0
rec id int prio alarm begin p-th t-th end rtm p0 p1
0 ah0 1 2 10000 10000 -- -- 10112 112 - -
1 ah1 0 2 10250 10250 -- -- 10363 113 - -
- rec: run record, ie. the recorded data set in the sequence as collected by the test handlers and procedures;
- id: the id of the test handler or procedure, eg.
ah0
,ah2
th1
, ortt1
, referring to their procedure name; - int: interrupt number, if applicable; note that the alarm number corresponds to the interrupt number;
- prio interrupt priority, if applicable;
- alarm: alarm trigger time, if applicable;
- begin: time at the start of the handler;
- end: time at the end of the handler;
- rtm: run time, ie.
end - begin
.
The other output data/columns are not yet relevant for the first series of test cases. We’ll introduce them for the relevant test cases.
All times are in microseconds (us), read from the timer device with the usual caveats.
Observations:
- With the alarms 250 us apart (at
10000
and10250
) and run times of just above 100 us, we just see the two corresponding handlersah0
andah1
triggered and executed independently, and without interaction.
Test Case 1
ah1
fires during the run time ofah0
.ah0
andah1
have the same priority.
test case: 1
rec id int prio alarm begin p-th t-th end rtm p0 p1
0 ah0 1 2 10000 10000 -- -- 10112 112 - -
1 ah1 0 2 10050 10113 -- -- 10227 114 - -
Observations:
ah1
fires at10050
, but only starts executing at10113
, right afterah0
ends;- this is consistent with the ARMv6 specs: an interrupt triggered during the run time of another will remain pending, and then be executed right after the running one, if the second interrupt has equal or lower priority (note: lower prio numbers mean higher prio in the hardware);
The RP2040 implements tail-chaining, hence ah1
executes without unstacking after ah0
and stacking again for ah1
.
Test Case 2
ah1
fires during the run time ofah0
.ah0
andah1
have different priority.
test case: 2
rec id int prio alarm begin p-th t-th end rtm p0 p1
0 ah0 1 2 10000 10000 -- -- 10228 228 - -
1 ah1 0 1 10050 10050 -- -- 10163 113 - -
Observations:
ah1
fires at10050
, and immediately starts to execute, preemptingah0
, which then finishes afterah1
has terminated;- this is consistent with the ARMv6 specs: a higher prio interrupt preempts a running lower prio one;
- the state of
ah0
is stacked and then unstacked upon entry and exit ofah1
, respectively.
Test Case 3
ah0
andah1
fire at the same time.ah0
andah1
have the same priority.
test case: 3
rec id int prio alarm begin p-th t-th end rtm p0 p1
0 ah1 0 2 10000 10000 -- -- 10114 114 - -
1 ah0 1 2 10000 10114 -- -- 10228 114 - -
Observations:
ah1
gets executed beforeah0
, the latter is executed right after the former;- this is consistent with the ARMv6 specs: if two interrupts with the same prio become pending at the same time, the one with the lower interrupt number gets executed first.
Due to tail-chaining, ah0
is executed without unstacking after ah1
and stacking again for ah0
.
About Traps
Overview
The term trap is borrowed from operating systems concepts and design. It is used to denote access to, and execution of, system level functions, with two main aspects:
- raised execution privilege level to operate low level facilities, including the hardware, and
- exclusive access to mutate protected data, via defined corresponding (API) function calls.
In Oberon RTK, we want to use the trap concept for protecting the kernel data when accessing and executing kernel functions. Other comparable modules can possibly use the same approach.
A related trap concept is used by the Astrobe compiler: a failed run-time check will result in the immediate execution of the SVC exception, which in turn is handled by RuntimeError.errorHandler
. As with the trap handlers described here, triggering the error trap is synchronous with the faulty program code.
Raised Privilege Level
Without any operating system support, on an embedded ARM MCU, we usually have two privilege levels, Unprivileged and Privileged (no kidding), as implemented in the hardware. With the Cortex-M0/M0+ (ARMv6), without a corresponding architecture extension,1 all code is always running in privileged mode, hence there’s no need to raise it for access to all register addresses and privileged registers and operations.
Other Cortex-M MCUs (ARMv7) offer unprivileged and privileged execution modes, however with Astrobe all thread mode code always runs privileged, too, unless the programmer changes that: the MCU starts from reset in privileged thread mode, so changing to running unprivileged means explicit measures – default is privileged.
Exception handlers execute in handler mode, all other code in thread mode. An exception handler always executes in privileged mode, as does thread mode code on the M0/M0+, as outlined above, but code running in thread mode on the M3 and above can be privileged or unprivileged. For clarity, thread mode here denotes a mode of the MCU; a kernel thread is a different concept, even though kernel threads usually do run in MCU thread mode.2
As an aside for now, to add some more fun, there’s another, orthogonal concept: thread mode code can use the main stack (MSP) or the process stack (PSP), handler mode code always uses the main stack. Exception stacking can happen in the process stack or the main stack. We’ll encounter that further down when discussing parameter passing.
Exclusive Protected Data (or Functionality) Access
Access to protected (kernel) data requires synchronisation among kernel threads, as well as with exception handlers. If we access the protected data exclusively via exceptions – the trap handlers –, and ensure exclusivity via a defined exception priority scheme, where all code that needs to mutate the protected data cannot preempt a currently running trap handler, the MCU’s Nested Vectored Interrupt Controller (NVIC) will do the legwork for us in hardware.
Trap (Exception) Handlers
As we have observed with the above test cases, the NVIC implements and enforces clear rules of how different exception handlers interact or interfere with each other, which allows to implement both aspects of traps, namely raising the privilege level as well as exclusive write access to data, using exceptions, without resorting to other synchronisation methods.
The basic concept is simple and straight-forward:
- Identify
- the protected data: to be protected from being mutated at the same time by different threads or exceptions handlers, as well as
- the requesting or triggering code: the potentially mutating sections of code, usually via specific procedures.
- Trap priority: define an exception priority level where no preemption of the trap handler by any requesting code will occur. If the requesting code runs in thread mode, any trap priority will do, but if it runs in handler mode, the trap priority must be higher than the requesting handler priority.
- Trap handler: implement and configure one or more exceptions handlers to mutate the data, running at the trap priority level. Only the trap handlers are allowed to mutate the protected data.
- Trigger the trap handlers from code in thread or handler mode. This is a synchronous operation with respect to the triggering code, unlike an asynchronous interrupt triggered by the hardware. An asynchronous interrupt can trigger a trap, but from the perspective of its handler this will be a synchronous operation.
Trap Handler Exception
The SVC system exception would be the obvious candidate to use for traps as described here, but Astrobe uses it for run-time error trapping. While this could easily be resolved, ie. using the SVC exception for both trap handlers and run-time error trapping, it’s useful to have run-time error detection also in trap handlers, and SVC exceptions cannot be used within the SVC exception handler.3
However, of the 32 interrupts of the RP2040 and its NVIC, only 26 are actually wired ({0..25}), so six ({26..31}) can be used for this purpose.
We’ll use interrupt 26 here for the test traps. The trap interrupt is triggered by setting it to pending state from software. Since, according to the concept rules, the trap handler has higher priority than the requesting code, it will execute immediately.
Note the disadvantage of using an interrupt this way: we need two CPU registers to set it to pending, see below.
Test Cases for the Trap Handler Triggered by Handler Mode Code
In the following test cases, both handlers ah0
and ah1
attempt to get access to the protected data, by triggering th0
. Both ah0
and ah1
need to be of lower prio than th0
, according to the concept rules.
Before triggering the trap interrupt, we store two (randomly chosen) parameter “marker” values in registers R3 and R12, respectively, which are read and registered by the trap handler th0
. Together with the timing data, we can then assess if the lockout works, and if the trap handler executions are actually initiated by the corresponding requesting code ah0
or ah1
, respectively.
We’ll look at passing parameters to trap handlers later.
Test Case 4
ah1
is triggered whileah0
runs, but beforeah0
triggersth0
.ah0
andah1
have the same priority.
test case: 4
rec id int prio alarm begin p-th t-th end rtm p0 p1
0 ah0 1 2 10000 10000 10001 10112 10226 226 13 -13
1 th0 26 1 -- 10113 -- -- 10225 112 13 -13
2 ah1 0 2 10050 10226 10227 10340 10455 229 42 -42
3 th0 26 1 -- 10342 -- -- 10455 113 42 -42
Additional relevant data columns:
- p-th: time of the start to prepare the trap handler (parameters);
- t-th: time just before the trap handler is triggered;
- p0: parameter 0, as passed to the trap handler by
ah0
andah1
, and as read by the trap handlerth0
, respectively; - p1: parameter 1, analogously.
Observations:
ah0
fires and starts to run at10000
, triggersth0
at10112
, which starts to run at10113
, and ends at100225
;ah1
fires at10050
, but does not preemptah0
(same prio) and neitherth0
(higher prio), so starts to run at10226
, ie. right afterth0
as triggered byah0
;- importantly, each
th0
run is uninterrupted, correctly as triggered by the requesterah0
orah1
(compare the “marker” parameter data), so any data mutations (or other protected functions) byth0
are safe.
Test Case 5
ah1
is triggered whileth0
triggered byah0
runs.ah0
andah1
have the same priority.
test case: 5
rec id int prio alarm begin p-th t-th end rtm p0 p1
0 ah0 1 2 10000 10000 10001 10112 10226 226 13 -13
1 th0 26 1 -- 10113 -- -- 10225 112 13 -13
2 ah1 0 2 10150 10226 10227 10340 10454 228 42 -42
3 th0 26 1 -- 10341 -- -- 10454 113 42 -42
Observations:
- again,
ah1
cannot preempt runningth0
, as the latter runs at higher prio than the former.
Test Case 6
ah1
andah0
are triggered at the same time.ah0
andah1
have the same priority.
test case: 6
rec id int prio alarm begin p-th t-th end rtm p0 p1
0 ah1 0 2 10000 10000 10001 10113 10227 227 42 -42
1 th0 26 1 -- 10114 -- -- 10227 113 42 -42
2 ah0 1 2 10000 10227 10229 10341 10454 227 13 -13
3 th0 26 1 -- 10341 -- -- 10454 113 13 -13
Observations:
- since
ah1
is assigned a lower interrupt number, it takes precedence overah0
with equal priority when triggered at the same time; th0
runs accordingly.
Test Case 7
ah1
is triggered whileah0
runs, but beforeah0
triggersth0
.ah0
andah1
have different priority.
test case: 7
rec id int prio alarm begin p-th t-th end rtm p0 p1
0 ah0 1 3 10000 10000 10001 10343 10456 456 13 -13
1 ah1 0 2 10050 10050 10051 10164 10279 229 42 -42
2 th0 26 1 -- 10166 -- -- 10278 112 42 -42
3 th0 26 1 -- 10343 -- -- 10456 113 13 -13
Observations:
ah1
preemptsah0
, and triggers “its” trap handler runth0
;ah0
will continue afterth0
triggered byah1
terminates;- the state of
ah0
is saved and restored by stacking/un-stacking ofah1
.
Test Case 8
ah1
is triggered whileth0
triggered byah0
runs.ah0
andah1
have different priority.
test case: 8
rec id int prio alarm begin p-th t-th end rtm p0 p1
0 ah0 1 3 10000 10000 10001 10113 10455 455 13 -13
1 th0 26 1 -- 10113 -- -- 10226 113 13 -13
2 ah1 0 2 10150 10226 10227 10340 10455 229 42 -42
3 th0 26 1 -- 10342 -- -- 10454 112 42 -42
Observations:
ah1
cannot preemptth0
triggered byah0
.
Test Cases for the Trap Handler Triggered by Thread Mode Code
In the following test cases, both handler ah1
and thread mode code tt0
attempt to get access to the protected data, by triggering th0
. Both ah1
and tt0
need to be of lower prio than th0
, according to the concept rules. tt0
implicitly has the lowest priority, ie. lower than any exception handler.
This scenario reflects, for example, when we have a kernel thread enabling another (tt0
), while the SysTick timer handler does its spiel (ah1
), both wanting to mutate the corresponding kernel data.
Test Case 9
ah1
is triggered whilett0
runs, but beforett0
synchronously triggersth0
.tt0
is thread mode code and therefore has no priority.
test case: 9
rec id int prio alarm begin p-th t-th end rtm p0 p1
0 tt0 - - -- 10000 10001 10342 10455 455 13 -13
1 ah1 0 2 10050 10050 10051 10164 10279 229 42 -42
2 th0 26 1 -- 10166 -- -- 10278 112 42 -42
3 th0 26 1 -- 10342 -- -- 10455 113 13 -13
Observations:
ah1
preemptstt0
, since its priority is implicitly higher than the thread mode code;tt0
will continue afterth0
triggered byah1
terminates;- the state of
tt0
is saved and restored by stacking/unstacking ofah1
.
Test Case 10
ah1
is triggered whileth0
triggered bytt0
runs.tt0
is thread mode code and therefore has no priority.
test case: 10
rec id int prio alarm begin p-th t-th end rtm p0 p1
0 tt0 - - -- 10000 10000 10111 10454 454 13 -13
1 th0 26 1 -- 10112 -- -- 10224 112 13 -13
2 ah1 0 2 10150 10224 10225 10338 10454 230 42 -42
3 th0 26 1 -- 10340 -- -- 10453 113 42 -42
Observations:
- the running
th0
as triggered bytt0
cannot be preempted byah1
, which has to wait untilth0
ends.
Test Case 11
ah1
and tt0
are triggered, or start at, respectively, at the same time.
test case: 11
rec id int prio alarm begin p-th t-th end rtm p0 p1
0 ah1 0 2 10000 10000 10001 10114 10229 229 42 -42
1 th0 26 1 -- 10116 -- -- 10228 112 42 -42
2 tt0 - - -- 10230 10230 10341 10454 224 13 -13
3 th0 26 1 -- 10342 -- -- 10454 112 13 -13
Observations:
- basically the same as test case 10, there’s nothing special with starting at the same time when considering a handler and thread mode code.
Passing Parameters to a Trap Handler
Overview
To use a trap handler for system functions, we want to be able to pass parameters. For example, in a generalised set-up, we only have one kernel (or OS) trap, which will execute the requested function based on a corresponding selection parameter. In the kernel, where we use a trap to put a thread on the run-queue, we need pass along the id of the thread.
An exception handler is a parameterless procedure, whose address is selected from the vector table by the hardware exception mechanism, and “called” by putting that address into the program counter. There’s no concept of parameters. Hence we need to pass the parameters in a different fashion than to non-handler procedures.
Basically, we need to prepare the parameter data in a suitable storage location, before triggering the handler, and the handler will then pick the data up when it runs.
Module Variables vs. Registers and Stack
Parameter storage in module variables is out of question, since that is not thread-safe. For example, if a thread were to start to prepare the parameter data in module variables, with the intention to then trigger the trap handler, but an exception handler is triggered during this preparation process, which also intends to trigger the same trap handler, and also stores its parameters in the same module variables, the parameters put there by the thread would be corrupted. Of course, we could disable all exceptions during the parameter preparation, but disabling all exceptions should be avoided if there are better alternatives, not least since the whole trap concept attempts to avoid just that. :)
The alternative is to use CPU registers and the stack. If the thread in the above example stores the parameters in registers, the incoming exception would push the values of registers r0
to r3
, and r12
, onto the stack upon exception entry (stacking), and restore them upon exception exit (unstacking). Hence, storing the parameters in registers protects against incoming exception handlers, without disabling them.
Parameter Pickup by the Trap Handler
The question is: where shall the trap handler pick up its parameters, from the registers directly, or the stack.
In the above test cases, with the “marker” parameters stored in R3 and R12, the trap handler th0
picked them up directly from the registers. This has worked so far, since we don’t have any other exception handlers than the ones used in our test framework so far.
This would even work in general, if there weren’t a pesky edge case: if an accepted exception, say the trap handler, is in the process of entry stacking, and another exception becomes pending, and has higher priority than the original one, the higher prio exception gets actually executed first, and only thereafter the original one (late arrival optimisation).
Basically, the higher prio exception usurps the exception execution. Importantly, it runs with the same stacking as the original exception, ie. does not do its own stacking. After the higher prio exception terminates, the original exception runs, but the potential problem is that the higher prio exceptions may have changed the registers’ contents.
Consequently, it’s not safe to pick up the parameters from the registers directly, they need to be read from the stack.
Test Cases with Parameters
The following test case 12, which shows how an incoming high prio exception can interfere with directly reading the parameters from the registers, is a bit shaky: the high prio exception ah2
has to fire exactly when the stacking for the trap handler th0
, triggered by ah0
, happens. That’s 15 clock cycles. The register stacking and fetching of the exception vector run purely in hardware, without reading any instructions, so with a 125 MHz clock we have a window of 15 * 8 ns = 120 ns, ie. about 1/8 of a microsecond. Hard to hit with a microseconds timer.
I have fiddled with the while loop in ah0
to get this working, but unfortunately it’s not a stable, repeatable test case, and on a different MCU specimen, it will likely not be the same. Then again, the beauty of falsification is that we only need one failed experiment4 to topple a theory or concept. :)
Test Case 12: Read from Registers, ah2 Interfering
ah0
prepares and triggersth0
, while the “trouble maker”ah2
fires exactly when the exception entry forth0
happens.ah2
changes one of the parameter registers to-99
.th0
reads the parameters directly from registers.
test case: 12
rec id int prio alarm begin p-th t-th end rtm p0 p1
0 ah0 1 2 10000 10000 10001 10112 10251 251 13 -13
1 ah2 2 0 10113 10113 -- -- 10138 25 - -
2 th0 26 1 -- 10138 -- -- 10250 112 -99 -13
Observations:
th0
reads a wrong parameter directly from the register, as “injected” byah2
.
Test Case 13: Read from Stack, ah2 Not Interfering
- Same as test case 13, but:
- Now using
th1
: read the parameters from the stack.
test case: 13
rec id int prio alarm begin p-th t-th end rtm p0 p1
0 ah0 1 2 10000 10000 10001 10112 10256 256 13 -13
1 ah2 2 0 10113 10113 -- -- 10138 25 - -
2 th1 26 1 -- 10138 -- -- 10256 118 13 -13
Observations:
th1
reads the correct value from the stack, where it was put by the exception entry stacking forth1
.
Before or after the hardware exception entry time window, we just have a normal higher prio exception, with regular stacking and unstacking.
Test Case 14
ah0
prepares and triggersth1
, trouble makerah2
fires and runs during this preparation.ah2
changes one of the parameter registers to-99
.th1
reads the parameters from the stack.
test case: 14
rec id int prio alarm begin p-th t-th end rtm p0 p1
0 ah0 1 2 10000 10000 10001 10138 10257 257 13 -13
1 ah2 2 0 10050 10050 -- -- 10075 25 - -
2 th1 26 1 -- 10138 -- -- 10256 118 13 -13
Observations:
- Normal stacking and unstacking for
ah2
whileah0
sets the parameters forth1
, without impact on the interaction ofah0
andth1
.
Test Case 15
ah0
prepares and triggersth1
, trouble makerah2
fires and runs during the execution ofth1
.ah2
changes one of the parameter registers to-99
.th1
reads the parameters from the stack.
test case: 15
rec id int prio alarm begin p-th t-th end rtm p0 p1
0 ah0 1 2 10000 10000 10001 10112 10257 257 13 -13
1 th1 26 1 -- 10113 -- -- 10256 143 13 -13
2 ah2 2 0 10150 10150 -- -- 10175 25 - -
Observations:
- Normal stacking and unstacking for
ah2
whileth1
runs, without impact on the interaction ofah0
andth1
.
Getting the Parameters from the Stack
When reading the parameters from the stack, we need to pay attention if both the process and the main stack pointers are used, ie. PSP and MSP, respectively. If the PSP is used, the stacking in thread mode happens on the process stack, but the handler itself uses the main stack. If only the MSP is used, or when a handler preempts another, also the stacking happens there.
The value EXC_RETURN
in the link register tells us what’s what.
The code in th1
demonstrates this:
PROCEDURE th1[0];
CONST Rv0offset = 12; Rv1offset = 16; LR = 14; SP = 13; PSPflag = 2;
VAR v0, v1, regAddr: INTEGER;
BEGIN
IF PSPflag IN BITS(SYSTEM.REG(LR)) THEN
SYSTEM.EMIT(MCU.MRS_R03_PSP);
regAddr := SYSTEM.REG(3)
ELSE
regAddr := SYSTEM.REG(SP) + 16
END;
SYSTEM.GET(regAddr + Rv0offset, v0);
SYSTEM.GET(regAddr + Rv1offset, v1);
(* ... *)
END th1;
- Remember, we use R3 and R12. Their offsets in the stack frame after stacking are
Rv0offset = 12
andRv1offset = 16
, respectively. - Bit 2 in the value EXC_RETURN in the link register tells us if the PSP is used, in which case we’re getting its value from the special register using a
SYSTEM.EMIT
instruction, which reads it into r3 (MRS
instruction, privileged). - If only the MSP is used, we simply correct the current SP value by 16, accounting for the procedure’s local variables, and for the link register, which has been pushed by the prologue:
. 728 02D8H 0B500H push { lr }
. 730 02DAH 0B083H sub sp,#12
- Now
regAddr
points to the bottom of the stacked registers for both cases, and we can get the stacked register values at their respective offsets.
If we only consider all the requesting code, and if the exception scheme outlined above is adhered to, we could read the registers directly. It’s any other exception outside this framework that can disrupt it, if they are assigned priorities that are equal or higher than the trap priority. Hence, in libraries we better read the parameters from the stack, as we never know how they will be used in a specific control program.5
Which Registers Can Be Used?
Registers r0 to r3, and r12
First, lets look at the registers that are stacked by the hardware during the exception entry sequence.
The test code for all the above test cases uses registers r3
and r12
for the two parameters. r0
to r2
are used by the test code logic, but in a real program, we could arrange this differently, so the question is, which registers are even available at maximum.
Since we need to set the trap handler exception to pending, we can start there to explore. Obviously this happens right after the parameters are set, so any register used for this operation is not free for parameters.
Typically:
VAR v: INTEGER
SYSTEM.PUT(MCU.NVIC_ISPR, {v})
. 1162 048AH 09800H ldr r0,[sp]
. 1164 048CH 02101H movs r1,#1
. 1166 048EH 04081H lsls r1,r0
. 1168 0490H 04801H ldr r0,[pc,#4] -> 1176
. 1170 0492H 06001H str r1,[r0]
r0
and r1
are used for setting the trap exception to pending in the NVIC.
Lets look at setting the registers.
VAR v,w, x, y, z: INTEGER;
SYSTEM.LDREG(12, v);
. 1308 051CH 09800H ldr r0,[sp]
. 1310 051EH 04684H mov r12,r0
SYSTEM.LDREG(3, w);
. 1312 0520H 09801H ldr r0,[sp,#4]
. 1314 0522H 04603H mov r3,r0
SYSTEM.LDREG(2, x);
. 1316 0524H 09802H ldr r0,[sp,#8]
. 1318 0526H 04602H mov r2,r0
SYSTEM.LDREG(1, y);
. 1320 0528H 09803H ldr r0,[sp,#12]
. 1322 052AH 04601H mov r1,r0
SYSTEM.LDREG(0, z);
. 1324 052CH 09804H ldr r0,[sp,#16]
Setting the parameter registers in this order would allow us to use r0
to r3
, plus r12
if we’re adventurous (r12
is declared as “reserved” by Astrobe).
But we’re limited by the use of r0
and r1
for pending the trap handler, so only r2
, r3
, and r12
remain of the ones stacked by the hardware upon exception entry.
Note: using the SVC
exception, we could use all registers r0
to r3
, since we would use SYSTEM.EMITH
to issue the SVC
instruction right after preparing the parameters.
Registers r4 to r7
Now let’s look at the registers not stacked by the hardware upon exception entry, namely r4
to r11
, of which only r4
to r7
are actually used by the Astrobe for Cortex-M0 compiler. Astrobe for Cortex-M3 and up allocates all registers up to r11
.
The Astrobe compiler adds all registers upward from r4
to the push
operation in the prologue, and to the pop
in the epilogue, in case they are being altered by the code of the exception handler.
Caveat: this does not include any registers set via SYSTEM.LDREG
, or via code inserted by SYSTEM.EMIT
or SYSTEM.EMITH
that modifies any register.
If the handler calls any procedure, all registers r4
to r7
(for M0) are pushed, regardless if the code actually alters them.
Put the other way around, we can be sure that registers r4
and r7
will always “survive” any exception, either because they are not altered, or because they are saved and restored. As this happens in software, the above late arrival edge case is not an issue: if the higher prio interrupt gets pending during the stacking of the lower prio one, and gets precedence and takes over, it will save and restore r4
to r7
as needed, or leave them unaltered.
Consequently, we can use r4
to r7
to pass our parameters. In the handler we read them directly from the registers.
However, due to the above caveat regarding SYSTEM.LDREG
and SYSTEM.EMIT
, if we write an exception handler that will trigger a trap handler, and we put the parameters into r4
to r7
, we need to save and restore these registers ourselves. The same is true if the trap handler uses SYSTEM.LDREG
or SYSTEM.EMIT
to mutate registers r4
to r7
.
Registers r8 to r11
The Astrobe for Cortex-M0 compiler never allocates registers r8
to r11
. Consequently, they will never be altered by an exception handler (unless we use SYSTEM.LDREG
or SYSTEM.EMIT
). An exception handler does not even have to save and restore them (M0), but we cannot know if another module gets “creative” about these registers, as we are here, so it’s always better to save and restore.
Therefore, these register can be used for passing parameters, and the handler can read them directly from the registers.
The compiler allocates registers r8
to r11
for the M3, M4, and M7 MCUs. However, they will be saved and restored by corresponding push
and pop
operations in case they are altered. So with the same reasoning and conditions as for registers r4
to r7
for the M0 MCU, these registers can be used for passing parameters also for the M3, M4, and M7.
What About Using Only the Stack?
The stack is a memory space that is always preserved by any configuration and combination of exceptions. Could we not save ourselves some conceptual and implementation headaches by passing the parameters for a trap handler on the stack? Any number of parameters could be passed this way, in a unified way, without the need to consider the different register ranges, and without requiring to set, and save/restore any registers in any case.
Here’s a typical piece of code triggering a trap handler with parameters passed in registers r2
and r3
.
PROCEDURE tt;
CONST R2 = 2; R3 = 3; IntNo = 26;
VAR v0, v1: INTEGER;
BEGIN
(* determine v0 and v1 *)
SYSTEM.LDREG(R2, v0);
SYSTEM.LDREG(R3, v1);
SYSTEM.PUT(MCU.NVIC_ISPR, {IntNo}})
END tt;
The handler would then retrieve the parameters from the stack using the code above, from the stack frame created by the trap handler entry stacking.
However, v0
and v1
are already on the stack of tt
, at addresses SP + 0 and SP + 4, respectively. In most cases, the parameters for the trap handler are, or can be set there. Hence, the trap handler can access these local variables on the stack.
As easy as the trap handler can determine the base address of the stacked registers, it can also determine the stack pointer value at the time it was triggered – the corresponding address is right above the stack frame created by the exception stacking, possibly corrected for the double-word (eight bytes) stack alignment.
We can even set a rule that the trap handler parameters always be the first variables in the requesting code’s VAR declaration, which is usually a specific procedure. Any position will do, though, we just need to use the right stack pointer offset in the trap handler. Code maintenance will be easier if we follow some rule as outlined above.
Here’s the code to access the parameters on the triggering code’s stack:
PROCEDURE th3[0];
CONST
LR = 14; SP = 13; PSPflag = 2; AlignFlag = 9;
StackFrameSize = 32; PSRoffset = 28;
VAR v0, v1, regAddr, parAddr: INTEGER;
BEGIN
IF PSPflag IN BITS(SYSTEM.REG(LR)) THEN
SYSTEM.EMIT(MCU.MRS_R03_PSP);
regAddr := SYSTEM.REG(3)
ELSE
regAddr := SYSTEM.REG(SP) + 16
END;
parAddr := regAddr + StackFrameSize;
IF SYSTEM.BIT(regAddr + PSRoffset, AlignFlag) THEN
INC(parAddr, 4)
END;
SYSTEM.GET(parAddr, v0);
SYSTEM.GET(parAddr + 4, v1);
(* ... *)
END th3;
The corresponding triggering procedure now simply is:
PROCEDURE tt1;
CONST R2 = 2; R3 = 3; IntNo = 26;
VAR v0, v1: INTEGER;
BEGIN
(* determine v0 and v1 *)
SYSTEM.PUT(MCU.NVIC_ISPR, {IntNo}})
END tt1;
Passing parameters this way is a bit opaque regarding the mechanism. Needs a clear comment.
The following test case uses that method.
Test Case 17
- Thread mode code
tt1
sets the parameters as local variables on its stack. th3
reads the parameters from the stack.ah2
changes one of the parameter registers to-99
.
test case: 17
rec id int prio alarm begin p-th t-th end rtm p0 p1
0 tt1 - - -- 10001 10009 10124 10268 267 13 -13
1 th3 26 1 -- 10129 -- -- 10267 138 13 -13
2 ah2 2 0 10150 10150 -- -- 10175 25 - -
Observations:
- Normal stacking and unstacking for
ah2
whileth3
runs, without impact on the interaction oftt1
andth3
Bottom Line
This document describes the concept and implementation of traps. A trap is an exception, with a corresponding trap handler, as a means to grant exclusive mutating access to shared data, or other protected functionality or hardware. It also raises the privilege level, which isn’t relevant for the Cortex-M0/M0+, but can be for the M3 and up.
Priority scheme
The concept relies on defining an exception priority scheme, and then letting the NVIC arbitrate the mutual access to the protected data, without the need for other synchronising or lock-out mechanics:
- prio 0: SVC for run-time error handling
- prio 1: trap handlers: mutate protected data (or access other privileged functionality)
- prio 2 or 3: exceptions requesting mutating access to protected data via trap
- no prio: thread code requesting mutating access to protected data via trap
For clarity, exception handlers that do not request access to the protected data can be of any priority. If we want to have run-time error reporting in these exception handlers, they should avoid priority level 0.
Parameters
We can pass parameters to the trap handler using different methods
- via
r2
,r3
, and (if adventurous)r12
, and read the values from the stack, assuming we use an unwired interrupt, hencer0
andr1
are not available; usingSVC
, we could also user0
andr1
; - via
r4
tor7
, and read the values from the registers - via
r8
tor11
, and read the values from the registers - via the stack only, and read the values from the stack
Or a combination thereof.
Results
- Not shown here, but results of the trap handler could be passed back to the triggering code using the same concepts as for passing parameters.
Output Terminal
See Set-up, one-terminal set-up.
Build and Run
Build module TrapHandlers
with Astrobe, and create and upload the UF2 file using abin2uf2
.
Set Astrobe’s memory options as listed, and the library search path as explained.
Repository
-
Also the Cortex-M0/M0+ can offer both levels of privilege with the addition of a corresponding architecture extension, but the RP2040 does not. ↩︎
-
We could design and implement a kernel where all threads are executed in handler mode. It’s an interesting concept, which I may one day attempt to implement. ↩︎
-
Maybe to be reconsidered. If (or when) the kernel trap handlers become stable, maybe we could get away with the hard fault error messages resulting of using SVC inside an SVC handlers. SVC is a nice concept for protected system level functions. ↩︎
-
Or successful experiment, depending on the point of view. ↩︎
-
Which means the prototype implementation of kernel-v2 at the time of this writing is wrong. One of the motivators for this test program and documenting the results here was to figure all this out. ↩︎