Fault Injection

Let us head back to our example. Now, we have to count the number of accesses to state to schedule our first experiment: triggering one false alarm for overload. Under normal ircumstances, each loop-run implies five read operations (lines 18, 21, 25, 30, 32). We select the third iteration and need to flip the least significant bit right before the 14th access. We prepare the run as shown by the first line of Fig. 1. The output also demonstrates the success of the fault injection as we notice one message Elevator overloaded. In a golden run, for example, one can count the number of accesses to monitored variables to spot significant mismatches in the number of expected and observed read accesses. As, under normal conditions, the program runs indefinitely, we have to terminate it by Ctrl-C.

At the end, FITIn prints additional statistics to the output: the overall number of counted variable accesses resulting from load operations from the memory, the accesses to variables that have been annotated by the user until the bit-flip took place, and the overall number of instructions executed, excluding the ones of Valgrind, FITIn, and all function calls the outside of the IR.

In the next experiment, we attempt to put the program into an invalid live-loop state: We select the fourth iteration to be last one correctly working and we will perform a bit flip before the read in line 21, which is the 22th access to state; we flip the second bit. We expect the program to return four reports and then mute for the rest of its life. After adjusting the parameters, we notice that our elevator will simply pause that second and continues to operate normally! The reason is that state is apparently reloaded from memory each time. As we perform the bit flip on a temporary representation, the flip does not persist and vanishes in the next iteration.
In order to persist the flip, we need to call persist_flip(state, {2}) from inside of flip_value of the control script. Now, the bit flip successfully induces a live lockā€”the output stops and we have to terminate the program.
Compared to the first experiment, the last line of Fig. 2 proofs that there has been much more activity after the last output. If working under less deterministic circumstances, we can tell FITIn to stop the execution on exceeding a specified limit.
In this experiment, we could terminate the elevator prematurely after 105000 instructions by adding the following callback code to the control script: