Among all compiler optimizations studied in the past, register allocation is considered the most important one. It intends to use a processor's registers most efficiently in order to minimize slow main memory accesses. Due to the increasing speed gap between processors and memories, accesses to physical processor registers (PHREGs) are orders of magnitudes faster than memory accesses. However, memory accesses cannot be totally avoided during register allocation, since the amount of temporary variables (aka. virtual registers, VREGs) at a certain place in a program can exceed the number of available PHREGs. In such a situation, spill code is inserted swapping a register out to memory and back.
Currently, register allocators usually decide heuristically where to insert spill code. Due to a lack of precise timing models, the compiler is unaware of the impact of generated spill code on a program's execution time. Thus, badly placed spill code can have a dramatic impact on a program's WCET.
Our WCC compiler is the first one to include techniques for WCET-aware register allocation. The main contributions of our register allocators are that they
- explicitly use WCET data during optimization and that they
- inherently cope with the inherent instability of the Worst-Case Execution Path (WCEP) defining the WCET.
In order to design a WCET-aware register allocator, the worst-case execution frequencies per node in the control flow graph need to be known. Unfortunately, static WCET analysis cannot be applied to the assembly program P serving as input for register allocation to obtain this data. This is because P is not an executable program since it uses VREGs instead of PHREGs. Static WCET analysis relies on executable and thus register-allocated code in order to correctly take the mutual influences of P and the processor hardware into account. Hence, there are cyclic dependencies between register allocation and WCET analysis which need to be broken in order to obtain a WCET-aware register allocator.
Conventional register allocators follow the strategy to keep as many VREGs in PHREGs as possible, and to move a VREG to memory only if this is really necessary. The traditional way of thinking thus assumes optimistically that all VREGs fit into the physical register file and that only exceptionally, a VREG is allocated to memory. This way of thinking is also reflected by the traditional graph coloring algorithm of Briggs, since it first tries to remove all colorable nodes from the interference graph, and only after that, a decision on one single potential spill is taken.
However, this traditional approach is not applicable for a WCET-aware register allocator. The intermediate code produced in the course of all the steps and iterations of traditional graph coloring is not executable and thus, no WCEP can be determined. The first stage where traditional graph coloring produces executable and thus WCET analyzable code is when register allocation is just finished. And at the end of the entire procedure, it does not make sense to reason about WCEPs, since all spill decisions are already taken.
For WCET-aware graph coloring, the opposite way of thinking is proposed here: it is pessimistically assumed that all VREGs are kept in memory. During register allocation, VREGs are thus moved from memory to PHREGs. This approach has the advantage that the intermediate code generated in the course of register allocation is always executable so that WCEPs can be determined.
Our WCET-aware graph coloring allocator bases on the following two key characteristics:
- Due to its instability, the WCEP has to be recomputed regularly. Since it is practically infeasible to recompute the WCEP after each individual spill decision, WCEPs are recomputed after deciding on the allocation and spilling of one single basic block.
- For a given WCEP, that basic block b leading to the highest execution of spill code in the worst case is chosen for allocation. All VREGs v of b are sorted by the number of occurrences of v in b. This precedence list of VREGs is passed to the standard graph coloring register allocator proposed by Briggs selecting that register with least occurrences in the list as potential spill, if necessary.
The above figure shows the impact of our WCET-aware register allocation algorithm on the WCET estimates of 46 different applications from the MRTC, MediaBench, UTDSP and DSPstone benchmark suites. These benchmarks are very different: some of them are quite small filter and sorting routines, others are large and complex audio / video codecs. However, all benchmarks have in common that register pressure within their assembly code is high so that spill code needs to be generated. The above diagram shows the WCET of all benchmarks resulting from our WCET-aware register allocator as a percentage of the WCET resulting from Briggs traditional graph coloring allocator spilling that node with highest degree in the interference graph.
As can be seen, our WCET-aware register allocator is able to reduce the WCETs considerably for all benchmarks. For qurt, the WCET after WCET-aware register allocation is 93.1% of the original WCET, i.e., a WCET reduction of 6.9% was achieved. For all other benchmarks, even higher gains were observed. The largest gain in terms of WCET was achieved for spectral where the WCET after our register allocation amounts to only 24.1% of the original WCET, leading to savings of 75.9%.
On average over all 46 considered benchmarks, a WCET of 68.8% of the original worst-case execution time estimates was obtained, corresponding to a total average WCET reduction of 31.2%.
- A compiler framework for the reduction of worst-case execution times
- WCET-aware Register Allocation based on Graph Coloring
- ILP-based Register Allocation for Worst-Case Execution Time Minimization (in German)
- ILP-based Register Allocation exploiting WCET Data (in German)
- WCET-aware Register Allocation based on Integer-Linear Programming