In the last edition, minor changes were made. In particular, the section on Gödel numbering has been rewritten and a glossary of terms has been added. In the third edition, 2013, the eight chapters on computability theory were complemented by a short introduction to computational complexity theory. The added chapter provides a brief presentation of the central open question in complexity theory which is one of the millenium price problems in mathematics asking roughly whether each problem whose solution can be verified in polynomial time can also be solved in polynomial time. The chapter includes the well-known result of Stephen Cook and Leonid Lewin that the satisfiabilty problem is NP-complete and also its proof from scratch. In the fourth and fifth editions, some small amendments have been make.} } @book{Zimm19a, Author = {Karl-Heinz Zimmermann}, Title = {Curves, Cryptosystems and Quantum Computing.}, Year = {(2019).}, Month = {July}, Note = {khzimmermann, AEG}, Publisher = {Hamburg University of Technology:}, Series = {https://tore.tuhh.de/handle/11420/2890}, Edition = {1.}, Isbn = {10.15480/882.1714}, Howpublished = {19-65 Zimm19 TUBdok}, Abstract = {This book gives an introduction to the theory of elliptic curves and their cryptographic use. In the last part, the principles of quantum computing are introduced and the famous algorithms of Shor and Grover are presented.} } @inproceedings{MLF19, Author = {Kateryna Muts, Arno Luppold and Heiko Falk}, Title = {Compiler-Based Code Compression for Hard Real-Time Systems.}, Year = {(2019).}, Pages = {72-81}, Month = {May}, Note = {kmuts, aluppold, hfalk, multiopt, ESD, WCC}, Series = {201905-scopes-muts.pdf}, Address = {St. Goar / Germany}, Isbn = {10.1145/3323439.3323976}, Howpublished = {19-85 MLF19 SCOPES}, Booktitle = {

In the second edition, 2012, minor changes were made. In particular, the section on Gödel numbering has been rewritten and a glossary of terms has been added. In the third edition, 2013, the eight chapters on computability theory were complemented by a short introduction to computational complexity theory. The added chapter provides a brief presentation of the central open question in complexity theory which is one of the millenium price problems in mathematics asking roughly whether each problem whose solution can be verified in polynomial time can also be solved in polynomial time. The chapter includes the well-known result of Stephen Cook and Leonid Lewin that the satisfiabilty problem is NP-complete and also its proof from scratch. In the fourth and fifth editions, some small amendments have been make.} } @inproceedings{MLF18b, Author = {Kateryna Muts, Arno Luppold and Heiko Falk}, Title = {Multi-Objective Optimization for the Compiler of Hard Real-Time Systems.}, Year = {(2018).}, Month = {July}, Note = {kmuts, aluppold, hfalk, multiopt, ESD, WCC}, Address = {Bordeaux / France}, Howpublished = {18-70 MLF18b ISMP}, Booktitle = {

In the second edition, 2012, minor changes were made. In particular, the section on Gödel numbering has been rewritten and a glossary of terms has been added. In the third edition, 2013, the eight chapters on computability theory were complemented by a short introduction to computational complexity theory. The added chapter provides a brief presentation of the central open question in complexity theory which is one of the millenium price problems in mathematics asking roughly whether each problem whose solution can be verified in polynomial time can also be solved in polynomial time. The chapter includes the well-known result of Stephen Cook and Leonid Lewin that the satisfiabilty problem is NP-complete and also its proof from scratch. In the fourth and fifth editions, some small amendments have been make.} } @inproceedings{MSF:2017, Author = {Meß, Jan-Gerd and Schmidt, Robert and Fey, Goerschwin}, Title = {Adaptive Compression Schemes for Housekeeping Data.}, Year = {(2017).}, Note = {gfey, CE}, Howpublished = {17-999 MSF:2017 AEROCONF}, Booktitle = {

We tackle this issue by proposing a cache-aware SPM allocation algorithm based on integer-linear programming which accounts for changes in the worst-case cache miss behavior.} } @article{ZBBDMSTVW16, Author = {Daniel Ziener, Florian Bauer, Andreas Becher, Christopher Dennl, Klaus Meyer-Wegener, Ute Schürfeld, Jürgen Teich, Jörg-Stephan Vogt and Helmut Weber}, Title = {FPGA-Based Dynamically Reconfigurable SQL Query Processing.}, Journal = {

Introduces FPGA technology to software developers by giving an overview of FPGA programming models and design tools, as well as various application examples;

Provides a holistic analysis of the topic and enables developers to tackle the architectural needs for Big Data processing with FPGAs;

Explains the reasons for the energy efficiency and performance benefits of FPGA processing;

Provides a user-oriented approach and a sense for where and how to apply FPGA technology.} } @inproceedings{OLK16, Author = {Dominic Oehlert, Arno Luppold and Heiko Falk}, Title = {Practical Challenges of ILP-based SPM Allocation Optimizations.}, Year = {(2016).}, Pages = {86-89}, Month = {May}, Note = {doehlert, aluppold, hfalk, ESD, emp2, tacle, WCC}, Series = {20160524-scopes-oehlert.pdf}, Address = {St. Goar / Germany}, Isbn = {10.1145/2906363.2906371}, Howpublished = {16-70 OLK16 SCOPES}, Booktitle = {

We collected open-source programs, adapted them to a common coding style, and provide the collection in open-source. The benchmark collection is called TACLeBench and is available from GitHub in version 1.9 at the publication date of this paper. One of the main features of TACLeBench is that all programs are self-contained without any dependencies on standard libraries or an operating system.} } @article{RHF+:2016, Author = {Heinz Riener and Finn Haedicke and Stefan Frehse and Mathias Soeken and Daniel Große and Rolf Drechsler and Goerschwin Fey}, Title = {metaSMT: Focus On Your Application And Not On Solver Integration.}, Journal = {

In the second edition, 2012, minor changes were made. In particular, the section on Gödel numbering has been rewritten and a glossary of terms has been added. In the third edition, 2013, the eight chapters on computability theory were complement ed by a short introduction to computational complexity theory. The added chapter provides a brief presentation of the central open question in complexity theory which is one of the millenium price problems in mathematics asking roughly whether each pro blem whose solution can be verified in polynomial time can also be solved in polynomial time. The chapter includes the well-kn own result of Stephen Cook and Leonid Lewin that the satisfiabilty problem is NP-complete and also its proof from scratch. In the fourth and fifth editions, some small amendments have been make.} } @inproceedings{TRF:2016, Author = {Niels Thole and Heinz Riener and Goerschwin Fey}, Title = {Equivalence Checking on ESL Utilizing A Priori Knowledge.}, Year = {(2016).}, Note = {gfey, CE}, Howpublished = {16-999 TRF:2016 FDL}, Booktitle = {

In the second edition, 2012, minor changes were made. In particular, the section on Gödel numbering has been rewritten and a glossary of terms has been added. In the third edition, 2013, the eight chapters on computability theory were complemented by a short introduction to computational complexity theory. The added chapter provides a brief presentation of the central open question in complexity theory which is one of the millenium price problems in mathematics asking roughly whether each problem whose solution can be verified in polynomial time can also be solved in polynomial time. The chapter includes the well-known result of Stephen Cook and Leonid Lewin that the satisfiabilty problem is NP-complete and also its proof from scratch. In the fourth and fifth editions, some small amendments have been make.} } @article{DF:2015, Author = {Mehdi Dehbashi and Goerschwin Fey}, Title = {Transaction-based online debug for NoC-based multiprocessor SoCs.}, Journal = {

In the second edition, 2012, minor changes were made. In particular, the section on Gödel numbering has been rewritten and a glossary of terms has been added. In the third edition, 2013, the eight chapters on computability theory were complemented by a short introduction to computational complexity theory. The added chapter provides a brief presentation of the central open question in complexity theory which is one of the millenium price problems in mathematics asking roughly whether each problem whose solution can be verified in polynomial time can also be solved in polynomial time. The chapter includes the well-known result of Stephen Cook and Leonid Lewin that the satisfiabilty problem is NP-complete and also its proof from scratch. In the fourth edition, some small amendments have been make.} } @inproceedings{DF:2014f, Author = {Mehdi Dehbashi and Goerschwin Fey}, Title = {Debug Automatisierung für logische Schaltungen unter Zeitvariation mittels Waveforms.}, Year = {(2014).}, Note = {gfey, CE}, Howpublished = {14-999 DF:2014f TUZ}, Booktitle = {

Alignment is the fundamental operation used to compare biological sequences and in this way to identify regions of similarity that are eventually consequences of structural, functional, or evolutionary relationships. Multiple sequence alignment is an important tool for the simultaneous alignment of three or more sequences. Efficient heuristics exist to cope with this problem.

In the thesis, progressive alignment methods and their parallel implementation by GPUs are studied. More specifically, the dynamic programming algorithms of profile-profile and profile-sequence alignment are mapped onto GPU. Wavefront and matrix-matrix product techniques are discussed which can deal well with the data dependencies. The performance of these methods is analyzed. Simulations show that one order of magnitude of speed-up over the serial version can be achieved.

ClustalW is the most widely used progressive sequence alignment method which aligns more closely related sequences first and then gradually adds more divergent sequences. It consists of three stages: distance matrix calculation, guide tree compilation, and progressive alignment. In this work, the efficient mapping of the alignment stage onto GPU by using a combination of wavefront and matrix-matrix product techniques has been studied.

In the hidden Markov model, the Viterbi algorithm is used to find the most probable sequence of hidden states that has generated the observation. In the thesis, the parallelism exhibited by the compute intensive tasks is studied and a parallel solution based on the matrix-matrix product method onto GPU is devised. Moreover, the opportunity to use optimized BLAS library provided by CUDA is explored. Finally, the performance by fixing the number of states and changing the number of observations and vice versa is portrayed.

At the end, general principles and guidelines for GPU programming of matrix-matrix product algorithms are discussed.} } @article{DuZi14b, Author = {Natalia Dück and Karl-Heinz Zimmermann}, Title = {Gröbner bases for perfect binary codes.}, Journal = {

In this paper, we propose a generalized computational model and a comprehensive explicit path analysis that operates on arbitrary directed control flow graphs. We propose simple and yet effective techniques to deal with unstructured control flows and complex flow fact models. The analysis does not require a control flow graph to be mutable, is non-recursive, fast, and provides the means to compute all worst-case paths from arbitrary source nodes. It is well suited for solving local problems and the computation of partial solutions, which is highly relevant for problems related to scheduling and execution modes alike.} } @inproceedings{MFF:2013b, Author = {Jan Malburg and Alexander Finder and Goerschwin Fey}, Title = {Analyse dynamischer Abhängigkeitsgraphen zum Debugging von Hardwaredesigns.}, Year = {(2013).}, Pages = {59-66}, Note = {gfey, CE}, Howpublished = {13-999 MFF:2013b ZUE}, Booktitle = {

In the second edition, 2012, minor changes were made. In particular, the section on Gödel numbering has been rewritten and a glossary of terms has been added. In the third edition, 2013, the eight chapters on computability theory were complemented by a short introduction to computational complexity theory. The added chapter provides a brief presentation of the central open question in complexity theory which is one of the millenium price problems in mathematics asking roughly whether each problem whose solution can be verified in polynomial time can also be solved in polynomial time. The chapter includes the well-known result of Stephen Cook and Leonid Lewin that the satisfiabilty problem is NP-complete and also its proof from scratch.} } @proceedings{SF+:2013, Author = {Lukáŝ Sekanina and Görschwin Fey and Jaan Raik and Snorre Aunet and Richard Ruzicka (editors)}, Title = {16th IEEE International Symposium on Design and Diagnostics of Electronic Circuits & Systems, DDECS 2013, Karlovy Vary, Czech Republic, April 8-10, 2013.}, Year = {(2013).}, Note = {gfey, CE}, Howpublished = {13-999 SF+:2013 DDECS}, Booktitle = {

In this chapter, we describe the integration of a compiler and a WCET analyzer, yielding our WCET-aware compiler WCC. We are then reconsidering standard compiler optimizations for their potential to reduce the WCET, assuming that the WCET is now used as the cost function. Considered optimizations include function inlining, loop unrolling, loop unswitching, register allocation, scratchpad memory allocation, and cache partitioning. For a set of benchmarks, average WCET reductions of up to 40% were observed. The results indicate that this new area of research has the potential of achieving worthwhile execution time reductions for real-time code.} } @inproceedings{RF:2012, Author = {Heinz Riener and Goerschwin Fey}, Title = {Model-Based Diagnosis versus Error Explanation.}, Year = {(2012).}, Pages = {43-52}, Note = {gfey, CE}, Howpublished = {12-999 RF:2012 MEMOCODE}, Booktitle = {

Nowadays, cryptography plays an ever more important role in information security given the countless scenarios in which information exchange requires different levels of privacy, secrecy or reliability. To this end, cryptographic algorithms based on neural synchronization can be used, since mutual learning leads to synchronization much faster than learning by examples.

In this work, a key exchange protocol based on permutation parity machines has been studied. It has been proved that even though the weights used during each learning step are not strongly correlated, synchronization still occurs. Moreover, the lack of correlation among the weights during the synchronization process makes the key exchange protocol robust not only against common attacks, e.g. simple or geometric attacks, but also against attacks based on non-standard schemes, such as majority, genetic or probabilistic attacks.

Permutation parity machines make use of a more complex learning rule than the tree parity machines, especially due to the process of weight assignment. Nevertheless, the simplicity of the network compensates for the complexity of the learning rule in terms of hardware implementation. Additionally, the use of a permutation network based on a linear feedback shift register helps to reduce considerably the complexity in the assignment of the weights during the learning step.

The key exchange protocol based on permutation parity machines does not require lengthy mathematical calculations and so is suitable for implementation by embedded systems where hardware constraints are decisive. Various alternatives of hardware implementations have been considered, including FPGA, RISC MCU, RFID tags and NFC devices.} } @techreport{AEF+12, Author = {Philip Axer, Rolf Ernst, Heiko Falk, Alain Girault, Daniel Grund, Nan Guan, Bengt Jonsson, Peter Marwedel, Jan Reineke, Christine Rochange, Maurice Sebastian, Reinhard von Hanxleden, Reinhard Wilhelm and Wang Yi}, Title = {Building Timing Predictable Embedded Systems.}, Year = {(2012).}, Number = {(#2012-013),}, Month = {July}, Note = {hfalk, ESD, tacle, WCC}, Series = {20120703-report-2012-013-axer.pdf}, Address = {Uppsala / Sweden}, Howpublished = {12-45 AEF+12 Uppsala}, Type = {Technical Report}, School = {Uppsala University}, Institution = {Department of Information Technology}, Abstract = {A large class of embedded systems is distinguished from general purpose computing systems by the need to satisfy strict requirements on timing, often under constraints on available resources. Predictable system design is concerned with the challenge of building systems for which timing requirements can be guaranteed a priori. Perhaps paradoxically, this problem has become more difficult by the introduction of performance-enhancing architectural elements, such as caches, pipelines, and multithreading, which introduce a large degree of nondeterminism and make guarantees harder to provide. The intention of this paper is to summarize current state-of-the-art in research concerning how to build predictable yet performant systems. We suggest precise definitions for the concept of "predictability", and present predictability concerns at different abstractions levels in embedded software design. First, we consider timing predictability of processor instruction sets. Thereafter, we consider how programming languages can be equipped with predictable timing semantics, covering both a language-based approach based on the synchronous paradigm, as well as an environment that provides timing semantics for a mainstream programming language (in this case C). We present techniques for achieving timing predictability on multicores. Finally we discuss how to handle predictability at the level of networked embedded systems, where randomly occurring errors must be considered.} } @inproceedings{BDF+:2012, Author = {Roderick Bloem and Rolf Drechsler and Goerschwin Fey and Alexander Finder and Georg Hofferek and Robert Könighofer and Jaan Raik and Urmas Repinski and André Sülflow}, Title = {FoREnSiC - An Automatic Debugging Environment for C Programs.}, Year = {(2012).}, Pages = {260-265}, Note = {gfey, CE}, Howpublished = {12-999 BDF+:2012 HVC}, Booktitle = {

In this paper, we propose a WCET-aware optimization technique for static I-cache locking which improves a program's performance and predictability. To select the memory blocks to lock into the cache and avoid time consuming repetitive WCET analyses, we developed a new algorithm employing integer-linear programming (ILP). The ILP models the worst-case execution path (WCEP) of a program and takes the influence of locked cache contents into account. By modeling the effect of locked memory blocks on the runtime of basic blocks, the overall WCET of a program can be minimized. We show that our optimization is able to reduce the estimated WCET (abbr. WCETest) of real-life benchmarks by up to 40.8%. At the same time, our proposed approach is able to outperform a regular cache by up to 23.8% in terms of WCETest.} } @inproceedings{FFA+:2012, Author = {Stefan Frehse and Goerschwin Fey and Eli Arbel and Karen Yorav and Rolf Drechsler}, Title = {Complete and Effective Robustness Checking by Means of Interpolation.}, Year = {(2012).}, Pages = {82-90}, Note = {gfey, CE}, Howpublished = {12-999 FFA+:2012 FMCAD}, Booktitle = {

We present a novel cache-aware code positioning optimization driven by worst-case execution time (WCET) information. For this purpose, we introduce a formal cache model based on a conflict graph which is able to capture a broad class of cache architectures. This cache model is combined with a formal WCET timing model, resulting in a cache conflict graph weighted with WCET data. This conflict graph is then exploited by heuristics for code positioning of both basic blocks and entire functions.

Code positioning is able to decrease the accumulated cache misses for a total of 18 real-life benchmarks by 15.5% on average for an automotive processor featuring a 2-way set-associative cache. These cache miss reductions translate to average WCET reductions by 6.1%. For direct-mapped caches, even larger savings of 18.8% (cache misses) and 9.0% (WCET) were achieved.} } @inproceedings{FSS11, Author = {Heiko Falk, Norman Schmitz and Florian Schmoll}, Title = {WCET-aware Register Allocation based on Integer-Linear Programming.}, Year = {(2011).}, Pages = {13-22}, Month = {July}, Note = {hfalk, ESD, WCC}, Series = {20110706-ecrts-falk.pdf}, Address = {Porto / Portugal}, Isbn = {10.1109/ECRTS.2011.10}, Howpublished = {11-85 FSS11 ECRTS}, Booktitle = {

This paper presents an integer-linear programming (ILP) based register allocator that uses precise worst-case execution time (WCET) models. Using this WCET timing data, the compiler avoids spill code generation along the critical path defining a program's WCET. To the best of our knowledge, this paper is the first one to present a WCET-aware ILP-based register allocator. Our results underline the effectiveness of the proposed techniques. For a total of 55 realistic benchmarks, we reduced WCETs by 20.2% on average and ACETs by 14%, compared to a standard graph coloring allocator. Furthermore, our ILP-based register allocator outperforms a WCET-aware graph coloring allocator by more than a factor of two for the considered benchmarks, while requiring less runtime.} } @inproceedings{RBF:2011, Author = {Heinz Riener and Roderick Bloem and Goerschwin Fey}, Title = {Test Case Generation from Mutants using Model Checking Techniques.}, Year = {(2011).}, Pages = {388-397}, Note = {gfey, CE}, Howpublished = {11-999 RBF:2011 MUTATION}, Booktitle = {

In this paper, we propose how the current best techniques for CRPD analysis, which have only been proposed separately and for different aspects of the analysis can be brought together to construct an efficient CRPD analysis with unique properties. Moreover, along the construction, we propose several different enhancements to the methods employed. We also exploit that in a complete approach, analysis steps are synergetic and can be combined into a single analysis pass solving all formerly separate steps at once. In addition, we argue that it is often sufficient to carry out the combined analysis on basic block bounds, which further lowers the overall complexity. The result is a proposal for a fast CRPD analysis of very high accuracy.} } @book{Zimm11, Author = {Karl-Heinz Zimmermann}, Title = {Computability Theory.}, Year = {(2011).}, Month = {July}, Note = {khzimmermann, AEG}, Publisher = {Hamburg University of Technology:}, Series = {20110729-computability-theory-zimmermann.pdf}, Edition = {1.}, Isbn = {10.15480/882.1012}, Howpublished = {11-75 Zimm11 TUBdok}, Abstract = {Why do we need a formalization of the notion of algorithm or effective computation? In order to show that a specific problem is algorithmically solvable, it is sufficient to provide an algorithm that solves it in a sufficiently precise manner. However, in order to prove that a problem is in principle not solvable by an algorithm, a rigorous formalism is necessary that allows mathematical proofs. The need for such a formalism became apparent in the works of David Hilbert (1900) on the foundations of mathematics and Kurt Gödel (1931) on the incompleteness of elementary arithmetic. The first investigations in the field were conducted by the logicians Alonzo Church, Stephen Kleene, Emil Post, and Alan Turing in the early 1930s. They provided the foundation of computability theory as a branch of theoretical computer science. The fundamental results established Turing computability as the correct formalization of the informal idea of effective calculation. The results led to Church's thesis stating that "everything computable is computable by a Turing machine". The theory of computability has grown rapidly from its beginning. Its questions and methods are penetrating many other mathematical disciplines. Today, computability theory provides an important theoretical background for logicians and computer scientists. Many mathematical problems are known to be undecidable such as the word problem for groups and semigroups, the halting problem, and Hilbert's tenth problem.} } @inproceedings{SKF+:2011, Author = {Mathias Soeken and Ulrich Kühne and Martin Freibothe and Goerschwin Fey and Rolf Drechsler}, Title = {Towards Automatic Property Generation for the Formal Verification of Bus Bridges.}, Year = {(2011).}, Pages = {417-422}, Note = {gfey, CE}, Howpublished = {11-999 SKF+:2011 DDECS}, Booktitle = {

In this paper, we propose a WCET-driven branch prediction aware optimization which reorders basic blocks of a function in order to reduce the amount of jump instructions and mispredicted branches. We employed a genetic algorithm which rearranges basic blocks in order to decrease the WCET of a program. This enables a first estimation of the possible optimization potential at the cost of high optimization runtimes. To avoid time consuming repetitive WCET analyses, we developed a new algorithm employing integer-linear programming (ILP). The ILP models the worst-case execution path (WCEP) of a program and takes branch prediction effects into account. This algorithm enables short optimization runtimes at slightly decreased optimization results. In a case study, the genetic algorithm is able to reduce the benchmarks' WCET by up to 24.7% whereas our ILP-based approach is able to decrease the WCET by up to 20.0%.} } @inproceedings{FHD+:2011, Author = {Stefan Frehse and Finn Haedicke and Melanie Diepenbeck and Goerschwin Fey and Rolf Drechsler}, Title = {Hochoptimierter Ablauf zur Robustheitsprüfung.}, Year = {(2011).}, Pages = {35-42}, Note = {gfey, CE}, Howpublished = {11-999 FHD+:2011 ZUE}, Booktitle = {

This article presents concepts and infrastructures for WCET-aware code generation and optimization techniques for WCET reduction. All together, they help to obtain code explicitly optimized for its worst-case timing, to automate large parts of the real-time software design flow, and to reduce costs of a real-time system by allowing to use tailored hardware.} } @article{MBWZ10, Author = {Israel Marck Martinez-Perez, Wolfgang Brandt, Michael Wild and Karl-Heinz Zimmermann}, Title = {Bioinspired Parallel Algorithms for Maximum Clique Problem on FPGA Architectures.}, Journal = {

In this paper, we propose the first adaptive WCET-aware compiler framework for an automatic search of compiler optimization sequences which yield highly optimized code. Besides the objective functions ACET and code size, we consider the worst-case execution time (WCET) which is a crucial parameter for real-time systems. To find suitable trade-offs between these objectives, stochastic evolutionary multi-objective algorithms identifying Pareto optimal solutions are exploited. A comparison based on statistical performance assessments is performed which helps to determine the most suitable multi-objective optimizer. The effectiveness of our approach is demonstrated on real-life benchmarks showing that standard optimization levels can be significantly outperformed.} } @misc{MaFa10, Author = {Peter Marwedel and Heiko Falk}, Title = {Reconciling compilers and timing analysis.}, Year = {(2010).}, Month = {April}, Note = {hfalk, ESD, WCC}, Series = {20100412-cpsweek-industrialworkshop-marwedel-falk.pdf}, Address = {Stockholm / Sweden}, Howpublished = {10-65 MaFa10 CPSWEEK}, Type = {Invited Talk at the CPSWEEK Industrial Workshop,} } @unpublished{DF:2010, Author = {Rolf Drechsler and Goerschwin Fey}, Title = {Formal verification meets robustness checking - techniques and challenges (Tutorial).}, Year = {(2010).}, Note = {gfey, CE}, Howpublished = {10-999 DF:2010} } @inproceedings{FF:2010, Author = {Stefan Frehse and Goerschwin Fey}, Title = {Kompositionelle Formale Robustheitsprüfung.}, Year = {(2010).}, Pages = {73-74}, Note = {gfey, CE}, Howpublished = {10-999 FF:2010 ZUE}, Booktitle = {

This paper extends a graph coloring register allocator such that it uses precise worst-case execution time (WCET) models. Using this WCET timing data, the compiler tries to avoid spill code generation along the critical path defining a program’s WCET. To the best of our knowledge, this paper is the first one to present a WCET-aware register allocator. Our results underline the effectiveness of the proposed techniques. For a total of 46 realistic benchmarks, we reduced WCETs by 31.2% on average. Additionally, the runtimes of our WCET-aware register allocator still remain acceptable.} } @misc{Falk09a, Author = {Heiko Falk}, Title = {Compiler Techniques for hard Real-Time Systems (in German).}, Year = {(2009).}, Month = {April}, Note = {hfalk, ESD, WCC}, Address = {Schloss Dagstuhl / Germany}, Howpublished = {09-70 Falk09a GIBU}, Type = {Invited Talk at the Annual Meeting of University Professors of the German Society of Informatics (GIBU),} } @proceedings{Falk09b, Author = {Heiko Falk (Ed.)}, Title = {Proceedings of the 12th International Workshop on Software & Compilers for Embedded Systems (SCOPES).}, Year = {(2009).}, Month = {April}, Note = {hfalk, ESD}, Publisher = {ACM:}, Address = {Nice / France}, Howpublished = {09-65 Falk09b SCOPES}, Url = {http://www.scopesconf.org/scopes-09} } @inproceedings{FaKl09, Author = {Heiko Falk and Jan C. Kleinsorge}, Title = {Optimal Static WCET-aware Scratchpad Allocation of Program Code.}, Year = {(2009).}, Pages = {732-737}, Month = {July}, Note = {hfalk, ESD, WCC}, Series = {20090730-dac-falk-kleinsorge.pdf}, Address = {San Francisco / USA}, Isbn = {10.1145/1629911.1630101}, Howpublished = {09-50 FaKl09 DAC}, Booktitle = {

This paper presents an optimal static SPM allocation algorithm for program code. It minimizes WCETs by placing the most beneficial parts of a program's code in an SPM. Our results underline the effectiveness of the proposed techniques. For a total of 73 realistic benchmarks, we reduced WCETs on average by 7.4% up to 40%. Additionally, the run times of our ILP-based SPM allocator are negligible.} } @article{MaZi09, Author = {Israel Marck Martinez-Perez and Karl-Heinz Zimmermann}, Title = {Parallel bioinspired algorithms for NP complete graph problems.}, Journal = {

In this paper, we propose a framework for a static loop analysis based on Abstract Interpretation, a theory of a sound approximation of program semantics. To accelerate the analysis, we preprocess the analyzed code using Program Slicing, a technique that removes statements irrelevant for the loop analysis. In addition, we introduce a novel polytope-based loop evaluation that further significantly reduces the analysis time. The efficiency of our loop analyzer is evaluated on a large number of benchmarks. Results show that 99% of the considered loops could be successfully analyzed in an acceptable amount of time. This study points out that our methodology is best suited for real-world problems.} } @incollection{WFG+:2009, Author = {Robert Wille and Goerschwin Fey and Daniel Große and Stephan Eggersglüß and Rolf Drechsler}, Title = {SWORD: A SAT like prover using word level information.}, Year = {(2009).}, Pages = {175-192}, Note = {gfey, CE}, Howpublished = {09-999 WFG+:2009 {VLSI-SOC: ADVANCED TOPICS ON SYSTEMS ON A CHIP}}, Booktitle = {

We present two novel positioning optimizations driven by worst-case execution time (WCET) information to effectively minimize the program’s worst-case behavior. WCET reductions by 10% on average are achieved. Moreover, a combination of positioning and the WCET-driven Procedure Cloning optimization proposed in [14] is presented improving the WCET analysis by 36% on average.} } @inproceedings{LFMT08, Author = {Paul Lokuciejewski, Heiko Falk, Peter Marwedel and Henrik Theiling}, Title = {WCET-Driven, Code-Size Critical Procedure Cloning.}, Year = {(2008).}, Pages = {21-30}, Month = {March}, Note = {hfalk, ESD, WCC}, Series = {20080314-scopes-lokuciejewski-falk.pdf}, Address = {Munich / Germany}, Howpublished = {08-95 LFMT08 SCOPES}, Booktitle = {

In this paper we extend the standard Procedure Cloning optimization by WCET-aware concepts with the objective to improve the tightness of the WCET estimation. Our novel approach is driven by WCET information which successively eliminates code structures leading to overestimated timing results, thus making the code more suitable for the analysis. In addition, the code size increase during the optimization is monitored and large increases are avoided.

The effectiveness of our optimization is shown by tests on real-world benchmarks. After performing our optimization, the estimated WCET is reduced by up to 64.2% while the employed code transformations yield an additional code size increase of 22.6% on average. In contrast, the average-case performance being the original objective of Procedure Cloning showed a slight decrease.} } @misc{MaFa08b, Author = {Peter Marwedel and Heiko Falk}, Title = {Memory-architecture aware compilation.}, Year = {(2008).}, Month = {September}, Note = {hfalk, ESD, WCC}, Series = {20080910-artist2-summerschool-marwedel-falk.pdf}, Address = {Autrans / France}, Howpublished = {08-60 MaFa08b ARTIST}, Url = {http://www.artist-embedded.org/docs/Events/2008/Autrans/Videos/Peter_Marwedel_and_Heiko_Falk}, Type = {Lecture at the ARTIST2 Summer School 2008 in Europe,} } @misc{MaFa08a, Author = {Peter Marwedel and Heiko Falk}, Title = {Embedded Systems - with Emphasis on the Exploitation of the Memory Hierarchy.}, Year = {(2008).}, Month = {August}, Note = {hfalk, ESD}, Address = {Seoul / South Korea}, Howpublished = {08-65 MaFa08a AIIT}, Type = {Full-week Tutorial at the Advanced Institute of Information Technology (AIIT),} } @inproceedings{WFM+:2008, Author = {Robert Wille and Goerschwin Fey and Marc Messing and Gerhard Angst and Lothar Linhard and Rolf Drechsler}, Title = {Identifying a Subset of System Verilog Assertions for Efficient Bounded Model Checking.}, Year = {(2008).}, Pages = {542-549}, Note = {gfey, CE}, Howpublished = {08-999 WFM+:2008 DSD_EUROMICRO}, Booktitle = {

DNA Computing Models begins with a comprehensive introduction to the field of DNA computing. This book emphasizes computational methods to tackle central problems of DNA computing, such as controlling living cells, building patterns, and generating nanomachines. DNA Computing Models presents laboratory-scale human-operated models of computation, including a description of the first experiment of DNA computation conducted by Adleman in 1994. It provides molecular-scale autonomous models of computation and addresses the design of computational devices working in living cells. It also addresses the important problem of proper word design for DNA computing.

DNA Computing Models is designed for researchers and advanced-level students in computers science, bioengineering and molecular biology as a reference or secondary text book. This book is also suitable for practitioners in industry.} } @inproceedings{SFD:2007, Author = {André Sülflow and Goerschwin Fey and Rolf Drechsler}, Title = {Verbesserte SAT basierte Fehlerdiagnose durch Widerspruchanalyse.}, Year = {(2007).}, Note = {gfey, CE}, Howpublished = {07-999 SFD:2007 MBMV}, Booktitle = {

Modern caches can be controlled by software. The software can load parts of its code or of its data into the cache and lock the cache afterwards. Cache locking prevents the cache's contents from being flushed by deactivating the replacement. A locked cache is highly predictable and leads to very precise WCET estimates, because the uncertainty caused by the replacement strategy is eliminated completely.

This paper presents techniques exploring the lockdown of instruction caches at compile-time to minimize WCETs. In contrast to the current state of the art in the area of cache locking, our techniques explicitly take the worst-case execution path into account during each step of the optimization procedure. This way, we can make sure that always those parts of the code are locked in the I-cache that lead to the highest WCET reduction. The results demonstrate that WCET reductions from 54% up to 73% can be achieved with an acceptable amount of CPU seconds required for the optimization and WCET analyses themselves.} } @phdthesis{Martinez07, Author = {Israel Marck Martinez-Perez}, Title = {Biomolecular Computing Models for Graph Problems and Finite State Automata.}, Year = {(2007).}, Month = {December}, Note = {AEG}, Address = {Hamburg / Germany}, Howpublished = {07-30 Martinez07 PhD}, Type = {Ph.D. Thesis.}, School = {Hamburg University of Technology}, Institution = {School of Electrical Engineering, Computer Science and Mathematics} } @article{MIZ07, Author = {Israel Marck Martinez-Perez, Zoya Ignatova and Karl-Heinz Zimmermann}, Title = {Computational genes: a tool for molecular diagnosis and therapy of aberrant mutational phenotype.}, Journal = {

This paper presents the effect of procedure cloning applied at the source-code level on worst-case execution time. The optimization generates specialized versions of functions being called with constant values as arguments. In standard literature, it is used to enable further optimizations like constant propagation within functions and to reduce calling overhead.

We show that procedure cloning for WCET minimization leads to significant improvements. Reductions of the WCET from 12% up to 95% were measured for real-life benchmarks. These results demonstrate that procedure cloning improves analyzability and predictability of real-time applications dramatically. In contrast, average-case performance as the criterion procedure cloning was developed for is reduced by only 3% at most. Our results also show that these WCET reductions only implied small overhead during WCET analysis.} } @misc{MFP+07, Author = {Peter Marwedel, Heiko Falk, Sascha Plazar, Robert Pyka and Lars Wehmeyer}, Title = {Automatic mapping to tightly-coupled memories and cache locking.}, Year = {(2007).}, Month = {November}, Note = {hfalk, ESD, WCC}, Series = {20071126-hipeac-industrialworkshop-marwedel-falk.pdf}, Address = {Cambridge / UK}, Howpublished = {07-35 MFP+07 HiPEAC}, Type = {Invited Talk at the 4th HiPEAC Industrial Workshop on Compilers and Architectures,} } @inproceedings{PFV+07, Author = {Robert Pyka, Christoph Faßbach, Manish Verma, Heiko Falk and Peter Marwedel}, Title = {Operating system integrated energy aware scratchpad allocation strategies for multiprocess applications.}, Year = {(2007).}, Pages = {41-50}, Month = {April}, Note = {hfalk, ESD}, Series = {20070420-scopes-pyka-fassbach.pdf}, Address = {Nice / France}, Isbn = {10.1145/1269843.1269850}, Howpublished = {07-85 PFV+07 SCOPES}, Booktitle = {

Especially loops and if-statements are an inherent source of unpredictability and loss of precision for WCET analysis. This is caused by the difficulty to obtain safe and tight worst-case estimates of an application's high-level control flow. In addition, assembly-level control flow redirections reduce predictability even more due to complex processor pipelines and branch prediction units.

Loop nest splitting bases on precise mathematical models combined with genetic algorithms. On the one hand, these techniques achieve a significantly more homogeneous control flow structure. On the other hand, the precision of our analyses enables to generate very accurate high-level flow facts for loops and if-statements. The application of our implemented algorithms to three real-life benchmarks leads to average speed-ups by 25.0% - 30.1%, while WCET is reduced by 34.0% - 36.3%.} } @inproceedings{FaSc06a, Author = {Heiko Falk and Martin Schwarzer}, Title = {Loop Nest Splitting for WCET-Optimization and Predictability Improvement.}, Year = {(2006).}, Month = {July}, Note = {hfalk, ESD, WCC}, Series = {20060704-wcet-falk-schwarzer.pdf}, Address = {Dresden / Germany}, Isbn = {10.4230/OASIcs.WCET.2006.674}, Howpublished = {06-50 FaSc06a WCET}, Booktitle = {

Especially loops and if-statements of high-level languages are an inherent source of unpredictability and loss of precision for WCET analysis. This is caused by the fact that it is difficult to obtain safe and tight worst-case estimates of an application's flow of control through these high-level constructs. In addition, the corresponding control flow redirections expressed at the assembly level reduce predictability even more due to the complex pipeline and branch prediction behavior of modern embedded processors.

The analysis techniques for loop nest splitting are based on precise mathematical models combined with genetic algorithms. On the one hand, these techniques achieve a significantly more homogeneous structure of the control flow. On the other hand, the precision of our analyses leads to the generation of very accurate high-level flow facts for loops and if-statements. The application of our implemented algorithms to three real-life multimedia benchmarks leads to average speed-ups by 25.0% - 30.1%, while WCET is reduced between 34.0% and 36.3%.} } @inproceedings{FWS06, Author = {Heiko Falk, Jens Wagner and André Schaefer}, Title = {Use of a Bit-true Data Flow Analysis for Processor-Specific Source Code Optimization.}, Year = {(2006).}, Pages = {133-138}, Month = {October}, Note = {hfalk, ESD}, Series = {20061027-estimedia-falk-wagner.pdf}, Address = {Seoul / South Korea}, Isbn = {10.1109/ESTMED.2006.321286}, Howpublished = {06-15 FWS06 ESTIMedia}, Booktitle = {

This paper presents techniques for processor-specific code analysis and optimization at the source-level. It is shown how a bit-true data flow analysis is made applicable for source code analysis for the TI C6x DSPs for the very first time. Based on this bit-true analysis, fully automated optimizations superior to conventional pattern matching techniques are presented which optimize saturated arithmetic, reduce bitwidths of variables and exploit SIMD data processing within source codes. The application of our implemented algorithms to complex real-life codes leads to speed-ups between 33% - 48% for the optimization of saturated arithmetic, and up to 16% after SIMD optimization.} } @inproceedings{FLT06b, Author = {Heiko Falk, Paul Lokuciejewski and Henrik Theiling}, Title = {Design of a WCET-Aware C Compiler.}, Year = {(2006).}, Pages = {121-126}, Month = {October}, Note = {hfalk, ESD, WCC}, Series = {20061027-estimedia-falk-lokuciejewski.pdf}, Address = {Seoul / South Korea}, Isbn = {10.1109/ESTMED.2006.321284}, Howpublished = {06-20 FLT06b ESTIMedia}, Booktitle = {

The concepts for extending the compiler structure are kept very general so that they are not limited to WCET information. Rather, it is possible to use our concepts also for multi-objective optimization of e. g. best-case execution time (BCET) or energy dissipation.} } @inproceedings{FLT06a, Author = {Heiko Falk, Paul Lokuciejewski and Henrik Theiling}, Title = {Design of a WCET-Aware C Compiler.}, Year = {(2006).}, Month = {July}, Note = {hfalk, ESD, WCC}, Series = {20060704-wcet-falk-lokuciejewski.pdf}, Address = {Dresden / Germany}, Isbn = {10.4230/OASIcs.WCET.2006.673}, Howpublished = {06-45 FLT06a WCET}, Booktitle = {

The concepts for extending the compiler infrastructure are kept very general so that they are not limited to WCET information. Rather, it is possible to use our structures also for multi-objective optimization of e. g. best-case execution time (BCET) or energy dissipation.} } @techreport{MGIZ06, Author = {Israel Marck Martinez-Perez, Zhang Gong, Zoya Ignatova and Karl-Heinz Zimmermann}, Title = {Solving the Maximum Clique Problem via DNA Hairpin Formation.}, Year = {(2006).}, Number = {(06.3),}, Month = {December}, Note = {khzimmermann, AEG}, Series = {200612report-2006-06.3-martinez.pdf}, Address = {Hamburg / Germany}, Isbn = {10.15480/882.254}, Howpublished = {06-10 MGIZ06 Hamburg}, Type = {Technical Report}, School = {Hamburg University of Technology}, Institution = {Computer Engineering Department} } @techreport{MIZ06b, Author = {Israel Marck Martinez-Perez, Zoya Ignatova and Karl-Heinz Zimmermann}, Title = {An Autonomous DNA Model for Stochastic Finite State Automata.}, Year = {(2006).}, Number = {(06.2),}, Month = {May}, Note = {khzimmermann, AEG}, Series = {200605report-2006-06.2-martinez.pdf}, Address = {Hamburg / Germany}, Isbn = {10.15480/882.241}, Howpublished = {06-65 MIZ06b Hamburg}, Type = {Technical Report}, School = {Hamburg University of Technology}, Institution = {Computer Engineering Department} } @techreport{MIZ06a, Author = {Israel Marck Martinez-Perez, Zoya Ignatova and Karl-Heinz Zimmermann}, Title = {An Autonomous DNA Model for Finite State Automata.}, Year = {(2006).}, Number = {(06.1),}, Month = {May}, Note = {khzimmermann, AEG}, Series = {200605report-2006-06.1-martinez.pdf}, Address = {Hamburg / Germany}, Isbn = {10.15480/882.240}, Howpublished = {06-70 MIZ06a Hamburg}, Type = {Technical Report}, School = {Hamburg University of Technology}, Institution = {Computer Engineering Department} } @inproceedings{Tyss06, Author = {Karl Tyss}, Title = {Generatoren für echte Zufallszahlen auf FPGAs für eingebettete Systeme.}, Year = {(2006).}, Pages = {4}, Month = {September}, Note = {AEG}, Address = {Kassel / Germany}, Howpublished = {06-30 Tyss06 Krypto}, Booktitle = {

First, source code optimizations are inherently portable, since the optimized source code can be compiled by any compiler for the considered source language. Second, the correctness of source code transformations is often easier to validate, since faster servers can be used due to the portability, instead of slow evaluation hardware or instruction set simulators. Third, source code optimizations are easy to integrate into existing industrial tool chains. Fourth, source code transformations are easier to understand due to their high abstraction level as compared to machine code optimizations.

The examination of the source codes of embedded multimedia applications in this Ph. D. thesis revealed that such software only uses a small fraction of its execution time to compute audio or video data. Most of the execution time is used to evaluate and execute the complex control flow inside these multimedia algorithms. The source code optimizations proposed in this Ph. D. thesis thus focus on the analysis of the control flow of multimedia applications, and on its improvement. All proposed techniques are jointly able to reduce the execution times of realistic applications on average over nine different processor architectures by 69%. Furthermore, cumulative energy reductions by 83% have been observed.} } @inproceedings{FaVe04, Author = {Heiko Falk and Manish Verma}, Title = {Combined Data Partitioning and Loop Nest Splitting for Energy Consumption Minimization.}, Year = {(2004).}, Pages = {137-151}, Month = {September}, Note = {hfalk, ESD}, Series = {20040903-scopes-falk-verma.pdf}, Address = {Amsterdam / The Netherlands}, Isbn = {10.1007/978-3-540-30113-4_11}, Howpublished = {04-85 FaVe04 SCOPES}, Booktitle = {

Source Code Optimization Techniques for Data Flow Dominated Embedded Software is the first contribution focusing on the application of optimizations outside a compiler at the source code level. This book covers the following areas: - Several entirely new techniques are presented in combination with efficient algorithms for the most important ones. - Control flow analysis and optimization of data-dominated applications is one of the main contributions of this book since this issue remained open up to now. - Using real-life applications, large improvements in terms of runtimes and energy dissipation were achieved by the techniques presented in this book. Detailed results for a broad range of processors including DSPs, VLIWs and embedded RISC cores are discussed.} } @inproceedings{WTSF:2004, Author = {Klaus Winkelmann and Hans-Joachim Trylus and Dominik Stoffel and Goerschwin Fey}, Title = {Cost-efficient Block Verification for a UMTS Up-link Chip-rate Coprocessor.}, Year = {(2004).}, Pages = {162-167}, Note = {gfey, CE}, Howpublished = {04-999 WTSF:2004 DATE}, Booktitle = {

Protein informatics requires substantial prerequisites on computer science, mathematics, and molecular biology. The approach chosen here, allows a direct and rapid grasp on the subject starting from basic knowledge of algorithm design, calculus, linear algebra, and probability theory.

An Introduction to Protein Informatics, a professional monograph will provide the reader a comprehensive introduction to the field of protein informatics. The text emphasizes mathematical and computational methods to tackle the central problems of alignment, phylogenetic reconstruction, and prediction and sampling of protein structure.

An Introduction to Protein Informatics is designed for a professional audience, composed of researchers and practitioners within bioinformatics, molecular modeling, algorithm design, optimization, and pattern recognition. This book is also suitable as a graduate-level text for students in computer science, mathematics, and biomedicine.} } @inproceedings{WTSF:2003, Author = {Klaus Winkelmann and Hans-Joachim Trylus and Dominik Stoffel and Goerschwin Fey}, Title = {Cost-efficient Formal Block Verification for ASIC Design.}, Year = {(2003).}, Pages = {184-188}, Note = {gfey, CE}, Howpublished = {03-999 WTSF:2003 MBMV}, Booktitle = {

For a detailed evaluation of the benefits of loop nest splitting, the effects of our optimization with respect to instruction pipeline and cache behavior, runtimes, energy consumption and code sizes are shown. The application of our implemented tools for loop nest splitting to three real-life multimedia benchmarks leads to average reductions of pipeline stalls between 19.7% and 64.8% and an average decrease of instruction cache misses between 8.9% and 45.3%. Measurements on a variety of different programmable processors show average speed-ups between 23.6% and 62.1% of the benchmarks, whereas reductions of energy dissipation between 19.2% and 57.6% are observed.} } @inproceedings{RFM:2002, Author = {Jörg Ritter and Goerschwin Fey and Paul Molitor}, Title = {SPIHT implemented in a XC4000 device.}, Year = {(2002).}, Pages = {239-242}, Note = {gfey, CE}, Howpublished = {02-999 RFM:2002 MWSCAS}, Booktitle = {

In this paper, we propose a common methodology for both LSGP and LPGS based on polyhedral partitionings of large-size k-dimensional systolic arrays which are synthesized from n-dimensional uniform recurrences by linear mappings for allocation and timing. In particular, we address the optimization problem of finding optimal piecewise linear timing functions for small-size arrays. These are mappings composed of linear timing functions for the computations of the subarrays. We study a continuous approximation of this problem by passing from piecewise linear to piecewise quasi-linear timing functions. The resultant problem formulation is then a quadratic programming problem which can be solved by standard algorithms for nonlinear optimization problems.} } @article{ZiAc01, Author = {Karl-Heinz Zimmermann and Wolfgang Achtziger}, Title = {Optimal piecewise linear schedules for LSGP- and LPGS-decomposed array processors via quadratic programming.}, Journal = {

In this paper, we propose a common methodology for both LSGP and LPGS based on polyhedral partitionings of large-size k-dimensional systolic arrays which are synthesized from n-dimensional uniform recurrences by linear mappings for allocation and timing. In particular, we address the optimization problem of finding optimal piecewise linear timing functions for small-size arrays. These are mappings composed of linear timing functions for the computations of the subarrays. We study a continuous approximation of this problem by passing from piecewise linear to piecewise quasi-linear timing functions. The resultant problem formulation is then a quadratic programming problem which can be solved by standard algorithms for nonlinear optimization problems.} } @article{BaZi01, Author = {N. Suresh Babu and Karl-Heinz Zimmermann}, Title = {Decoding of linear codes over Galois rings.}, Journal = {

Das Buch wendet sich an Studenten und Wissenschaftler der Informatik, Mathematik und Elektrotechnik sowie an Fachleute in der Praxis.} } @mastersthesis{Falk98, Author = {Heiko Falk}, Title = {Hardware Partitioning for Prototype Boards (in German).}, Year = {(1998).}, Month = {August}, Note = {hfalk, ESD}, Series = {19980812-mscthesis-falk.pdf}, Address = {Dortmund / Germany}, Howpublished = {98-80 Falk Diploma}, Type = {Diploma Thesis}, School = {University of Dortmund}, Institution = {Faculty of Computer Science}, Abstract = {The diploma thesis deals with the problem to partition complex hardware designs for several FPGAs during HW/SW co-design. Using genetic algorithms, a partitioning method was realized for Xilinx FPGAs that achieves a higher quality than most other published approaches.} } @article{ZiAc98, Author = {Karl-Heinz Zimmermann and Wolfgang Achtziger}, Title = {On Time Optimal Implementation of Uniform Recurrences onto Array Processors via Quadratic Programming.}, Journal = {