Sable Publications (Technical Reports)
|
AspectMatlab: An Aspect-Oriented Scientific Programming Language
|
(2009-3) |
back
|
Authors:
Toheed Aslam and Jesse Doherty and Anton Dubrau and Laurie Hendren
Date: October 2009
Abstract
This paper introduces a new aspect-oriented programming language,
AspectMatlab. Matlab is a dynamic scientific programming
language that is commonly used by scientists because of its
convenient and high-level syntax for arrays, the fact that type
declarations are not required, and the availability of a rich
set of application libraries. AspectMatlab introduces key
aspect-oriented features in a way that is both accessible to
scientists and where the aspect-oriented features concentrate
on array accesses and loops, the core computation elements in
scientific programs. Introducing aspects into a dynamic
language such as Matlab also provides some new challenges.
In particular, it is difficult to statically determine precisely
where patterns match, resulting in many dynamic checks in the
woven code. Our compiler includes flow analyses which are used
to eliminate many of those dynamic checks. This paper reports
on the language design of AspectMatlab, the amc compiler
implementation and related optimizations, and also provides an
overview of use cases that are specific to scientific
programming.
View
the report (.pdf)
BibTeX
entry (.bib)
|
Understanding Method Level Speculation
|
(2009-2) |
back
|
Authors:
Christopher J. F. Pickett and Clark Verbrugge and Allan Kielstra
Date: September 2009
Abstract
Method level speculation (MLS) is an optimistic technique for
parallelizing imperative programs, for which a variety of MLS
systems and optimizations have been proposed. However, runtime
performance strongly depends on the interaction between program
structure and MLS system design choices, making it difficult to
compare approaches or understand in a general way how programs
behave under MLS. Here we develop an abstract list-based model of
speculative execution that encompasses several MLS designs, and a
concrete stack-based model that is suitable for implementations.
Using our abstract model, we show equivalence and correctness for a
variety of MLS designs, unifying in-order and out-of-order execution
models. Using our concrete model, we directly explore the execution
behaviour of simple imperative programs, and show how specific
parallelization patterns are induced by combining common programming
idioms with precise speculation decisions. This basic groundwork
establishes a common basis for understanding MLS designs, and
suggests more formal directions for optimizing MLS behaviour and
application.
View
the report (.pdf)
BibTeX
entry (.bib)
|
Adaptive Software Return Value Prediction
|
(2009-1) |
back
|
Authors:
Christopher J. F. Pickett and Clark Verbrugge and Allan Kielstra
Date: June 2009
Abstract
Return value prediction (RVP) is a useful technique that enables a
number of program optimizations and analyses. Potentially high
overhead, however, as well as a dependence on novel hardware support
remain significant barriers to method level speculation and other
applications that depend on low cost, high accuracy RVP. Here we
investigate the feasibility of software-based RVP through a
comprehensive software study of RVP behaviour. We develop a
structured framework for RVP design and use it to experimentally
analyze existing and novel predictors in the context of non-trivial
Java programs. As well as measuring accuracy, time, and memory
overhead, we show that an object-oriented adaptive hybrid predictor
model can significantly improve performance while maintaining high
accuracy levels. We consider the impact on speculative
parallelism, and show further application of RVP to program
understanding. Our results suggest that software return value
prediction can play a practical role in further program optimization
and analysis.
View
the report (.pdf)
BibTeX
entry (.bib)
|
A Stochastic Approach to Instruction Cache Optimization
|
(2008-4) |
back
|
Authors:
Maxime Chevalier-Boisvert and Clark Verbrugge
Date: December 2008
Abstract
The memory alignment of executable code can have significant yet hard to predict effects on
the run-time performance of programs, even in the presence of existing aggressive optimization.
We present an investigation of two different approaches to instruction cache optimization—the
first is an implementation of a well-known and established compile-time heuristic, and the second
is our own stochastic, genetic search algorithm based on active profiling. Both algorithms were
tested on 7 real-world programs and speed gains of up to 10% were observed. Our results show
that, although more time consuming, our approach yields higher average performance gains than
established heuristic solutions. This suggests there is room for improvement, and moreover that
stochastic and learning-based optimization approaches like our own may be worth investigating
further as a means of better exploiting hardware features.
View
the report (.pdf)
BibTeX
entry (.bib)
|
Memory Abstractions for Speculative Multithreading
|
(2008-3) |
back
|
Authors:
Christopher J. F. Pickett and Clark Verbrugge and Allan Kielstra
Date: September 2008
Abstract
Speculative multithreading is a promising technique for automatic
parallelization. However, our experience with a software
implementation indicates that there is significant overhead involved
in managing the heap memory internal to the speculative system and
significant complexity in correctly managing the call stack memory
used by speculative tasks. We first describe a specialized heap
management system that allows the data structures necessary to support
speculation to be quickly recycled. We take advantage of
constraints imposed by our speculation model to reduce the complexity
of this otherwise general producer/consumer memory problem. We
next provide a call stack abstraction that allows the local variable
requirements of in-order and out-of-order nested method level
speculation to be expressed and hence implemented in a relatively
simple fashion. We believe that heap and stack memory management
issues are important to efficient speculative system design. We
also believe that abstractions such as ours help provide a framework
for understanding the requirements of different speculation
models.
View
the report (.pdf)
BibTeX
entry (.bib)
|
Enabling Static Analysis for Partial Java Programs
|
(2008-2) |
back
|
Authors:
Barthélémy Dagenais and Laurie Hendren
Date: April 2008
Abstract
Software engineering tools often deal with the source code of programs retrieved
from the web or source code repositories. Typically, these tools only have
access to a subset of the programs' source code (one file or a subset of files)
which makes it difficult to build a complete and typed intermediate
representation (IR). Indeed, for incomplete object-oriented programs, it is not
always possible to completely disambiguate the syntactic constructs and to
recover the declared type of certain expressions because the declaration of many
types and class members are not accessible.
We present a framework that performs partial type inference and uses heuristics
to recover the declared type of expressions and resolve ambiguities in partial
Java programs. Our framework produces a complete and typed IR suitable for
further static analysis. We have implemented this framework and used it in an
empirical study on four large open source systems which shows that our system
recovers most declared types with a low error rate, even when only one class is
accessible.
View the report (.pdf)
Download the report (.ps)
BibTeX entry (.bib)
|
Transforming Timeline specifications into automata for runtime monitoring (extended version)
|
(2008-1) |
back
|
Authors:
Eric Bodden and Hans Vangheluwe
Date: February 2008
Abstract
In runtime monitoring, a programmer specifies
code to execute whenever a sequence of events occurs during program execution.
Previous and related work has shown that runtime monitoring techniques can be
useful in order to validate or guarantee the safety and security of running
programs.
Those techniques have however not been incorporated in
everyday software development processes.
One problem that hinders industry adoption is that the required
specifications use a cumbersome, textual notation.
As a consequence, only verification experts, not programmers, can
understand what a given specification means and in particular,
whether it is correct.
In 2001, researchers at Bell Labs proposed the Timeline formalism.
This formalism was designed with ease of use in mind,
for the purpose of static verification (and not, as in our work,
for runtime monitoring).
In this article, we describe how software safety specifications
can be described visually in the Timeline formalism and subsequently
transformed into finite automata suitable for runtime monitoring,
using our meta-modelling and model transformation tool \At.
The synthesized automata are subsequently fed into an
existing monitoring back-end that
generates efficient runtime monitors for them.
Those monitors can then automatically be applied to Java programs.
Our work shows that the transformation of Timeline models to automata is
not only feasible in an efficient and sound way but also
helps programmers identify correspondences between the original
specification and the generated monitors.
We argue that visual specification of safety criteria and subsequent
automatic synthesis of runtime monitors will help users reason about the
correctness of their specifications on the one hand and effectively
deploy them in industrial settings on the other hand.
View the report (.pdf)
Download the report (.ps)
BibTeX entry (.bib)
|
Bytecode Testing Framework for SableVM Code-copying Engine
|
(2007-9) |
back
|
Authors:
Gregory B. Prokopski, Etienne M. Gagnon and Christian Arcand
Date: November 2007
Abstract
Code-copying is a very attractive, simple-JIT, highly portable fast bytecode execution technique with
implementation costs close to a classical interpreter. The idea is to reuse, at Virtual Machine (VM)
runtime, chunks of VM binary code, copy them, concatenate and execute together at a greater speed.
Unfortunatelly code-copying makes anwarranted assumptions about the code generated by the compiler
used to compile the VM. It is therefore necessary to find which VM code chunks can be safely used
after being copied and which can not. Previously used manual testing and assembly analysis were highly
ineffective and error prone. In this work we present a test suite, Bytecode Testing Framework (BTF) that
contains a comprehensive set of fine-grained Java assembly tests. These tests are then used in conjunction
with SableVM JVM testing mode to semi-automatically detect one-by-one which code chunks are safe
to use. With this techunique we ported SableVMâs code-copying engine to several architectures and had
external contributors with no VM knowledge port SableVM to new architectures within as little as one
hour. The presented Bytecode Testing Framework greatly improved reliability and ease of porting of
SableVM's code-copying engine, thus its practical usability.
View the report (.pdf)
Download the report (.ps)
BibTeX entry (.bib)
|
Instance keys: A technique for sharpening whole-program pointer analyses with intraprocedural information
|
(2007-8) |
back
|
Authors:
Eric Bodden, Patrick Lam and Laurie Hendren
Date: September 2007
Abstract
Pointer analyses enable many subsequent program analyses and
transformations, since they enable compilers to statically
disambiguate references to the heap. Extra precision enables pointer
analysis clients to draw stronger conclusions about
programs. Flow-sensitive pointer analyses are typically quite precise.
Unfortunately, flow-sensitive pointer analyses are also often too expensive
to run on whole programs. This paper therefore describes a technique
which sharpens results from a whole-program flow-insensitive points-to
analysis using two flow-sensitive intraprocedural analyses: a
must-not-alias analysis and a must-alias analysis. The main technical
idea is the notion of instance keys, which enable client
analyses to disambiguate object references. We distinguish two kinds of keys:
weak and strong. Strong instance keys guarantee that
different keys represent different runtime objects, allowing
instance keys to almost transparently substitute for runtime objects in static
analyses. Weak instance keys may represent multiple runtime objects, but
still enable disambiguation of compile-time values.
We assign instance keys based on the results of our analyses. We found that
the use of instance keys greatly simplified the design and implementation of
subsequent analysis phases.
View
the report (.pdf)
Download
the report (.ps)
BibTeX
entry (.bib)
|
Compiler-guaranteed safety in code-copying VMs
|
(2007-7) |
back
|
Authors:
Gregory B. Prokopski, Clark Verbrugge
Date: September 2007
Abstract
Virtual Machine authors face a difficult choice: to settle for low performance, cheap interpreter, or to
write a specialized and costly compiler. One of the methods to bridge the gap between these two distant
solutions is to use the existing code-copying technique that reuses chunks of VMâs binary code creating a
simple JIT. While simple in principle this technique is not reliable without a compiler that can guarantee
that copied chunks are functionally equivalent, which is often not the case due to aggressive optimizations.
We present a proof-of-concept, minimal-impact modification of a highly optimizing compiler,
GCC. It allows a VM programmer to mark specific chunks of VM source code as copyable. The chunks
of native code resulting from compilation of the marked source become addressable and self-contained.
Chunks can be safely copied at VM runtime, concatenated and executed together. With minimal impact
on compiler maintenance we guarantee the necessary safety and correctness properties of chunks. This
allows code-copying VMs to safely achieve performance improvement up to 200%, 67% average, over
direct interpretation. ensured thanks to chunks integrity verification. This maintanable enhancement
makes the code-copying technique reliable and thus practially usable.
View
the report (.pdf)
Download
the report (.ps)
BibTeX
entry (.bib)
|
Partial Program Analysis
|
(2007-6) |
back
|
Authors:
Barthélémy Dagenais
Date: September 2007
Abstract
Software engineers often need to perform static analysis on a subset of a program source
code. Unfortunately, static analysis frameworks usually fail to complete their analyses be-
cause the definitions of some types used within a partial program are unavailable and the
complete program type hierarchy cannot be reconstructed. We propose a technique, Partial
Program Analysis, to generate and infer type facts in incomplete Java programs to allow static
analysis frameworks to complete their analyses. We describe the various levels of inference
soundness our technique provides, and then, we cover the main algorithm and type inference
strategies we use. We conclude with a detailed case study showing how our technique can
provide more precise type facts than standard parsers and abstract syntax tree generators.
View
the report (.pdf)
Download
the report (.ps)
BibTeX
entry (.bib)
|
Arithmetic Coding revealed - A guided tour from theory to praxis
|
(2007-5) |
back
|
Authors:
Eric Bodden, Malte Clasen and Joachim Kneis
Date: May 2007
Abstract
This document is an updated and translated version of the German paper
Arithmetische Kodierung from 2002.
It tries to be a comprehensive guide to the art of arithmetic coding.
First we give an introduction to the mathematic principles involved. These build the
foundation for the next chapters, where we describe the encoding and decoding
algorithms for different numerical systems. Here we also mention various problems one can
come across as well as solutions for those.
This is followed by a proof of uniqueness and an estimation of the efficiency of the
algorithm. In the end we briefly mention different kinds of statistical models, which are
used to actually gain compression through the encoding. Throughout this paper we
occasionally make some comparisons to the related Huffman encoding algorithm. Though, some
rudimentary knowledge about Huffman encoding should suffice for the reader to follow the
line of reasoning.
This paper is mainly based on the works on Sayood and Bell. On the latter we
base our implementation which is included in the appendix as full C++ source code. We also
make use of parts of this code during some of our examples. The mathematical model we use,
however, is strongly based on Sayood and Fano. Also we employ the well-known
Shannon-Theorem, which proofs the entropy to be the bound of feasible
lossless compression.
View
the report (.pdf)
Download
the report (.ps)
BibTeX
entry (.bib)
|
Phase-based adaptive recompilation in a JVM
|
(2007-4) |
back
|
Authors:
Dayong Gu and Clark Verbrugge
Date: May 2007
Abstract
Modern JIT compilers often employ multi-level recompilation
strategies as a means of ensuring the most used code is also the most
highly optimized, balancing optimization costs and expected future
performance. Accurate selection of code to compile and
level of optimization to apply is thus important to performance. In this paper
we investigate the effect of an improved recompilation strategy for
a Java virtual machine. Our design makes use of a lightweight, low-level
profiling mechanism to detect high-level, variable length phases in
program execution. Phases are then used to guide adaptive
recompilation choices, improving performance. We develop both an offline implementation
based on trace data and a self-contained online version.
Our offline study shows an average speedup of 8.5% and up to 21%, and our online
system achieves an average speedup of 4.5%, up to 18%.
We subject our results to extensive analysis and show that our
design achieves good overall performance with high consistency
despite the existence of many complex and interacting factors in such
an environment.
View
the report (.pdf)
Download
the report (.ps)
BibTeX
entry (.bib)
|
Component-Based Lock Allocation
|
(2007-3) |
back
|
Authors:
Richard L. Halpert and Christopher J.F. Pickett and Clark Verbrugge
Date: May 2007
Abstract
The choice of lock objects in concurrent programs can affect both
performance and correctness, a burden of complexity for programmers.
Recently, various automated lock allocation and assignment techniques
have been proposed, each aiming primarily to minimize the number of
conflicts between critical sections. However, practical performance
depends on a number of important factors, including the nature of
concurrent interaction, the accuracy of the program analyses used to
support the lock allocation, and the underlying machine hardware. We
introduce component-based lock allocation, which starts by
analysing data dependences and automatically assigns lock objects with
tunable granularity to groups of interfering critical sections.
Our experimental results show that while a single global lock is
usually suboptimal, high accuracy in program analysis is not always
necessary to achieve generally good performance. Our work provides
empirical evidence as to sufficient strategies for efficient lock
allocation, and details the requirements of an adaptable
implementation.
View
the report (.pdf)
BibTeX
entry (.bib)
|
Using hardware data to detect repetitive program behavior
|
(2007-2) |
back
|
Authors:
Dayong Gu and Clark Verbrugge
Date: March 2007
Abstract
Detecting repetitive ``phases'' in program execution is helpful for
program understanding, runtime optimization, and for reducing
simulation/profiling workload. The nature of the phases that may be
found, however, depend on the kinds of programs, as well how programs
interact with the underlying hardware. We present a technique to
detect long term and variable length repetitive program behaviour by
monitoring microarchitecture-level hardware events. Our approach results
in high quality repetitive phase detection; we propose quantitative
methods of evaluation, and show that our design accurately calculates
phases with a 92% ``confidence'' level. We further validate our design
through an implementation and analysis of phase-driven, runtime
profiling, showing a reduction of about 50% of the profiling workload
while still preserving 94% accuracy in the profiling results. Our work
confirms that it is practical to detect high-level phases from
lightweight hardware monitoring, and to use such information to improve
runtime performance.
View
the report (.pdf)
Download
the report (.ps)
BibTeX
entry (.bib)
|
libspmt: A Library for Speculative Multithreading
|
(2007-1) |
back
|
Authors:
Christopher J.F. Pickett and Clark Verbrugge and Allan Kielstra
Date: March 2007
Abstract Speculative multithreading (SpMT), or thread level
speculation (TLS), is a dynamic parallelisation technique that uses
out-of-order execution and memory buffering to achieve speedup. The
complexity of implementing software SpMT results in long development
lead times, and the lack of reuse between different software SpMT
systems makes comparisons difficult. In order to address these
problems we have created libspmt, a new language independent library
for speculative multithreading. It provides a virtualization of
speculative execution components that are unavailable in commercial
multiprocessors, and enables method level speculation when fully
integrated. We have isolated a clean and minimal library interface,
and the corresponding implementation is highly modular. It can
accommodate hosts that implement garbage collection, exceptions, and
non-speculative multithreading. We created libspmt by refactoring a
previous implementation of software SpMT for Java, and by aiming for
modularity have exposed several new opportunities for optimization.
View
the report (.pdf)
BibTeX
entry (.bib)
|
Obfuscating Java: the most pain for the least gain
|
(2006-5) |
back
|
Authors: Micheal Batchelder and Laurie Hendren
Date: June 2006
Abstract
Software obfuscators are used to transform code so that it becomes more
difficult to understand and harder to reverse engineer. Obfuscation of
Java programs is particularly important since Java's binary form, Java
bytecode, is relatively high-level and susceptible to high-quality
decompilation. The objective of our work is to develop and study
obfuscation techniques that produce obfuscated bytecode that is very
hard to reverse engineer while at the same time not significantly
degrading performance.
We present three kinds of obfuscation techniques that: (1) obscure intent
at the operational level; (2) obfuscate program structure
such as complicating control flow and confusing object-oriented design;
and (3) exploit the semantic gap between what is allowed at the
Java source level and what is allowed at the bytecode level.
We have implemented all of our techniques as a tool called JBCO (Java
Byte Code Obfuscator), which is built on top of the Soot framework. We
developed a benchmark suite for evaluating our obfuscations and we
examine runtime performance, control flow graph complexity and the
negative impact on decompilers. These results show that most of the
obfuscations do not degrade performance significantly and many increase
complexity, making reverse engineering more difficult. The impact on
decompilers was two-fold. For those obfuscations that can be
decompiled, readability is greatly reduced. Otherwise, the decompilers
fail to produce legal source code or crash completely.
View the report(.pdf)
View the report(.ps.gz)
|
Metrics for Measuring the Effectiveness of Decompilers and
Obfuscators
|
(2006-4) |
back
|
Authors: Nomair A. Naeem, Micheal Batchelder and Laurie Hendren
Date: June 2006
Abstract
Java developers often use decompilers to aid reverse engineering and
obfuscators to prevent reverse engineering. Decompilers translate
low-level
class files to Java source and often produce ``good" output.
Obfuscators
rewrite class files into semantically-equivalent class files that are
either: (1) difficult to decompile, or (2) decompilable, but
produce ``hard-to-understand" Java source.
This paper presents a set of metrics we have developed to quantify
the
effectiveness of decompilers and obfuscators. The metrics include
some
selective size and counting metrics, a expression complexity metric
and
a metric for evaluating the complexity of identifier names.
We have applied these metrics to evaluate a collection of
decompilers. By
comparing metrics of the original Java source and the decompiled
source we can
show when the decompilers produced ``good" code (as compared to the
original
code). We have also applied the metrics on the code produced by
obfuscators
and show how the metrics indicate that the obfuscators produced
``hard-to-understand" code (as compared to the unobfuscated code).
Our work provides a first step in evaluating the effectiveness of
decompilers
and obfuscators and we plan to apply these techniques to evaluate new
decompiler and obfuscator tools and techniques.
View the report(.pdf)
View the report(.ps.gz)
|
Improving the Compiling Speed of the AspectBench Compiler
|
(2006-3) |
back
|
Authors: Jingwu Li and Laurie Hendren
Date: August 2006
Abstract
The AspectBench Compiler (abc) is an extensible AspectJ compiler and built based on Polyglot and Soot. It generates optimized Java bytecode which can run faster in many cases when compared with ajc. However, the compilation speed of abc can be substantially slower than ajc. Typically it is roughly 4 times slower than ajc. By using various profiling tools, we observed some bottlenecks during the process of compiling AspectJ source code. We modified some data structures related to forward and backward flow analysis and changed some algorithms such as the variable name generator and around inlining to remove the bottlenecks or relieve the severity of those bottlenecks. The experimental results on selected benchmarks shows that the combined modifications reduce overall by 8% compilation time as compared with original abc. The speed up is especially notable for the benchmarks with around advice applied many places. At the same time, we also reduce the class file size for the benchmarks with around advices applied many places greatly, around 28%.
View the report(.pdf)
View the report(.ps)
|
Programmer-friendly Decompiled Java
|
(2006-2) |
back
|
Authors: Nomair A. Naeem and Laurie Hendren
Date: March 2006
Abstract
Java decompilers convert Java class files to Java source. Java class files
may be created by a number of different tools including standard Java
compilers, compilers for other languages such as AspectJ, or other tools
such as optimizers or obfuscators. There are two kinds of Java decompilers,
javac-specific decompilers that assume that the class file was
created by a standard javac compiler and
tool-independent decompilers that can decompile arbitrary class files,
independent of the tool that created the class files. Typically
javac-specific decompilers produce more readable code, but they
fail to decompile many class files produced by other tools.
This paper tackles the problem of how to make a tool-independent
decompiler, Dava, produce Java source code that is programmer-friendly.
In past work it has been shown that Dava can decompile arbitrary
class files, but often the output, although correct, is very different
from what a programmer would write and is hard to understand.
Furthermore, tools like obfuscators
intentionally confuse the class files and this also leads
to confusing decompiled source files.
Given that Dava already produces correct Java abstract syntax trees (ASTs)
for arbitrary class files, we provide a new back-end for Dava. The
back-end rewrites the ASTs to semantically equivalent
ASTs that correspond to code that is easier
for programmers to understand. Our new back-end includes a new
AST traversal framework, a set of simple pattern-based transformations,
a structure-based data flow analysis framework and a collection of more
advanced AST transformations that use flow analysis information. We
include several illustrative examples including the use of advanced
transformations to clean up obfuscated code.
View the report(.pdf)
View the report(.ps)
BibTeX entry
|
A Survey of Phase Analysis: Techniques, Evaluation and Applications
|
(2006-1) |
back
|
Authors: Dayong Gu and Clark Verbrugge
Date: March 2006
Abstract
Most programs exhibit significant repetition of behaviour, and
detecting program phases is an increasingly
important aspect of dynamic, adaptive program optimizations.
Phase-based optimization can also reduce profiling overhead,
save simulation time, and help in program understanding.
Using a novel state space characterization, we give an an overview of
state-of-the-art phase detection and prediction techniques.
We use this systematic exposition to situate our own long-range
phase analysis, and evaluate our approach using a novel phase
prediction evaluation metric. Our results show that phases with
relatively long periods can be predicted with
very high accuracy. This new technique, as well as the
unexplored regions in our high level characterization suggest
that further improvements and variations on program phase analysis
are both possible and potentially quite useful.
View the report(.pdf)
View the report(.ps)
BibTeX entry
|
Assessing the Impact of Optimization in Java Virtual Machines
|
(2005-4) |
back
|
Authors: Dayong Gu, Clark Verbrugge and Etienne M. Gagnon
Date: November 2005
Abstract
Many new Java runtime optimizations report relatively small,
single-digit performance improvements. On modern virtual and actual
hardware, however, the performance impact of an optimization
can be influenced by a variety of factors in the underlying systems.
Using a
case study of a new garbage collection optimization in two different
Java virtual machines, we show the various issues that must be taken
into consideration when claiming an improvement. We examine the
specific and overall performance changes due to our optimization and
show how unintended side-effects can contribute to, and distort the
final assessment. Our experience shows that VM and hardware concerns
can generate variances of up to 9.5% in whole program execution time.
Consideration of these confounding effects is critical to a good
understanding of Java performance and optimization.
View the report(.pdf)
View the report(.ps)
BibTeX entry
|
Dynamic Shape and Data Structure Analysis in Java
|
(2005-3) |
back
|
Authors: Sokhom Pheng and Clark Verbrugge
Date: October 2005
Abstract
Analysis of dynamic data structure usage is useful for both program
understanding and for improving the accuracy of other program
analyses. Static shape analysis techniques, however, suffer from
reduced accuracy in complex situations. We have designed and
implemented a dynamic shape analysis system that allows one to examine
and analyze how Java programs build and modify data structures. Using
a complete execution trace from a profiled run of the program, we
build an internal representation that mirrors the evolving runtime
data structures. The resulting series of representations can then be
analyzed and visualized, and we show how to use our approach to help
understand how programs use data structures, the precise effect of
garbage collection, and to establish limits on static techniques for
shape analysis. A deep understanding of dynamic data structures is
particularly important for modern, object-oriented languages that make
extensive use of heap-based data structures.
View the report
(.pdf)
View the report
(.ps)
|
Context-sensitive points-to analysis: is it worth it?
|
(2005-2) |
back
|
Authors: Ondřej Lhoták and Laurie Hendren
Date: October 2005
Abstract
We present the results of an empirical study evaluating the precision
of subset-based points-to analysis with several variations of context
sensitivity on Java benchmarks of significant size. We compare the use
of call site strings as the context abstraction, object sensitivity,
and the BDD-based context-sensitive algorithm proposed by Zhu and
Calman, and by Whaley and Lam. Our study includes analyses that
context-sensitively specialize only pointer variables, as well as
ones that also specialize the heap abstraction. We measure both
characteristics of the points-to sets themselves, as well as effects
on the precision of client analyses. To guide development of efficient
analysis implementations, we measure the number of contexts, the number
of distinct contexts, and the number of distinct points-to sets that
arise with each context sensitivity variation. To evaluate precision, we
measure the size of the call graph in terms of methods and edges, the
number of devirtualizable call sites, and the number of casts statically
provable to be safe.
The results of our study indicate that object-sensitive analysis
implementations are likely to scale better and more predictably than the
other approaches; that object-sensitive analyses are more precise than
comparable variations of the other approaches; that specializing the
heap abstraction improves precision more than extending the length of
context strings; and that the profusion of cycles in Java call graphs
severely reduces precision of analyses that forsake context sensitivity
in cyclic regions.
View the report
(.pdf)
View the report
(.ps)
|
Speculative Multithreading in a Java Virtual Machine
|
(2005-1) |
back
|
Authors:
Christopher J.F. Pickett and Clark Verbrugge
Date: March 2005
Abstract
Speculative multithreading (SpMT) is a dynamic program
parallelisation technique that promises dramatic speedup of
irregular, pointer-based programs as well as numerical, loop-based
programs.
We present the design and implementation of software-only SpMT for
Java at the virtual machine level. We take the full Java language
into account and we are able to run and analyse real world
benchmarks in reasonable execution times on commodity multiprocessor
hardware.
We provide an experimental analysis of benchmark behaviour,
uncovered parallelism, the impact of return value prediction,
processor scalability, and a breakdown of overhead costs.
View
the report (.pdf)
View
the presentation slides (.pdf)
BibTeX
entry (.bib)
|
Adding trace matching to AspectJ
|
(abc-2005-1) |
back
|
Authors:
Chris Allan, Pavel Avgustinov, Aske Simon Christensen, Laurie Hendren,
Sascha Kuzins, Ondřej Lhoták, Oege de Moor, Damien Sereni,
Ganesh Sittampalam, and Julian Tibble
Date: March 2005
Abstract
An aspect observes the execution of a base program; when certain actions occur, the aspect runs some extra code of its own. In the AspectJ language, the observations that an aspect can make are confined to the current action: it is not possible to directly observe the history of a computation.
Recently there have been several interesting proposals for new history-based language features, most notably by Douence et al. and also by Walker and Viggers. In this paper we present a new history-based language feature called tracematches, where the programmer can trigger the execution of extra code by specifying a regular pattern of events in a computation trace. We have fully designed and implemented tracematches as a seamless extension to AspectJ.
A key innovation in our tracematch approach is the introduction of free variables in the matching patterns. This enhancement enables a whole new class of applications where events can be matched not only by the event kind, but also by the values associated with the free variables. We provide several examples of applications enabled by this feature.
After introducing and motivating the idea of tracematches via examples, we present a detailed semantics of our language design, and we derive an implementation from that semantics. The implementation has been realised as an extension of the abc compiler for AspectJ.
View the report
(.pdf)
View the report
(.ps)
|
Optimising AspectJ
|
(abc-2004-3) |
back
|
Authors:
Pavel Avgustinov,
Aske Simon Christensen,
Laurie Hendren,
Sascha Kuzins,
Jennifer Lhoták,
Ondřej Lhoták,
Oege de Moor,
Damien Sereni,
Ganesh Sittampalam, and
Julian Tibble
Date: November 2004
Abstract
AspectJ, an aspect-oriented extension of Java, is becoming increasingly
popular. However, not much work has been
directed at optimising compilers for AspectJ.
Optimising AOP languages provides many new and interesting
challenges for compiler writers, and this paper identifies and
addresses three such challenges.
First, compiling around advice efficiently is particularly challenging.
We provide a new code generation strategy for around
advice, which (unlike previous implementations) both avoids
the use of excessive inlining and the use of closures. We show it
leads to more compact code, and can also improve run-time
performance. Second, woven code sometimes includes run-time tests
to determine whether advice should execute. One important
case is the cflow pointcut which uses information about the
dynamic calling context. Previous techniques for cflow
were very costly in terms of both time and space. We
present new techniques to minimize or eliminate the overhead of
cflow using both intra- and inter-procedural analyses.
Third, we have addressed the general problem of how to structure an
optimising compiler so that traditional analyses can be
easily adapted to the AOP setting.
We have implemented all of the techniques in this paper
in abc, our AspectBench Compiler for AspectJ, and
we demonstrate significant speedups with empirical results.
Some of our techniques have already been integrated into the production
AspectJ compiler, ajc 1.2.1.
View the report
(.pdf)
View the report
(.ps)
|
Code Layout as a Source of Noise in JVM Performance
|
(2004-8) |
back
|
Authors:
Dayong Gu, Clark Verbrugge, and Etienne Gagnon
Date: October 2004
Abstract
We describe the effect of a particular form of "noise" in
benchmarking. We investigate the source of anomalous measurement data
in a series of optimization strategies that attempt to improve data
cache performance in the garbage collector of a Java virtual
machine. The results of our experiments can be explained in terms of
the difference in code positioning, and hence instruction and data
cache behaviour. We show that unintended changes in code positioning
due to code modifications as trivial as symbol renaming can contribute
up to 2.7% of measured machine cycle cost, 20% in data cache misses,
and 37% in instruction cache misses.
View the report
(.pdf)
BibTeX entry
(.bib)
|
An Algorithmic Approach for Precise Analysis of the pi-Calculus
|
(2004-7) |
back
|
Authors:
Sam B. Sanjabi and Clark Verbrugge
Date: October 2004
Abstract
We present an algorithmic approach to dataflow
analysis of the pi-calculus that unifies previous such analyses under
a single framework. The structure of the algorithms presented
lead to a dissociation of correctness proofs from the analysis itself,
and this has led to the development of an algorithm that improves the
accuracy of previous approaches by considering the operation of all
process operators including sequencing and replication. Our
analysis solution is also independent of any concurrent context in
which the analyzed process runs.
View the report
(.pdf)
View the report
(.ps)
BibTeX entry
(.bib)
|
Compiler Analyses for Improved Return Value Prediction
|
(2004-6) |
back
|
Authors:
Christopher J.F. Pickett and Clark Verbrugge
Date: October 2004
Abstract
Speculative method-level parallelism has been shown to benefit from
return value prediction. In this paper we propose and analyse two
compiler analyses designed to improve the cost and performance of a
hybrid return value predictor in a Java virtual machine setting. A
return value use analysis determines which return values are consumed,
and enables us to eliminate 2.6% of all non-void dynamic method calls
in the SPEC JVM98 benchmarks as prediction candidates. An extension
of this analysis detects return values used only inside boolean and
branch expressions, and allows us to safely substitute incorrect
predictions for 14.1% of return values at runtime, provided all future
use expressions are satisfied correctly. We find an average 3.2%
reduction in memory and up to 7% increase in accuracy. A second,
interprocedural parameter dependence analysis reduces the number of
inputs to a memoization-based sub-predictor for 10.2% of all method
calls, and the combination of both analyses reduces overall memory
requirements by 5.3%.
View
the report (.pdf)
View
the presentation slides (.pdf)
BibTeX
entry (.bib)
|
Building the abc AspectJ compiler with Polyglot and Soot
|
(abc-2004-2) |
back
|
Authors:
Pavel Avgustinov,
Aske Simon Christensen,
Laurie Hendren,
Sascha Kuzins,
Jennifer Lhoták,
Ondřej Lhoták,
Oege de Moor,
Damien Sereni,
Ganesh Sittampalam, and
Julian Tibble
Date: October 2004
Abstract
Aspect-oriented programming and the development of aspect-oriented
languages is rapidly gaining momentum, and the advent of this new kind
of programming language provides interesting challenges for compiler
developers. Aspect-oriented language features require new compilation
approaches, both in the frontend semantic analysis and in the backend
code generation. This paper is about the design and implementation of
the abc compiler for the aspect-oriented language AspectJ.
One important contribution of this paper is to show how we can leverage
existing compiler technology by combining Polyglot (an extensible
compiler framework for Java) in the frontend and Soot (a framework for
analysis and transformation of Java) in the backend. In particular,
the extra semantic checks needed to compile AspectJ are implemented
by extending and augmenting the compiler passes provided by Polyglot.
All AspectJ-specific language constructs are then extracted from
the program to leave a pure Java program that can be compiled using
the existing Java code generation facility of Soot. Finally, all
code generation and transformation is performed on Jimple, Soot's
intermediate representation, which allows for a clean and convenient
method of applying aspects to source code and class files alike.
A second important contribution of the paper is that we describe our
implementation strategies for the new challenges that are specific to
aspect-oriented language constructs. Although our compiler is targeted
towards AspectJ, many of these ideas apply to aspect-oriented languages
in general.
Our abc compiler implements the full AspectJ language as defined by ajc
1.2.0 and is freely available under the GNU LGPL.
View the report
(.pdf)
View the report
(.ps)
|
Using inter-procedural side-effect information in JIT optimizations
|
(2004-5) |
back
|
Authors: Anatole Le, Ondřej Lhoták, and Laurie Hendren
Date: October 2004
Abstract
Inter-procedural analyses such as side-effect analysis can provide
information useful for performing aggressive optimizations. We present
a study of whether side-effect information improves performance in
just-in-time (JIT) compilers, and if so, what level of analysis
precision is needed.
We used Spark, the inter-procedural analysis component of the Soot Java
analysis and optimization framework, to compute side-effect information
and encode it in class files. We modified Jikes RVM, a research JIT,
to make use of side-effect analysis in local common sub-expression
elimination, heap SSA, redundant load elimination and loop-invariant
code motion. On the SpecJVM98 benchmarks, we measured the static number
of memory operations removed, the dynamic counts of memory reads
eliminated, and the execution time.
Our results show that the use of side-effect analysis increases the
number of static opportunities for load elimination by up to 98%, and
reduces dynamic field read instructions by up to 27%. Side-effect
information enabled speedups in the range of 1.08x to 1.20x for some
benchmarks. Finally, among the different levels of precision of
side-effect information, a simple side-effect analysis is usually
sufficient to obtain most of these speedups.
View the report
(.pdf)
View the report
(.ps)
|
A Study of Type Analysis for Speculative Method Inlining in a JIT Environment
|
(2004-4) |
back
|
Authors: Feng Qian and Laurie Hendren
Date: October 2004
Abstract
Method inlining is one of the most important optimizations for JIT
compilers in Java virtual machines. In order to increase the number of
inlining opportunities, a type analysis can be used to identify
monomorphic virtual calls. In a JIT environment, the compiler and
type analysis must also handle dynamic class loading properly because
class loading can invalidate previous analysis results and invalidate
some speculative inlining decisions. To date, a very simple type
analysis, class hierarchy analysis (CHA), has been used successfully
in JIT compilers for speculative inlining with invalidation techniques
as backup.
This paper seeks to determine if more powerful dynamic type analyses
could further improve inlining opportunities in a JIT compiler. To
achieve this goal we developed a general dynamic type analysis
framework which we have used for designing and implementing dynamic
versions of several well-known static type analyses, including CHA,
RTA, XTA and VTA.
Surprisingly, the simple dynamic CHA is nearly as good as an ideal
type analysis for inlining virtual method calls. There is little
room for further improvement. On the other hand, only
a reachability-based interprocedural type analysis (VTA) is able to
capture the majority of monomorphic interface calls.
We also found that memory overhead is the biggest issue with dynamic
whole-program analyses. To address this problem, we outline the
design of a demand-driven interprocedural type analysis for inlining
hot interface calls.
View the report
(.pdf)
View the report
(.ps)
|
abc: An extensible AspectJ compiler
|
(abc-2004-1) |
back
|
Authors:
Pavel Avgustinov,
Aske Simon Christensen,
Laurie Hendren,
Sascha Kuzins,
Jennifer Lhoták,
Ondřej Lhoták,
Oege de Moor,
Damien Sereni,
Ganesh Sittampalam, and
Julian Tibble
Date: September 2004
Abstract
Research in the design of aspect-oriented programming languages
requires a workbench that facilitates easy experimentation with
new language features and implementation techniques. In particular,
new features for AspectJ have been proposed that require extensions
in many dimensions: syntax, type checking and code generation, as well as
data flow and control flow analyses.
The AspectBench Compiler (abc) is an implementation of such a workbench.
The base version of abc implements the full AspectJ language.
Its frontend is built, using the Polyglot framework, as a modular
extension of the Java language. The use of Polyglot
gives flexibility of syntax and type checking.
The backend is built using the Soot framework, to give modular code
generation and analyses.
In this paper, we outline the design of abc, focusing mostly on how
the design supports extensibility. We then provide a general overview of how
to use abc to implement an extension. Finally,
we illustrate the extension mechanisms of abc through a number of
small, but non-trivial, examples. abc is freely available under
the GNU LGPL.
View the report
(.pdf)
View the report
(.ps)
|
A Practical MHP Information Analysis for Concurrent Java Programs
|
(2004-3) |
back
|
Authors:
Lin Li and Clark Verbrugge
Date: June 2004
Abstract
In this paper we present an implementation of May Happen in
Parallel analysis for Java that attempts to address some of the
practical implementation concerns of the original work. We describe
a design that incorporates techniques for aiding a feasible
implementation and expanding the range of acceptable inputs.
We provide experimental results showing the utility and impact
of our approach and optimizations using a variety of concurrent
benchmarks.
View the report
(.pdf)
Download the
report (.ps.gz)
|
Measuring the Dynamic Behaviour of AspectJ Programs
|
(2004-2) |
back
|
Authors:
Bruno Dufour, Christopher Goard, Laurie Hendren and Clark Verbrugge (McGill University)
Oege de Moor and Ganesh Sittampalam (Oxford University)
Date: March 2004
Abstract
This paper proposes and implements a rigorous method for studying the
dynamic behaviour of AspectJ programs. As part of this methodology
several new metrics specific to AspectJ programs are proposed and
tools for collecting the relevant metrics are presented. The major
tools consist of: (1) a modified version of the AspectJ compiler that tags
bytecode instructions with an indication of the cause of their generation,
such as a particular feature of AspectJ; and (2) a modified version of the
*J dynamic metrics collection tool which is composed of a JVMPI-based
trace generator and an analyzer which propagates tags and
computes the proposed metrics. This dynamic propagation is essential,
and thus this paper contributes not only
new metrics, but also non-trivial ways of computing them.
We furthermore
present a set of benchmarks that exercise a wide range of AspectJ's
features, and the metrics that we measured on these
benchmarks. The results provide guidance to AspectJ users on how to avoid
efficiency pitfalls, to AspectJ implementors on promising
areas for future optimization, and to tool builders on ways to
understand runtime behaviour of AspectJ.
View the report
(.pdf)
Download the
report (.ps.gz)
|
Precise, Partially Compositional Analysis of the pi-Calculus
|
(2004-1) |
back
|
Authors:
Sam Sanjabi and Clark Verbrugge
Date: February 2004
Abstract We present a new algorithm for computing
control flow analysis on the pi-calculus. Our approach is strictly
more accurate then existing algorithms, while maintaining a polynomial
running time. Our algorithm is also partially compositional, in the
sense that the representational structure that results from analyzing a
process can be efficiently reused in an analysis of the same process in
another context.
View the report
(.pdf)
Download the
report (.ps.gz)
|
Jedd: A BDD-based relational extension of Java
|
(2003-7) |
back
|
Authors: Ondřej Lhoták and Laurie Hendren
Date: November 2003
Abstract
In this paper we present Jedd, a language extension to Java that supports
a convenient way of programming with Binary Decision Diagrams (BDDs).
The Jedd language abstracts BDDs as database-style relations and operations
on relations, and provides static type rules to ensure that relational
operations are used correctly.
The paper provides a description of the Jedd language and reports on the
design and implementation of the Jedd translator and associated runtime
system. Of particular interest is the approach to assigning attributes
from the high-level relations to physical domains in the underlying BDDs, which
is done by expressing the constraints as a SAT problem and using a modern
SAT solver to compute the solution. Further, a runtime system is
defined that handles memory management issues and supports a browsable
profiling tool for tuning the key BDD operations.
The motivation for designing Jedd was to support the development of whole
program analyses based on BDDs, and we have used Jedd to express five
key interrelated whole program analyses in our Soot compiler framework.
We provide some examples of this application and discuss our experiences
using Jedd.
View the report
(.pdf)
Download the
report (.ps.gz)
|
Problems in Objectively
Quantifying Benchmarks using Dynamic Metrics
|
(2003-6) |
back
|
Authors: Bruno Dufour, Laurie Hendren and Clark Verbrugge
Date: October 2003
Abstract
Basic problems encountered when trying to accurately and reasonably
measure dynamic properties of a program are discussed. These include
problems in determining and assessing specific, desirable metric
qualities that may be perturbed by subtle and unexpected program
behaviour, as well as technical limitations on data collection. Some
interpreter or Java-specific problems are examined, and the general
issue of how to present metrics is also discussed.
View the report
(.pdf)
Download the
report (.ps.gz)
|
Towards Dynamic Interprocedural Analysis in JVMs
|
(2003-5) |
back
|
Authors: Feng Qian and Laurie Hendren
Date: October 2003
A revised version appears in VM 2004, see the final paper here.
Abstract
This paper presents a new, inexpensive, mechanism for constructing
a complete call graph for Java programs at runtime, and provides an
example of using the mechanism for implementing a dynamic reachability-based
interprocedural analysis (IPA), namely dynamic XTA.
Reachability-based IPAs, such as points-to
analysis, type analysis, and escape analysis, require a
context-insensitive call graph of the analyzed program.
Computing a call graph at runtime presents several challenges. First,
the overhead must be low. Second, when implementing the mechanism for
languages such as Java, both polymorphism and lazy class loading must
be dealt with correctly and efficiently.
We propose a new mechanism with low costs for constructing runtime
call graphs in a JIT environment. The mechanism uses a profiling code
stub to capture the fist-time execution of a call edge, and adds at
most one more instruction to the repeated call edge invocations.
Polymorphism and lazy class loading are handled transparently. The
call graph is constructed incrementally, and it supports optimistic
analysis and speculative optimizations with invalidations.
We also developed a dynamic, reachability-based type analysis, dynamic XTA,
as an application of runtime call graphs. It also serves as
an example of handling lazy class loading in dynamic IPAs.
The dynamic call graph construction algorithm and dynamic
version of XTA have been
implemented in Jikes RVM, and we present empirical measurements of the
overhead of call graph construction and the characteristics of
dynamic XTA.
View the report (.pdf)
Download the report (.ps.gz)
|
Integrating the Soot compiler infrastructure into an IDE
|
(2003-4) |
back
|
Authors: Jennifer Lhoták, Ondřej Lhoták and Laurie Hendren
Date: October 2003
Abstract
This paper presents the integration of Soot, a byte-code analysis and
transformation framework, with an integrated development environment (IDE),
Eclipse. Such an integrated toolkit is useful for both the compiler
developer, to aid in understanding and debugging new analyses,
and also for the end-user of the IDE, to aid in program
understanding by exposing semantic information gathered by the advanced
compiler analyses. The paper discusses these advantages and provides
concrete examples of its usefulness.
There are several major challenges to overcome in developing the integrated
toolkit, and the paper discusses three major challenges and the solutions
to those challenges. An overview of Soot and the integrated toolkit is
given, followed by a more detailed discussion of the fundamental components.
The paper concludes with several illustrative examples of using the
integrated toolkit along with a discussion of future plans and research.
View the report (.pdf).
Download the report (.ps.gz)
|
Improving the Precision
and Correctness of Exception Analysis in Soot
|
(2003-3) |
back
|
Author: John Jorgensen
Date: September 2003
Abstract
To date, Soot has employed very conservative analyses of
exceptional control flow, producing control flow
graphs which include an edge to each exception handler from every
statement within the block of code protected by
that handler, including statements which cannot
throw any exceptions caught by the handler. This
report describes some of the difficulties and
subtleties of analyzing exceptional control flow,
and describes an attempt to refine Soot's
analysis, by determining the types of exceptions
each statement has the potential to throw, and drawing edges to
exception handlers only from protected statements
which may throw exceptions the handlers would catch.
View the report (.pdf).
Download the report (.ps.gz)
|
Points-to Inspired Static Analysis For The pi-Calculus
|
(2003-2) |
back
|
Author: Sam B. Sanjabi and Clark Verbrugge
Date: August 2003
Abstract
We present a flow and context insensitive control flow analysis of the
pi-calculus which computes a superset of the names that can possibly be
transmitted over each channel. The analysis is a simpler version of the
constraint-based analysis of Bodei et al based on techniques inspired
from points-to analyis, but requires no extension to the basic calculus.
The algorithm computes the same solution as the previous one, but
with better space efficiency. Our approach is closely related to
techniques used in compiler optimizers to do points-to analyses, and is a
simpler presentation, provides insight into the inner workings of the
process, and immediately suggests other possible analyses on
pi-calculus processes.
Download the report (.ps.gz)
|
Dynamic Metrics for Java
|
(2003-1) |
back |
Author: Bruno Dufour, Karel Driesen, Laurie Hendren and Clark Verbrugge
Date: March 2003
Abstract
In order to perform meaningful experiments in optimizing compilation and
run-time system design, researchers usually rely on a suite of benchmark
programs of interest to the optimization technique under consideration.
Programs are described as numeric, memory-intensive,
concurrent, or object-oriented, based on a qualitative
appraisal, in some cases with little justification. We believe it is
beneficial to quantify the behaviour of programs with a concise and precisely
defined set of metrics, in order to make these intuitive notions of program
behaviour more concrete and subject to experimental validation. We therefore
define and measure a set of unambiguous, dynamic, robust and
architecture-independent metrics that can be used to categorize programs
according to their dynamic behaviour in five areas: size, data structure,
memory use, concurrency, and polymorphism. A framework computing some of
these metrics for Java programs is presented along with specific results
demonstrating how to use metric data to understand a program's behaviour, and
both guide and evaluate compiler optimizations.
View the report (.pdf)
Download the report (.ps.gz)
Link to Dynamic Metrics project
|
EVolve: An Open Extensible Software Visualization Framework
|
(2002-12) |
back |
Author: Qin Wang, Wei Wang, Rhodes Brown, Karel Driesen, Bruno Dufour, Laurie Hendren and Clark Verbrugge
Date: December 2002
Abstract
Existing visualization tools typically do not allow easy extension by
new visualization techniques, and are often coupled with inflexible
data input mechanisms. This paper presents EVolve, a flexible and
extensible framework for visualizing program characteristics and
behaviour. The framework is flexible in the sense that it can
visualize many kinds of data, and it is extensible in the sense that
it is quite straightforward to add new kinds of visualizations.
The overall architecture of the framework consists of
the core EVolve platform that communicates with data sources via a
well defined data protocol and which communicates with visualization methods
via a visualization protocol.
Given a data source, an end-user can use EVolve as a stand-alone tool
by interactively creating, configuring and modifying
visualizations. A variety of visualizations are provided in the
current EVolve library, with features that facilitate the comparison
of multiple views on the same execution data.
We demonstrate EVolve in the context of visualizing execution behaviour
of Java programs.
View the report (.pdf)
Link to EVolve project
|
Dynamic Metrics for Compiler Developers
|
(2002-11) |
back |
Author: Bruno Dufour, Karel Driesen, Laurie Hendren and Clark Verbrugge
Date: November 2002
Abstract
In order to perform meaningful experiments in optimizing compilation
and run-time system design, researchers usually rely on a suite of
benchmark programs of interest to the optimization technique under
consideration. Programs are described as numeric, memory-intensive,
concurrent, or object-oriented, based on a qualitative appraisal, in
some cases with little justification. We believe it is beneficial to
quantify the behavior of programs with a concise and precisely defined
set of metrics, in order to make these intuitive notions of program
behavior more concrete and subject to experimental validation. We
therefore define a set of unambiguous, dynamic, robust and
architecture-independent metrics that can be used to categorize
programs according to their dynamic behavior in five areas: size, data
structure, memory use, concurrency, and polymorphism. A framework
computing some of these metrics for Java programs is presented along
with specific results.
View the report (.pdf)
Download the report (.ps.gz)
Link to Dynamic Metrics project
|
Points-to Analysis using BDDs
|
(2002-10) |
back |
Author: Marc Berndl, Ondřej Lhoták, Feng Qian, Laurie Hendren and Navindra Umanee
Date: November 2002
A revised version to appear in PLDI'03, San Diego
Abstract
This paper reports on a new approach to solving a subset-based
points-to analysis for Java using Binary Decision Diagrams (BDDs).
In the model checking community, BDDs have been shown very effective for
representing large sets and solving very large verification problems.
Our work shows that BDDs can also be very effective for developing a
points-to analysis that is simple to implement and that
scales well, in both space and time, to large programs.
The paper first introduces BDDs and operations on BDDs using some
simple points-to examples. Then, a complete subset-based points-to
algorithm is presented, expressed completely using BDDs and BDD
operations. This algorithm is then refined by finding appropriate
variable orderings and by making the algorithm incremental, in order
to arrive at a very efficient algorithm. Experimental results are
given to justify the choice of variable ordering, to demonstrate the
improvement due to incrementalization, and to compare the performance
of the BDD-based solver to an efficient hand-coded graph-based solver.
Finally, based on the results of the BDD-based solver, a variety of
BDD-based queries are presented, including the points-to query.
View the report (.pdf)
Download the report (.ps.gz)
Link to PTA-BDD project
|
Scaling Java Points-To Analysis using Spark
|
(2002-9) |
back |
Author: Ondřej Lhoták and Laurie Hendren
Date: October 2002
Abstract
Most points-to analysis research has been done on different systems by
different groups, making it difficult to compare results, and to understand
interactions between individual factors each group studied.
Furthermore, points-to analysis for Java has been studied much less
thoroughly than for C, and the tradeoffs appear very different.
We introduce Spark, a flexible framework for experimenting with
points-to analyses for Java. Spark supports equality- and subset-based
analyses, variations in field sensitivity, respect for declared types,
variations in call graph construction, off-line simplification, and
several solving algorithms. Spark is composed of building blocks on
which new analyses can be based.
We demonstrate Spark in a substantial study of factors affecting
precision and efficiency of subset-based points-to analyses, including
interactions between these factors. Our results show that Spark is
not only flexible and modular, but also offers superior time/space
performance when compared to other points-to analysis implementations.
View the report (.ps)
Download the report (.ps.gz)
|
Dynamic Profiling and Trace Cache Generation for a Java Virtual Machine
|
(2002-8) |
back |
Author: Marc Berndl and Laurie Hendren
Date: October 2002
Abstract
Dynamic program optimization is becoming increasingly important for
achieving good runtime performance. One of the key issues in such
systems is how it selects which code to optimize. One approach is
to dynamically detect traces, long sequences of instructions which
are likely to execute to completion. Such traces can be stored in
a trace cache and dispatched one trace at a time (rather than one
instruction or one basic block at a time). Furthermore, traces have
been shown to be a good unit for optimization.
This paper reports on a new approach for dynamically detecting, creating
and storing traces in a Java virtual machine.
We first elucidate four important design criteria for a successful trace
strategy: good instruction stream coverage, low dispatch rate, cache
stability and optimizability of traces. We then present our approach which is
based on branch correlation graphs. Branch correlation graphs
store information about the correlation between pairs of branches, as
well as additional state information.
We present the complete design for an efficient implementation of the system,
including a detailed discussion of the trace cache and profiling mechanisms.
We have implemented an experimental framework to measure the traces
generated by our approach in a direct-threaded-inlining Java VM
(SableVM) and we present experimental results to show that the traces we
generate meet the design criteria.
View the
report (.pdf)
View the report (.ps)
Download the report (.ps.gz)
|
STEP: A Framework for the Efficient Encoding of General Trace Data
|
(2002-7) |
back |
Author: Rhodes Brown, Karel Driesen, David Eng, Laurie Hendren,
John Jorgensen, Clark Verbrugge and Qin Wang
Date: June 2002
Abstract
Traditional tracing systems are often limited to recording a fixed set of basic
program events. These limitations can frustrate an application or compiler
developer who is trying to understand and characterize the complex behavior of
software systems such as a Java program running on a Java Virtual Machine. In
the past, many developers have resorted to specialized tracing systems that
target a particular type of program event. This approach often results in an
obscure and poorly documented encoding format which can limit the reuse and
sharing of potentially valuable information. To address this problem, we present
STEP, a system designed to provide profiler developers with a standard method
for encoding general program trace data in a flexible and compact format. The
system consists of a trace data definition language along with a compiler and
architecture that simplifies the client interface by encapsulating the details
of encoding and interpretation.
View the
report (.pdf)
View the report (.ps)
Download the report (.ps.gz)
|
EVolve: An Extensible Software Visualization Framework
|
(2002-6) |
back |
Author: Qin Wang, Rhodes Brown, Karel Driesen, Laurie Hendren and Clark Verbrugge
Date: July 2002
Abstract
Existing visualization tools typically do not allow easy extension by
new visualization techniques, and are often coupled with inflexible
data input mechanisms. This paper presents EVolve, a flexible and
extensible framework for visualizing program characteristics and
behaviour. The framework is flexible in the sense that it can
visualize many kinds of data, and it is extensible in the sense that
it is quite straightforward to add new kinds of visualizations.
The overall architecture of the framework consists of the core EVolve
platform that communicates with data sources via a well defined data
protocol and which communicates with visualization methods via a
visualization protocol.
Given a data source, an end-user can use EVolve as a stand-alone
tool by interactively creating, configuring and modifying one or more
visualizations. A variety of visualizations are provided in the
standard EVolve visualization library. EVolve can be used to build
custom visualizers by implementing new data sources and/or new kinds
of visualizations.
The paper presents an overview of the system, examples of its use,
and discusses our experiences using the tool both as an end-user and
as a developer building custom visualizers.
View the report (.ps)
Download the report (.ps.gz)
|
Fast Local List Scheduling
|
(2002-5) |
back |
Clark Verbrugge
Date: July 2002
Abstract
List scheduling is a well-known O(n^2) algorithm for performing instruction scheduling during
compiler code generation on modern, superscalar architecture. An asymptotically more efficient
O(n*log(n)) variation is presented; this algorithm achieves its lower time bounds by asserting
practical constraints on the structure of instruction dependencies.
View the report (.ps)
Download the report (.ps.gz)
|
Combining Static and Dynamic Data in Code Visualization
|
(2002-4) |
back |
Author: David Eng
Date: June 2002
Abstract
The task of developing, tuning, and debugging compiler
optimizations is a difficult one which can be facilitated by
software visualization. There are many characteristics of the
code which must be considered when studying the kinds of
optimizations which can be performed. Both static data collected
at compile-time and dynamic runtime data can reveal opportunities
for optimization and affect code transformations. Visualization
of such complex systems must include as much information as
possible, and accommodate the different sources from which this
information is acquired.
This paper presents a visualization framework designed to address
these issues. The framework is based on a new, extensible
language called JIL which provides a common format for
encapsulating intermediate representations and associating them
with compile-time and runtime data. Custom visualization
interfaces can then combine JIL data from separate tools, exposing
both static and dynamic characteristics of the underlying code.
View the report (.ps)
Download the report (.ps.gz)
|
JIL: an Extensible Intermediate Language
|
(2002-3) |
back |
Author: David Eng
Date: June 2002
Abstract
The Java Intermediate Language (JIL) is a subset of XML and SGML
described in this document. Its goal is to provide an
intermediate representation of Java source code suitable for
machine use. JIL benefits from the features of XML, such as
extensibility and portability, while providing a common ground for
software tools. The following document discusses the design
issues and overall framework for such a language, including a
description of all fundamental elements.
View the report (.ps)
Download the report (.ps.gz)
|
Run-time Evaluation of Opportunities for Object Inlining in Java
|
(2002-2) |
back |
Authors: Ondřej Lhoták and Laurie Hendren
Date: June 2002
Abstract
Object-oriented languages, like Java, encourage the use of many small
objects linked together by field references, instead of a few monolithic
structures. While this practice is beneficial from a program design
perspective, it can slow down program execution by incurring many
pointer indirections. One solution to this problem is object inlining:
when the compiler can safely do so, it fuses small objects together,
thus removing the reads/writes to the removed field, saving the memory
needed to store the field and object header, and reducing the number of
object allocations.
The objective of this paper is to measure the potential for object
inlining by studying the run-time behavior of a comprehensive set of
Java programs. We study the traces of program executions in order to
determine which fields behave like inlinable fields. Since we are using
dynamic information instead of a static analysis, our results give
an upper bound on what could be achieved via a static compiler-based
approach. Our experimental results measure the potential improvements
attainable with object inlining, including reductions in the numbers of
field reads and writes, and reduced memory usage.
Our study shows that some Java programs can benefit significantly from
object inlining, with close to a 10% speedup. Somewhat to our surprise,
our study found one case, the db benchmark, where the most important
inlinable field was the result of unusual program design, and fixing
this small flaw led to both better performance and clearer program
design. However, the opportunities for object inlining are highly
dependent on the individual program being considered, and are in many
cases very limited. Furthermore, fields that are inlinable also have
properties that make them potential candidates for other optimizations
such as removing redundant memory accesses. The memory savings possible
through object inlining are moderate.
View the report (.ps)
Download the report (.ps.gz)
|
An Adaptive, Region-Base Allocator for Java
|
(2002-1) |
back |
Authors: Feng Qian and Laurie Hendren
Date: February 2002
Abstract
This paper introduces an adaptive, region-based allocator for Java.
The basic idea is to allocate non-escaping objects in local regions,
which are allocated and freed in conjunction with their associated
stack frames. By releasing memory associated with these stack frames,
the burden on the garbage collector is reduced, possibly resulting in
fewer collections.
The novelty of our approach is that it does not require static escape
analysis, programmer annotations, or special type systems. The
approach is transparent to the Java programmer and relatively simple
to add to an existing JVM. The system starts by assuming that all
allocated objects are local to their stack region, and then catches
escaping objects via write barriers. When an object is caught
escaping, its associated allocation site is marked as a non-local
site, so that subsequent allocations will be put directly in the
global region. Thus, as execution proceeds, only those allocation
sites that are likely to produce non-escaping objects are allocated to
their local stack region.
The paper presents the overall idea, and then provides details of a
specific design and implementation. In particular, we present a paged
region allocator and the necessary modifications of the Jikes RVM
baseline JIT and a copying collector. Our experimental study
evaluates the idea using the SPEC JVM98 benchmarks, plus one other large
benchmark. We show that a region-based allocator is a reasonable
choice, that overheads can be kept low, and that the adaptive system
is successful at finding local regions that contain no escaping
objects.
Download the report (.ps.gz)
|
STOOP: The Sable Toolkit for Object-Oriented Profiling
|
(2001-2) |
back |
Authors: Rhodes Brown, Karel Driesen, David Eng, Laurie Hendren,
John Jorgensen, Clark Verbrugge, and Qin Wang
Date: November 2001
Abstract The performance and behaviour of large
object-oriented software systems can be difficult to
profile. Furthermore, existing profiling systems are often limited
by a fixed set of events that can be profiled and a fixed set of
visualizations. This paper presents the Sable Toolkit for
Object-Oriented Profiling (STOOP), which we developed to support a
wide variety of profiling needs. The toolkit makes it simple to
build new profilers which can be used for both standard and less
conventional profiling tasks.
At the core of the toolkit is its event pipe. A profiling agent
produces events for the event pipe and a visualizer consumes events
from the pipe. To provide a general purpose event pipe we have
defined a language for defining events and a compiler which
automatically produces the event pipe interfaces for the specified
events. Many different profiling agents can produce events for the
event pipe, including JVMPI agents, instrumented VMs and instrumented
bytecode. It is also simple to adapt a variety of visualizers to
read events from the event pipe. We have worked with two
visualizers, EVolve and JIMPLEX. We summarize each visualizer and
show how easy it was to adapt them to use events produced from STOOP.
Download the
report (.ps.gz)
|
Decompiling Java Bytecode: Problems, Traps and Pitfalls
|
(2001-1) |
back |
Authors: Jerome Miecznikowski and Laurie Hendren
Date: October 2001
Abstract
Java virtual machines execute Java bytecode instructions. Since this
bytecode is a higher level representation than traditional object code, it
is possible to decompile it back to Java source. Many such decompilers
have been developed and the conventional wisdom is that decompiling Java
bytecode is relatively simple. This may be true when decompiling bytecode
produced directly from a specific compiler, most often Sun's javac
compiler. In this case it is really a matter of inverting a known
compilation strategy. However, there are many problems, traps and
pitfalls when decompiling arbitrary verifiable Java bytecode. Such
bytecode could be produced by other Java compilers, Java bytecode
optimizers or Java bytecode obfuscators. Java bytecode can also be
produced by compilers for other languages, including Haskell, Eiffel, ML,
Ada and Fortran. These compilers often use very different code generation
strategies from javac.
This paper outlines the problems and solutions we have found in our
development of Dava, a decompiler for arbitrary Java bytecode. We first
outline the problems in assigning types to variables and literals, and the
problems due to expression evaluation on the Java stack. Then, we look at
finding structured control flow with a particular emphasis on how to deal
with Java exceptions and synchronized blocks. Throughout the paper we
provide small examples which are not properly decompiled by commonly used
decompilers.
Download the report (.ps.gz) (the same version as paper 2002-2)
|
A Comprehensive Approach to Array Bounds Check Elimination for Java
|
(2000-4) |
back |
Authors: Feng Qian, Laurie Hendren and Clark Verbrugge
Date: November 2000
Abstract
This paper reports on a new approach to eliminating array bounds checks in
Java. Our approach is based upon a flow-sensitive intraprocedural analysis
called variable constraint analysis (VCA). This analysis builds a
small constraint graph for each important point in a method, and then uses
the information encoded in the graph to infer the relationship between
array index expressions and the bounds of the array. Using VCA as the
base analysis, we also show how two further analyses can improve the
results of VCA. Array field analysis is applied
on each class and provides information about some arrays stored in fields,
while rectangular array analysis is an interprocedural analysis to approximate
the shape of arrays, and is useful for finding rectangular (non-ragged) arrays.
We have implemented all three analyses using the Soot bytecode
optimization/annotation framework and we transmit the results of the
analysis to virtual machines using class file attributes. We have
modified the Kaffe JIT, and IBM's High Performance
Compiler for Java (HPCJ) to make use of these attributes, and we
demonstrate significant speedups.
View the report (.ps)
Download the report (.ps.gz)
|
SableVM: A Research Framework for the Efficient Execution of Java Bytecode
|
(2000-3) |
back |
Authors: Etienne Gagnon and Laurie Hendren
Date: November 2000
Abstract
SableVM is an open-source virtual machine for Java, intended as a research framework for efficient
execution of Java bytecode.
The framework is essentially composed
of an extensible bytecode interpreter using state-of-the-art
and innovative techniques.
Written in the C programming language, and assuming
minimal system dependencies, the interpreter emphasizes high-level
techniques to support efficient execution.
In particular, we introduce new data
layouts for classes, virtual tables and object instances that
reduce the cost of interface method calls to that of normal virtual calls,
allow efficient garbage collection and light synchronization, and make
effective use of memory space.
View the report (.ps)
Download the report (.ps.gz)
|
A Framework for Optimizing Java Using Attributes
|
(2000-2) |
back |
Authors: Patrice Pominville, Feng Qian, Raja Vallée-Rai, Laurie Hendren and Clark Verbrugge
Date: July 2000 (updated: September 2000)
Abstract
This paper presents a framework for supporting the optimization of
Java programs using attributes in Java class files. We show how
class file attributes may be used to convey both optimization
opportunities and profile information to a variety of Java virtual
machines including ahead-of-time compilers and just-in-time compilers.
We present our work in the context of Soot, a framework that supports
the analysis and transformation of
Java bytecode (class files).
We demonstrate the framework with attributes for elimination of array bounds
and null pointer checks, and we provide experimental results for the
Kaffe just-in-time compiler, and IBM's High Performance Compiler for Java
ahead-of-time compiler.
View the report (.ps)
Download the report (.ps.gz)
|
Common Subexpression Removal in Soot
|
(2000-1) |
back |
Author: Patrick Lam
Date: May 2000
Abstract
Everyone who knows something about optimizing compilers knows about
common subexpression elimination. It is used to remove redundant
computations, which usually improves the execution time of a program.
The transformation is quite simple to describe, and so common
subexpression elimination must, of course, therefore be trivial to
implement. In this work, we discuss an implementation of common
subexpression elimination for Java, using the Soot framework. Our
implementation passes the Soot sentinel suite. Experimental
results are presented, showing that common subexpression elimination
slightly speeds up Java programs.
View the report (.ps)
Download the report (.ps.gz)
|
Practical Virtual Method Call Resolution for Java
|
(1999-4) |
back |
Authors: Vijay Sundaresan, Laurie Hendren, Chrislain Razafimahefa,
Raja Vallée-Rai, Patrick Lam, Etienne Gagnon and Charles Godin
Date: November 99
Abstract
This paper addresses the problem of resolving virtual method and
interface calls in Java bytecode.
The main focus is on a new practical technique that can
be used to analyze large applications while at the same time providing
more accurate results than class hierarchy analysis and rapid type
analysis.
We present two variations of our new technique, variable-type analysis
and a coarser-grain version called declared-type analysis.
Both of these analyses are inexpensive, easy to implement,
and our experimental results show that they scale linearly in
the size of the program.
We have implemented our new analyses
using the Soot framework, and we report on
empirical results for seven benchmarks.
We have used our techniques to build
accurate call graphs for complete applications (including libraries)
and we show that compared to a conservative call graph built
using class hierarchy analysis, our new variable-type analysis
can remove a significant number of nodes (methods) and call edges.
Further, our results show that we can improve upon the compression obtained
using rapid type analysis.
We also provide dynamic measurements of monomorphic call sites, focusing
on the benchmark code excluding libraries. We demonstrate that when
considering only the benchmark code,
both rapid type analysis and our new declared-type analysis do not add much
precision over class hierarchy analysis. However, our finer-grained
variable-type analysis does resolve significantly more
call sites, particularly for very object-oriented programs.
View the report (.ps)
Download the report (.ps.gz)
|
Optimizing Java Bytecode using the Soot Framework: Is it Feasible?
|
(1999-3) |
back |
Authors: Raja Vallée-Rai, Etienne Gagnon, Laurie Hendren, Patrick Lam, Patrice Pominville, and Vijay Sundaresan,
Date: November 99
Abstract
This paper presents Soot, a framework for optimizing Java(tm) bytecode.
The
framework is implemented in Java and supports three intermediate
representations
for representing Java bytecode: Baf, a streamlined representation of
Java's
stack-based bytecode;
Jimple, a typed three-address intermediate
representation suitable for optimization; and Grimp, an aggregated
version of
Jimple.
Our approach to class file optimization is to first convert the
stack-based
bytecode into Jimple, a three-address form more amenable to traditional
program
optimization, and then convert the optimized Jimple back to bytecode.
In order to demonstrate that our approach is feasible, we present
experimental results showing the effects of processing class files
through
our framework. In particular, we study the techniques necessary to
effectively
translate Jimple back to bytecode, without losing performance.
Finally, we
demonstrate that class file optimization can be quite effective by
showing the results of some basic optimizations using our framework.
Our experiments
were done on ten benchmarks, including seven SPECjvm98 benchmarks, and
were
executed on five different Java virtual machine implementations.
View the report (.ps)
Download the report (.ps.gz)
|
Intra-procedural Inference of Static Types for Java Bytecode |
(1999-1) |
back |
Authors: Etienne Gagnon and Laurie J. Hendren
Date: March 1999
Abstract
Although Java
bytecode has a significant amount of type information embedded in it,
there are no explicit types for local variables.
However, knowing types for local variables is very useful for both program
optimization and decompilation.
In this paper, we present a practical algorithm for
inferring static types for local variables
in a 3-address, stackless, representation of Java bytecode.
By decoupling the type inference problem from the
low level bytecode representation, and abstracting it into a
constraint system, we are able to construct a sound and efficient
algorithm for inferring the type of most bytecode found in practice.
Further, for bytecode that cannot be typed, we suggest
transformations that make it typeable.
Using this same constraint system, we show that this type inference
problem is NP-hard in general. However, this hard case does not seem
to arise in practice and our experimental results show
that all of the 17,000 methods used in our tests were successfully
and efficiently typed by our algorithm.
View the report (.ps)
Download the report (.ps.gz)
|
Of Graphs and Coffi Grounds: Decompiling Java |
(1998-6) |
back |
Author: Patrick Lam
Date: September 1998
Abstract
Java programmers write applications and applets in plain English-like
text, and then apply a java compiler to the text to obtain {\em class
files}. Class files, which are typically transmitted across the web,
are a low-level representation of the original text; they are not
human-readable. Consider a compiler as a function from text to class
files. My goal is to compute the inverse function: given the compiled
class file, I wish to find the best approximation to the original text
possible. This is called decompilation.
Given a class file, one can build an unstructured graph representation
of the program. The main goal of my work is to develop graph
algorithms to convert these unstructured graphs into structured graphs
corresponding to high-level program constructs, and I will discuss
these in detail. I shall also mention some results concerning
possibility and impossibility which show that decompilation is always
possible if the program may be lengthened.
View the report (.ps) Download the report (.ps.gz)
| Jimple: Simplifying Java Bytecode for Analyses and Transformations |
(1998-4) |
back |
Authors: Raja Vallee-Rai and Laurie J. Hendren
Date: July 1998
Abstract
In this paper we present Jimple, a 3-address intermediate representation that has been designed
to simplify analysis and transformation of Java bytecode. We motivate the need for a new
intermediate representation by illustrating several difficulties with optimizing the stack-based
Java bytecode directly. In general, these difficulties are due to the fact that bytecode
instructions affect an expression stack, and thus have implicit uses and definitions of stack
locations. We propose Jimple as an alternative representation, in which each statement refers
explicitly to the variables it uses. We provide both the definition of Jimple and a complete
procedure for translating from Java bytecode to Jimple. This definition and translation have
been implemented using Java, and finally we show how this implementation forms the heart of the
Sable research projects.
View the report (.ps) Download the report (.ps.gz)
| Measuring ILP in Java Programs |
(1998-3) |
back |
Author: Raja Vallee-Rai
Date: June 1998
Abstract
Java programs can greatly differ in terms of required runtime support. C-like Java programs which
are usually computation intensive tend to require little, whereas heavily object oriented programs such
as compilers, can spend most of their execution time in runtime handlers. This report begins to
explore the differences in instruction level parallelism between these two extreme types of Java
programs. No solid conclusions can be drawn, however, due to an oversimplified execution model and
the scarcity of appropriate benchmarks.
View the report (.ps) Download the report (.ps.gz)
| Profiling the Kaffe JIT Compiler |
(1998-2) |
back |
Author: Raja Vallee-Rai
Date: February 1998
Abstract
This report documents our attempts at instrumenting the Kaffe JIT compiler in
order to estimate the execution cost of each bytecode instruction in the Java virtual machine.
View the report (.ps) Download the report (.ps.gz)
| The Jimple Framework |
(1998-1) |
back |
Author: Raja Vallee-Rai
Date: February 1998
Note: This report is obsolete. It covers an extremely old version of
the Jimple framework which is now called Soot.
Abstract
The Java Virtual Machine is a stack based architecture. This makes compiler analyses
and transformations more complex because of the implicit participation of the operand stack in
instructions. To simplify these tasks, we developed an alternative representation for Java
bytecode based on 3-address code and a framework to manipulate Java classfiles in this
form. This report presents an overview of the Jimple framework and the algorithms used
in the jimplification process.
View the report (.ps) Download the report (.ps.gz)
Last updated Mon Aug 25 09:04:54 EDT 2003.
|