|
Sable Home
Main Page
People
Projects
Publications
Software
Internal
Links
Publications
Papers
Theses
Posters
Reports
Notes
|
Sable Publications (Papers)
|
Dependent Advice: A General Approach to Optimizing History-based
Aspects
|
back |
Authors: Eric Bodden, Feng Chen and Grigore Rosu
Date: March 2009
AOSD 2009, Charlottesville, VA
Abstract
Many aspects for runtime monitoring are history-based: they contain pieces
of advice that execute conditionally, based on the observed execution history.
History-based aspects are notorious for causing high runtime overhead. Compilers
can apply powerful optimizations to history-based aspects using domain knowledge.
Unfortunately, current aspect languages like AspectJ impede optimizations, as
they provide no means to express this domain knowledge.
In this paper we present dependent advice, a novel AspectJ language
extension. A dependent advice contains dependency annotations that preserve
crucial domain knowledge: a dependent advice needs to execute only when its
dependencies are fulfilled. Optimizations can exploit this knowledge: we present
a whole-program analysis that removes advice-dispatch code from program locations
at which an advice's dependencies cannot be fulfilled.
Programmers often opt to have history-based aspects generated automatically,
from formal specifications from model-driven development or runtime monitoring.
As we show using code-generation tools for two runtime-monitoring approaches,
tracematches and JavaMOP, such tools can use knowledge contained in the
specification to automatically generate dependency annotations as well.
Our extensive evaluation using the DaCapo benchmark suite shows that the use of
dependent advice can significantly lower, sometimes even completely eliminate,
the runtime overhead caused by history-based aspects, independently of the
specification formalism.
View the paper (.pdf)
Download the paper (.ps.gz)
BibTeX entry
|
Finding Programming Errors Earlier by Evaluating Runtime Monitors
Ahead-of-Time
|
back |
Authors: Eric Bodden, Patrick Lam and Laurie Hendren
Date: November 2008
FSE 2008
Abstract
Runtime monitoring allows programmers to validate, for instance,
the proper use of application interfaces. Given a property specification,
a runtime monitor tracks appropriate runtime events to detect violations and
possibly execute recovery code. Although powerful, runtime monitoring
inspects only one program run at a time and so may require
many program runs to find errors. Therefore, in this paper, we present
ahead-of-time techniques that can (1) prove the absence of property violations on all
program runs, or (2) flag locations where violations are likely to occur.
Our work focuses on tracematches, an expressive runtime monitoring
notation for reasoning about groups of correlated objects.
We describe a novel flow-sensitive static analysis for
analyzing monitor states. Our abstraction captures both positive
information (a set of objects could be in a particular monitor
state) and negative information (the set is known not to be in a
state). The analysis resolves heap references by combining the
results of three points-to and alias analyses. We also propose a
machine learning phase to filter out likely false
positives.
We applied a set of 13 tracematches to the DaCapo benchmark suite and
SciMark2. Our static analysis rules out all potential points of failure
in 50% of the cases, and 75% of false positives on average. Our
machine learning algorithm correctly classifies the remaining potential
points of failure in all but three of 461 cases. The approach revealed
defects and suspicious code in three benchmark programs.
View the paper (.pdf)
BibTeX entry
|
Object representatives: a uniform abstraction for pointer information
|
back |
Authors: Eric Bodden, Patrick Lam and Laurie Hendren
Date: October 2008
1st International Academic Conference of the British Computer Society (BCS)
Abstract
Pointer analyses enable many subsequent program analyses and
transformations by statically disambiguating references to the
heap. However, different client analyses may have different sets
of pointer analysis needs, and each must pick some pointer analysis
along the cost/precision spectrum to meet those needs. Some analysis
clients employ combinations of pointer analyses to obtain better
precision with reduced analysis times. Our goal is to ease the task of
developing client analyses by enabling composition and
substitutability for pointer analyses. We therefore
propose object representatives, which statically represent runtime objects.
A representative encapsulates the notion of object identity, as observed
through the representative's aliasing relations with other representatives.
Object representatives enable pointer analysis clients to disambiguate
references to the heap in a uniform yet flexible way.
Representatives can be generated from many
combinations of pointer analyses, and pointer analyses can be freely exchanged
and combined without changing client code.
We believe that the use of object representatives brings many software
engineering benefits to compiler implementations because, at compile time,
object representatives are Java objects. We discuss our motivating case for
object representatives, namely, the development of an abstract interpreter for
tracematches, a language feature for runtime monitoring. We explain one
particular algorithm for computing object representatives which combines
flow-sensitive intraprocedural must-alias and must-not-alias analyses with a
flow-insensitive, context-sensitive whole-program points-to analysis. In our
experience, client analysis implementations can almost directly substitute
object representatives for runtime objects, simplifying the design and
implementation of such analyses.
View the paper (.pdf)
BibTeX entry
|
Racer: Effective Race Detection Using AspectJ
|
back |
Winner of an "ACM SIGSOFT Distinguished Paper Award".
Authors: Eric Bodden and Klaus Havelund
Date: July 2008
ISSTA 08, July 2008, Seattle, WA
Abstract
Programming errors occur frequently in large software systems, and even more so
if these systems are concurrent. In the past researchers have developed
specialized programs to aid programmers detecting concurrent programming errors
such as deadlocks, livelocks, starvation and data races.
In this work we propose a language extension to the aspect-oriented programming
language AspectJ, in the form of three new pointcuts, lock(),
unlock() and maybeShared(). These pointcuts allow programmers
to monitor program
events where locks are granted or handed back, and where values are accessed
that may be shared amongst multiple Java threads. We decide thread-locality
using a static thread-local objects analysis developed by others.
Using the three new primitive pointcuts, researchers can directly implement efficient
monitoring algorithms to detect concurrent programming errors online.
As an example, we expose a new algorithm which we call Racer, an
adoption of the well-known Eraser algorithm to the memory model of Java.
We implemented the new pointcuts as an extension to the AspectBench Compiler,
implemented the Racer algorithm using this language extension and then applied
the algorithm to the NASA K9 Rover Executive.
Our experiments proved our implementation very effective. In the Rover
Executive Racer finds 70 data races. Only one of these races was previously known.
We further applied the algorithm to two other multi-threaded programs written by
Computer Science researchers, in which we found races as well.
View the paper (.pdf)
BibTeX entry
|
Relational Aspects as Tracematches
|
back |
Authors: Eric Bodden, Reehan Shaikh and Laurie Hendren
Date: March 2008
AOSD 2008, March 2008, Brussels, Belgium
Abstract
The relationships between objects in an object-oriented program are an
essential property of the program's design and implementation. Two
previous approaches to implement relationships with aspects were
association aspects, an AspectJ-based language extension, and the
relationship aspects library. While those approaches greatly ease
software development, we believe that they are not general enough. For
instance, the library approach only works for binary relationships, while
the language extension does not allow for the association of primitive
values or values from non-weavable classes.
Hence, in this work we propose a generalized alternative implementation
via a direct reduction to tracematches, a language feature for executing
an advice after having matched a sequence of events.
This new implementation scheme yields multiple benefits. Firstly, our
implementation is more general than existing ones, avoiding most
previous limitations. It also yields a new language construct,
relational tracematches.
We provide an efficient implementation based on the AspectBench
Compiler, along with test cases and microbenchmarks. Our empirical
studies showed that our implementation, when compared to previous
approaches, uses a similar memory footprint with no leaking, but the
generality of our approach does lead to some runtime overhead. We
believe that our implementation can provide a solid foundation for
future research.
View the paper (.pdf)
BibTeX entry
|
Compiler-guaranteed Safety in Code-copying Virtual Machines
|
back |
Authors: Gregory B. Prokopski and Clark Verbrugge
Date: March 2008
CC 2008, March 29 - April 6, 2008, Budapest, Hungary
Abstract
Virtual Machine authors face a difficult choice between low performance, cheap interpreters, or specialized and costly compilers. A method able to bridge this wide gap is the existing \emph{code-copying} technique that reuses chunks of the VM's binary code to create a simple JIT. This technique is not reliable without a compiler guaranteeing that copied chunks are still functionally equivalent despite aggressive optimizations. We present a proof-of-concept, minimal-impact modification of a highly optimizing compiler, GCC. A VM programmer marks chunks of VM source code as {\em copyable}. The chunks of native code resulting from compilation of the marked source become addressable and self-contained. Chunks can be safely copied at VM runtime, concatenated and executed together. This allows code-copying VMs to safely achieve speedup up to 3 times, 1.67 on average, over the {\em direct} interpretation. This maintainable enhancement makes the code-copying technique reliable and thus practically usable.
View the paper (.pdf)
BibTeX entry
View the slides (.pdf)
Springer version
|
Phase-Based Adaptive Recompilation in a JVM
|
back |
Authors: Dayong Gu and Clark Verbrugge
Date: April 2008
CGO 2008, April 6 - 9, 2008, Boston, Massachusetts
Abstract
Modern JIT compilers often employ multi-level recompilation strategies as a means of ensuring the most used code is also the most highly optimized, balancing optimization costs and expected future performance. Accurate selection of code to compile and level of optimization to apply is thus important to performance. In this paper we investigate the effect of an improved recompilation strategy for a Java virtual machine. Our design makes use of a lightweight, low-level profiling mechanism to detect high-level, variable length phases in program execution. Phases are then used to guide adaptive recompilation choices, improving performance. We develop both an offline implementation based on trace data and a self-contained online version. Our offline study shows an average speedup of 8.7% and up to 21%, and our online system achieves an average speedup of 4.4%, up to 18%. We subject our results to extensive analysis and show that our design achieves good overall performance with high consistency despite the existence of many complex and interacting factors in such an environment.
View the paper (.pdf)
BibTeX entry
View the slides (.pdf)
ACM version
|
A staged static program analysis to improve the performance of runtime monitoring
|
back |
Authors: Eric Bodden Laurie Hendren and Ondřej Lhoták
Date: July 2007
21st European Conference on Object-Oriented Programming, July 30th - August 3rd 2007, Berlin, Germany
There exists an extended Technical Report version of this paper: abc-2007-2.
Abstract
In runtime monitoring, a programmer specifies a piece of code to execute when
a trace of events occurs during program execution.
Our work is based on tracematches, an extension to AspectJ,
which allows programmers to specify
traces via regular expressions with free variables.
In this paper we present a
staged static analysis which speeds up trace matching by
reducing the required runtime instrumentation.
The first stage is a simple analysis that
rules out entire tracematches, just based on
the names of symbols. In the second stage,
a points-to analysis is used, along with a flow-insensitive
analysis that eliminates instrumentation points with
inconsistent variable bindings. In the third stage the
points-to analysis is combined with a flow-sensitive
analysis that also takes into consideration the order in
which the symbols may execute.
To examine the effectiveness of each stage, we experimented
with a set of nine tracematches applied to the DaCapo benchmark suite.
We found that about 25% of the tracematch/benchmark combinations
had instrumentation overheads greater than 10%.
In these cases the first two stages work well for certain
classes of tracematches, often leading to significant performance
improvements. Somewhat surprisingly, we found the
third, flow-sensitive, stage did not add any improvements.
View the paper (.pdf)
BibTeX entry
|
Component-Based Lock Allocation
|
back |
Authors: Richard L. Halpert and Christopher J. F. Pickett and Clark Verbrugge
Date: July 2007
PACT 2007, September 2007, Brasov, Romania
Abstract
The allocation of lock objects to critical sections in concurrent
programs affects both performance and correctness. Recent work
explores automatic lock allocation, aiming primarily to minimize
conflicts and maximize parallelism by allocating locks to individual
critical section interferences. We investigate component-based lock
allocation, which allocates locks to entire groups of interfering
critical sections. Our allocator depends on a thread-based side
effect analysis, and benefits from precise points-to and may happen in
parallel information. Thread-local object information has a small
impact, and dynamic locks do not improve significantly on static
locks. We experiment with a range of small and large Java benchmarks
on 2-way, 4-way, and 8-way machines, and find that a single static
lock is sufficient for mtrt, that performance degrades by 10% for
hsqldb, that jbb2000 becomes mostly serialized, and that for lusearch,
xalan, and jbb2005, component-based lock allocation recovers the
performance of the original program.
View the paper (.pdf)
BibTeX entry
|
Dynamic Purity Analysis for Java Programs
|
back |
Authors: Haiying Xu and Christopher J. F. Pickett and Clark Verbrugge
Date: April 2007
PASTE 2007, June 2007, San Diego, California, USA
Abstract
The pure methods in a program are those that exhibit functional
or side effect free behaviour, a useful property in many contexts.
However, existing purity investigations present primarily static
results. We perform a detailed examination of dynamic method purity
in Java programs using a JVM-based analysis. We evaluate multiple
purity definitions that range from strong to weak, consider purity
forms specific to dynamic execution, and accomodate constraints
imposed by an example consumer application, memoization. We show that
while dynamic method purity is actually fairly consistent between
programs, examining pure invocation counts and the percentage of the
bytecode instruction stream contained within some pure method reveals
great variation. We also show that while weakening purity definitions
exposes considerable dynamic purity, consumer requirements can limit
the actual utility of this information.
View the paper (.pdf)
BibTeX entry
|
Obfuscating Java: the most pain for the least gain
|
back |
Authors: Michael Batchelder and Laurie Hendren
Date: March 2007
International Conference on Compiler Construction (CC 2007), Braga, Portugal.
Abstract
Bytecode, Java's binary form, is relatively high-level and therefore susceptible to decompilation attacks. An obfuscator transforms code such that it becomes more complex and therefore harder to reverse engineer. We develop bytecode obfuscations that are complex to reverse engineer but also do not significantly degrade performance. We present three kinds of techniques that: (1) obscure intent at the operational level; (2) complicate control flow and object-oriented design (i.e. program structure); and (3) exploit the semantic gap between what is legal in source code and what is legal in bytecode. Obfuscations are applied to a benchmark suite to examine their affect on runtime performance, control flow graph complexity, and decompilation. These results show that most of the obfuscations have only minor negative performance impacts and many increase complexity. In almost all cases, tested decompilers fail to produce legal source code or crash completely. Those obfuscations that are decompilable greatly reduce the readability of the output source code.
View the paper (.pdf)
Download the paper (.ps.gz)
BibTeX entry
|
Avoiding Infinite Recursion with Stratified Aspects
|
back |
Authors: Eric Bodden, Florian Forster and Friedrich Steimann
Date: March 2006
Net.ObjectDays 2006 - published in: GI-Edition Lecture Notes in Informatics 'NODe 2006 GSEM 2006'
Abstract
Infinite recursion is a known problem of aspect-oriented programming with AspectJ: if no special precautions are taken, aspects which advise other aspects can easily and unintentionally advise themselves. We present a compiler for an extension of the AspectJ programming language that avoids self reference by associating aspects with levels, and by automatically restricting the scope of pointcuts used by an aspect to join points of lower levels. We report on a case study using our language extension and quantify the changes necessary for migrating existing applications to it. Our results suggest that we can make programming with AspectJ simpler and safer, without restricting its expressive power unduly.
View the paper (.pdf)
BibTeX entry
|
Programmer-Friendly Decompiled Java
|
back |
Authors: Nomair A. Naeem and Laurie Hendren
Date: March 2006
International Conference on Program Comprehension (ICPC 2006), Athens, Greece.
Abstract
Java decompilers convert Java class files to Java source. Java class files may be created by a
number of different tools including standard Java compilers, compilers for other languages
such as AspectJ, or other tools such as optimizers or obfuscators. There are two kinds of Java
decompilers, javac-specific decompilers that assume that the class file was created by a
standard javac compiler and tool-independent decompilers that can decompile arbitrary class
files, independent of the tool that created the class files. Typically javac-specific
decompilers produce more readable code, but they fail to decompile many class files produced
by other tools.
This paper tackles the problem of how to make a toolindependent decompiler, Dava, produce Java
source code that is programmer-friendly. In past work it has been shown that Dava can
decompile arbitrary class files, but often the output, although correct, is very different
from what a programmer would write and is hard to understand. Furthermore, tools like
obfuscators intentionally confuse the class files and this also leads to confusing decompiled
source files.
Given that Dava already produces correct Java abstract syntax trees (ASTs) for arbitrary class
files, we provide a new back-end for Dava. The back-end rewrites the ASTs to semantically
equivalent ASTs that correspond to code that is easier for programmers to understand. Our new
backend includes a new AST traversal framework, a set of simple pattern-based transformations,
a structure-based data flow analysis framework and a collection of more advanced AST
transformations that use flow analysis information. We include several illustrative examples
including the use of advanced transformations to clean up obfuscated code.
View the paper (.pdf)
BibTeX entry
|
Context-sensitive points-to analysis: is it worth it?
|
back |
Authors: Ondřej Lhoták and Laurie Hendren
Date: March 2006
15th International Conference on Compiler Construction (CC 2006)
Abstract
We present the results of an empirical study evaluating the precision
of subsetbased pointsto analysis with several variations of context sensitivity on
Java benchmarks of significant size. We compare the use of call site strings as the
context abstraction, object sensitivity, and the BDDbased contextsensitive algo
rithm proposed by Zhu and Calman, and by Whaley and Lam. Our study includes
analyses that contextsensitively specialize only pointer variables, as well as ones
that also specialize the heap abstraction. We measure both characteristics of the
pointsto sets themselves, as well as effects on the precision of client analyses. To
guide development of efficient analysis implementations, we measure the number
of contexts, the number of distinct contexts, and the number of distinct pointsto
sets that arise with each context sensitivity variation. To evaluate precision, we
measure the size of the call graph in terms of methods and edges, the number of
devirtualizable call sites, and the number of casts statically provable to be safe.
The results of our study indicate that objectsensitive analysis implementations are
likely to scale better and more predictably than the other approaches; that object
sensitive analyses are more precise than comparable variations of the other ap
proaches; that specializing the heap abstraction improves precision more than ex
tending the length of context strings; and that the profusion of cycles in Java call
graphs severely reduces precision of analyses that forsake context sensitivity in
cyclic regions.
View the paper (.pdf)
Download the paper (.ps.gz)
BibTeX entry
|
Dynamic Data Structure Analysis for Java Programs
|
back |
Authors: Sokhom Pheng and Clark Verbrugge
Date: June 2006
ICPC 2006, Athens, Greece
Abstract
Analysis of dynamic data structure usage is useful for both program
understanding and for improving the accuracy of other program analyses.
Static analysis techniques, however, suffer from reduced accuracy in
complex situations, and do not necessarily give a clear picture of
runtime heap activity. We have designed and implemented a dynamic heap
analysis system that allows one to examine and analyze how Java programs
build and modify data structures. Using a complete execution trace from
a profiled run of the program, we build a internal representation that
mirrors the evolving runtime data structures. The resulting series of
representations can then be analyzed and visualized, and we show how to
use our approach to help understand how programs use data structures,
the precise effect of garbage collection, and to establish limits on
static data structure analysis. A deep understanding of dynamic data
structures is particularly important for modern, object-oriented
languages that make extensive use of heapbased data structures.
View the paper (.pdf)
BibTeX entry
|
Relative Factors in Performance Analysis of Java Virtual Machines
|
back |
Authors: Dayong Gu and Clark Verbrugge and Etienne M. Gagnon
Date: June 2006
VEE 2006, Ottawa, Canada
Abstract
Many new Java runtime optimizations report relatively small,
single-digit performance improvements. On modern virtual and actual
hardware, however, the performance impact of an optimization can be
influenced by a variety of factors in the underlying systems. Using a
case study of a new garbage collection optimization in two different
Java virtual machines, we show the relative effects of issues that must
be taken into consideration when claiming an improvement. We examine the
specific and overall performance changes due to our optimization and
show how unintended side-effects can contribute to, and distort the
final assessment. Our experience shows that VM and hardware concerns can
generate variances of up to 9.5% in whole program execution time.
Consideration of these confounding effects is critical to a good,
objective understanding of Java performance and optimization.
View the paper (.pdf)
View the presentation slides (.pdf)
BibTeX entry
|
Software Thread Level Speculation for the Java Language and Virtual
Machine Environment
|
back |
Authors: Christopher J.F. Pickett and Clark Verbrugge
Date: October 2005
LCPC 2005, October 2005, Hawthorne, NY, USA
Abstract
Thread level speculation (TLS) has shown great promise as a strategy
for fine to medium grain automatic parallelisation, and in a hardware
context techniques to ensure correct TLS behaviour are now well
established. Software and virtual machine TLS designs, however,
require adherence to high level language semantics, and this can
impose many additional constraints on TLS behaviour, as well as open
up new opportunities to exploit language-specific information.
We present a detailed design for a Java-specific, software TLS system
that operates at the bytecode level, and fully addresses the problems
and requirements imposed by the Java language and VM
environment. Using SableSpMT, our research TLS framework, we
provide experimental data on the corresponding costs and benefits; we
find that exceptions, GC, and dynamic class loading have only a small
impact, but that concurrency, native methods, and memory model
concerns do play an important role, as does an appropriate,
language-specific runtime TLS support system. Full consideration
of language and execution semantics is critical to correct and
efficient execution of high level TLS designs, and our work here
provides a baseline for future Java or Java virtual machine
implementations.
View
the paper (.pdf)
View
the presentation slides (.pdf)
BibTeX entry
|
SableSpMT: A Software Framework for Analysing Speculative
Multithreading in Java
|
back |
Authors: Christopher J.F. Pickett and Clark Verbrugge
Date: August 2005
PASTE 2005, September 2005, Lisbon, Portugal
Abstract
Speculative multithreading (SpMT) is a promising optimisation
technique for achieving faster execution of sequential programs on
multiprocessor hardware. Analysis of and data acquisition from such
systems is however difficult and complex, and is typically limited to
a specific hardware design and simulation environment. We have
implemented a flexible, software-based speculative multithreading
architecture within the context of a full-featured Java virtual
machine. We consider the entire Java language and provide a complete
set of support features for speculative execution, including return
value prediction. Using our system we are able to generate extensive
dynamic analysis information, analyse the effects of runtime feedback,
and determine the impact of incorporating static, offline information.
Our approach allows for accurate analysis of Java SpMT on existing,
commodity multiprocessor hardware, and provides a vehicle for further
experimentation with speculative approaches and optimisations.
View
the paper (.pdf)
View
the presentation slides (.pdf)
BibTeX entry
|
(P)NFG: A Language and Runtime System for Structured Computer Narratives
|
back |
Authors: Christopher J.F. Pickett and Clark Verbrugge and Félix Martineau
Date: August 2005
GameOn'NA 2005, August 2005, Montréal, Québec, Canada
Abstract
Complex computer game narratives can suffer from logical consistency
and playability problems if not carefully constructed, and current,
state of the art design tools do little to help analysis or ensure
good narrative properties. A formally-grounded system that
allows for relatively easy design and analysis is therefore
desireable. We present a language and an environment for
expressing game narratives based on a structured form of Petri Net,
the Narrative Flow Graph. Our "(P)NFG" system provides
a simple, high level view of narrative programming that maps onto a
low level representation suitable for expressing and analysing game
properties. The (P)NFG framework is demonstrated experimentally
by modelling narratives based on non-trivial interactive fiction
games, and integrates with the NuSMV model checker. Our system
provides a necessary component for systematic analysis of computer
game narratives, and lays the foundation for all-around improvements
to game quality.
View
the paper (.pdf)
BibTeX entry
|
A Study of Type Analysis for Speculative Method Inlining in a JIT Environment
|
back |
Authors: Feng Qian and Laurie Hendren
Date: April 2005
CC 2005
Abstract
Method inlining is one of most important optimizations to achieve a
high performance JIT compiler in Java virtual machines. A type
analysis allows the compiler directly inline monomorphic calls. At
runtime, the compiler and type analysis have to handle dynamic class
loading properly because the analysis result is only correct at
compile time. Loading of new classes could invalidate previous
analysis results and optimizations. Class hierarchy analysis (CHA)
has been used successfully in JIT compilers for speculative inlining
with various invalidation techniques as backup.
In this paper, we present the results of a limit study of method
inlining using dynamic type analysis on a set of standard Java
benchmarks. We developed a general type analysis framework for measure
the effectiveness of several well-known type analysis, including CHA,
RTA, XTA and VTA. Surprisingly, the simple dynamic CHA is nearly as
good as an ideal type analysis for inlining virtual method calls. It
leaves no room for other type analysis to improve. On the other hand,
only reachability-based interprocedural type analysis (VTA) is able to
capture the majority of monomorphic interface calls. We measured the
runtime overhead of interprocedural type analysis in the JIT
environment. To overcome the memory overhead of dynamic whole-program
analysis, we outlined the design of a demand-driven inter-procedural
type analysis for inlining hot interface calls.
View the paper (.ps)
|
Using inter-procedural side-effect information in JIT optimizations
|
back |
Authors: Anatole Le, Ondřej Lhoták and Laurie Hendren
Date: April 2005
CC 2005
Abstract
Inter-procedural analyses such as side-effect analysis can provide
information useful for performing aggressive optimizations. We present
a study of whether side-effect information improves performance in
just-in-time (JIT) compilers, and if so, what level of analysis
precision is needed.
We used Spark, the inter-procedural analysis component of the Soot Java
analysis and optimization framework, to compute side-effect information
and encode it in class files. We modified Jikes RVM, a research JIT,
to make use of side-effect analysis in local common sub-expression
elimination, heap SSA, redundant load elimination and loop-invariant
code motion. On the SpecJVM98 benchmarks, we measured the static number
of memory operations removed, the dynamic counts of memory reads eliminated,
and the execution time.
Our results show that the use of side-effect analysis increases the
number of static opportunities for load elimination by up to 98%,
and reduces dynamic field read instructions by up to 27%. Side-effect
information enabled speedups in the range of 1.08x to 1.20x for some
benchmarks. Finally, among the different levels of precision of
side-effect information, a simple side-effect analysis is usually
sufficient to obtain most of these speedups.
View the paper (.ps)
BibTeX entry
|
abc: An extensible AspectJ compiler
|
back |
Authors:
Pavel Avgustinov,
Aske Simon Christensen,
Laurie Hendren,
Sascha Kuzins,
Jennifer Lhoták,
Ondřej Lhoták,
Oege de Moor,
Damien Sereni,
Ganesh Sittampalam, and
Julian Tibble
Date: March 2005
AOSD 2005
Abstract
Research in the design of aspect-oriented programming languages
requires a workbench that facilitates easy experimentation with
new language features and implementation techniques. In particular,
new features for AspectJ have been proposed that require extensions
in many dimensions: syntax, type checking and code generation, as well as
data flow and control flow analyses.
The AspectBench Compiler (abc) is an implementation of such a workbench.
The base version of abc implements the full AspectJ language.
Its frontend is built, using the Polyglot framework, as a modular
extension of the Java language. The use of Polyglot
gives flexibility of syntax and type checking.
The backend is built using the Soot framework, to give modular code
generation and analyses.
In this paper, we outline the design of abc, focusing mostly on how
the design supports extensibility. We then provide a general overview of how
to use abc to implement an extension. Finally,
we illustrate the extension mechanisms of abc through a number of
small, but non-trivial, examples. abc is freely available under
the GNU LGPL.
View the paper (.ps)
BibTeX entry
|
Code Layout as a Source of Noise in JVM Performance
|
back |
Authors: Dayong Gu and Clark Verbrugge and Etienne Gagnon
Date: October 2004
CAMP04, October 2004, Vancouver, BC, Canada
Abstract
We describe the effect of a particular form of
"noise" in benchmarking. We investigate the source of anomalous
measurement data in a series of optimization strategies that attempt
to improve runtime performance in the garbage collector of a Java
virtual machine. The results of our experiments can be explained in
terms of the difference in code layout, and hence instruction and data
cache behaviour. We show that unintended changes in code layout due to
code modifications as trivial as symbol renaming can contribute up to
2.7% of measured machine cycle cost, 20% in data cache misses, and 37%
in instruction cache misses.
View
the paper (.pdf)
View
the presentation slides (.ppt)
BibTeX entry
|
Return Value Prediction in a Java Virtual Machine
|
back |
Authors: Christopher J.F. Pickett and Clark Verbrugge
Date: September 2004
VPW2, October 2004, Boston, MA, USA
Abstract
We present the design and implementation of return value prediction in
SableVM, a Java Virtual Machine.
We give detailed results for the full
SPEC JVM98 benchmark suite, and compare our results with previous,
more limited data.
At the performance limit of existing last value, stride, 2-delta
stride, parameter stride, and context (FCM) sub-predictors in a
hybrid, we achieve an average accuracy of 72%.
We describe and characterize a new table-based memoization predictor
that complements these predictors nicely, yielding an
increased average hybrid accuracy of
81%.
VM level information about data widths provides a 35%
reduction in space, and
dynamic allocation and expansion of per-callsite hashtables allows for
highly accurate prediction with an average per-benchmark requirement
of 119 MB for the context predictor and
43 MB for the memoization
predictor.
As far as we know, the is the first implementation of non-trace-based
return value prediction within a JVM.
View
the paper (.pdf)
View
the presentation slides (.pdf)
BibTeX entry
|
A Practical MHP Information Analysis for Concurrent Java Programs
|
back |
Authors: Lin Li and Clark Verbrugge
Date: September 2004
LCPC 2004, September 2004, West Lafayette, IN, USA
Abstract
In this paper we present an implementation of May Happen in Parallel
analysis for Java that attempts to address some of the practical
implementation concerns of the original work. We describe a design
that incorporates techniques for aiding a feasible implementation and
expanding the range of acceptable inputs. We provide experimental
results showing the utility and impact of our approach and
optimizations using a variety of concurrent benchmarks.
View
the paper (.pdf)
BibTeX entry
|
Jedd: A BDD-based Relational Extension of Java
|
back |
Authors: Ondřej Lhoták and Laurie Hendren
Date: April 2004
PLDI 2004, June 2004, Washington, D.C., USA
Abstract
In this paper we present Jedd, a language extension to Java that supports
a convenient way of programming with Binary Decision Diagrams (BDDs).
The Jedd language abstracts BDDs as database-style relations and operations
on relations, and provides static type rules to ensure that relational
operations are used correctly.
The paper provides a description of the Jedd language and reports on the
design and implementation of the Jedd translator and associated runtime
system. Of particular interest is the approach to assigning attributes
from the high-level relations to physical domains in the underlying BDDs, which
is done by expressing the constraints as a SAT problem and using a modern
SAT solver to compute the solution. Further, a runtime system is
defined that handles memory management issues and supports a browsable
profiling tool for tuning the key BDD operations.
The motivation for designing Jedd was to support the development of whole
program analyses based on BDDs, and we have used Jedd to express five
key interrelated whole program analyses in our Soot compiler framework.
We provide some examples of this application and discuss our experiences
using Jedd.
View the
paper (.pdf)
Download the
paper (.ps.gz)
BibTeX entry
|
Towards Dynamic Interprocedural Analysis in JVMs
|
back |
Authors: Feng Qian and Laurie Hendren
Date: May 2004
VM 2004, May 2004, San Jose, USA
Abstract
This paper presents a new, inexpensive, mechanism for constructing a
complete call graph for Java programs at runtime, and provides an
example of using the mechanism for implementing a dynamic
reachability-based interprocedural analysis (IPA), namely dynamic XTA.
Reachability-based IPAs, such as points-to analysis and escape
analysis, require a context-insensitive call graph of the analyzed
program. Computing a call graph at runtime presents several
challenges. First, the overhead must be low. Second, when
implementing the mechanism for languages such as Java, both
polymorphism and lazy class loading must be dealt with correctly and
efficiently. We propose a new, low-cost, mechanism for constructing
runtime call graphs in a JIT environment. The mechanism uses a
profiling code stub to capture the first execution of a call edge, and
adds at most one more instruction to repeated call edge invocations.
Polymorphism and lazy class loading are handled transparently. The
call graph is constructed incrementally, and it supports optimistic
analysis and speculative optimizations with invalidations.
We also developed a dynamic, reachability-based type analysis, dynamic
XTA, as an application of runtime call graphs. It also serves as an
example of handling lazy class loading in dynamic IPAs.
The dynamic call graph construction algorithm and dynamic version of
XTA have been implemented in Jikes RVM. We present empirical
measurements of the overhead of call graph profiling and compare the
characteristics of call graphs built using our profiling code stubs
with conservative ones constructed by using dynamic class hierarchy
analysis (CHA).
View the
paper (.pdf)
Download the
paper (.ps.gz)
Slides
|
Integrating the Soot compiler infrastructure into an IDE
|
back |
Authors: Jennifer Lhoták, Ondřej Lhoták, and Laurie Hendren
Date: April 2004
CC 2004, April 2004, Barcelona, Spain
Abstract
This paper presents the integration of Soot, a byte-code analysis and
transformation framework, with an integrated development environment (IDE),
Eclipse. Such an integrated toolkit is useful for both the compiler
developer, to aid in understanding and debugging new analyses,
and also for the end-user of the IDE, to aid in program
understanding by exposing semantic information gathered by the advanced
compiler analyses. The paper discusses these advantages and provides
concrete examples of its usefulness.
There are several major challenges to overcome in developing the integrated
toolkit, and the paper discusses three major challenges and the solutions
to those challenges. An overview of Soot and the integrated toolkit is
given, followed by a more detailed discussion of the fundamental components.
The paper concludes with several illustrative examples of using the
integrated toolkit along with a discussion of future plans and research.
View the
paper (.pdf)
Download the
paper (.ps.gz)
BibTeX entry
|
Visualizing Program Analysis with the Soot-Eclipse Plugin
|
back |
Authors: Jennifer Lhoták and Ondřej Lhoták
Date: April 2004
eTX (at ETAPS) 2004, March 2004, Barcelona, Spain
Abstract
Our integration of the Soot bytecode manipulation framework into the
Eclipse IDE forms a powerful tool for graphically visualizing both
the progress and output of program analyses. We demonstrate several
examples of the visualizations that we have developed, and explain how
they are useful for both compiler research and teaching.
View the
paper (.pdf)
BibTeX entry
|
Dynamic Metrics for Java
|
back |
Authors: Bruno Dufour, Karel Driesen, Laurie Hendren and Clark Verbrugge
Date: November 2003
OOPSLA 2003
Abstract
In order to perform meaningful experiments in optimizing compilation
and run-time system design, researchers usually rely on a suite of
benchmark programs of interest to the optimization
technique under consideration. Programs are described
as numeric, memory-intensive, concurrent,
or object-oriented, based on a qualitative appraisal,
in some cases with little justification. We believe it is beneficial
to quantify the behaviour of programs with a concise and precisely
defined set of metrics, in order to make these intuitive notions of program
behaviour more concrete and subject to experimental validation.
We therefore define and measure a set of unambiguous, dynamic, robust
and architecture-independent metrics that can be used to categorize
programs according to their dynamic behaviour in five areas:
size, data structure, memory use, concurrency, and polymorphism.
A framework computing some of these metrics for Java programs is
presented along with specific results demonstrating how to use metric
data to understand a program's behaviour, and both guide and evaluate
compiler optimizations.
View the
paper (.pdf)
View the presentation slides
BibTeX entry
|
EVolve, an Open Extensible Software Visualization Framework
|
back |
Authors: Qin Wang, Wei Wang, Rhodes Brown, Karel Driesen, Bruno Dufour, Laurie Hendren and Clark Verbrugge
Date: June 2003
ACM Symposium on Software Visualization 2003
Abstract
Existing visualization tools typically do not allow easy extension by new
visualization techniques, and are often coupled with inflexible data input
mechanisms. This paper presents EVolve, a flexible and extensible framework
for visualizing program characteristics and behaviour. The framework is
flexible in the sense that it can visualize many kinds of data, and it is
extensible in the sense that it is quite straightforward to add new kinds of
visualizations.
The overall architecture of the framwork consists of the core EVolve platform
that communicates with data sources via a well defined data protocal
and which communicates with visualization methods via a visualization protocol.
Given a data source, an end-user can use EVolve as a stand-alone tool by interactively
creating, configuring and modifying visualizations. A variety of visualizations are
provided in the current EVolve library, with features that facilitate the
comparison of multiple views on the same execution data. We demonstrate
EVolve in the context of visualizing execution behaviour of Java programs.
View the paper (.pdf)
|
Points-to Analysis using BDDs
|
back |
Authors: Marc Berndl, Ondřej Lhoták, Feng Qian, Laurie Hendren and Navindra Umanee
Date: April 2003
PLDI 2003, June 2003, San Diego, USA
Abstract
This paper reports on a new approach to solving a subset-based
points-to analysis for Java using Binary Decision Diagrams (BDDs).
In the model checking community, BDDs have been shown very effective for
representing large sets and solving very large verification problems.
Our work shows that BDDs can also be very effective for developing a
points-to analysis that is simple to implement and that
scales well, in both space and time, to large programs.
The paper first introduces BDDs and operations on BDDs using some
simple points-to examples. Then, a complete subset-based points-to
algorithm is presented, expressed completely using BDDs and BDD
operations. This algorithm is then refined by finding appropriate
variable orderings and by making the algorithm propagate sets incrementally, in order to
arrive at a very efficient algorithm.
Experimental results are given to justify the
choice of variable ordering, to demonstrate the improvement due to
incrementalization, and to compare the performance of the BDD-based
solver to an efficient hand-coded graph-based solver. Finally,
based on the results of the BDD-based solver, a variety of BDD-based queries
are presented, including the points-to query.
View the paper (.pdf)
Download the paper (.ps.gz)
Presentation slides (.pdf)
Presentation slides (.ps)
BibTeX entry
|
Dynamic Profiling and Trace Cache Generation
|
back |
Authors: Marc Berndl and Laurie Hendren
Date: March 2003
CGO'03, March 2003, San Francisco, USA
Abstract
Dynamic program optimization is increasingly important for achieving
good runtime performance. A key issue is how to select which code to
optimize. One approach is to dynamically detect traces, long
sequences of instructions spanning multiple methods, which are likely
to execute to completion. Traces are easy to optimize and have been
shown to be a good unit for optimization.
This paper reports on a new approach for dynamically detecting,
creating and storing traces in a Java virtual machine. We first
describe four important criteria for a successful trace strategy: good
instruction stream coverage, low dispatch rate, cache stability, and
optimizability of traces. We then present our approach based on
branch correlation graphs. A branch correlation graph stores
information about the correlation between pairs of branches, as weel
as additional state information.
We present the complete design for an efficient implementation of the
system, including a detailed discussion of the trace cache and
profiling mechanisms. We have implemented an experimental framework
to measure the traces generated by our approach in a direct-threaded
Java VM(SableVM) and we presnet experimental results to show that the
trace we generate meet the design criteria.
View the technical report (pdf)
|
Design, Implementation and Evaluation of Adaptive Recompilation with On-Stack Replacement
|
back |
Authors: Stephen J. Fink (IBM T.J. Watson) and Feng Qian
Date: March 2003
CGO'03, March 23-26, San Francisco, USA
Abstract
Modern virtual machines often maintain multiple compiled versions of a
method. An on-stack replacement (OSR) mechanism enables a virtual
machine to transfer execution between compiled versions, even while a
method runs. Relying on this mechanism, the system can exploit
powerful techniques to reduce compile time and code space, dynamically
de-optimize code, and invalidate speculative optimizations.
This paper presents a new, simple, mostly compiler-independent
mechanism to transfer execution into compiled code. Additionally, we
present enhancements to an analytic model for recompilation to exploit
OSR for more aggressive optimization. We have implemented these
techniques in Jikes RVM and present a comprehensive evaluation,
including a study of fully automatic, online, profile-driven deferred
compilation.
Paper available upon requests.
|
CC2003: Effective Inline-Threaded Interpretation of Java Bytecode Using Preparation Sequences
|
back |
Authors: Etienne Gagnon and Laurie Hendren
Date: January 2003
CC 2003, April 2003, Warsaw, Poland
Abstract
Inline-threaded interpretation is a recent technique that improves
performance by eliminating dispatch overhead within basic blocks for
interpreters written in C. The dynamic class loading,
lazy class initialization, and multi-threading features of Java reduce
the effectiveness of a straight-forward implementation of this
technique within Java interpreters. In this paper, we introduce
preparation sequences, a new technique that solves the particular
challenge of effectively inline-threading Java. We have implemented
our technique in the SableVM Java virtual machine, and our
experimental results show that using our technique, inline-threaded
interpretation of Java, on a set of benchmarks, achieves a speedup
ranging from 1.20 to 2.41 over switch-based interpretation, and a
speedup ranging from 1.15 to 2.14 over direct-threaded interpretation.
Download the paper (.ps.gz)
View the paper (.pdf)
|
CC2003: Scaling Java Points-To Analysis using Spark
|
back |
Authors: Ondřej Lhoták and Laurie Hendren
Date: January 2003
CC 2003, April 2003, Warsaw, Poland
Abstract Most points-to analysis research has been done on different systems by
different groups, making it difficult to compare results, and to understand
interactions between individual factors each group studied.
Furthermore, points-to analysis for Java has been studied much less
thoroughly than for C, and the tradeoffs appear very different.
We introduce Spark, a flexible framework for experimenting with
points-to analyses for Java. Spark supports equality- and subset-based
analyses, variations in field sensitivity, respect for declared types,
variations in call graph construction, off-line simplification, and
several solving algorithms. Spark is composed of building blocks on
which new analyses can be based.
We demonstrate Spark in a substantial study of factors affecting
precision and efficiency of subset-based points-to analyses, including
interactions between these factors. Our results show that Spark is
not only flexible and modular, but also offers superior time/space
performance when compared to other points-to analysis implementations.
|
PASTE02-2: STEP: A Framework for the Efficient Encoding of General Trace Data
|
back |
Authors: Rhodes Brown, Karel Driesen, David Eng, Laurie Hendren, John Jorgensen, Clark Verbrugge and Qin Wang
Date: November 2002
PASTE 2002, Charleston, SC, USA
Abstract
Traditional tracing systems are often limited to recording a fixed set
of basic program events. This limitation can frustrate an application
or compiler developer who is trying to understand and characterize the
complex behavior of software systems such as a Java program running on
a Java Virtual Machine. In the past, many developers have resorted to
specialized tracing systems that target a particular type of program
event. This approach often results in an obscure and poorly documented
encoding format which can limit the reuse and sharing of potentially
valuable information. To address this problem, we present STEP, a
system designed to provide profiler developers with a standard method
for encoding general program trace data in a flexible and compact
format. The system consists of a trace data definition language along
with a compiler and an architecture that simplifies the client
interface by encapsulating the details of encoding and interpretation.
|
PASTE02-1: Combining Static and Dynamic Data in Code Visualization
|
back |
Authors: David Eng
Date: November 2002
PASTE 2002, Charleston, SC, USA
Abstract
The task of developing, tuning, and debugging compiler optimizations is a
difficult one which can be facilitated by software visualization. There
are many characteristics of the code which must be considered when
studying the kinds of optimizations which can be performed. Both static
data collected at compile-time and dynamic runtime data can reveal
opportunities for optimization and affect code transformations. In order
to expose the behavior of such complex systems, visualizations should
include as much information as possible and accommodate the different
sources from which this information is acquired.
This paper presents a visualization framework designed to address these
issues. The framework is based on a new, extensible language called JIL
which provides a common format for encapsulating intermediate
representations and associating them with compile-time and runtime data.
We present new contributions which extend existing compiler and profiling
frameworks, allowing them to export the intermediate languages, analysis
results, and code metadata they collect as JIL documents. Visualization
interfaces can then combine the JIL data from separate tools, exposing
both static and dynamic characteristics of the underlying code. We
present such an interface in the form of a new web-based visualizer,
allowing JIL documents to be visualized online in a portable,
customizable interface.
|
JGI02: Run-time Evaluation of Opportunities for Object Inlining in Java
|
back |
Authors: Ondřej Lhoták and Laurie Hendren
Date: September, 2002
JGI'02, November 2002, Seattle, WA, USA
Abstract
Object-oriented languages, such as Java, encourage the use of many small
objects linked together by field references, instead of a few monolithic
structures. While this practice is beneficial from a program design
perspective, it can slow down program execution by incurring many
pointer indirections. One solution to this problem is object inlining:
when the compiler can safely do so, it fuses small objects together,
thus removing the reads/writes to the removed field, saving the memory
needed to store the field and object header, and reducing the number of
object allocations.
The objective of this paper is to measure the potential for object inlining
by studying the run-time behaviour of a comprehensive set of Java programs.
We study the traces of program executions in order to determine which
fields behave like inlinable fields. Since we are using dynamic information
instead of a static analysis, our results give an upper bound on what
could be achieved via a static compiler-based approach.
Our experimental results measure the potential improvements
attainable with object inlining, including reductions in the numbers of
field reads and writes, and reduced memory usage.
Our study shows that some Java programs can benefit significantly
from object inlining, with close to a 10% speedup. Somewhat to our
surprise, our study found one case, the db benchmark,
where the most important inlinable field was the result of unusual
program design, and fixing this small flaw led to both better performance
and clearer program design. However, the opportunities for
object inlining are highly dependent on the individual program being
considered, and are in many
cases very limited. Furthermore, fields that are inlinable also have
properties that make them potential candidates for other optimizations such
as removing redundant memory accesses.
The memory savings possible through object inlining are moderate.
|
ISMM2002: An Adaptive, Region-based Allocator for Java
|
back |
Authors: Feng Qian and Laurie Hendren
Date: April 22, 2002
ISMM'02, June 2002, Berlin, Germany
Abstract
This paper introduces an adaptive, region-based allocator for Java.
The basic idea is to allocate non-escaping objects in local regions,
which are allocated and freed in conjunction with their associated
stack frames. By releasing memory associated with these stack frames,
the burden on the garbage collector is reduced, possibly resulting in
fewer collections.
The novelty of our approach is that it does not require static escape
analysis, programmer annotations, or special type systems. The
approach is transparent to the Java programmer and relatively simple
to add to an existing JVM. The system starts by assuming that all
allocated objects are local to their stack region, and then catches
escaping objects via write barriers. When an object is caught
escaping, its associated allocation site is marked as a non-local
site, so that subsequent allocations will be put directly in the
global region. Thus, as execution proceeds, only those allocation
sites that are likely to produce non-escaping objects are allocated to
their local stack region.
The paper presents the overall idea, and then provides details of a
specific design and implementation. In particular, we present a
region-based allocator and the necessary modifications of the Jikes RVM
baseline JIT and a copying collector. Our experimental study
evaluates the idea using the SPEC JVM98 benchmarks, plus one other large
benchmark. We show that a region-based allocator is a reasonable
choice, that overheads can be kept low, and that the adaptive system
is successful at finding local regions that contain no escaping
objects.
|
CC2002: Decompiling Java Bytecode: Problems, Traps and Pitfalls
|
back |
Authors: Jerome Miecznikowski and Laurie Hendren
Date: February 2002
CC'02, April 2002, Grenoble France
Abstract
Java virtual machines execute Java bytecode instructions. Since this
bytecode is a higher level representation than traditional object code, it
is possible to decompile it back to Java source. Many such decompilers
have been developed and the conventional wisdom is that decompiling Java
bytecode is relatively simple. This may be true when decompiling bytecode
produced directly from a specific compiler, most often Sun's javac
compiler. In this case it is really a matter of inverting a known
compilation strategy. However, there are many problems, traps and
pitfalls when decompiling arbitrary verifiable Java bytecode. Such
bytecode could be produced by other Java compilers, Java bytecode
optimizers or Java bytecode obfuscators. Java bytecode can also be
produced by compilers for other languages, including Haskell, Eiffel, ML,
Ada and Fortran. These compilers often use very different code generation
strategies from javac.
This paper outlines the problems and solutions we have found in our
development of Dava, a decompiler for arbitrary Java bytecode. We first
outline the problems in assigning types to variables and literals, and the
problems due to expression evaluation on the Java stack. Then, we look at
finding structured control flow with a particular emphasis on how to deal
with Java exceptions and synchronized blocks. Throughout the paper we
provide small examples which are not properly decompiled by commonly used
decompilers.
Authors: Feng Qian, Laurie Hendren and Clark Verbrugge
Date: February 2002
CC'02, April 2002, Grenoble France
Abstract
This paper reports on a comprehensive approach to eliminating array
bounds checks in Java. Our approach is based upon three analyses. The
first analysis is a flow-sensitive
intraprocedural analysis called variable constraint analysis
(VCA). This analysis builds a small constraint graph for each
important point in a method, and then uses the information encoded in
the graph to infer the relationship between array index expressions
and the bounds of the array. Using VCA as the base analysis, we also
show how two further analyses can improve the results of VCA.
Array field analysis is applied on each class and provides
information about some arrays stored in fields, while rectangular
array analysis is an interprocedural analysis to approximate the
shape of arrays, and is useful for finding rectangular (non-ragged)
arrays.
We have implemented all three analyses using the Soot bytecode
optimization/annotation framework and we transmit the results of the
analysis to virtual machines using class file attributes. We have
modified the Kaffe JIT, and IBM's High Performance Compiler for Java
(HPCJ) to make use of these attributes, and we demonstrate significant
speedups.
Authors: Jerome Miecznikowski and Laurie Hendren
Date: October 2001
Abstract
This paper presents an approach to program structuring for use in
decompiling Java bytecode to Java source. The structuring approach uses
three intermediate representations: (1) a list of typed, aggregated
statements with an associated exception table, (2) a control flow graph,
and (3) a structure encapsulation tree.
The approach works in six distinct stages, with each stage focusing on a
specific family of Java constructs, and each stage contributing more
detail to the structure encapsulation tree. After completion of all
stages the structure encapsulation tree contains enough information to
allow a simple extraction of a structured Java program.
The approach targets general Java bytecode including bytecode that may be
the result of front-ends for languages other than Java, and also bytecode
that has been produced by a bytecode optimizer. Thus, the techniques have
been designed to work for bytecode that may not exhibit the typical
structured patterns of bytecode produced by a standard Java compiler.
The structuring techniques have been implemented as part of the Dava
decompiler which has been built using the Soot framework.
Authors: Patrice Pominville, Feng Qian, Raja Vallée-Rai, Laurie Hendren and Clark Verbrugge
Date: November 2000
Abstract
This paper presents a framework for supporting the optimization of Java programs using attributes in Java class files. We show how
class file attributes may be used to convey both optimization opportunities and profile information to a variety of Java virtual machines
including ahead-of-time compilers and just-in-time compilers.
We present our work in the context of Soot, a framework that supports the analysis and transformation of Java bytecode (class files).
We demonstrate the framework with attributes for elimination of array bounds and null pointer checks, and we provide experimental
results for the Kaffe just-in-time compiler, and IBM's High Performance Compiler for Java ahead-of-time compiler.
Winner of the "best paper that is primarily the work of a student" award.
Authors: Etienne Gagnon and Laurie Hendren
Date: April 2001
Conference: Java Virtual Machine Research and Technology Symposium (JVM '01)
Abstract
SableVM is an open-source virtual machine for Java
intended as a research framework for efficient
execution of Java bytecode.
The framework is essentially composed
of an extensible bytecode interpreter using state-of-the-art
and innovative techniques.
Written in the C programming language, and assuming
minimal system dependencies, the interpreter emphasizes high-level
techniques to support efficient execution.
In particular, we introduce a bidirectional layout for object
instances that groups reference fields sequentially to allow
efficient garbage collection. We also introduce
a sparse interface virtual table layout that reduces the cost
of interface method calls to that of normal virtual calls.
Finally, we present a technique to improve thin locks
by eliminating busy-wait in presence of contention.
Authors: Vijay Sundaresan, Laurie Hendren, Chrislain Razafimahefa, Raja Vallée-Rai, Patrick Lam, Etienne Gagnon, and Charles Godin
Date: October 2000
Abstract
This paper addresses the problem of resolving virtual method and
interface calls in Java bytecode.
The main focus is on a new practical technique that can
be used to analyze large applications.
Our fundamental design goal was to develop a technique that can be solved
with only one iteration, and thus scales linearly with the size of the
program,
while at the same time providing
more accurate results than two popular existing linear techniques,
class hierarchy analysis and rapid type analysis.
We present two variations of our new technique, variable-type analysis
and a coarser-grain version called declared-type analysis.
Both of these analyses are inexpensive, easy to implement,
and our experimental results show that they scale linearly in
the size of the program.
We have implemented our new analyses
using the Soot framework, and we report on
empirical results for seven benchmarks.
We have used our techniques to build
accurate call graphs for complete applications (including libraries)
and we show that compared to a conservative call graph built
using class hierarchy analysis, our new variable-type analysis
can remove a significant number of nodes (methods) and call edges.
Further, our results show that we can improve upon the compression obtained
using rapid type analysis.
We also provide dynamic measurements of monomorphic call sites, focusing
on the benchmark code excluding libraries. We demonstrate that when
considering only the benchmark code,
both rapid type analysis and our new declared-type analysis do not add much
precision over class hierarchy analysis. However, our finer-grained
variable-type analysis does resolve significantly more
call sites, particularly for programs with more complex uses of objects.
Authors: Patrice Pominville, Feng Qian, Raja Vallée-Rai, Laurie Hendren and Clark Verbrugge
Date: November 2000
Abstract
This paper presents a framework for supporting the optimization of Java programs using attributes in Java class files. We show how
class file attributes may be used to convey both optimization opportunities and profile information to a variety of Java virtual machines
including ahead-of-time compilers and just-in-time compilers.
We present our work in the context of Soot, a framework that supports the analysis and transformation of Java bytecode (class files).
We demonstrate the framework with attributes for elimination of array bounds and null pointer checks, and we provide experimental
results for the Kaffe just-in-time compiler, and IBM's High Performance Compiler for Java ahead-of-time compiler.
Authors: Etienne Gagnon, Laurie Hendren and Guillaume Marceau
Date: June-July 2000
Abstract
Even though Java bytecode has a significant amount of type
information embedded in it,
there are no explicit types for local variables.
However, knowing types for local variables is very useful
for both program optimization and decompilation.
In this paper, we present an efficient and practical
algorithm for inferring static types for local variables
in a 3-address, stackless, representation of Java bytecode.
By decoupling the type inference problem from the
low level bytecode representation, and abstracting it into a
constraint system, we show that there exists verifiable
bytecode that cannot be statically typed. Further, we show that,
without transforming the program, the static typing problem
is NP-hard. In order to develop a practical approach we
have developed an algorithm that works efficiently for the
usual cases and then applies efficient program transformations to
simplify the hard cases.
Our solution is an multi-stage algorithm.
In the first stage, we
propose an efficient algorithm that infers static types for most
bytecode found in practice. In case this
stage fails, the second stage is applied. It consists of a simple
and efficient variable splitting operation that renders
most bytecode typeable using the algorithm of stage
one. Finally, for completeness of the algorithm, we present a
final stage that efficiently transforms and infers types for all
remaining bytecode (such bytecode is likely to be a contrived example,
and not code produced from a compiler).
We have implemented this algorithm in the Soot framework. Our
experimental results show that all of the 17,000 methods used
in our tests were successfully typed, 99.8% of those required only
the first stage, 0.2% required the second stage, and no methods
required the third stage.
Authors: Raja Vallée-Rai, Etienne Gagnon, Laurie Hendren, Patrick Lam, Patrice Pominville, and Vijay Sundaresan
Date: March-April 2000
Abstract
This paper presents Soot, a framework for optimizing JavaTM bytecode. The
framework is implemented in Java and supports three intermediate representations
for representing Java bytecode: Baf, a streamlined representation of Java's
stack-based bytecode;
Jimple, a typed three-address intermediate
representation suitable for optimization; and Grimp, an aggregated version of
Jimple.
Our approach to class file optimization is to first convert the stack-based
bytecode into Jimple, a three-address form more amenable to traditional program
optimization, and then convert the optimized Jimple back to bytecode.
In order to demonstrate that our approach is feasible, we present
experimental results showing the effects of processing class files through
our framework. In particular, we study the techniques necessary to effectively
translate Jimple back to bytecode, without losing performance. Finally, we
demonstrate that class file optimization can be quite effective by
showing the results of some basic optimizations using our framework.
Our experiments
were done on ten benchmarks, including seven SPECjvm98 benchmarks, and were
executed on five different Java virtual machine implementations.
Authors: Raja Vallée-Rai, Laurie Hendren, Vijay Sundaresan, Patrick Lam, Etienne Gagnon and Phong Co
Date: September 99
Abstract
This paper presents Soot, a framework for optimizing Java(tm) bytecode. The
framework is implemented in Java and supports three intermediate representations
for representing Java bytecode: Baf, a streamlined representation of bytecode
which is simple to manipulate; Jimple, a typed 3-address intermediate
representation suitable for optimization; and Grimp, an aggregated version of
Jimple suitable for decompilation. We describe the motivation for each
representation, and the salient points in translating from one representation to
another.
In order to demonstrate the usefulness of the framework, we have implemented
intraprocedural and whole program optimizations. To show that whole program
bytecode optimization can give performance improvements, we provide experimental
results for 12 large benchmarks, including 8 SPECjvm98 benchmarks running on JDK
1.2 for GNU/Linux(tm). These results show up to 8% improvement when the
optimized bytecode is run using the interpreter and up to 21% when run using the
JIT compiler.
| TOOLS98: SableCC, an Object-Oriented Compiler Framework |
back |
Authors: Etienne Gagnon and Laurie J. Hendren
Date: August 1998
Abstract
In this paper, we introduce SableCC, an object-oriented framework that generates
compilers (and interpreters) in the Java programming language. This framework is based on
two fundamental design decisions. Firstly, the framework uses object-oriented techniques
to automatically build a strictly-typed abstract syntax tree that matches the grammar of
the compiled language which simplifies debugging. Secondly, the framework generates tree-walker
classes using an extended version of the visitor design pattern which enables the
implementation of actions on the nodes of the abstract syntax tree using inheritance. These
two design decisions lead to a tool that supports a shorter development cycle for constructing
compilers.
To demonstrate the simplicity of the framework, we present all the steps of building an
interpreter for a mini-BASIC language. This example could easily be modified to provide
an embedded scripting language in an application. We also provide a brief description of
larger systems that have been implemented the SableCC tool.
We conclude that the use of object-oriented techniques significantly reduces the length of
the programmer written code, can shorten the development time and finally, makes the code
easier to read and maintain.
Last updated Fri Apr 11 23:53:33 EDT 2003.
|