Feng Qian (fqian@sable.mcgill.ca)
January 22, 2012
This note explains how to use Soot annotation options to add array bounds check and null pointer check attributes to a class file and how to use these attributes in a JIT or ahead-of-time compiler.
Soot provides static analyses for detecting safe array and object accesses in a method. These analyses mark array and object reference bytecodes as either safe or unsafe. The results of these analyses are encoded into the class file as attributes, which can then be understood by an interpreter or JIT compiler. If a bytecode is marked as safe in its attribute, the associated comparison instructions can be eliminated. This can speed up the execution of Java applications. Our process of encoding class files with attributes is called annotation.
Soot can be used as a compiler framework to support any attributes you would like to define; they can then be encoded into the class file. The process of adding new analyses and attributes is documented in ``Adding attributes to class files via Soot''.
Soot has new command-line options -annot-nullpointer and -annot-arraybounds to enable the phases required to emit null pointer check and array bounds check annotations, respectively.
Soot has some phase options to configure the annotation process. These phase options only take effect when annotation is enabled. Note that the array bounds check analysis and null pointer check analysis constitute two different phases, but that the results are combined and stored in the same attribute in the class files.
The null pointer check analysis has the phase name ``jap.npc''. It has one phase option (aside from the default option enabled).
Soot also has phase options for the array bounds check analysis. These options affect three levels of analyses: intraprocedural, class-level, and whole-program. The array bounds check analysis has the phase name ``jap.abc''. If the whole-program analysis is required, an extra phase ``wjap.ra'' for finding rectangular arrays is required. This phase can be also enabled with phase options.
By default, our array bounds check analysis is intraprocedual, since it only examines local variables. This is fast, but conservative. Other options can improve the analysis result; however, it will usually take longer to carry out the analysis, and some options assume that the application is single-threaded.
Annotate the benchmark in class file mode with both analyses.
java soot.Main -annot-nullpointer -annot-arraybounds spec.benchmarks._222_mpegaudio.Main
The options for rectangular array should be used in application mode. For example:
java soot.Main --app -annot-arraybounds -annot-arraybounds -p wjap.ra with-wholeapp -p jap.abc with-all spec.benchmarks._222_mpegaudio.Main
The following command only annotates the array reference bytecodes.
java soot.Main -annot-arraybounds -annot-arraybounds -jap.npc only-array-ref spec.benchmarks._222_mpegaudio.Main
All array reference bytecodes, such as ?aload, ?store will be annotated with bounds check information. Bytecodes that need null pointer check are listed below:
?aload ?astore getfield putfield invokevirtual invokespecial invokeinterface arraylength monitorenter monitorexit athrow
The attributes in the class file are organized as a table. If a method has been annotated, it will have an ArrayNullCheckAttribute attribute on its Code_attribute. The data structure is defined as:
array_null_check_attribute { u2 attribute_name_index; u4 attribute_length; u3 attribute[attribute_length/3]; }
The attribute data consist of 3-byte entries. Each entry has the first two bytes indicating the PC of the bytecode it belongs to; the third byte is used to represent annotation information.
soot_attr_entry { u2 PC; u1 value; }
Entries are sorted by PC in ascending order when written into the class file. The right-most two bits of the `value' byte represent upper and lower bounds information. The third bit from right is used for nullness annotation. Other bits are not used and set to zero. The bit value `1' indicates the check is needed, and 0 represents a known-to-be-safe access. In general, only when both lower and upper bounds are safe can the check instructions be eliminated. However, sometimes this depends on the VM implementation.
0 0 0 0 0 N U L N : nullness check U : upper bounds check L : lower bounds check
For example, the attribute data should be interpreted as:
0 0 0 0 0 1 x x // need null check 0 0 0 0 0 0 x x // no null check // x x represent array bound check. 0 0 0 0 0 0 0 0 // do not need null check or array bounds check 0 0 0 0 0 1 0 0 // need null check, but not array bounds check
This document was generated using the LaTeX2HTML translator Version 2008 (1.71)
Copyright © 1993, 1994, 1995, 1996,
Nikos Drakos,
Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999,
Ross Moore,
Mathematics Department, Macquarie University, Sydney.
The command line arguments were:
latex2html useannotation -split 0 -nonavigation -dir ./
The translation was initiated by Eric Bodden on 2012-01-22