[Soot-list] throw analysis
John Jorgensen
jorgnsn at lcd.uregina.ca
Wed Jun 22 01:01:35 EDT 2005
rugang> so maybe i have a incorrect notion of what's right
rugang> exception behavior. So I have:
rugang> try { . . . }
rugang> catch (Exception e) { . . .}
rugang> catch (Error e) {. . .}
rugang> finally{...}
rugang> so it seems that the correct cfg will not have an
rugang> edge from every statement in the try block to the
rugang> finally block because any of those exceptions /
rugang> errors thrown will be caught in the previous catch
rugang> blocks.
rugang> However, the unit throw analysis seems to add those
rugang> unecessary edges from the statements in the try block
rugang> to the finally block. Am i missing something here?
The reason that there are CFG edges from within your "try" to the
"finally" is not that the "try" statements might throw an
exception that gets handled directly by the "finally"; the reason
is that the "try" statements might throw an exception to one of
the "catch"s, only to have the first instruction in that handler
throw an exception to the "finally".
This will be clearer if we flesh out your example:
int method(int k) {
int i = 0; //A
int j = 0;
try {
j = -1; //B
i = j/k; //C
} catch (Exception e) {
i = 5; //D
} catch (Error e) {
i = 6; //E
} finally {
return j/i; //F
}
}
and imagine that we want to write a compiler analysis that
performs constant propagation and tracks whether variables may or
may not have zero values.
>From your question, I imagine you expect this method to have the
following CFG edges (using the instruction labels that I included
in the inline comments above):
A->B (regular control flow)
B->C (regular control flow)
B->E (exceptional control flow, Errors are always possible)
C->D (exceptional control flow, ArithmeticException)
C->E (exceptional control flow)
D->F (exceptional control flow)
E->F (exceptional control flow)
Bear in mind, though, that the primary use of Soot CFGs is for
performing dataflow analyses, not human visualization of control
flow. And a dataflow analysis using those edges could produce
incorrect answers. The edges say that the only routes from A to
F are
A->B->C->D->F
A->B->C->E->F
A->B->E->F
from which our dataflow analysis could deduce that at statement F
neither i nor j could possibly have the value 0, since all the
paths include B, which assigns j -1, and either C, D or E, which
assign i non-zero values. So your dataflow analysis could deduce
that F cannot cause a division by zero.
But if k is zero, the exception will be thrown to D before any
assignment to i. So an exception thrown from instruction C to
instruction D needs to create a CFG edge from B to D, not from C
to D. This allows dataflow analyses to correctly deduce that
when you reach D, the computation at B must have been completed
(because B cannot through an Exception, only an Error), but the
computation at C must not have been completed (if it had
completed, there would not have been an exception).
Now let us say that k is zero, and that at the instant that
control is transfered to D to catch the ArithmeticException, the
VM throws ThreadDeath or InternalError. Then control will be
transferred to F in the finally clause, potentially before 5 is
assigned to i. It is possible that i still has the value 0 at F,
and our CFG has to reflect that.
Moreover, if there is an error during the execution of
B, control could be transferred to E before the assignment of -1
to j, so j could also have a value of 0 at F.
Section 2.4 of
http://www.sable.mcgill.ca/publications/techreports/#report2003-3
goes further into just what a CFG edge should mean. Essentially...
- A CFG edge from instruction A to instruction B in Soot's CFGs
means that after some or all of A's computation is performed, B
might be the next instruction to execute (B _will_ be the next
instruction to execute if there is only one CFG edge leaving
A).
- if statement A is interrupted by an exception which is thrown
to a handler, B, in most cases _none_ of A's computation is
performed before the transfer of control to the exception
handler (if A has side effects--which essentially means if it
involves a method call--then some of those side effects might
be performed before the exception is thrown). So in the
absence of side effects, the CFG edges corresponding to
throwing the exception are _not_ (in the absence of side
effects) from A to B, but from each of A's predecessors to B.
If A does have side effects, then there is an edge from A to B,
as well as edges from each of A's predecessors to B.
(Actually, soot will include an edge from A to B even if A has no
side effects, unless you specify --omit-excepting-unit-edges,
possibly indirectly through --trim-cfgs.
--omit-excepting-unit-edges corresponds to the
omitExceptingUnitEdges parameter to ExceptionalUnitGraph's
constructor. As explained in the usage documentation, this
conservative inclusion of an edge from A to B even if A has no
side effects is a safeguard against producing unverifiale code,
since the specification of verification speaks of all
instructions in the scope of a handler possibly throwing
exceptions to it.)
For human visualization of CFGs, you might be better off with the
combination of regular control flow edges provided by
ExceptionalGraph.getUnexceptionalSuccsOF(), the black edges in
the dot files), and what the "exception destinations" provided by
ExceptionalUnitGraph.getExceptionDests() (the gray edges in the
dot files, provided you specify --show-exception-dests).
That combination would be closer to what you seem to be
expecting.
More information about the Soot-list
mailing list