[Soot-list] throw analysis

Wed Jun 22 01:01:35 EDT 2005

    rugang> so maybe i have a incorrect notion of what's right
    rugang> exception behavior.  So I have:

    rugang> try { . . . }
    rugang> catch (Exception e) { . . .}
    rugang> catch (Error e) {. . .}
    rugang> finally{...}

    rugang> so it seems that the correct cfg will not have an
    rugang> edge from every statement in the try block to the
    rugang> finally block because any of those exceptions /
    rugang> errors thrown will be caught in the previous catch
    rugang> blocks.

    rugang> However, the unit throw analysis seems to add those
    rugang> unecessary edges from the statements in the try block
    rugang> to the finally block. Am i missing something here?

The reason that there are CFG edges from within your "try" to the
"finally" is not that the "try" statements might throw an
exception that gets handled directly by the "finally"; the reason
is that the "try" statements might throw an exception to one of
the "catch"s, only to have the first instruction in that handler
throw an exception to the "finally".

This will be clearer if we flesh out your example:

int method(int k) {
  int i = 0;        //A
  int j = 0;
  try {
    j = -1;         //B
    i = j/k;        //C
  } catch (Exception e) {
    i = 5;          //D
  } catch (Error e) {
    i = 6;          //E
  } finally {
    return j/i;     //F
  } 
}

and imagine that we want to write a compiler analysis that
performs constant propagation and tracks whether variables may or
may not have zero values.

>From your question, I imagine you expect this method to have the
following CFG edges (using the instruction labels that I included
in the inline comments above):

  A->B (regular control flow)
  B->C (regular control flow)
  B->E (exceptional control flow, Errors are always possible)
  C->D (exceptional control flow, ArithmeticException)
  C->E (exceptional control flow)
  D->F (exceptional control flow)
  E->F (exceptional control flow)

Bear in mind, though, that the primary use of Soot CFGs is for
performing dataflow analyses, not human visualization of control
flow.  And a dataflow analysis using those edges could produce
incorrect answers.  The edges say that the only routes from A to
F are

  A->B->C->D->F
  A->B->C->E->F
  A->B->E->F

from which our dataflow analysis could deduce that at statement F
neither i nor j could possibly have the value 0, since all the
paths include B, which assigns j -1, and either C, D or E, which
assign i non-zero values.  So your dataflow analysis could deduce
that F cannot cause a division by zero.

But if k is zero, the exception will be thrown to D before any
assignment to i.  So an exception thrown from instruction C to
instruction D needs to create a CFG edge from B to D, not from C
to D.  This allows dataflow analyses to correctly deduce that
when you reach D, the computation at B must have been completed
(because B cannot through an Exception, only an Error), but the
computation at C must not have been completed (if it had
completed, there would not have been an exception).

Now let us say that k is zero, and that at the instant that
control is transfered to D to catch the ArithmeticException, the
VM throws ThreadDeath or InternalError.  Then control will be
transferred to F in the finally clause, potentially before 5 is
assigned to i.  It is possible that i still has the value 0 at F,
and our CFG has to reflect that.

Moreover, if there is an error during the execution of
B, control could be transferred to E before the assignment of -1
to j, so j could also have a value of 0 at F.

Section 2.4 of
http://www.sable.mcgill.ca/publications/techreports/#report2003-3
goes further into just what a CFG edge should mean.  Essentially...

- A CFG edge from instruction A to instruction B in Soot's CFGs
  means that after some or all of A's computation is performed, B
  might be the next instruction to execute (B _will_ be the next
  instruction to execute if there is only one CFG edge leaving
  A).

- if statement A is interrupted by an exception which is thrown
  to a handler, B, in most cases _none_ of A's computation is
  performed before the transfer of control to the exception
  handler (if A has side effects--which essentially means if it
  involves a method call--then some of those side effects might
  be performed before the exception is thrown).  So in the
  absence of side effects, the CFG edges corresponding to
  throwing the exception are _not_ (in the absence of side
  effects) from A to B, but from each of A's predecessors to B.

  If A does have side effects, then there is an edge from A to B,
  as well as edges from each of A's predecessors to B.

(Actually, soot will include an edge from A to B even if A has no
side effects, unless you specify --omit-excepting-unit-edges,
possibly indirectly through --trim-cfgs.
--omit-excepting-unit-edges corresponds to the
omitExceptingUnitEdges parameter to ExceptionalUnitGraph's
constructor. As explained in the usage documentation, this
conservative inclusion of an edge from A to B even if A has no
side effects is a safeguard against producing unverifiale code,
since the specification of verification speaks of all
instructions in the scope of a handler possibly throwing
exceptions to it.)

For human visualization of CFGs, you might be better off with the
combination of regular control flow edges provided by
ExceptionalGraph.getUnexceptionalSuccsOF(), the black edges in
the dot files), and what the "exception destinations" provided by
ExceptionalUnitGraph.getExceptionDests() (the gray edges in the
dot files, provided you specify --show-exception-dests).
That combination would be closer to what you seem to be
expecting.