[Soot-list] Creating an abstracted call graph

Thu Feb 16 08:53:35 EST 2006

On Wed, Feb 15, 2006 at 11:06:37PM -0500, Eric Bodden wrote:
> This could certainly be solved by first building the usual call graph
> and then using Filter and TransitiveTargets. However, this would
> probably be quite slow.

The only thing slow about building the full call graph is that you have
to read all the methods you visit off the disk and jimplify them. You
still have to do this in your proposed algorithm. Compared to that,
building the call graph is peanuts.

Yes, doing lots of TransitiveTargets could be slow, because it would go
and visit things past what you're looking for. That is, it would do much
more work than you actually need. So, if you're worried about
efficiency, you would probably have to implement your graph compression
algorithm yourself.

> So I was wondering if it would be a better idea to modify/subclass
> OnFlyCallGraphBuilder in such a way that it creates this abstracttion
> right away. Basically, it would have modified in such a way that it only
> adds edges between nodes that satisfy the predicate. If one was about to
> add an edge ([M1],M2), then one would not add this edge (since M2 is not
> marked) but rather store [M1] (hence deleting M2) and at the next time
> one encounters a marked method (here [M3]) would add the edge
> ([M1},[M3]). At this time one would also have to adjust all incoming
> edges which went to M2 to now point to M3.

It seems to me that this proposed algorithm would only work correctly if
the call graph is a tree. In reality, it's usually not even a DAG.

Consider the following graph (in dot notation):
A -> B -> C
D -> B -> E
(where only the edges out of B satisfy your predicate)

A DFS would visit the nodes in, for example, this order:
A B C E D

This would give you the graph
A -> C
A -> E

But presumably you also want
D -> C
D -> E

> This of course all only works if the call graph builder builds the graph
> in a depth first order. Would it be safe to make such an assumption? Or
> does anyone have a better idea of how to achive this?

If you use CHA, the current implementation does a breadth-first
traversal. (I don't guarantee it's exactly breadth-first; it was never
a requirement, and there may be details I'm forgetting that make it not
exactly breadth-first.)

Using CHA, it's theoretically possible to implement a depth-first
traversal (again, unless I'm forgetting some nasty language detail).

If you use Spark for resolving call sites, it's not possible to do a
depth-first or a breadth-first search, because both of those visit every
node only once, and using points-to information to resolve call sites
requires, in general, visiting some call sites multiple times. That
is, I can construct an example program (and I've seen such examples in
practice) where both a breadth-first and a depth-first traversal would
give the wrong result.

I would suggest to just get Soot to build the graph, and then do your
transformation on the graph.

Note that your transformation itself may, in theory, significantly
increase the size of the call graph. Take the example I showed above,
but make it bigger by adding more nodes like A and D and more nodes like
C and E. Then your transformation squares the number of edges. One could
hope that this won't happen in practice, but it doesn't seem all that
unlikely. I guess the only way to tell is to try it.

Ondrej

> 
> Thanks a lot,
> 
> Eric
> 
> --
> Eric Bodden
> Sable Research Group, McGill University
> Montreal, Canada 
> _______________________________________________
> Soot-list mailing list
> Soot-list at sable.mcgill.ca
> http://mailman.cs.mcgill.ca/mailman/listinfo/soot-list
>