[Soot-list] Missing call graph edges

Steven Arzt Steven.Arzt at cased.de
Tue Feb 24 04:06:55 EST 2015


Hi Peter,

 

Do not add any transformers to the “cg” pack, that is too early. Only add your SceneTransformers to the “wjtp” pack which runs after the callgraph has been constructed and all necessary classes have already been loaded.

 

Also make sure that you have indeed enabled whole-program mode.

 

Best regards,

  Steven

 

Von: Peter Kim [mailto:chpkim at gmail.com] 
Gesendet: Montag, 23. Februar 2015 22:59
An: Steven Arzt
Cc: soot-list at cs.mcgill.ca
Betreff: Re: [Soot-list] Missing call graph edges

 

Hi Steven,

 

I pointed to the fully implemented JAR files, but I'm still having the same problem. To see if the method implementations are actually being picked up by Soot, I tried to see if I can retrieve the body of java.util.ArrayList.get() in a SceneTransformer added to the call graph pack. I'm getting the following error:

 

This operation requires resolving level BODIES but java.util.ArrayList is at resolving level SIGNATURES

If you are extending Soot, try to add the following call before calling soot.Main.main(..):

Scene.v().addBasicClass(java.util.ArrayList,BODIES);

Otherwise, try whole-program mode (-w)

 

I tried addBasicClass(), whole program mode, and forceResolve() as well against "java.util.ArrayList", but I still cannot retrieve the body. Could you please let me know how I can get Soot to pick up the bodies of library implementations?

 

Thanks.

 

On Mon, Feb 23, 2015 at 1:56 PM, Steven Arzt <Steven.Arzt at cased.de> wrote:

Hi Peter,

 

There’s not much to modify. Just point FlowDroid to fully implemented JAR files instead of specifying the platforms directory of your Android SDK which contains only the stub  versions.

 

Best regards,

  Steven 

 

Von: Peter Kim [mailto:chpkim at gmail.com] 
Gesendet: Donnerstag, 19. Februar 2015 19:11


An: Steven Arzt
Cc: soot-list at cs.mcgill.ca
Betreff: Re: [Soot-list] Missing call graph edges

 

Hi Steven,

 

Could you please tell me how/where to change Infoflow.java or a related file so that full, non-stub, library jars, i.e. android.jar, rt.jar, and jce.jar, are picked up? 

 

Thanks

 

On Tue, Feb 10, 2015 at 9:14 PM, Steven Arzt <Steven.Arzt at cased.de> wrote:

Hi Peter,

 

What a suitable filler for the gap should look like depends on your analysis problem. If you are ok with treating the library methods as black boxes, but want the incoming and outgoing call edges (i.e., those edges that cross the boundary of the library’s interface), there has been some work in the area. I am not aware of any readily available implementation on top of Soot, but for WALA there is Averroes:

 

                http://link.springer.com/chapter/10.1007/978-3-642-39038-8_16

 

The basic idea behind Averroes is to take a full implementation of a library and throw away everything that is not required for callgraph construction. This gives you a very lightweight stub that you can use when analyzing client programs while still maintaining a full callgraph.

 

Best regards,

  Steven

 

Von: soot-list-bounces at CS.McGill.CA [mailto:soot-list-bounces at CS.McGill.CA] Im Auftrag von Peter Kim
Gesendet: Dienstag, 10. Februar 2015 21:13


An: Steven Arzt
Cc: soot-list at cs.mcgill.ca
Betreff: Re: [Soot-list] Missing call graph edges

 

Hi Steven,

 

Thanks for your clarification. What I was really trying to ask was if there is an easy way to manually fill in the gaps, missing due to library stubs, for call graph construction using Spark for Android apps. Basically, a model of the Android framework like taint wrappers, but for precise Android call graph construction. 

 

 

 

On Tue, Feb 10, 2015 at 7:45 PM, Steven Arzt <Steven.Arzt at cased.de> wrote:

Hi Peter,

 

There still seems to be some misunderstanding about the concept of taint wrappers. You write that you do not want to perform taint tracking. In that case, the taint wrappers provided by FlowDroid will not be of much help, so it doesn’t even matter what you put into EasyTaintWrapperSource.txt. Again: Taint wrappers have nothing to do with callgraph construction. They are a means for the taint analysis to get along with an incomplete callgraph and “fill the gaps” with respect to the semantics of taint tracking.

 

The conceptual problem I described in my last e-mail has nothing to do with how you obtain the callgraph in Soot either. Your callgraph will be incomplete. That’s what happens because of how the SPARK callgraph algorithm works.

 

The methods I explained in my last e-mail are ways to deal with the problem. You can either analyze your apps together with full OS / library implementations (with all the downsides this has), or you can extend your client analysis to work with an incomplete callgraph which is what I recommend. Besides that, if you are willing to greatly sacrifice callgraph precision, you might also give CHA a try. CHA does not depend on allocation sites, so the CHA callgraph will at least be somewhat sound (given the usual small print). For most analyses, CHA is however not an option due to its really heavy over-approximation.

 

If you chose to include the library code in your analysis, this has nothing to do with where you get your apps from. With “library”, I mean the Android platform implementation, the stuff that is installed on your phone ever since. By the way, “just an APK” is never sufficient. You always need some kind of library model; usually, you just use a very minimalistic one provided through the stub JAR files from the Android SDK.

 

Best regards,

  Steven

 

Von: soot-list-bounces at CS.McGill.CA [mailto:soot-list-bounces at CS.McGill.CA] Im Auftrag von Peter Kim
Gesendet: Dienstag, 10. Februar 2015 19:13


An: Steven Arzt
Cc: soot-list at cs.mcgill.ca
Betreff: Re: [Soot-list] Missing call graph edges

 

Hi Steven,

 

Thanks for your response. I have some follow up questions:

 

- I changed "EasyTaintWrapperSource.txt" to include "<java.util.ArrayList: java.lang.Object get(int)>" but it's still not working. Is this the right way to add a taint wrapper?

 

- I am not interested in taint analysis. I only want a call graph for an Android app. Is what I am doing, i.e. extending Infoflow and getting call graph through Scene, the best way to get the call graph or is there a better way? Note that I'm getting the call graph *before* FlowDroid starts looking at sources/sinks.

 

- Suppose that I'm analyzing APKs found on the web. Will these ever come with full library implementations or will they have just stubs? So if I wanted to transform/analyze library code, it seems that it would not be possible to do so with just an APK.

 

 

On Tue, Feb 10, 2015 at 9:56 AM, Steven Arzt <Steven.Arzt at cased.de> wrote:

Hi Peter,

 

The callgraph edges have nothing to do with basic blocks. The difference here is that “remove” is called on the base object “objects” and “free” is called on “obj”. What you observe is that you have outgoing call edges for all calls on “objects”, but none for calls on “obj”. This usually means that the SPARK callgraph algorithm is unable to propagate allocation sites to the base object “obj”. Therefore, it cannot decide where the calls should go.

 

The question now is why the allocation site propagation fails for “obj”. I guess you are running FlowDroid with the Android JAR files from the official SDK? These JAR files are only stubs, so there is no real implementation of “java.util.ArrayList”. All methods inside this class (and all other system classes) will only throw NotImplementedExceptions. Therefore, SPARK cannot know what the return type of “get()” would be – it could be everything. In such a case, SPARK does not perform an over-approximation using CHA, but simply leaves out the respective edges.

 

How to get around the problem? Firstly, you could use a full implementation of the Android system classes and analyze them along with your program. The disadvantage of this approach is that you are analyzing tens of megabytes of system code together with an app of a few kilobytes. Your memory consumption will likely blow up to tens of gigabytes due to all the allocation site propagation inside the system libraries and you will have to wait a long time for your callgraph.

 

Another idea would be to just live with the incomplete callgraph. That’s what FlowDroid does. We know that we don’t have call edges for some call sites. If we encounter such a situation during the taint propagation, we query a store of explicit models for library methods on how to continue with the taint propagation. You might want to read up on Taint Wrappers in the FlowDroid paper. This approach is fast and scalable, with the downside of having to provide these models by hand. In FlowDroid, we currently have a rule set that works pretty well for most of the Android API.

 

This explanation is by the way also consistent with your observation what you get a call edge if you change “free” to a static method: In that case, you get a StaticInvokeExpr instead of a VirtualInvokeExpr and call resolution becomes much more trivial.

 

Best regards,

  Steven

 

 

M.Sc. M.Sc. Steven Arzt

Secure Software Engineering Group (SSE)

European Center for Security and Privacy by Design (EC SPRIDE) 

Rheinstraße 75

D-64293 Darmstadt

Phone: +49 61 51 869-336

Fax: +49 61 51 16-72118 <tel:%2B49%2061%2051%2016-72118> 

eMail:  <mailto:steven.arzt at ec-spride.de> steven.arzt at ec-spride.de

Web: http://sse.ec-spride.de <http://sse.ec-spride.de/> 

 

 

 

 

Von: Peter Kim [mailto:chpkim at gmail.com] 
Gesendet: Montag, 9. Februar 2015 22:38
An: Sam Blackshear
Cc: Steven Arzt; soot-list at cs.mcgill.ca


Betreff: Re: [Soot-list] Missing call graph edges

 

Hi Sam,

 

Yes, I changed the code, re-ran Soot, and Soot still doesn't report the edges. 

 

 

Hi Steve,

 

Note that when I change "free()" to a static method, the edge is reported, but when it is an instance method, it is not reported. In light of the discussion with Sam, I want to make it absolutely clear that the code runs fine even when it is an instance method, so in my view, it seems to be a bug or perhaps Infoflow is constructing a call graph that is different from the traditional call graph, but since you told me that you changed it back to return a traditional call graph, I think it deserves an investigation.

 

Thanks.

 

On Mon, Feb 9, 2015 at 9:31 PM, Sam Blackshear <samuel.blackshear at colorado.edu> wrote:

Call graph construction is typically flow-insensitive, so it is not precise enough to do the kind of reasoning you are doing in your head (i.e. For "objects.remove()" to be included in the call graph, "obj" cannot be null). If you are not familiar with the flow-insensitive call graph construction algorithms used by tools like Soot, this <http://manu.sridharan.net/files/aliasAnalysisChapter.pdf>  is a good place to start.

 

Now, if you changed the code in that way that you described (adding one or more objects to the objects list), re-ran Soot, and Soot still does not report the edges, that is a problem for Soot (and thus goes beyond my expertise :)). But for the example program you posted, Soot's result is as expected for a flow-insensitive call graph construction algorithm.

 

- Sam

 

On Mon, Feb 9, 2015 at 2:22 PM, Peter Kim <chpkim at gmail.com> wrote:

Hi Sam,

 

The code snippet is the following:

 

BaseTweet<?> obj = objects.get(i);

if (obj.isFinished() && obj.isAutoRemoveEnabled) {

  objects.remove(i);

  obj.free();

}

 

For "objects.remove()" to be included in the call graph, "obj" cannot be null. So even without any object in the list, if "objects.remove()" is included, then "obj.free()" should be included as well.

 

Just to be absolutely sure though, I just ran the app with one object in the list and made sure that the true branch is executed. Soot still returns only "remove()" in the call graph. I also made sure that "free()" prints output (meaning it shouldn't be excluded from the call graph).

 

 

On Mon, Feb 9, 2015 at 9:08 PM, Sam Blackshear <samuel.blackshear at colorado.edu> wrote:

Peter,

  What you observe is consistent with what I explained. The objects.remove() method is included because objects is initialized to a non-null ArrayList. However, obj.free() is not included because the analysis is smart enough to determine that there is no possible concrete execution in which the method obj.free() will be called (the statement obj.free() will throw an exception *if* it ever executes, because obj will always be null).

 

- Sam

 

On Mon, Feb 9, 2015 at 1:54 PM, Peter Kim <chpkim at gmail.com> wrote:

Hi Sam,

 

The loop iteration is executed only if there is an item in the list, so it shouldn't matter if no object has been added to the list. The code runs without exception. The problem is that of two call graph edges that are part of the same basic block (remove() and free()), only one call graph edge (remove()) is being returned, which is strange.

 

On Mon, Feb 9, 2015 at 8:19 PM, Sam Blackshear <samuel.blackshear at colorado.edu> wrote:

Hi Peter,

  My suspicion is that the callgraph is correct here. You never add anything the the objects ArrayList, so whenever you try to read a BaseTweet object out of the list, the analysis (correctly) concludes that only null could be returned. If you call a method on null (like isFinished), the call graph (correctly) concludes that this would result in an NPE and thus does not add the edge. If you want to see these edges in the callgraph, extend your code to add something to the objects ArrayList:

 

objects.add(new BaseTweet())

 

When debugging your static analysis results, it's often helpful to concretely execute your target program and be sure that it behaves as you expect!

 

- Sam

 

 

On Mon, Feb 9, 2015 at 12:40 PM, Peter Kim <chpkim at gmail.com> wrote:

Hi Steven,

 

Here is a complete minimal example as an Eclipse project (just import into your workspace): https://drive.google.com/file/d/0B9KLXcAovVUHa0FuN3gzRGJETmc/view

 

I retrieve the CFG of this app at Infoflow.runAnalysis(final ISourceSinkManager sourcesSinks, final Set<String> additionalSeeds), calling "CallGraph cg = Scene.v().getCallGraph();" right before "iCfg = icfgFactory.buildBiDirICFG(callgraphAlgorithm);". I use cg, not iCfg.

 

The edges out of com.example.toyandroid.ChpkimMainActivity.chpkimUpdate() I get are:

 

<java.util.ArrayList: int size()>

<java.util.ArrayList: java.lang.Object get(int)>

<java.util.ArrayList: java.lang.Object remove(int)>

 

But they should be:

 

<java.util.ArrayList: int size()>

<java.util.ArrayList: java.lang.Object get(int)>

<java.util.ArrayList: java.lang.Object remove(int)>

<com.example.toyandroid.BaseTweet: boolean isFinished()>

<com.example.toyandroid.BaseTweet: void free()>

<com.example.toyandroid.BaseTweet: void update(float)>

 

Thanks for your help.

 

 

 

 

 

 

On Mon, Feb 9, 2015 at 8:40 AM, Steven Arzt <Steven.Arzt at cased.de> wrote:

Hi Peter,

 

Can you please send me a more complete minimal example with which I can reproduce the issue?

 

Best regards,

  Steven

 

Von: soot-list-bounces at CS.McGill.CA [mailto:soot-list-bounces at CS.McGill.CA] Im Auftrag von Peter Kim
Gesendet: Sonntag, 8. Februar 2015 19:05
An: Steven Arzt
Cc: soot-list at cs.mcgill.ca
Betreff: Re: [Soot-list] Missing call graph edges

 

eliminateDeadCode() is *not* being called and I'm still running into the problem. Thanks in advance for your help.

 

On Sun, Feb 8, 2015 at 5:37 PM, Peter Kim <chpkim at gmail.com> wrote:

Hi Steven,

 

I'm still running into the same problem after pulling from Github.

 

 

On Fri, Feb 6, 2015 at 9:24 AM, Steven Arzt <Steven.Arzt at cased.de> wrote:

Hi Peter,

 

that might have to do with an optimization I added recently. In short, FlowDroid removes these callgraph edges for which it can easily decide that having them does not influence the outcome of the taint analysis. I can however fully understand that this might lead to surprising results if you are using the FlowDroid components for other analyses, so I decided to make this optimization optional and turn it off by default.

 

The new code is on Github and a new nightly build will be available tomorrow.

 

Best regards,

  Steven

 

 

M.Sc. M.Sc. Steven Arzt

Secure Software Engineering Group (SSE)

European Center for Security and Privacy by Design (EC SPRIDE) 

Rheinstraße 75

D-64293 Darmstadt

Phone: +49 61 51 869-336

Fax: +49 61 51 16-72118 <tel:%2B49%2061%2051%2016-72118> 

eMail:  <mailto:steven.arzt at ec-spride.de> steven.arzt at ec-spride.de

Web: http://sse.ec-spride.de <http://sse.ec-spride.de/> 

 

 

 

Von: soot-list-bounces at CS.McGill.CA [mailto:soot-list-bounces at CS.McGill.CA] Im Auftrag von Peter Kim
Gesendet: Freitag, 6. Februar 2015 00:05
An: soot-list at cs.mcgill.ca
Betreff: [Soot-list] Missing call graph edges

 

Hi,

 

I'm extending FlowDroid to construct an Android app's call graph. More specifically, I get the call graph by modifying Infoflow.runAnalysis(final ISourceSinkManager sourcesSinks, final Set<String> additionalSeeds) to call Scene.v().getCallGraph(). The call graph is missing edges in an odd way - for a function, the graph has some outgoing edges but is missing ones that should be there. Namely, given the following function (shown in Java rather than jimple for readability), the called methods should be "get()", "isFinished()", "remove()", "free()", "size()", "update()", but I'm only getting "get()", "size()", and "remove()". I don't understand why "remove()" is included but "free()" is not since they are in the same basic block. I'm using soot.jimple.toolkits.callgraph.TransitiveTargets to analyze the call graph.

 

public void update(float x) {

  for (...size()..) {

      get();

      if (isFinished()) {

        remove();

        free();

      }

  }

 

  if (y) {

    if (x) {

      for (... size()...)  get().update(x);

    } else {

      for (...size()...)  get().update(x);

    }

  }

}

 

Thank you for your help.

 

 

 

 

 

_______________________________________________
Soot-list mailing list
Soot-list at CS.McGill.CA
https://mailman.CS.McGill.CA/mailman/listinfo/soot-list

 

 

 

 

 

 

 

 

 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://mailman.CS.McGill.CA/pipermail/soot-list/attachments/20150224/e6f959a3/attachment-0001.html 


More information about the Soot-list mailing list