[Soot-list] [Android][FlowDroid][SPARK] A question about the precision of taint analysis in Flowdroid (and possible false negatives in spark)

Arzt, Steven steven.arzt at sit.fraunhofer.de
Mon Apr 15 09:38:38 EDT 2019


Hi Sumaya,

 

I’m not sure that your approach is likely to scale to realistic apps. For every source, FlowDroid needs to track a taint abstraction through the program. With thousands of sources, I don’t think the analysis will terminate in any realistic time frame. You normally have a few dozen or maybe a few hundred sources that apply to a single application, but not thousands.

 

I’d suggest that you only specify those methods as sources that you are actually interested in. It might very well be the case that data from some method foo() is passed to native code, but if the return value of foo() is not of interest, it doesn’t really matter what the native code does with it.

 

Secondly, concerning the callgraph: SPARK’s callgraph is incomplete, because it needs to propagate type information from allocation sites to call sites. Therefore, if there is no call site (e.g., because the call site is hidden inside a factory method in the OS), the calls on the respective base object are missing from the CG. FlowDroid handles these cases through StubDroid summaries, and does not rely on the SPARK CG alone.

 

Best regards,

  Steven

 

From: Soot-list <soot-list-bounces at cs.mcgill.ca> On Behalf Of Sumaya Abdullah A Almanee
Sent: Thursday, April 11, 2019 12:35 AM
To: soot-list at cs.mcgill.ca
Subject: [Soot-list] [Android][FlowDroid][SPARK] A question about the precision of taint analysis in Flowdroid (and possible false negatives in spark)

 

 

Im currently using FlowDroid to simply track taint propagations between certain sources and sinks. since I'm performing a separate analysis on some native libraries of Android apks, I've decided to leverage FlowDroid to track any taints passed/leaked from the dalvik-side to the native-side.

The way I configured the Source_Sink files is by first examining the reachable functions in the call graph generated by FlowDroid (using spark) and then marking these reachable functions as follow: any native function is marked as _SINK_ and everything else as _SOURCE_.

 

I obtained some initial results. A small snippet of these results is shown below: (The results highlighted in yellow are the ones that Im mainly interested in)

 



 

Based on the way I constructed the sources and sinks config file I was expecting more leaks to be reported. If I understand correctly these results might contain false positives for example in the case of arrays or collections (due to over-approximations). However, FlowDroid is unlikely to miss any leaks (low false negatives rate). Is this correct? What I'm trying to figure out here is:

1) An estimate of false positives or false negatives in FlowDroid's reported leaks. 

2) Possible reasons why some leaks might be missing (false negatives)?

3) Since FlowDroid is relaying on the call graph for reporting taints (in this case SPARK) and since the absence of a node in the graph might result also in missing reported leaks. I was wondering is there's also an estimate of false negatives in Sprak?

 

I really appreciate your time and help with this!

 

Best,

Sumaya

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.CS.McGill.CA/pipermail/soot-list/attachments/20190415/e1b8641c/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.png
Type: image/png
Size: 105629 bytes
Desc: not available
URL: <https://mailman.CS.McGill.CA/pipermail/soot-list/attachments/20190415/e1b8641c/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 9148 bytes
Desc: not available
URL: <https://mailman.CS.McGill.CA/pipermail/soot-list/attachments/20190415/e1b8641c/attachment-0001.p7s>


More information about the Soot-list mailing list