[Soot-list] [Android][FlowDroid][SPARK] A question about the precision of taint analysis in Flowdroid (and possible false negatives in spark)

Tue Apr 16 14:15:41 EDT 2019

Thank you so much for your response Steven!

I've actually changed my approach since I've sent the previous email. Now I
marked native methods as sinks and used sources produced by SuSi. which
brings me to my next question:
I run SuSi on API 28 and used the same ground truth listed in SuSi's Github
repo. The recall and precision reported in the output (result of the
ten-fold cross validation) were high (89% for recall and precision). Which
is not what I expected given the fact that I didn't hand-annotate a ground
truth for Android's 28 API. I did notice that the number of methods per
buckets were relatively small in API 28 as compared to API 17 and some
entities were purged before running the 10-fold cross validation. I suspect
that there will be more false negatives than those reported in the recall.

Regarding SPARK's call graph, is the lack of completeness in SPARK due to
implementation reasons, or is it an inherent issue with the algorithm? I
was under the impression that SPARK is sound and more precise than the rest
of the cg algorithms that's why I opted for this it when running FlowDroid.
Since reducing false negatives in my analysis is more vital, I'll consider
using VTA instead.

Thanks again for your valuable feedback Steven.

On Mon, Apr 15, 2019 at 6:40 AM Arzt, Steven <steven.arzt at sit.fraunhofer.de>
wrote:

> Hi Sumaya,
>
>
>
> I’m not sure that your approach is likely to scale to realistic apps. For
> every source, FlowDroid needs to track a taint abstraction through the
> program. With thousands of sources, I don’t think the analysis will
> terminate in any realistic time frame. You normally have a few dozen or
> maybe a few hundred sources that apply to a single application, but not
> thousands.
>
>
>
> I’d suggest that you only specify those methods as sources that you are
> actually interested in. It might very well be the case that data from some
> method foo() is passed to native code, but if the return value of foo() is
> not of interest, it doesn’t really matter what the native code does with it.
>
>
>
> Secondly, concerning the callgraph: SPARK’s callgraph is incomplete,
> because it needs to propagate type information from allocation sites to
> call sites. Therefore, if there is no call site (e.g., because the call
> site is hidden inside a factory method in the OS), the calls on the
> respective base object are missing from the CG. FlowDroid handles these
> cases through StubDroid summaries, and does not rely on the SPARK CG alone.
>
>
>
> Best regards,
>
>   Steven
>
>
>
> *From:* Soot-list <soot-list-bounces at cs.mcgill.ca> *On Behalf Of *Sumaya
> Abdullah A Almanee
> *Sent:* Thursday, April 11, 2019 12:35 AM
> *To:* soot-list at cs.mcgill.ca
> *Subject:* [Soot-list] [Android][FlowDroid][SPARK] A question about the
> precision of taint analysis in Flowdroid (and possible false negatives in
> spark)
>
>
>
>
>
> Im currently using FlowDroid to simply track taint propagations between
> certain sources and sinks. since I'm performing a separate analysis on some
> native libraries of Android apks, I've decided to leverage FlowDroid to
> track any taints passed/leaked from the *dalvik*-side to the *native-*
> side.
>
> The way I configured the Source_Sink files is by first examining the
> reachable functions in the call graph generated by FlowDroid (using spark)
> and then marking these reachable functions as follow: any native function
> is marked as _SINK_ and everything else as _SOURCE_.
>
>
>
> I obtained some initial results. A small snippet of these results is shown
> below: (The results highlighted in yellow are the ones that Im mainly
> interested in)
>
>
>
> [image: Screen Shot 2019-04-10 at 3.04.27 PM.png]
>
>
>
> Based on the way I constructed the sources and sinks config file I was
> expecting more leaks to be reported. If I understand correctly these
> results might contain *false positives* for example in the case of arrays
> or collections (due to over-approximations). However, FlowDroid is unlikely
> to miss any leaks (low *false negatives* rate). Is this correct? What I'm
> trying to figure out here is:
>
> 1) An estimate of false positives or false negatives in FlowDroid's
> reported leaks.
>
> 2) Possible reasons why some leaks might be missing (false negatives)?
>
> 3) Since FlowDroid is relaying on the call graph for reporting taints (in
> this case SPARK) and since the absence of a node in the graph might result
> also in missing reported leaks. I was wondering is there's also an estimate
> of false negatives in Sprak?
>
>
>
> I really appreciate your time and help with this!
>
>
>
> Best,
>
> Sumaya
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.CS.McGill.CA/pipermail/soot-list/attachments/20190416/2a74981b/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.png
Type: image/png
Size: 105629 bytes
Desc: not available
URL: <https://mailman.CS.McGill.CA/pipermail/soot-list/attachments/20190416/2a74981b/attachment-0001.png>