[Soot-list] Fwd: Re: [McLab] Question about McSAF merge

Tue May 7 16:21:48 EDT 2013

Hi Sooters,

The e-mail thread below questions the API for merge in McSAF.   I think 
McSAF inherited the API from Soot, so I would like to know the Soot 
folks take on this.       I know we won't want to change Soot now,  but 
McLab is new enough that we can still make changes.

Cheers,  Laurie

-------- Original Message --------
Subject: 	Re: [McLab] Question about McSAF merge
Date: 	Tue, 7 May 2013 15:06:37 -0400
From: 	Matthieu Dubet <maattdd at gmail.com>
To: 	Ismail Badawi <isbadawi at gmail.com>
CC: 	mclab <mclab at sable.mcgill.ca>

As you said the only reason I can think of to do that is to avoid the 
allocation in the merge function 
(http://stackoverflow.com/questions/10403766/passing-a-parameter-versus-returning-it-from-function). 

It is clearly negligeable in 99.9 .. % of the case (especially 
considering the size of flowsets), but it creates a confusion for the 
user ("what is happening to the "out" parameter if it's not an empty set 
??? " ..etc..) .Â

It is a bad design choice imho.

However, the original author might have been influenced by a wrong C++ 
idiom where returning collections as value was considered bad for 
performances so people were passing the output value as input 
&parameter. But it's not true anyway because of :Â
* C++11 move semantic
* automatic return-value-optimization from most compilers before C++11
* anyway not applicable to Java because of the reference only semantic.

On Tue, May 7, 2013 at 2:00 PM, Ismail Badawi <isbadawi at gmail.com 
<mailto:isbadawi at gmail.com>> wrote:

    Hi,

    I was wondering if someone knows why (or can try and justify why)
    McSAF analyses have to implement

    void merge(A in1, A in2, A out)

    instead of

    A merge(A in1, A in2)

    The first form is error-prone because you don't know what out is --
    e.g. if A = Set<String> and you want to do union, it might not be
    the empty set so you should clear it first. But it seems merge is
    often called with out pointing to the same thing as in1 or in2, so
    you end up having to consider a bunch of cases (all three sets the
    same, all three sets different, in1 same as out, in2 same as out,
    etc.) otherwise you risk writing over the inputs.

    For example, if you wanted to merge sets using union, naively you
    would write

    public void merge(Set<String> in1, Set<String> in2, Set<String> out) {
    Â  out.clear();
    Â out.addAll(in1);
    Â out.addAll(in2);
    }

    but actually you should write

    public void merge(Set<String> in1, Set<String> in2, Set<String> out) {
    Â  if (in1 != out && in2 != out) {
    Â  Â out.clear();
    Â  }
    Â  if (in1 != out) {
    Â  Â out.addAll(in1);
    Â  }
    Â  if (in2 != out) {
    Â  Â out.addAll(in2);
    Â  }
    }

    which seems needlessly convoluted. I guess I could fix the framework
    to always pass in a newInitialFlow() or something but why not leave
    it up the implementing class? You could write

    public Set<String> merge(Set<String> in1, Set<String> in2) {
    Â Set<String> out = new HashSet<String>();
    Â out.addAll(in1);
    Â out.addAll(in2);
    Â  return out;
    }

    which is more obviously correct.

    I guess maybe it's to avoid a memory allocation? But is it worth the
    trouble?

    Ismail

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.cs.mcgill.ca/pipermail/soot-list/attachments/20130507/d50d1611/attachment.html 
-------------- next part --------------
_______________________________________________
McLab mailing list
McLab at sable.mcgill.ca
http://mailman.cs.mcgill.ca/mailman/listinfo/mclab