[Soot-list] Why Spark in soot 2.2.4 slows down

Thu Jul 5 10:43:17 EDT 2007

Hi there,

I just wanted to chime in and say that the performance of my 
demand-driven points-to analysis implementation *may* also depend on 
equals() being object identity for Spark points-to sets; I can't 
immediately remember.  I also think the cleaner fix would be in those 
Spark clients depending on different equals() behavior, but the 
IdentityHashSet solution may be much easier.

-Manu

Ondrej Lhotak wrote:
> On Wed, Jul 04, 2007 at 12:00:45PM -0400, Eric Bodden wrote:
>   
>> Ondrej, I actually think that actually it would be the cleanest
>> solution if Spark would use an IdentityHashSet at those places where
>> comparison on identity is required. Would that be possible?
>>     
>
> A clean solution would avoid imposing behaviour required by some
> external application on Spark-internal classes; using a wrapper
> in your application would be cleaner. But you're right that using an
> IdentityHashSet inside Spark is the easiest solution, if not cleanest.
> It's fine with me, as long as you find all the places where Spark
> could be calling hashCode and equals on those classes. The worklist
> may be the only such place, but it may not; it was a running assumption
> that equals would be object identity, and I just don't know all the
> places in the code that rely on that assumption.
>
>   
>> Cheers,
>> Eric
>>
>> On 04/07/07, Eric Bodden <eric.bodden at mail.mcgill.ca> wrote:
>>     
>>> In addition, I saw that hashCode() had an implementation bug:
>>>
>>>                P2SetVisitorInt visitor = new P2SetVisitorInt(1) {
>>>
>>>                        final int PRIME = 31;
>>>
>>>                        public void visit(Node n) {
>>>                                intValue = PRIME * intValue + n.hashCode();
>>>                        }
>>>
>>>                };
>>>                this.forall(visitor); <<<<<<<<< WAS MISSING
>>>                return visitor.intValue;
>>>
>>> Due to the missing line, the hashcode returned would always be 0. :-(
>>> No surprise that leads to slow results...
>>>
>>> Eric
>>>
>>>
>>> On 04/07/07, Eric Bodden <eric.bodden at mail.mcgill.ca> wrote:
>>>       
>>>> Oh, I see. Sorry about that.
>>>>
>>>> Could the same be a problem in PointsToSetReadOnly in paddle and
>>>> AbstractFlowSet in Soot? I added similar methods there...
>>>>
>>>> I *do* require "equal" points-to sets to be equal in terms of
>>>> equals/hashCode in the case of points-to sets of Spark, i.e.
>>>> PointsToSetInternal. However, I could maybe come up with a wrapper
>>>> class or something like that.
>>>>
>>>> Ondrej, feel free to commit a fix to this at your will.
>>>>
>>>> Eric
>>>>
>>>> On 04/07/07, Ondrej Lhotak <olhotak at uwaterloo.ca> wrote:
>>>>         
>>>>> I've tracked down the cause of the slowdown. It is a bug that could 
>>>>>           
>>> also
>>>       
>>>>> cause Spark to give an incorrect (unsound) answer, though I haven't
>>>>> observed an input program for which this would be the case.
>>>>>
>>>>> The problem is introduced in the change:
>>>>>
>>>>>           
>>> ------------------------------------------------------------------------
>>>       
>>>>> r2465 | ebodde | 2006-08-15 20:15:52 -0400 (Tue, 15 Aug 2006) | 1 line
>>>>>
>>>>> implementation of equals/hashCode for PointsToSets in Spark
>>>>>
>>>>>           
>>> ------------------------------------------------------------------------
>>>       
>>>>> This change adds expensive equals and hashCode methods to the class
>>>>> PointsToSetInternal, Spark's internal representation of a points-to
>>>>> set. These new methods override the object identity methods on Object,
>>>>> on which the Spark propagation algorithm depends. Inside Spark, a
>>>>> PointsToSetInternal is a mutable container that holds ever-changing 
>>>>>           
>>> sets
>>>       
>>>>> of abstract objects during the analysis. Thus, from the point of view 
>>>>>           
>>> of
>>>       
>>>>> Spark, the appropriate notion of equality for these containers is 
>>>>>           
>>> object
>>>       
>>>>> identity, not equality of the contents. Specifically, Spark makes a
>>>>> worklist (actually a set) of PointsToSetInternal's whose contents need
>>>>> to be further propagated. The overridden methods cause adding to the
>>>>> worklist to be very slow, and may prevent PointsToSetInternal's whose
>>>>> contents still need to be propagated from being added, if their 
>>>>>           
>>> contents
>>>       
>>>>> happen to be the same as another PointsToSetInternal.
>>>>>
>>>>> Removing these methods from PointsToSetInternal fixes the problem. I
>>>>> have not committed this change yet.
>>>>>
>>>>> Ondrej
>>>>>
>>>>> On Mon, Jun 25, 2007 at 04:58:45PM -0400, Eric Bodden wrote:
>>>>>           
>>>>>> Ondrej could you do the same run with hprof enabled? I think with 
>>>>>>             
>>> such
>>>       
>>>>>> a significant difference, this should easily show up in a profile.
>>>>>>
>>>>>> Eric
>>>>>>
>>>>>> On 22/06/07, Ondrej Lhotak <olhotak at uwaterloo.ca> wrote:
>>>>>>             
>>>>>>> Strange, I also notice a big slowdown.
>>>>>>>
>>>>>>> Also, Soot 2.2.4 gives significantly different answers than 2.2.3.
>>>>>>>
>>>>>>> If I use the pre-jimplify switch, which separates Spark time from
>>>>>>> Jimplification time, I get the following Spark times on soot-c:
>>>>>>> 2.2.3:  15.3s
>>>>>>> 2.2.4: 113.9s
>>>>>>>
>>>>>>> Something *very* weird is going on.
>>>>>>>
>>>>>>> I'm not aware of any recent changes to Spark. Has anyone been 
>>>>>>>               
>>> working on
>>>       
>>>>>>> Spark?
>>>>>>>
>>>>>>> I would like to do a binary search through the Subversion revisions
>>>>>>> between releases 2.2.3 and 2.2.4 to determine the cause of this.
>>>>>>> Unfortunately, the internet is dead here at the moment, so I can't 
>>>>>>>               
>>> do
>>>       
>>>>>>> that right now (and who knows when this e-mail will even go out). 
>>>>>>>               
>>> I'll
>>>       
>>>>>>> be travelling all next week, so I can't investigate the problem 
>>>>>>>               
>>> until
>>>       
>>>>>>> later. If someone manages to track it down, I'd be interested to 
>>>>>>>               
>>> hear
>>>       
>>>>>>> what the cause is.
>>>>>>>
>>>>>>> On Thu, Jun 21, 2007 at 08:39:25AM -0400, Eric Bodden wrote:
>>>>>>>               
>>>>>>>> Nope. No such observation here. 50s sounds pretty fast to me. Are 
>>>>>>>>                 
>>> you
>>>       
>>>>>>>> sure it's doing the right thing? If so, maybe could you email us 
>>>>>>>>                 
>>> you
>>>       
>>>>>>>> test case and command line?
>>>>>>>>
>>>>>>>> Eric
>>>>>>>>
>>>>>>>> On 21/06/07, Joe Qian <owqian at gmail.com> wrote:
>>>>>>>>                 
>>>>>>>>> Hi everyone,
>>>>>>>>>
>>>>>>>>>       Recently I moved from soot 2.2.3 to soot 2.2.4. But when 
>>>>>>>>>                   
>>> testing
>>>       
>>>>>>> old
>>>>>>>               
>>>>>>>>> programs (also with JDK1.4.1), i found the Spark points-to 
>>>>>>>>>                   
>>> analysis run
>>>       
>>>>>>>>> much slower than before.  For my small test case, the old 
>>>>>>>>>                   
>>> version of
>>>       
>>>>>>> spark
>>>>>>>               
>>>>>>>>> run about 50s, while the new spark version have to run about 
>>>>>>>>>                   
>>> 200s.
>>>       
>>>>>>>>> I have no idea about this situation. Does any one else have 
>>>>>>>>>                   
>>> similar
>>>       
>>>>>>>>> experience?
>>>>>>>>>
>>>>>>>>>       Thanks
>>>>>>>>>
>>>>>>>>> Joe
>>>>>>>>> _______________________________________________
>>>>>>>>> Soot-list mailing list
>>>>>>>>> Soot-list at sable.mcgill.ca
>>>>>>>>> http://mailman.cs.mcgill.ca/mailman/listinfo/soot-list
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                   
>>>>>>>> --
>>>>>>>> Eric Bodden
>>>>>>>> Sable Research Group
>>>>>>>> McGill University, Montréal, Canada
>>>>>>>> _______________________________________________
>>>>>>>> Soot-list mailing list
>>>>>>>> Soot-list at sable.mcgill.ca
>>>>>>>> http://mailman.cs.mcgill.ca/mailman/listinfo/soot-list
>>>>>>>>
>>>>>>>>                 
>>>>>> --
>>>>>> Eric Bodden
>>>>>> Sable Research Group
>>>>>> McGill University, Montréal, Canada
>>>>>>
>>>>>>             
>>>> --
>>>> Eric Bodden
>>>> Sable Research Group
>>>> McGill University, Montréal, Canada
>>>>
>>>>         
>>> --
>>> Eric Bodden
>>> Sable Research Group
>>> McGill University, Montréal, Canada
>>>
>>>       
>> -- 
>> Eric Bodden
>> Sable Research Group
>> McGill University, Montréal, Canada
>>
>>     
> _______________________________________________
> Soot-list mailing list
> Soot-list at sable.mcgill.ca
> http://mailman.cs.mcgill.ca/mailman/listinfo/soot-list
>