[Soot-list] Why Spark in soot 2.2.4 slows down

Thu Jul 5 10:05:26 EDT 2007

On Wed, Jul 04, 2007 at 12:00:45PM -0400, Eric Bodden wrote:
> Ondrej, I actually think that actually it would be the cleanest
> solution if Spark would use an IdentityHashSet at those places where
> comparison on identity is required. Would that be possible?

A clean solution would avoid imposing behaviour required by some
external application on Spark-internal classes; using a wrapper
in your application would be cleaner. But you're right that using an
IdentityHashSet inside Spark is the easiest solution, if not cleanest.
It's fine with me, as long as you find all the places where Spark
could be calling hashCode and equals on those classes. The worklist
may be the only such place, but it may not; it was a running assumption
that equals would be object identity, and I just don't know all the
places in the code that rely on that assumption.

> Cheers,
> Eric
> 
> On 04/07/07, Eric Bodden <eric.bodden at mail.mcgill.ca> wrote:
> >In addition, I saw that hashCode() had an implementation bug:
> >
> >                P2SetVisitorInt visitor = new P2SetVisitorInt(1) {
> >
> >                        final int PRIME = 31;
> >
> >                        public void visit(Node n) {
> >                                intValue = PRIME * intValue + n.hashCode();
> >                        }
> >
> >                };
> >                this.forall(visitor); <<<<<<<<< WAS MISSING
> >                return visitor.intValue;
> >
> >Due to the missing line, the hashcode returned would always be 0. :-(
> >No surprise that leads to slow results...
> >
> >Eric
> >
> >
> >On 04/07/07, Eric Bodden <eric.bodden at mail.mcgill.ca> wrote:
> >> Oh, I see. Sorry about that.
> >>
> >> Could the same be a problem in PointsToSetReadOnly in paddle and
> >> AbstractFlowSet in Soot? I added similar methods there...
> >>
> >> I *do* require "equal" points-to sets to be equal in terms of
> >> equals/hashCode in the case of points-to sets of Spark, i.e.
> >> PointsToSetInternal. However, I could maybe come up with a wrapper
> >> class or something like that.
> >>
> >> Ondrej, feel free to commit a fix to this at your will.
> >>
> >> Eric
> >>
> >> On 04/07/07, Ondrej Lhotak <olhotak at uwaterloo.ca> wrote:
> >> > I've tracked down the cause of the slowdown. It is a bug that could 
> >also
> >> > cause Spark to give an incorrect (unsound) answer, though I haven't
> >> > observed an input program for which this would be the case.
> >> >
> >> > The problem is introduced in the change:
> >> > 
> >------------------------------------------------------------------------
> >> > r2465 | ebodde | 2006-08-15 20:15:52 -0400 (Tue, 15 Aug 2006) | 1 line
> >> >
> >> > implementation of equals/hashCode for PointsToSets in Spark
> >> > 
> >------------------------------------------------------------------------
> >> >
> >> > This change adds expensive equals and hashCode methods to the class
> >> > PointsToSetInternal, Spark's internal representation of a points-to
> >> > set. These new methods override the object identity methods on Object,
> >> > on which the Spark propagation algorithm depends. Inside Spark, a
> >> > PointsToSetInternal is a mutable container that holds ever-changing 
> >sets
> >> > of abstract objects during the analysis. Thus, from the point of view 
> >of
> >> > Spark, the appropriate notion of equality for these containers is 
> >object
> >> > identity, not equality of the contents. Specifically, Spark makes a
> >> > worklist (actually a set) of PointsToSetInternal's whose contents need
> >> > to be further propagated. The overridden methods cause adding to the
> >> > worklist to be very slow, and may prevent PointsToSetInternal's whose
> >> > contents still need to be propagated from being added, if their 
> >contents
> >> > happen to be the same as another PointsToSetInternal.
> >> >
> >> > Removing these methods from PointsToSetInternal fixes the problem. I
> >> > have not committed this change yet.
> >> >
> >> > Ondrej
> >> >
> >> > On Mon, Jun 25, 2007 at 04:58:45PM -0400, Eric Bodden wrote:
> >> > > Ondrej could you do the same run with hprof enabled? I think with 
> >such
> >> > > a significant difference, this should easily show up in a profile.
> >> > >
> >> > > Eric
> >> > >
> >> > > On 22/06/07, Ondrej Lhotak <olhotak at uwaterloo.ca> wrote:
> >> > > >Strange, I also notice a big slowdown.
> >> > > >
> >> > > >Also, Soot 2.2.4 gives significantly different answers than 2.2.3.
> >> > > >
> >> > > >If I use the pre-jimplify switch, which separates Spark time from
> >> > > >Jimplification time, I get the following Spark times on soot-c:
> >> > > >2.2.3:  15.3s
> >> > > >2.2.4: 113.9s
> >> > > >
> >> > > >Something *very* weird is going on.
> >> > > >
> >> > > >I'm not aware of any recent changes to Spark. Has anyone been 
> >working on
> >> > > >Spark?
> >> > > >
> >> > > >I would like to do a binary search through the Subversion revisions
> >> > > >between releases 2.2.3 and 2.2.4 to determine the cause of this.
> >> > > >Unfortunately, the internet is dead here at the moment, so I can't 
> >do
> >> > > >that right now (and who knows when this e-mail will even go out). 
> >I'll
> >> > > >be travelling all next week, so I can't investigate the problem 
> >until
> >> > > >later. If someone manages to track it down, I'd be interested to 
> >hear
> >> > > >what the cause is.
> >> > > >
> >> > > >On Thu, Jun 21, 2007 at 08:39:25AM -0400, Eric Bodden wrote:
> >> > > >> Nope. No such observation here. 50s sounds pretty fast to me. Are 
> >you
> >> > > >> sure it's doing the right thing? If so, maybe could you email us 
> >you
> >> > > >> test case and command line?
> >> > > >>
> >> > > >> Eric
> >> > > >>
> >> > > >> On 21/06/07, Joe Qian <owqian at gmail.com> wrote:
> >> > > >> >Hi everyone,
> >> > > >> >
> >> > > >> >       Recently I moved from soot 2.2.3 to soot 2.2.4. But when 
> >testing
> >> > > >old
> >> > > >> >programs (also with JDK1.4.1), i found the Spark points-to 
> >analysis run
> >> > > >> >much slower than before.  For my small test case, the old 
> >version of
> >> > > >spark
> >> > > >> >run about 50s, while the new spark version have to run about 
> >200s.
> >> > > >> >I have no idea about this situation. Does any one else have 
> >similar
> >> > > >> >experience?
> >> > > >> >
> >> > > >> >       Thanks
> >> > > >> >
> >> > > >> >Joe
> >> > > >> >_______________________________________________
> >> > > >> >Soot-list mailing list
> >> > > >> >Soot-list at sable.mcgill.ca
> >> > > >> >http://mailman.cs.mcgill.ca/mailman/listinfo/soot-list
> >> > > >> >
> >> > > >> >
> >> > > >>
> >> > > >>
> >> > > >> --
> >> > > >> Eric Bodden
> >> > > >> Sable Research Group
> >> > > >> McGill University, Montréal, Canada
> >> > > >> _______________________________________________
> >> > > >> Soot-list mailing list
> >> > > >> Soot-list at sable.mcgill.ca
> >> > > >> http://mailman.cs.mcgill.ca/mailman/listinfo/soot-list
> >> > > >>
> >> > > >
> >> > >
> >> > >
> >> > > --
> >> > > Eric Bodden
> >> > > Sable Research Group
> >> > > McGill University, Montréal, Canada
> >> > >
> >> >
> >>
> >>
> >> --
> >> Eric Bodden
> >> Sable Research Group
> >> McGill University, Montréal, Canada
> >>
> >
> >
> >--
> >Eric Bodden
> >Sable Research Group
> >McGill University, Montréal, Canada
> >
> 
> 
> -- 
> Eric Bodden
> Sable Research Group
> McGill University, Montréal, Canada
>