abc-private (Dec 2004): Re: [abc] compile times

From: Prof. Laurie HENDREN <hendren@sable.mcgill.ca>
Date: Wed Dec 15 2004 - 11:52:53 GMT

I really think that we are focusing too much on exact compile times.
The point of the paper is to show the architecture of abc, and to
say why such an architecture is good ... and how it allows us to use
existing components that allow for extensibility and optimization (and
perhaps we should add easy experimentation).

Nobody in their right mind would pick Polyglot and Soot as their
components if their main objective was compile-time speed and I didn't
think that was our objective either. In both
cases there has been a clear tradeoff of functionality/ease of extension/
use of OO techniques and raw speed.

What we can say something about this issue, in that clearly there is
is a compile-time penalty - and what the tradeoff is (from 5 to 15 times
slower on most programs).

Laurie

+-------------------------------------------------------------+
| Laurie Hendren, Professor, School of Computer Science |
| McGill University |
| 318 McConnell Engineering Building tel: (514) 398-7391 |
| 3480 University Street fax: (514) 398-3883 |
| Montreal, Quebec H3A 2A7 hendren@cs.mcgill.ca |
| CANADA http://www.sable.mcgill.ca/~hendren |
+-------------------------------------------------------------+

On Wed, 15 Dec 2004, Oege de Moor wrote:

> Thanks, Ondrej.
>
> I guess these numbers are good enough to put into the paper,
> with an explanation of why we think abc is slow compared to ajc.
>
> We have to compare abc against ajc because it is the only
> other available compiler. It is reasonable of the reviewers
> to demand that, is it not?
>
> -O
>
>
> On Tue, 14 Dec 2004, Ondrej LHOTAK wrote:
>
> > On Mon, Dec 13, 2004 at 10:12:28PM +0000, Oege de Moor wrote:
> > > However, I also tried compiling abc itself:
> > >
> > > javac 5.0secs
> > > ajc 5.4secs
> > > abc 374.9secs
> > > abc -O0 312.3secs
> >
> > I've been excluding the generated directory containing the parsers and
> > running on various McGill machine, and none of the machines gets
> > anywhere near this bad:
> >
> > magic (Quad AMD Opteron):
> > ajc: 8.2s
> > abc-O0: 100.8s (12.3x)
> > abc: 115.7s (14.1x)
> >
> > lima (Intel P4 1.8GHz):
> > ajc: 20.8s
> > abc-O0: 165.3s (7.9x)
> > abc: 189.3s (9.1x)
> >
> > Still not good, but not quite 70 times.
> >
> > Using -time, I get roughly the same breakdown by phase as Oege.
> >
> > I tested out a couple hypotheses of potential causes of this difference.
> >
> > First, I thought that maybe the heavier-weight data structures in
> > Soot/Polyglot were causing it to get memory-starved, and that lots of time
> > was being spent in the garbage collector. However, giving it more memory
> > didn't really change the times, so that's probably not it.
> >
> > Second, I tried profiling it with hprof. Nothing obvious came up.
> > However, as with many Java programs, a big chunk of the time is being
> > spent in various places in the Java collections classes. From past
> > experience, I know that these are very slow compared to low-level array
> > operations. That got me wondering whether the Eclipse compiler also uses
> > these slow standard collections, so I checked the CVS. It turns out that
> > it doesn't use them at all, and it even uses OO features in general
> > sparingly. Everything is done with arrays, and often, rather than use
> > objects and virtual dispatch, it quite happily uses int constants and a
> > switch statement, just like a typical C program. s/String/char[]/. As
> > for BCEL, it does use the Java library a little bit, but it still uses
> > arrays for the bulk of the stuff, and int constants rather than objects.
> >
> > It's true that hprof is not always reliable. However, based on past
> > experience, it is very much possible for the Java collections to be
> > responsible for the roughly 10-fold difference that we're seeing.
> > It's therefore very possible that the difference has nothing to do
> > with abc's architecture, but with the fact that it's written in a Java
> > style, rather than in a C style.
> >
> > More generally, I find it quite odd that we're evaluating abc's
> > *architecture* by comparing it to ajc, a system with a different
> > architecture, but also very different components. Since it is very
> > much possible for the components themselves to have vastly different
> > performance, if we only want to compare the architectures, shouldn't
> > we keep the components constant? For example, rather than compare abc
> > to ajc, we should be comparing abc to Polyglot/JavaToJimple/Soot. Or
> > do we consider the choice of components a part of the architecture?
> > If so, then perhaps we should have decided that we care so much about
> > compile-time performance before we started writing abc, benchmarked the
> > components, and used Eclipse/BCEL rather than Polyglot/Soot.
> >
> > Ondrej
> >
> >
>
Received on Wed Dec 15 11:53:00 2004

This archive was generated by hypermail 2.1.8 : Wed Dec 15 2004 - 21:30:03 GMT