abc-private (Aug 2005): Re: [abc] OOPSLA paper

From: Ondrej Lhotak <olhotak@sable.mcgill.ca>
Date: Tue Aug 02 2005 - 15:29:25 BST

On Tue, Aug 02, 2005 at 12:54:51PM +0100, Oege de Moor wrote:
> > Section 6
> > ---------
>
> Laurie, I've brought this up a number of times. This section is the
> main McGill contribution to the paper - it was written by Ondrej.
> If you're willing to rewrite it - as I've asked in a previous email,
> that would be great... if there's no time to do that, I don't
> think it does major harm.

I'm rather confused by this discussion. Perhaps I missed something
that you talked about. You (Laurie and Oege) seem to think there is
something wrong with it, but I don't understand what you feel is wrong
with it. I'm not saying there's nothing wrong with it; just that I don't
understand what exactly you feel is wrong with it. If I rewrite it
without such an understanding, the rewritten version is likely to end up
very similar to the current version.

> In the story of the whole paper, I think it is important to have
> the section (with some level of detail), to assure the reader
> that one day tracematches could be even more efficient than the
> pure AspectJ versions...

I thought the reasons that you (Laurie and Oege) asked me to write this
section were the following:
1) to show that it is possible to optimize tracematches, so that their
current slowness may eventually be overcome
2) to explain in detail what kind of analysis information would be
required, without going into details about the analyses that would be
done to compute it
3) to say that our design of tracematches lends itself well to analysis
because it has clear semantics and limits what the programmer can do
4) that the analyses required are by no means simple, which is why we
leave them to future work

I wrote the section with the above goals in mind, and I believe the
current version achieves them quite well. From the current comments,
it sounds like Oege pretty much agrees with this (correct me if I'm
wrong), but Laurie doesn't. Am I misunderstanding the goals for this
section, or do you disagree that the section achieves the goals?

> > I'm not certain about the discussion of analyses needed to further
> > optimize tracematches using static analysis. In some sense, this seems
> > to be hitting exactly the same problems as those faced by people who
> > build static checkers for properties ... and the problems seem just
> > as hard.

The analysis that would be required is hard, and I thought this was one
of the points we were trying to make.

> > So, I'm not sure what this discussion is supposed to convey. Is it
> > to say that further optimizing tracematches will require a hard analysis,
> > yet to be defined?
> >
> > In the discussion of alias information, it is stated that one needs
> > to know whether a variable points to the same location at two different
> > points in the program. For store-based analyses, i.e. those that give
> > names to locations, this can be done by using the names. For example,
> > if s points to "new_1" at program point A and t points to "new_1" at
> > program point B, then there is a possible dependence. We already
> > do that for side-effect and dependence analysis in Soot using the alloc
> > sites - although these are "may" relationships.

Right. However, for optimizing tracematches, I think we will need both
may and must relationships. I would even say that the must relationships
will be more important than the may relationships, because in most
tracematches, you're trying to trace a single object through the
execution of the program, to make sure that it's the same object in each
event. Moreover, not only will we need to know that things must be
aliased at a given point, but that the object at one event must be the
same object as at another event.

> > There are also "storeless" pointer analyses, such as those developed in
> > my Ph.D. thesis, and then later by my student Rakesh Ghiya. In these
> > cases you don't have names, like "new_1", but only relationships between
> > variables at a program point. In his work (see POPL98 and his thesis) he
> > showed how to inject enough new variable copies so that the relationships
> > between variables at different program points could be compared, and he
> > then used that info for optimizing C pointer programs.
> >
> > I mention these things not because I think we need to address them in the
> > paper, but to point out that when we start talking about static analyses
> > like these there is a lot of related work ... and perhaps we just want to
> > say something a little more high-level and not get into details...

Yes, there is a lot of related work on doing these kinds of analyses,
and I think you're right that we don't want to get into details on how
the analysis is done at this point (we will in the future, when we
actually implement an analysis). What we do want to get into details
on is what kinds of information we want from such an analysis. Do you
feel we go into too little detail on the information we want from the
analysis? Do we assume too much about how the analysis should compute
the information that we want?

Ondrej
Received on Tue Aug 2 15:29:59 2005

This archive was generated by hypermail 2.1.8 : Wed Aug 03 2005 - 13:50:06 BST