abc-private (Aug 2005): Re: [abc] OOPSLA paper

From: Prof. Laurie HENDREN <hendren@cs.mcgill.ca>
Date: Wed Aug 03 2005 - 13:46:46 BST

Hi Ondrej,

I have been thinking about theOptimisation section.
I like the first part which deals with the situation where there are no
variables, but something bothers me about the second column, where the
iterator example is discussed.

In considering what bothers me, here is what I can come up with:

I think it is because the discussion here seems to me to be very specific to
this particular example, and it's not clear to me what the general problem
is. In the first part you state the problem generally. Of course,
perhaps part of the problem is that it is hard to state a general problem,
and so we are just giving an illustrative example ... but then I think that
has to be very clear.

Secondly, I wonder why you are using the term "alias analysis" instead of
"points-to" analysis. It seems clear that one would want to start with
some sort of points-to analysis, since one wants to reason about the
pointed-to objects. As our previous round of e-mail says, the problem
is really that since tracematches are relative to object instances.

With existing may points-to analyses, it is possible to say when two variables
don't point to the same instance, but for tracematches we will need to
concentrate on analysis that determine when two variables definitely point
to the same instance. Have you seen anything in recent literature that
solves this problem? Rakesh's work did do this using his connection/shape
analysis for C, and Christopher Lapkowski's work is C also addressed this
by using SSA numbering (i.e. when a pointer variable had the same primary
and secondary SSA number, it must point to the same instance).

Of course it would be easier to hash this out on a white board, but hopefully
my main points come across in e-mail????

Cheers, Laurie

+-------------------------------------------------------------+
| Laurie Hendren, Professor, School of Computer Science |
| McGill University |
| 318 McConnell Engineering Building tel: (514) 398-7391 |
| 3480 University Street fax: (514) 398-3883 |
| Montreal, Quebec H3A 2A7 hendren@cs.mcgill.ca |
| CANADA http://www.sable.mcgill.ca/~hendren |
| http://wwww.sable.mcgill.ca http://aspectbench.org |
+-------------------------------------------------------------+

On Tue, 2 Aug 2005, Ondrej Lhotak wrote:

> On Tue, Aug 02, 2005 at 12:54:51PM +0100, Oege de Moor wrote:
> > > Section 6
> > > ---------
> >
> > Laurie, I've brought this up a number of times. This section is the
> > main McGill contribution to the paper - it was written by Ondrej.
> > If you're willing to rewrite it - as I've asked in a previous email,
> > that would be great... if there's no time to do that, I don't
> > think it does major harm.
>
> I'm rather confused by this discussion. Perhaps I missed something
> that you talked about. You (Laurie and Oege) seem to think there is
> something wrong with it, but I don't understand what you feel is wrong
> with it. I'm not saying there's nothing wrong with it; just that I don't
> understand what exactly you feel is wrong with it. If I rewrite it
> without such an understanding, the rewritten version is likely to end up
> very similar to the current version.
>
> > In the story of the whole paper, I think it is important to have
> > the section (with some level of detail), to assure the reader
> > that one day tracematches could be even more efficient than the
> > pure AspectJ versions...
>
> I thought the reasons that you (Laurie and Oege) asked me to write this
> section were the following:
> 1) to show that it is possible to optimize tracematches, so that their
> current slowness may eventually be overcome
> 2) to explain in detail what kind of analysis information would be
> required, without going into details about the analyses that would be
> done to compute it
> 3) to say that our design of tracematches lends itself well to analysis
> because it has clear semantics and limits what the programmer can do
> 4) that the analyses required are by no means simple, which is why we
> leave them to future work
>
> I wrote the section with the above goals in mind, and I believe the
> current version achieves them quite well. From the current comments,
> it sounds like Oege pretty much agrees with this (correct me if I'm
> wrong), but Laurie doesn't. Am I misunderstanding the goals for this
> section, or do you disagree that the section achieves the goals?
>
> > > I'm not certain about the discussion of analyses needed to further
> > > optimize tracematches using static analysis. In some sense, this seems
> > > to be hitting exactly the same problems as those faced by people who
> > > build static checkers for properties ... and the problems seem just
> > > as hard.
>
> The analysis that would be required is hard, and I thought this was one
> of the points we were trying to make.
>
> > > So, I'm not sure what this discussion is supposed to convey. Is it
> > > to say that further optimizing tracematches will require a hard analysis,
> > > yet to be defined?
> > >
> > > In the discussion of alias information, it is stated that one needs
> > > to know whether a variable points to the same location at two different
> > > points in the program. For store-based analyses, i.e. those that give
> > > names to locations, this can be done by using the names. For example,
> > > if s points to "new_1" at program point A and t points to "new_1" at
> > > program point B, then there is a possible dependence. We already
> > > do that for side-effect and dependence analysis in Soot using the alloc
> > > sites - although these are "may" relationships.
>
> Right. However, for optimizing tracematches, I think we will need both
> may and must relationships. I would even say that the must relationships
> will be more important than the may relationships, because in most
> tracematches, you're trying to trace a single object through the
> execution of the program, to make sure that it's the same object in each
> event. Moreover, not only will we need to know that things must be
> aliased at a given point, but that the object at one event must be the
> same object as at another event.
>
> > > There are also "storeless" pointer analyses, such as those developed in
> > > my Ph.D. thesis, and then later by my student Rakesh Ghiya. In these
> > > cases you don't have names, like "new_1", but only relationships between
> > > variables at a program point. In his work (see POPL98 and his thesis) he
> > > showed how to inject enough new variable copies so that the relationships
> > > between variables at different program points could be compared, and he
> > > then used that info for optimizing C pointer programs.
> > >
> > > I mention these things not because I think we need to address them in the
> > > paper, but to point out that when we start talking about static analyses
> > > like these there is a lot of related work ... and perhaps we just want to
> > > say something a little more high-level and not get into details...
>
> Yes, there is a lot of related work on doing these kinds of analyses,
> and I think you're right that we don't want to get into details on how
> the analysis is done at this point (we will in the future, when we
> actually implement an analysis). What we do want to get into details
> on is what kinds of information we want from such an analysis. Do you
> feel we go into too little detail on the information we want from the
> analysis? Do we assume too much about how the analysis should compute
> the information that we want?
>
> Ondrej
>
>
>
>
Received on Wed Aug 3 13:46:53 2005

This archive was generated by hypermail 2.1.8 : Wed Aug 03 2005 - 16:00:06 BST