[Soot-list] Java-Jimple mapping using SourceLnPosTag

Patrick Lam plam at sable.mcgill.ca
Fri Mar 9 17:22:07 EST 2012


Hi Marcus,

Thanks for the pointer, Josh. Let me add a couple of points.

Raja's thesis 
(http://www.sable.mcgill.ca/publications/thesis/masters-kor/sable-thesis-2000-masters-kor.ps.gz) 
is probably still the definitive reference on Jimple, although you can 
also refer to our recent CETUS paper 
(http://www.sable.mcgill.ca/publications/papers/2011-6/11.cetus.soot.pdf) for 
more recent information. The CC 2000 paper may also be helpful.

The usual way that Soot creates Jimple is by reading Java bytecode, and 
you can think of Jimple as a nicer form of bytecode, which one can 
actually understand. Nowadays, one can also create Jimple directly from 
Java source, but that's still really more like compiling it to bytecode, 
and then going back to Jimple. In any case, you need to understand the 
Jimple with respect to the bytecode. You can also look at the Baf code 
which gets generated, for better understanding of what's going on.

Java bytecode doesn't contain anything like x++, so neither does Jimple. 
Unless you process the Java source code directly, you're not going to 
get a direct mapping to things in the Java source code.

pat

On 03/09/2012 09:07 PM, Josh Branchaud wrote:
> Hi Marcus,
>
> I am sure someone can try to explain to you why the Jimple code looks as
> it does. I am not going to try to do that.
>
> My assumption from your question is that you aren't familiar with JVM
> bytecode. Let me know if this assumption is incorrect. I think if you
> were to learn a little about bytecode and how it is related to Java
> code, this would not only give you a much stronger understanding of how
> Java works under the hood, but I think it would also give you some
> insight on how Jimple relates to Java source code. This isn't a quick
> fix. If you are interested in getting a quick answer in a couple
> paragraphs, then this isn't the way to go. However, if you looking for a
> more long-term, deeper understanding of these concepts, then I think it
> would be well worth your time to learn a little about the JVM and bytecode.
>
> Here is a link to the JVM specification
> <http://docs.oracle.com/javase/specs/jvms/se5.0/html/VMSpecTOC.doc.html>. Perhaps
> others on the mailing list have suggestions for resources to check out.
>
> On Fri, Mar 9, 2012 at 12:24 PM, Marcus Mews <m.mews at tu-berlin.de
> <mailto:m.mews at tu-berlin.de>> wrote:
>
>
>     Hello everybody,
>
>         I am Marcus Mews (TU-Berlin) and about to use Soot in an academic
>     prototype to perform semantic preserving refactorings. Therfor, I need
>     to know certain details about Soot's mapping between Java Expressions
>     and Jimple Units. For various reasons, Soot splits up Java Expressions
>     like from
>
>     void JavaFoo() {
>     x++;
>     }
>
>     to
>
>     void JimpleFoo() {
>          temp$1 = this.<some.Clazz: int x>;  //sline=134, eline=134, spos=1,
>     epos=1
>          temp$2 = temp$1 + 1;                //sline=134, eline=134, spos=1,
>     epos=1
>          this.<some.Clazz: int x> = temp$2;  //sline=134, eline=134, spos=1,
>     epos=1
>     }
>
>     Another - slightly different - case is the following example: From
>
>     void JavaBar() {
>     x=x+1;
>     }
>
>     to
>
>     void JimpleBar() {
>          temp$3 = this.<some.Clazz: int x>;    //sline=135, eline=135,
>     spos=3, epos=3
>          temp$4 = temp$3 + 1;                  //sline=135, eline=135,
>     spos=3, epos=3
>          this.<some.Clazz: int x> = temp$4;    //sline=135, eline=135,
>     spos=1, epos=1
>     }
>
>
>         In the first example above, the expression "x++" is substituted by
>     three other expressions in the Jimple code, since there is no equivalent
>     in the Jimple language, I assume. In the second example, the expression
>     "x=x+1" is just split up into more trivial expressions.
>         I'd like to know where each Jimple unit comes from.
>
>         I considered using the SourceLnPosTag, but I'm not sure, whether
>     this is the right thing to do; in the examples above, I also stated the
>     SourceLnPosTags: In the "x++"-example, all the SourceLnPosTags point to
>     the same Java code text snippet, which is as I expected (although I it
>     seems odd that epos=1 instead of epos=3). But in the second example I
>     don't understand the position numbers, since some Jimple units have
>     equal SourceLnPosTags. Maybe the tags are incorrect or maybe I'm off the
>     track here and there is a better way to retrieve each unit's Java
>     origin?
>
>         Anyway, I appreciate any advice or links to manuals, etc.
>     Thanks and regards,
>         Marcus
>
>
>     _______________________________________________
>     Soot-list mailing list
>     Soot-list at sable.mcgill.ca <mailto:Soot-list at sable.mcgill.ca>
>     http://mailman.cs.mcgill.ca/mailman/listinfo/soot-list
>
>
>
>
> --
> Josh Branchaud
> Graduate Research Assistant, University of Nebraska-Lincoln
> jbranchaud at gmail.com <mailto:jbranchaud at gmail.com> - (402) 660-1656
> Website: cse.unl.edu/~jbrancha <http://cse.unl.edu/~jbrancha>
>
>
>
> _______________________________________________
> Soot-list mailing list
> Soot-list at sable.mcgill.ca
> http://mailman.cs.mcgill.ca/mailman/listinfo/soot-list



More information about the Soot-list mailing list