[Soot-list] Inconsistency in handling unicode class names

Eric Bodden bodden at st.informatik.tu-darmstadt.de
Fri Mar 11 12:37:41 EST 2011


Just as a note to the list:

Christophe found the problem (in Jasmin) and a patch is committed now.
(see below)

Cheers,
Eric

Index: Scanner.java
===================================================================
--- Scanner.java        (revision 3555)
+++ Scanner.java        (working copy)
@@ -23,9 +23,10 @@
 import java_cup.runtime.*;
 import java.util.*;
 import java.io.InputStream;
+import java.io.InputStreamReader;

 class Scanner implements java_cup.runtime.Scanner {
-    InputStream inp;
+    InputStreamReader inp;

    // single lookahead character
    int next_char;
@@ -94,7 +95,7 @@
    final static int BIGNUM=65000;
    public Scanner(InputStream i) throws java.io.IOException
    {
-       inp = i;
+       inp = new InputStreamReader(i);
        line_num = 1;
        char_num = 0;
        line = new StringBuffer();

===================================================================

On 7 March 2011 17:13, Eric Bodden <bodden at st.informatik.tu-darmstadt.de> wrote:
> Hi Christophe.
>
> I have just tried to reproduce the problem but I am having trouble
> creating a source/class file that contains such characters. (I have
> never used unicode in class names.) Could you tell us how to produce a
> test case or even better send us an appropriate test file?
>
> Cheers,
> Eric
>
> On 6 March 2011 23:20, Christophe Foket <christophe.foket at elis.ugent.be> wrote:
>> Hi Richard,
>>
>> You are probably right. I've tried the same thing with the version of jasmin
>> I got from http://sourceforge.net/projects/jasmin/files/jasmin/jasmin-2.4/
>> and everything works fine.
>> When I run the following jasmin file (correctly generated by Soot)
>>
>> cfoket at degenerate:~/Desktop/test/jasmin-original$ cat Ǥ.jasmin
>> .source D.java
>> .class public Ǥ
>> .super c
>>
>> .implements J
>> .method public <init>()V
>> .limit stack 1
>> .limit locals 1
>> aload_0
>> invokespecial c/<init>()V
>> return
>> .end method
>>
>> .method public z()V
>> .limit stack 0
>> .limit locals 1
>> return
>> .end method
>>
>> through jasmin I end up with the correct class file "Ǥ.class":
>>
>> cfoket at degenerate:~/Desktop/test/jasmin-original$ javap Ǥ
>> Compiled from "D.java"
>> public class Ǥ extends c implements J{
>> public Ǥ();
>> public void z();
>> }
>>
>> Indeed, as you pointed out, the problem lies somewhere in the version of
>> jasmin that comes with Soot.
>>
>> -Christophe
>>
>> Richard L. Halpert wrote:
>>>
>>> Actually, it sounds to me like jasmin is the culprit, since its output is
>>> clearly in ASCII instead of UTF-8, but soot's immediate output (the .jasmin
>>> file) seems to be correct. Jasmin must be converting strings (which in Java
>>> are stored as UTF-8) or files containing your character to ASCII before
>>> writing the final version. This could occur by converting to a character
>>> array or saving to a file without correctly specifying the charset.
>>>
>>> -Richard
>>>
>>> On Mar 6, 2011 7:34 AM, "Eric Bodden"
>>> <bodden at st.informatik.tu-darmstadt.de
>>> <mailto:bodden at st.informatik.tu-darmstadt.de>> wrote:
>>
>>
>
>
>
> --
> Dr. Eric Bodden, http://bodden.de/
> Principal Investigator in Secure Services at CASED
> Coordinator of the CASED Advisory Board of Study Affairs
> PostDoc at Software Technology Group, Technische Universität Darmstadt
> Tel: +49 6151 16-5478    Fax: +49 6151 16-5410
> Mailing Address: S2|02 A209, Hochschulstraße 10, 64289 Darmstadt
>



-- 
Dr. Eric Bodden, http://bodden.de/
Principal Investigator in Secure Services at CASED
Coordinator of the CASED Advisory Board of Study Affairs
PostDoc at Software Technology Group, Technische Universität Darmstadt
Tel: +49 6151 16-5478    Fax: +49 6151 16-5410
Mailing Address: S2|02 A209, Hochschulstraße 10, 64289 Darmstadt


More information about the Soot-list mailing list