Java Supercompiler Version 0.1.x README

Contents

Introduction

Java Supercompiler (JScp) version 0.1.x is a technology preview (alpha version) of a supercompiler for the Java programming language. It is a global optimizer based on the supercompilation method conceived by Valentin Turchin. JScp performs source-to-source transformation of (part of) a Java program.

The modern JScp is capable of replacing some methods in a source Java program by their optimized versions. What JScp performs now may be characterized as inlining method bodies to some depth and specializing thus obtained code. This is a limited form of supercompilation (driving is implemented to a large extent, while configuration analysis is rather simplified). Future JScp versions will gradually cover more and more advanced techniques of supercompilation.

System Requirements

JScp is intended for use on Windows NT 4.0 or Windows 2000 operating systems running on Intel hardware. (Windows 95, 98, Me are not supported). On your request, versions running on Unix operating systems may be supplied.

A Pentium 200MHz or faster processor, and hundreds megabytes of RAM are required to run the supercompiler. The more the better. Supercompilation is time and space consuming depending on a Java program. JScp does not use disk space for temporary data, and gradually request more and more virtual memory while working. Lack of physical RAM causes disk swapping which has a severe effect on performance.

JScp program installation occupy about 1M of disk space.

A prerequisite is Sun JDK 1.2.x or later installation.

List of Files

  • bin subdirectory with JScp executables
    • camlrt.dll* the MosML run-time library used by jscp.exe
    • camlrunm.exe* the MosML run-time used by jscp.exe
    • jarithm.bat a .bat file to start the "arithmetic server" used by jscp.exe
    • jarithm.jar the classes of the "arithmetic server"
    • javap-patch.jar a .jar file with patched classes of javap.exe used by jscp-javap.bat
    • jscp.exe the JScp executable
    • jscp-javac.bat jscp.exe calls javac.exe via this .bat file
    • jscp-javap.bat jscp.exe calls the patched version of javap.exe via this .bat file
    • libmregex.so* a run-time library used by jscp.exe
    • libmsocket.so* a run-time library used by jscp.exe
  • sample subdirectory with a test to check JScp installation
    • Hello.java a sample Java program subject to supercompilation
    • jscp.bat a .bat file that checks environment variables settings and calls jscp.exe
    • run.bat a .bat file to compile and run Hello.java (without supercompilation)
    • supercompile.bat a .bat file to supercompile Hello.java
    • supercompile-and-run-result.bat a .bat file to supercompile Hello.java and run the resulting program
  • README.htm this file

Note 4 files marked by * constitute the MosML run-time used by jscp.exe. These files have been taken from http://www.dina.kvl.dk/~sestoft/mosml.html.

Mode of Operation

Executable jscp.exe takes several .java files whose names are given by command-line arguments or loaded by demand, and outputs transformed .java files to a specified  directory. The process is controlled by a supercompilation task defined by command-line options or a task file in XML format. The task tells to JScp which methods to supercompile and how. E.g.,

jscp.exe Hello.java -method main -destdir res -invoke10

Here Hello.java is the file name of a source Java program. Option -method main tells that method main is to be replaced by its supercompiled version. Option -destdir res specifies destination directory where the resulting file Hello.java is put. Option -invoke10 says that during supercompilation methods are invoked (inlined) recursively not more than 10 times.

When jscp.exe needs to evaluate arithmetic and other operations with known data, it calls JVM with so called arithmetic server, which must be started in advance. (The rational behind using a separate JVM is that JScp is written in SML rather than in Java.) The server is to be launched once by calling jarithm.bat and then permanently resides in computer memory. If you use the supercompiler often, it is convenient to put the link to jarithm.bat in Windows Startup, and forget about the server.

During its execution, jscp.exe calls the Java compiler javac.exe and a parched version of the Java class file disassembler javap.exe to check that the Java program under supercompilation is correct and to gather information about used .class files. These programs are invoked by jscp.exe via .bat files jscp-javac.bat and jscp-javap.bat that lie in the same directory as jscp.exe, which must be in the system path when jscp.exe is invoked.

Installation

  1. Install Sun Java 2 SDK 1.2 or later after having downloaded it from http://java.sun.com/j2se. It is recommended to use version 1.4.x. Set the following environment variable:

    set JAVA_HOME=path-to-Java-installation

  2. Unpack the downloaded .zip file with JScp executable and other files listed above to any directory. Set environment variable JSCP_HOME to this directory:

    set JSCP_HOME=path-to-JScp-installation

  3. Include the JScp installation directory in the system path or copy file %JSCP_HOME%\sample\jscp.bat to a directory which is in your system path already.
     

  4. This step is optional. Execute it, if you don't want to think about starting the arithmetic server each time before using JScp.

    Put the link to jarithm.bat in your Windows Startup directory:

    • Open directory with JScp installation.
    • Right-click to jarithm.bat in this directory.
    • Choose Create Shortcut from menu. File jarithm.bat.lnk will be created in this directory.
    • Open Windows Startup directory: right-click the Start button; select Open in menu and Start Menu will be opened; then open Programs folder and Startup folder in it.
    • Move jarithm.bat.lnk from JScp installation directory to Startup folder.

    Now each time you login to Windows, the arithmetic server starts without your intervention.

  5. The bat file jarithm.bat also checks environment variables JAVA_HOME and JCSP_HOME are set properly. You may want to additionally check the installation by calling without parameters the following files in the %JSCP_HOME%\bin directory:
    • jscp.exe
    • jscp-javac.bat
    • jscp-javap.bat

    Each program will return usage information about itself.

Running an example

  1. Go to the directory with JScp installation and start the arithmetic server by executing jarithm.bat, if it is not started yet. (Don't worry, if you call it for the second time, you would receive the following message: "Server can't start. Perhaps the one is working already" and nothing happen.)

  2. Go to subdirectory sample that contains Hello.java along with .bat files to run and supercompile the sample:

    public class Hello {
        void test() {
            System.out.println("Hello!");
        }
        public static void main(String[] args) {
            new Hello().test();
        }
    }

    Execute run.bat to check Java installation. Hello.java will be compiled and run. You will see "Hello!" in black window before "Press any key to continue . . .".

  3. Execute supercompile-and-run-result.bat. It contains the following lines:

    call jscp -invoke -m main Hello -d res %*
    cd res
    call "%JAVA_HOME%\bin\javac" Hello.java
    call "%JAVA_HOME%\bin\java" Hello
    @pause

    In a black window you will see a printout of JScp options and the result of supercompilation of method main enclosed in comment lines with time, which are output when (1) supercompilation is started, (2) supercompilation proper has been finished and post-processing is started and (3) the whole of supercompilation is done:

    //--------------------------------------   0 sec - method Hello.main(java.lang.String[])
    //--------------------------------------   0 sec - method Hello.main(java.lang.String[]) postprocessing...
        public static void main (final java.lang.String[] args_1)
        {
          java.lang.System.out.println("Hello!") /*virtual*/;
          return;
        }
    }
    //--------------------------------------   0 sec - JScp version 0.1.99  ---

    Here method invocation test() has been inlined and instance creation expression new Hello() discarded as if garbage collection has been performed in supercompile time.

  4. You may want to look at the resulting file Hello.java in subdirectory res.

Getting Started

Simplest case

The simplest scenario to use the Java supercompiler is as follows:

  1. A Java program subject to supercompilation must be successfully compiled by a Java compiler. (The use of Sun's javac is strongly recommended.) To check this, call Sun's javac having set the classpath environment variable if needed:

    javac A.java B.java C.java

    You may want to check that the original program successfully runs, e.g., in case where the method main is located in class A and the Java program requires no arguments, execute the following command:

    java A

  2. Then call JScp with the same arguments and additional options for JScp:

    jscp A.java B.java C.java -d res -m test -cons

    Here -m, -d and -cons are shorthands for options -method, -destdir and -conservative:

    • -destdir res put resulting Java file with the same name A.java to directory res, which is the subdirectory of the current working directory.
    • -method test supercompile method(s) test from the first compilation unit A.java. If there are several methods with such identifier, all of them are supercompiled.
    • -conservative use conservative mode of supercompilation (to shorten supercompilation time for the first try).

  3. Go to directory res and compile the resulting file. Indicate the directory where other .class files lies, in the classpath environment variable or the -classpath option, e.g.,

    cd res
    javac -classpath .;.. A.java

    Syntactic errors may be reported by the Java compiler. Some of these may be the result of our underdevelopment concerning Java 1.1, which will be fixed soon. Other messages may complain that access modifiers (private, protected or default) does not allow accessing to members of other classes from within supercompiled code. This is a known bug. Now you should edit the source Java files manually and change access to public for required members.

  4. Run the supercompiled version of the Java program, e.g., in case where method main is located in class A:

    java -classpath .;.. A

    Compare the results for equivalence.

Further experiments

Repeat supercompilation with other options that control supercompilation strategies. We recommend experimenting with the following options for a start:

  • Delete option -conservative. Supercompilation may take more time but the result may be better.

  • Use option -all instead of -method identifier to supercompile all methods in the given compilation units. Such supercompilation may be rather long; for the first time set the most conservative mode by options -conservative and -invoke0:

    jscp A.java B.java C.java -d res -all -cons -i0

    Here -i0 is the shorthand for -invoke0. Option -invoken specifies the inlining depth, that is the number of recursive method invocations performed in supercompilation time. -i0 means no method invocations, no inlining.

    Experiment with different values of n and with no limit, which is set by option -invoke without parameter. Default is -invoke1.

    Exclude "bad" methods (e.g., those that supercompile too long) from supercompilation by -except... options described below.

  • As the alternative to listing the names of all .java files that should be known to the supercompiler, use dynamic loading. Specify option -dynamicLoading or -dl, and when a class C belonging to a package p1.p2.p3 is needed, file p1\p2\p3\C.java is sought and loaded if found. Otherwise, the methods of the class are considered unknown and supercompilation continues as without dynamic loading. By default, the file path is relative to the current working directory. Use option -sourcepath d1;d2;d3 to specify other directories to look for .java files.

Main options to control supercompilation

  • The depth of inlining is controlled by options -invoken and -recurn. While -invoken sets the limit to the total number of recursive invocations, option -recurn sets the limit of recursive invocations of the same method: inlining is stopped on (n+1)th invocation of any method. No limit is set by option -recur without parameter. Default is -recur1.

    Options -invoken and -recurn affect supercompilation jointly: inlining is not performed when the limit set by either of the options is reached.  -invoke0 and -recur0 are equivalent (no inlining). Options -invoke -recur set no inlining limit. Inlining may be stopped by other reasons as well, e.g., if the .java file containing a required method is not given to JScp.

  • The set of methods to be supercompiled may be specified in more detail by the following options:

    • -all, -allmethods supercompile all method in all top-level classes of all compilation units, except ones excluded by options -exceptMethod, -exceptClass, -exceptUnit.
    • -un, -unitn supercompile nth compilation unit. Units are counted as listed in command line. Default is -unit1.
    • -xun, -exceptUnitn do not supercompile nth compilation unit.
    • -c classIdentifier, -class classIdentifier supercompile top-level method of class(es) with given identifier.
    • -xc classIdentifier, -exceptClass classIdentifier do not supercompile classes with given identifier.
    • -m methodIdentifier, -method methodIdentifier supercompile method(s) with given identifier.
    • -xm methodIdentifier, -exceptMethod methodIdentifier do not supercompile methods with given identifier.

Run jscp.exe without arguments to see the list of all command-line options. The most of them set modes, strategies of supercompilation. The command line options set default values for the whole of supercompilation. More precise options at the level of each class and method may be set in a separate JScp task file in XML format. This will be described elsewhere.

Known bugs and underdevelopments

The main underdevelopment of the versions 0.1.x is simplified configuration analysis. This means that no new methods are generated by the JScp. Driving has been implemented to large extend. Our current goal is to complete development, testing and debugging of the ground level of supercompilation relying mainly on driving, and apply it to optimizing and specializing such Java programs, for which this collection of program transformation techniques is sufficient. Then we will continue the development of next levels of configuration analysis.

The most unpleasant known bugs in the current version of the Java Supercompiler are as follows:

  • JScp composes resulting Java program by merging supercompiled methods with fragments of original Java files that are synthesized from the internal parsed representation. Some constructs of Java 1.1 are output incorrectly now. These results in syntactic error when compiling the resulting Java program. The Java subset corresponding to Java 1.0 is output correctly.
  • Java code is moved from one classes to others by inlining. In new classes access to old members from the moved code may be not allowed by modifiers private, protected or default, and hence, resulting program contains syntactic errors reported by the Java compiler. Now the user have to change modifiers to public manually. In future JScp will output Java files with modified access by itself.
  • The statement try is implemented "too approximately" now. More special cases should be considered. This means that rather few information is now propagated from its body to the statements after try. This noticeably limits the depth of program specialization when statement try is often used, which is the case of a lot of real programs.
  • Moreover, even the current approximation of the try statement is not always correct been "not enough general". The main underdevelopment of try that may result in incorrect program transformation, is the absent of check whether the values of local variables may change inside try in such a way that they have other values on exit from try on an exception than by normal control flow.

Bug Reports and Feedback

Send questions, comments and bug reports to info@supercompilers.com. Thank you!


Copyright (C) 1999-2008 Supercompilers, LLC   Last modified 3 May 2008 by Andrei.Klimov@supercompilers.com