Introduction
This is a list of things to hack on. It’s mostly meant for people who are new to Jato and want to find something interesting to hack on.
Priority
This section contains projects that are considered to be most important to make Jato a top notch alternative JVM for contemporary applications that are dominated by alternative languages such as Scala, Clojure, and JRuby.
LLVM backend
Implemeting a production quality JIT compiler for a new architecture is time-consuming and hard. The goal of this project is to simplify Jato porting by implementing a code generator using LLVM C bindings. The work has already been started and it is visible in the llvm/core branch.
- Required skills
-
C, LLVM, JVM
- Difficulty
-
Medium to Hard
Darwin Port
Darwin (a.k.a. Mac OS X) is a popular platform among software developers. The goal of this project is to get rid of Linuxism such as:
-
Thread-local storage (TLS)
-
Signal handling
and be able to run Jato under Darwin.
- Required skills
-
C, Darwin
- Difficulty
-
Medium to Hard
Interpreter
Jato supports JIT-only execution which is slow for some applications. For example, startup time for short-lived applications can be reduced with a simple interpreter. The primary goal of this project is to improve vm/interp.c to cover the whole bytecode instruction set. Secondary goal for this project is to implement mixed-mode execution where we start out by interpreting bytecode and only JIT compile methods that are invoked a lot of times.
- Required skills
-
C, JVM
- Difficulty
-
Medium
JSR 292: invokedynamic
The goal of this project is to add support for the new invokedynamic bytecode instruction specified in JSR 292 that’s designed to improve execution performance of dynamic languages such as JRuby and Redline Smalltalk. Please note that you also need to do some work in GNU Classpath to be able to run full invokedynamic applications.
- Required skills
-
C, Java
- Difficulty
-
Medium to Hard
Support for OpenJDK
Jato uses GNU Classpath to provide essential Java APIs. Unfortunately the development of GNU Classpath seems to have slowed after OpenJDK was released. The purpose of this project is to OpenJDK as an alternative for providing essential APIs for better Java compatibility. One possible way to do that is to use The OpenJDK Common Virtual Machine Interface CVMI as a starting point.
- Required skills
-
C, Java
- Difficulty
-
Medium
Performance
SSA form
The JIT compiler support SSA form but it’s currently disabled by default because it breaks some DaCapo benchmarks. The goal of this project is to fix all the remaining issues in SSA form conversion so that it can be enabled by default.
- Required skills
-
C, x86
- Difficulty
-
Hard
Method Inlining
Method inlining is an optimization where a method invocation is replaced with an inline copy of the invoked method body. As shown by [Suganuma02], static inlining decisions only make sense for tiny methods where the body of the method is smaller than the invocation site footprint. The purpose of this project is to implement method inlining for tiny static methods and optionally, if inline cache support is added to the VM, inlining of tiny virtual and interface methods to the inline cache.
- Required skills
-
C, x86
- Difficulty
-
Medium
Array Bounds Check Elimination
Out-of-bounds array stores and loads are required to throw the IndexOutOfBoundsException. Currently the JIT compiler generates a bounds check for every array access which is slow as seen in the SciMark2 benchmark, for example. Fortunately, much of the array bounds checks can be eliminated which speeds up SciMark by 40% and SPECjvm98 2% on average [Wuerthinger07].
- Required skills
-
C
- Difficulty
-
Medium
Register Allocator Improvements
Linear scan register allocator functions show up high in performance profiles for many applications. The goal of this project is to improve both the speed of the register allocator and the generated code. One such improvement would be to use the SSA form in the register allocator [Moe02].
- Required skills
-
C
- Difficulty
-
Medium
Startup Time
The purpose of this project is to improve VM startup time. There are two different classes of applications to optimize: command line applications such as Ant and Maven and graphical user interface applications such as Eclipse. Both classes might require different kinds of hacks to speed up the start up time. You might need to optimize GNU Classpath and external libraries Jato uses during VM startup.
- Required skills
-
C, Java
- Difficulty
-
Medium
Tail Call Optimizations
Functional languages such as Scala and Clojure make heavy use of tail-calls. The purpose of this project is to implement tail call optimizations to speed up functional languages on JVM. For background material, check out HotSpot Request for Enhancement on tail-call optimizations and a related blog post by John Rose ("Tail calls in the VM").
- Required skills
-
C, x86
- Difficulty
-
Hard
Virtual Machine
Ahead-of-time Compilation
The goal of this project is to implement support for ahead-of-time compilation.
- Required skills
-
C, JVM
- Difficulty
-
Hard
Garbage Collector
Exact GC
Jato uses Boehm GC as its garbage collector. While Boehm GC is stable and reliable, it’s unnecessarily slow because it’s based on a conservative GC algorithm. The first step for a fully integrated and fast garbage collector is to implement an exact GC that uses safepoints (also known as GC points) to stop-the-world and GC maps to enumerate the GC root set. There’s a broken implementation of safepoints in vm/gc.c that serves as a starting point for the project.
- Required skills
-
C, x86
- Difficulty
-
Medium
Compacting GC
This project is for implementing a compacting GC on top of the core GC to reduce memory fragmentation and speed up object allocation in the GC. Please note that this requires support from the VM and the JIT compiler because you need to be able to move objects during compacting phase.
- Required skills
-
C, x86
- Difficulty
-
Hard
Low Latency GC
The stop-the-world part of a GC can cause latencies in the order of many milliseconds in Java applications. While there are pauseless GCs for special purpose hardware, the purpose of this project is to implement GC latency monitoring tools for the VM and investigate and implement practical solutions for reducing GC latency on commodity hardware.
- Required skills
-
C, x86
- Difficulty
-
Hard
x86
Legacy floating point support (x87)
Add instruction selection and code emission for floating point arithmetic using the x87 FPU instruction such as fadd and fdiv. The main difference to the project idea above is that the x87 instructions do not operate on regular registers but use a stack-based approach which makes instruction selection harder. As the SSE2 approach is more performant, the x87 support is only for compatilibity with older x86 CPUs that do not support SSE.
- Required skills
-
C, x86
- Difficulty
-
Medium
ARM
ARMv5 support
The JIT compiler has preliminary support for ARMv5 but there’s still more work to be done to support the full JVM instruction set.
- Required skills
-
C, ARM
- Difficulty
-
Hard
Thumb2 support
The Thumb2 instruction set on ARM is more compact than the 32-bit ARM instruction set which is important for embedded devices.
- Required skills
-
C, ARM
- Difficulty
-
Hard
Portability
Machine Architectures
To port the VM to a new architecture, you need to introduce a new arch/<target> directory that implements the following machine architecture specific parts:
-
Instruction encoding (see arch/x86/emit.c and arch/x86/encoding.c)
-
Instruction selection (see arch/x86/insn-selector.brg)
-
Exception handling (see arch/x86/unwind_{32,64}.S)
-
Signal bottom halves (see arch/x86/signal.c and arch/x86/signal-bh.sh)
-
sun/misc/Unsafe locking primitives (see arch/x86/include/arch/cmpxchg*.h)
You also need to make sure the core code in vm, jit, and runtime directories works on your architecture. Special attention needs to be paid when porting to big endian CPUs because of our x86 centric heritage.
- Required skills
-
C, target machine architecture
- Difficulty
-
Hard
Windows Port
The goal of this project is to port Jato to Windows. Please note that the codebase currently is very GCC and Linux specific which are likely to bite you when porting to native Windows toolchain:
-
ABI issues (e.g. Microsoft x64 calling conventions)
-
Thread-local storage (TLS)
-
Signal handling
-
POSIXism
- Required skills
-
C, Windows
- Difficulty
-
Medium to Hard
Clang
Clang is an interesting new compiler built on top of LLVM that claims faster compilation times and better code generation. The purpose of this project is to fix blatant GCC-ism in Jato to make it compile with Clang. Please note that you might need to dive into LLVM and Clang sources if you run into Clang crashes while compiling the VM.
- Required skills
-
C, possibly LLVM
- Difficulty
-
Medium to Hard
JVM
Java Native Interface (JNI) support
The VM has partial support for Java Native Interface (JNI) API. The goal of this project is to finish the API.
- Required skills
-
C, Java
- Difficulty
-
Medium
JVM Tool Interface (JVM TI)
The JVM Tool Inteface is a native programming interface that’s used by development tool such as commercial YourKit Java Profiler.
- Required skills
-
C, Java
- Difficulty
-
Medium
Tools Support
Valgrind
You can run Jato under Valgrind to detect bugs in the VM. However, to accomplish that, Jato generates slightly different code because using signals for lazy class initialization interracts badly with Valgrind. Furthermore, throwing a NullPointerException through SIGSEGV also upsets Valgrind.
The goal of this project is to fix up Jato and Valgrind interractions. This might require changes possibly in both projects.
- Required skills
-
C, x86
- Difficulty
-
Medium
GDB
Jato already hooks into GDB’s JIT interface to provide basic debugging functionality such as backtraces and breakpoints. This could use a cleanup and various improvements, like the ability to display the arguments passed to a method. We could implement a better mangling scheme, but it’s probably better to use the new custom debug info support in GDB 7.4. Check the GDB documentation for details.
Another possible improvement could be minimizing the need to disable optimizations.
This task should be rather straightforward, but it might require digging through GDB to figure out possibly missing details from the documentation. It can be scaled according to the features one proposes to implement.
- Required skills
-
C, GDB, x86(-64)
- Difficulty
-
Medium, depends on the scope of the proposal
Other
fork() support
The JRuby VM is unable to support fork() because OpenJDK does not support it. The goal of this project is to make Java programs running under Jato be able to call fork() and still have a working system.
- Required skills
-
C, POSIX
- Difficulty
-
Medium
POSIX API support
Projects like JRuby and Cassandra rely heavily on OS specific APIs for correctness and performance reasons. The goal of this project is to support POSIX APIs in Jato efficiently. Ideally, we should be able to inline the actual API calls directly to the JIT’d machine code so that there’s zero overhead.
As a starting point, you should look into the JNR POSIX project to see what Jato can do to make it work and perform well out-of-the-box.
- Required skills
-
C, POSIX
- Difficulty
-
Medium
JIT API
VMs that run on the JVM need to transform their internal representation into bytecode for the JIT to kick in. This is wasteful because JVMs will immediately transform the bytecode to their own intermediate representation.
The goal of this project is to implement a Java API that allows Java programs to generate JIT’d code directly and bypass the bytecode conversion layer.
- Required skills
-
C, POSIX
- Difficulty
-
Medium
JIT preloading
The goal of this project is to implement support for saving off JITted code and preload it to speed up startup time.
- Required skills
-
C
- Difficulty
-
Medium
Dalvik
The Dalvik VM is very similar to the JVM semantically. The goal of this project is to implement support for running Dalvik applications under Jato.
- Required skills
-
C, POSIX
- Difficulty
-
Hard
References
-
[Cooper01] Keith Cooper et al. A Simple, Fast Dominance Algorithm. 2001. URL
-
[Moe02] Hanspeter Mössenböck and Michael Pfeiffer. Linear Scan Register Allocation in the Context of SSA Form and Register Constraints. 2002. URL
-
[Suganuma02] Toshio Suganuma et al. An Empirical Study of Method Inlining for a Java Just-In-Time Compiler. URL
-
[Wuerthinger07] Thomas Würthinger and Christian Wimmer. Array Bounds Check Elimination for the Java HotSpot™ Client Compiler. 2007. URL