zafena development

March 10, 2010

When making a programming tool or a virtual machine getting the tool running perfectly stable without any crash bugs are always on a higher priority than gaining more speed. A crashing tool are a broken tool so I will share some tricks that I have practised to find and fix Shark LLVM JIT CodeGen crash bugs. The main trick are to be able generate reproducable testcases that can be reported to the LLVM developers bugzilla bugtracker by using what you can extract from the Shark LLVM JIT CodeGen crashes. Here is how I do it, enjoy!

How to provoke hard to find Shark LLVM JIT bugs
Some Shark LLVM JIT bugs are hard to find because they only occour after the Shark JIT enabled JVM have been running for a long time, this are because the Shark Hotspot JVM takes advantage of the fact that a given running application spends about 90% of its time running only 10% of the applications code. Hotspot profiles the running code and only JITs the most frequently used methods of the program. Hotspot uses a threshold to determine which methods to JIT. When a method have been used more than 100000 times then it are scheduled to be optimized by the JIT. JIT bugs can stay undetected if they are located in unfrequently executed methods, those methods that makes up the 90%, of the unfrequently executed application code.

A easy trick to provoke unfrequently executed JIT bugs are to lower the JIT threshold in Hotspot so that Hotspot JITs everything. The JIT threshold can be controlled by using the -XX:CompileThreshold=1 option and -Xbatch option. -Xbatch prevents the hotspot from running the JIT in background and will make hotspot reproduce JIT bugs more determistic.

Using a low JIT threshold will of course make the program startup magnitudes slower but it will also eventually find and hit all JIT bugs for a given application. Try pass -XX:+PrintCompilation to Hotspot as well so that you can observe all the java methods that Hotspot are JITting and find out which method that failed to JIT if Hotspot hits a JIT crash bug.
java -XX:CompileThreshold=1 -Xbatch -XX:+PrintCompilation JavaApplication
1 b java.lang.Thread:: (49 bytes)
10 b java.lang.String::getChars (66 bytes)
/home/xerxes/llvm/include/llvm/CodeGen/MachineFrameInfo.h:289: int64_t
llvm::MachineFrameInfo::getObjectOffset(int) const: Assertion
`!isDeadObjectIndex(ObjectIdx) && "Getting frame offset for a dead object?"'

Huh.. no logfile??
Most Shark LLVM JIT CodeGen crash bugs makes the JVM instantainiously exit without producting a hs_err_pid*.log file. Whats usefull are that the JVM output will contain a Assertion, Unreachable or Unimplemented keyword and a LLVM code line numer.

So what do we do now?
Thanks by using -XX:+PrintCompilation makes us aware that the last method JITed was the java.lang.String::getChars method and that caused the Assertion in the LLVM CodeGen when running the Shark JIT so the next step are to dump the LLVM IR that Shark have generated for the method.

Extract the LLVM IR for the java method that makes the Shark JIT crash.
Ok so we got a crash and we know that it was JITing of java.lang.String::getChars that caused it.

a) shark debug build -XX:SharkPrintBitcodeOf= method:
If you have built a debuggable "Mixtech" Shark build then Shark will contain some extra usefull debug runtimeoptions where one of the more usefull are
use it and Shark will dump the LLVM IR bitcode to stdout just before jitting it.

b) gdb call F->dump() method:
I personally prefer dumping LLVM IR from inside the gnu gdb debugger since this method can be used using release Shark build in combination with release llvm builds so lets jump into the gdb debugger!

Start gdb and attach it to the java application with all the options that triggered the JIT CodeGen bug!
$ gdb -args java -XX:CompileThreshold=1 -Xbatch -XX:+PrintCompilation JavaApplication
(gdb) run
Segmentation fault

Ick gdb crashed why? This are because the JVM launcher "java" first sets up the system environment and then forks off in a new process using execve(). gdb gets killed by the linux kernel when it are trying to read memory across process boundarys so we must stop java from forking!

The easiest way to prevent java from forking are to setup the system environments before launching the application. And all this can be done from inside gdb so lets try again!
$ gdb -args java -XX:CompileThreshold=1 -Xbatch -XX:+PrintCompilation JavaApplication
(gdb) break execve
Breakpoint 1 at 0x93b8
(gdb) run
(gdb) call puts(getenv("LD_LIBRARY_PATH"))
$1 = 220

Ok now we know what the LD_LIBRARY_PATH should look like and if we set it before running the java launcher will prevent java from forking using execve, this LD_LIBRARY_PATH and execve madness are thankfully gone in JDK7!
(gdb) set env LD_LIBRARY_PATH=/media/disk/
I will do one more thing namely set a gdb breakpoint inside java_md.c:652 right after the hotspot library have been loaded by the java launcher.
(gdb) break java_md.c:652
(gdb) run
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Breakpoint 2, LoadJavaVM ... java_md.c:652
652 if (libjvm == NULL) {

This are a good spot to setup new gdb breakpoints inside the loaded that contains the Shark JIT. Finally we are able to place a breakpoint on the line where the Shark JIT failed inside LLVM.
(gdb) break MachineFrameInfo.h:289
(gdb) continue
10 b java.lang.String::getChars (66 bytes)
[Switching to Thread 0x67ed96a490 (LWP 21127)]

Breakpoint 3, ... at ... MachineFrameInfo.h:289

Get a backtrace and try to locate the frame where Shark calls getPointerToFunction
(gdb) bt
#9 0x40d4ee68 in llvm::JIT::getPointerToFunction (this=0x9e138, F=0xda6f0)

Switch to the getPointerToFunction stack frame
(gdb) frame 9
and finnaly dump the LLVM IR for the function by calling the functions own method dump() !
(gdb) call F->dump()
define internal void @"java.lang.String::getChars"([84 x i8]* %method, i32 %base_pc, [788 x i8]* %thread) {
%1 = getelementptr inbounds [788 x i8]* %thread, i32 0, i32 756 ; [#uses=1]
%zero_stack = bitcast i8* %1 to [12 x i8]* ; <[12 x i8]*> [#uses=1]
%2 = getelementptr inbounds [12 x i8]* %zero_stack, i32 0, i32 8 ; [#uses=1]
%stack_pointer_addr = bitcast i8* %2 to i32* ; [#uses=1]
%3 = load i32* %stack_pointer_addr ; [#uses=1]
%142 = getelementptr inbounds [17 x i32]* %frame, i32 0, i32 12 ; [#uses=1]
store i32 %31, i32* %142
call void inttoptr (i32 13839116 to void ([788 x i8]*, i32)*)([788 x i8]* %thread, i32 7)
ret void

Horray! we have successfully dumped the Shark generated LLVM IR for the problematic method-call. Now simply copy the dump output from the terminal into a file named bug.ll and continue reading.

Check for LLVM CodeGen bugs by testing if the dumped LLVM IR bug.ll file can reproduce the bug using llc
After you have extracted LLVM IR for the problematic method check if you can reproduce the bug using llc..
$ llvm-as < bug.ll | llc
.syntax unified
.eabi_attribute 20, 1
.eabi_attribute 21, 1
.eabi_attribute 23, 3
.eabi_attribute 24, 1
.eabi_attribute 25, 1
.file ""
int64_t llvm::MachineFrameInfo::getObjectOffset(int) const: Assertion
`!isDeadObjectIndex(ObjectIdx) && "Getting frame offset for a dead object?"'
0 llc 0x01368414
1 llc 0x01368ccc
2 0x4021cc10 __default_sa_restorer_v2 + 0
Stack dump:
0. Program arguments: /wd/r96575/Debug/bin/llc -march=arm
1. Running pass 'Prolog/Epilog Insertion & Frame Finalization' on function

If it crashes using llc then cheer up because you now got a reproducable CodeGen bug and thats great! These kind of crash bugs are on LLVM developers top wanted list because they can fire on any tool that uses LLVM code generation. The best way to report this kind of bugs are to first generate a compact testcase for LLVM that triggers the bug that can be used by the LLVM developers to fix it. It can also be run by the LLVM developers daily regression testing to make sure this bug never hits again.

If it fails to crash with an Aborted like above then you are probably observing a JIT CodeEmitter runtime bug, stay tuned and look forward to my next blog post on "How to fix Shark LLVM JIT CodeEmitter bugs"!

How to generate a bugpoint-reduced-simplified.bc from the bug.ll using bugpoint for CodeGen crash bugs
LLVM ships with a clever tool called bugpoint that are designed to convert dumped blocks of LLVM IR into a compact bugpoint-reduced-simplified.bc LLVM bitcode testcase file that only contains the instructions needed to reproduce the bug.

$ bugpoint -run-llc bug.ll --tool-args -march=arm

Bugpoint work by using deductive logic to break down and remove parts of the bug.ll file and automatically narrow down the LLVM IR lines needed to reproduce the bug. It can take some minutes so be patient but bugpoint will eventually stop and give you a bugpoint-reduced-simplified.bc and print some information on how to reproduce the bug.

File a LLVM bugreport containing the bugpoint-reduced-simplified.bc file
An example of a Shark JIT LLVM bug that have been fixed after submitting a bugpoint-reduced-simplified.bc produced from a dumped Shark metod are :
LLVM PR6478 ARM CodeGen Running pass 'Prolog/Epilog Insertion & Frame Finalization' on function '@"java.lang.String::getChars"'

I hope this post have given you some inspiration on how to get Shark LLVM JIT CodeGen crash bugs fixed!
If you want to know more about how bugpoint works and how to officially prepare LLVM bugreports then take a peek at the LLVM documentation: its great.


March 9, 2010

I have from time to time been working on a automatic CPU feature tuning code for Shark to make the LLVM JIT generate better code for Cortex-A8 class ARM CPUs. Using it i was able to gain some substantial speed-improvements, all in all it made Shark generate 30% faster code on ARM and the Shark JIT are now able to beat the 2000 point CaffeineMark 3.0 score! I could not resist using CaffeineMark 3.0 for some benchmarking again :) .

While this looks all great with rainbows and unicorns the patch are sadly in limbo. I have had some trouble merging the code into LLVM 2.7 trunk before the next LLVM release since I got hit by LLVM problem report 6544 that will force me to redesign the implementation on top of LLVM before it can be commited.

And a similar optimisation like what i did for ARM Linux can easily be done for Shark on PPC Linux as well by adopting the ARM CPU features detection code from ARM to PPC!
For those who are interested the optimising code are submitted and can be fetched from LLVM PR5389.

Cheers and have a great day!

Powered by WordPress