zafena development

May 2, 2011

My first Android window runnign on top of unmodified OpenJDK using Icedrobot.

Android window running on top of a unmodified OpenJDK 6b22 using IcedRobot.

While I was reviewing Guillaume's graphics stack patches for IcedRobot this window poped up on my desktop. Im amazed, IcedRobot are able to run Android Activity's on top of a GNU/Linux desktop using a unmodified OpenJDK 6b22. Android app developers prepare yourself to start see your apps migrate into GNU/Linux desktops!

If you are new to IcedRobot then check out Mario's nice statement on the current overall progress of the project.

 

March 24, 2011

I have for a long time never looked at using JavaScript in any of my projects because of its untyped nature. But now all have changed.
Thanks to the Processing.js project it are now possible to write complex JavaScript code using all familiar typed "Java" syntax used by the http://processing.org language.

You can now take existing Processing "Java" sketches and convert them into into plain JavaScript code without actually changing a line of the original processing .pde file!
WOW!, This are the coolest thing I have seen in a long time since it enables me to start create all kinds of nice HTML5 doodles using the same syntax used in regular "Java" programs and then run them on any HTML5 capable device without actually having to install a "Java" browser plugin!

Have fun and jump instantly into some coding action at: http://sketchpad.cc - a online Processing.js IDE!

If you want to get inspired on how to create processing sketches then check out this all awesome OpenProcessing sketch archive at http://openprocessing.org.


Here running reach3 from the http://processingjs.org/learning/topic archive.

Thanks a lot to John Resig for making this work. (Get the source! MIT licensed)

March 18, 2011

Thanks to the IcedRobot project, it are now possible to run Android .dex files on top of any GNU/Linux system using OpenJDK and JamVM!
xranby@babbage:/wd/daneel$ java -jamvm -showversion -cp target/daneel-0.0.1-SNAPSHOT-jar-with-dependencies.jar \
-Djava.system.class.loader=org.icedrobot.daneel.loader.DaneelClassLoader -Ddaneel.class.path=src/test/java/resources/HelloDroid.dex \
org.icedrobot.test.HelloDroid

java version "1.6.0_18"
OpenJDK Runtime Environment (IcedTea6 1.8.3) (6b18-1.8.3-2+squeeze1)
JamVM (build 1.6.0-devel, inline-threaded interpreter with stack-caching)

Trying to find class 'org.icedrobot.test.HelloDroid' ...
Hello Android!

xranby@babbage:/wd/daneel$ uname -a
Linux babbage 2.6.28-11-imx51 #42 Tue Jun 23 11:27:23 BST 2009 armv7l GNU/Linux

(more...)

February 3, 2011

http://builder.classpath.org/icedtea/buildbot/waterfall

Thank you Mark for setting up this nice BuildBot master for the IcedTea project!

March 10, 2010

When making a programming tool or a virtual machine getting the tool running perfectly stable without any crash bugs are always on a higher priority than gaining more speed. A crashing tool are a broken tool so I will share some tricks that I have practised to find and fix Shark LLVM JIT CodeGen crash bugs. The main trick are to be able generate reproducable testcases that can be reported to the LLVM developers bugzilla bugtracker by using what you can extract from the Shark LLVM JIT CodeGen crashes. Here is how I do it, enjoy!

How to provoke hard to find Shark LLVM JIT bugs
Some Shark LLVM JIT bugs are hard to find because they only occour after the Shark JIT enabled JVM have been running for a long time, this are because the Shark Hotspot JVM takes advantage of the fact that a given running application spends about 90% of its time running only 10% of the applications code. Hotspot profiles the running code and only JITs the most frequently used methods of the program. Hotspot uses a threshold to determine which methods to JIT. When a method have been used more than 100000 times then it are scheduled to be optimized by the JIT. JIT bugs can stay undetected if they are located in unfrequently executed methods, those methods that makes up the 90%, of the unfrequently executed application code.

A easy trick to provoke unfrequently executed JIT bugs are to lower the JIT threshold in Hotspot so that Hotspot JITs everything. The JIT threshold can be controlled by using the -XX:CompileThreshold=1 option and -Xbatch option. -Xbatch prevents the hotspot from running the JIT in background and will make hotspot reproduce JIT bugs more determistic.

Using a low JIT threshold will of course make the program startup magnitudes slower but it will also eventually find and hit all JIT bugs for a given application. Try pass -XX:+PrintCompilation to Hotspot as well so that you can observe all the java methods that Hotspot are JITting and find out which method that failed to JIT if Hotspot hits a JIT crash bug.
java -XX:CompileThreshold=1 -Xbatch -XX:+PrintCompilation JavaApplication
1 b java.lang.Thread:: (49 bytes)
...
10 b java.lang.String::getChars (66 bytes)
*crash*
/home/xerxes/llvm/include/llvm/CodeGen/MachineFrameInfo.h:289: int64_t
llvm::MachineFrameInfo::getObjectOffset(int) const: Assertion
`!isDeadObjectIndex(ObjectIdx) && "Getting frame offset for a dead object?"'
failed.

Huh.. no logfile??
Most Shark LLVM JIT CodeGen crash bugs makes the JVM instantainiously exit without producting a hs_err_pid*.log file. Whats usefull are that the JVM output will contain a Assertion, Unreachable or Unimplemented keyword and a LLVM code line numer.

So what do we do now?
Thanks by using -XX:+PrintCompilation makes us aware that the last method JITed was the java.lang.String::getChars method and that caused the Assertion in the LLVM CodeGen when running the Shark JIT so the next step are to dump the LLVM IR that Shark have generated for the method.

Extract the LLVM IR for the java method that makes the Shark JIT crash.
Ok so we got a crash and we know that it was JITing of java.lang.String::getChars that caused it.

a) shark debug build -XX:SharkPrintBitcodeOf= method:
If you have built a debuggable "Mixtech" Shark build then Shark will contain some extra usefull debug runtimeoptions where one of the more usefull are
-XX:SharkPrintBitcodeOf=java.package.name::MethodName
use it and Shark will dump the LLVM IR bitcode to stdout just before jitting it.

b) gdb call F->dump() method:
I personally prefer dumping LLVM IR from inside the gnu gdb debugger since this method can be used using release Shark build in combination with release llvm builds so lets jump into the gdb debugger!

Start gdb and attach it to the java application with all the options that triggered the JIT CodeGen bug!
$ gdb -args java -XX:CompileThreshold=1 -Xbatch -XX:+PrintCompilation JavaApplication
(gdb) run
...
Segmentation fault
$

Ick gdb crashed why? This are because the JVM launcher "java" first sets up the system environment and then forks off in a new process using execve(). gdb gets killed by the linux kernel when it are trying to read memory across process boundarys so we must stop java from forking!

The easiest way to prevent java from forking are to setup the system environments before launching the application. And all this can be done from inside gdb so lets try again!
$ gdb -args java -XX:CompileThreshold=1 -Xbatch -XX:+PrintCompilation JavaApplication
(gdb) break execve
Breakpoint 1 at 0x93b8
(gdb) run
(gdb) call puts(getenv("LD_LIBRARY_PATH"))
/media/disk/4mar-shark-1.8pre-b18-llvm-2.7svn.so-npplugin/jre/lib/arm/server:/media/disk/4mar-shark-1.8pre-b18-llvm-2.7svn.so-npplugin/jre/lib/arm:/media/disk/4mar-shark-1.8pre-b18-llvm-2.7svn.so-npplugin/jre/../lib/arm
$1 = 220

Ok now we know what the LD_LIBRARY_PATH should look like and if we set it before running the java launcher will prevent java from forking using execve, this LD_LIBRARY_PATH and execve madness are thankfully gone in JDK7!
(gdb) set env LD_LIBRARY_PATH=/media/disk/4mar-shark-1.8pre-b18-llvm-2.7svn.so-npplugin/jre/lib/arm/server:/media/disk/4mar-shark-1.8pre-b18-llvm-2.7svn.so-npplugin/jre/lib/arm:/media/disk/4mar-shark-1.8pre-b18-llvm-2.7svn.so-npplugin/jre/../lib/arm
I will do one more thing namely set a gdb breakpoint inside java_md.c:652 right after the hotspot library libjvm.so have been loaded by the java launcher.
(gdb) break java_md.c:652
(gdb) run
The program being debugged has been started already.
Start it from the beginning? (y or n) y
...
Breakpoint 2, LoadJavaVM ... java_md.c:652
652 if (libjvm == NULL) {

This are a good spot to setup new gdb breakpoints inside the loaded libjvm.so that contains the Shark JIT. Finally we are able to place a breakpoint on the line where the Shark JIT failed inside LLVM.
(gdb) break MachineFrameInfo.h:289
(gdb) continue
Continuing.
...
10 b java.lang.String::getChars (66 bytes)
[Switching to Thread 0x67ed96a490 (LWP 21127)]

Breakpoint 3, ... at ... MachineFrameInfo.h:289

Get a backtrace and try to locate the frame where Shark calls getPointerToFunction
(gdb) bt
...
#9 0x40d4ee68 in llvm::JIT::getPointerToFunction (this=0x9e138, F=0xda6f0)
...

Switch to the getPointerToFunction stack frame
(gdb) frame 9
and finnaly dump the LLVM IR for the function by calling the functions own method dump() !
(gdb) call F->dump()
define internal void @"java.lang.String::getChars"([84 x i8]* %method, i32 %base_pc, [788 x i8]* %thread) {
%1 = getelementptr inbounds [788 x i8]* %thread, i32 0, i32 756 ; [#uses=1]
%zero_stack = bitcast i8* %1 to [12 x i8]* ; <[12 x i8]*> [#uses=1]
%2 = getelementptr inbounds [12 x i8]* %zero_stack, i32 0, i32 8 ; [#uses=1]
%stack_pointer_addr = bitcast i8* %2 to i32* ; [#uses=1]
%3 = load i32* %stack_pointer_addr ; [#uses=1]
...
%142 = getelementptr inbounds [17 x i32]* %frame, i32 0, i32 12 ; [#uses=1]
store i32 %31, i32* %142
call void inttoptr (i32 13839116 to void ([788 x i8]*, i32)*)([788 x i8]* %thread, i32 7)
ret void
}

Horray! we have successfully dumped the Shark generated LLVM IR for the problematic method-call. Now simply copy the dump output from the terminal into a file named bug.ll and continue reading.

Check for LLVM CodeGen bugs by testing if the dumped LLVM IR bug.ll file can reproduce the bug using llc
After you have extracted LLVM IR for the problematic method check if you can reproduce the bug using llc..
$ llvm-as < bug.ll | llc
.syntax unified
.eabi_attribute 20, 1
.eabi_attribute 21, 1
.eabi_attribute 23, 3
.eabi_attribute 24, 1
.eabi_attribute 25, 1
.file ""
llc:
/wd/buildbot/llvm-arm-linux/llvm/include/llvm/CodeGen/MachineFrameInfo.h:289:
int64_t llvm::MachineFrameInfo::getObjectOffset(int) const: Assertion
`!isDeadObjectIndex(ObjectIdx) && "Getting frame offset for a dead object?"'
failed.
0 llc 0x01368414
1 llc 0x01368ccc
2 libc.so.6 0x4021cc10 __default_sa_restorer_v2 + 0
Stack dump:
0. Program arguments: /wd/r96575/Debug/bin/llc -march=arm
1. Running pass 'Prolog/Epilog Insertion & Frame Finalization' on function
'@"java.lang.String::getChars"'
Aborted

If it crashes using llc then cheer up because you now got a reproducable CodeGen bug and thats great! These kind of crash bugs are on LLVM developers top wanted list because they can fire on any tool that uses LLVM code generation. The best way to report this kind of bugs are to first generate a compact testcase for LLVM that triggers the bug that can be used by the LLVM developers to fix it. It can also be run by the LLVM developers daily regression testing to make sure this bug never hits again.

If it fails to crash with an Aborted like above then you are probably observing a JIT CodeEmitter runtime bug, stay tuned and look forward to my next blog post on "How to fix Shark LLVM JIT CodeEmitter bugs"!

How to generate a bugpoint-reduced-simplified.bc from the bug.ll using bugpoint for CodeGen crash bugs
LLVM ships with a clever tool called bugpoint that are designed to convert dumped blocks of LLVM IR into a compact bugpoint-reduced-simplified.bc LLVM bitcode testcase file that only contains the instructions needed to reproduce the bug.

$ bugpoint -run-llc bug.ll --tool-args -march=arm

Bugpoint work by using deductive logic to break down and remove parts of the bug.ll file and automatically narrow down the LLVM IR lines needed to reproduce the bug. It can take some minutes so be patient but bugpoint will eventually stop and give you a bugpoint-reduced-simplified.bc and print some information on how to reproduce the bug.

File a LLVM bugreport containing the bugpoint-reduced-simplified.bc file
An example of a Shark JIT LLVM bug that have been fixed after submitting a bugpoint-reduced-simplified.bc produced from a dumped Shark metod are :
LLVM PR6478 ARM CodeGen Running pass 'Prolog/Epilog Insertion & Frame Finalization' on function '@"java.lang.String::getChars"'

I hope this post have given you some inspiration on how to get Shark LLVM JIT CodeGen crash bugs fixed!
If you want to know more about how bugpoint works and how to officially prepare LLVM bugreports then take a peek at the LLVM documentation: http://llvm.org/docs/HowToSubmitABug.html its great.

Xerxes

March 9, 2010


I have from time to time been working on a automatic CPU feature tuning code for Shark to make the LLVM JIT generate better code for Cortex-A8 class ARM CPUs. Using it i was able to gain some substantial speed-improvements, all in all it made Shark generate 30% faster code on ARM and the Shark JIT are now able to beat the 2000 point CaffeineMark 3.0 score! I could not resist using CaffeineMark 3.0 for some benchmarking again :) .

While this looks all great with rainbows and unicorns the patch are sadly in limbo. I have had some trouble merging the code into LLVM 2.7 trunk before the next LLVM release since I got hit by LLVM problem report 6544 that will force me to redesign the implementation on top of LLVM before it can be commited.

And a similar optimisation like what i did for ARM Linux can easily be done for Shark on PPC Linux as well by adopting the ARM CPU features detection code from ARM to PPC!
For those who are interested the optimising code are submitted and can be fetched from LLVM PR5389.

Cheers and have a great day!
Xerxes

February 25, 2010

During the past months i have seen some really cool stuff done using small powerefficient ARM computers and OpenJDK.

SimpleSimon connects

Simple Simon PT connected to a hospital laboratory system by using a powerefficient plugcomputer and displaylink usb screen. All powered by OpenJDK

Simple Simon PT connects:

This project hooks up a battery powered laboratory coagulation device, a Simple Simon PT reader to standard hospital laboratory system using a ASTM-1394-1397 / LIS2-A2 connectivity over ethernet. A small ARM based plugcomputer does all data message processing and communication. User interraction are performed by using the Simon reader and a usb-barcode reader to enter laboratory identification. Optionally can a usb-touch-screen be connected for improved user feedback, by displaying charts using JFreeChart, to show and give a better understanding of the coagulation process.

Powerconsumption tops at 15W with the USB screen attached and 6W without. All running silent without any moving parts!

Shark linked against the shared libLLVM-2.7svn.so

Shark linked against the shared libLLVM-2.7svn.so

Shark linked against dynamic LLVM .so library

Earlier today I got Shark linked against a shared libLLVM-2.7svn.so generated by using LLVM 2.7svn trunk. It work by simply building LLVM using configure --enable-shared --enable-optimized --disable-assertions and then tweak the Icedtea6 main Makefiles to use the shared library during liking:
Replace the line
LLVM_LIBS = -lLLVMX86Disassembler -lLLVMX86AsmParser -lLLVMMCParser -lLLVMX86AsmPrinter -lLLVMX86CodeGen -lLLVMSelectionDAG -lLLVMAsmPrinter -lLLVMX86Info -lLLVMJIT -lLLVMExecutionEngine -lLLVMCodeGen -lLLVMScalarOpts -lLLVMInstCombine -lLLVMTransformUtils -lLLVMipa -lLLVMAnalysis -lLLVMTarget -lLLVMMC -lLLVMCore -lLLVMSupport -lLLVMSystem
with
LLVM_LIBS = -lLLVM-2.7svn
in the main icedtea6/Makefile and then build Icedtea6 normally, Shark currently builds and works right out of the box when using a LLVM release build!

A cool thing by building shark against the shared library are that you can switch the LLVM JIT that Shark uses from running with or without assertions, debug code, and various extra optimizations by simply replacing the /usr/local/lib/libLLVM-2.7svn.so file with what you want. Linking time during shark builds and shark footprint are impressively smaller as well. Im really happy to see this functionallity in LLVM 2.7!

The LLVM 2.7 code freeze before the 2.7 release happens in about 1.5 weeks from now and i will stay busy for some days to observe and polish the current LLVM svn trunk to be usable with openjdk-6-shark.

Edward Nevill created a ARM Jazelle RTC Thumb2 JIT reference implementation

Meanwhile I have been busy taming Sharks a new kind of Thumb2 JIT have emerged built by Edward Nevill of Cambridge Software Labs! The new Tumb2JIT have been committed into the Icedtea6 trunk and it are a working implementation of Jazelle RTC to be used by ARM Cortex-A8+ class CPUs. It wonderfull that this have been released as free software, Wow!,

Suddenly we got three different JITs to use on ARM with OpenJDK: Cacao, Shark and T2. An opurtunity has emerged to tier them and so I did. Here comes the raw "truth" produced by Caffeine Mark 3! This will probably be the last time i will show off any Caffeine Mark 3 benchmark since it really dont give justice on real world client applications where responsiveness are more crucial than top runtime speed, nevertheless benchmarking using CM30 have always felt fun so here we go: All benchmarks running using a Sharp PC-Z1 Cortex-A8 Mobile internet tool.

Tier between Edwards Thumb 2 JIT , Shark LLVM JIT and Cacao JIT: All running on OpenJDK 6 ARM

Tier between Edwards Thumb 2 JIT , Shark LLVM JIT and Cacao JIT: All running on a ARM Sharp PC-Z1 Mobile internet tool smartbook using OpenJDK 6 compiled with Icedtea6.

This new T2 JIT's main strenght are reduced jitting time, it basically cuts all jtting time to almost zero and client applications on ARM finnaly runs from tick one. This thumb2 jit makes a really nice java applet browser experience with about 15 seconds first applet startuptime on a ARM smartbook and and all usable instantly after being loaded.
A small 1min 12seconds .3gp movie displaying some java applets running on the Sharp PC-Z1 featuring the new thumb2jit from Icedtea6

Cheers and have a great day!
Xerxes

October 9, 2009

The results displayed are generated using the debug build of shark, with assertions, that I made on the 6th of October compared against release builds of the pure zero cpp interpreter and the optimised zero assembler interpreter, both from Icedtea6-1.6.1.

Did we gain anything from having a jumping shark? By taking a quick peek at the graph you can quite quickly spot some 15X+ speed improvements so yes! yeah!, the shark JIT indeed got some sharp toots in its yaws! I am quite delighted to see that some parts of the benchmark got a 25x+ speed boost!
There are still some rough spots that can be identified that of course needs some polishing, so let me share some ideas on how to make the Shark JIT on ARM really shine.

As can be seen in the chart shark uses the zero cpp interpreter before the methods are jited and the extra overhead on running the JIT causes the zero interpreter to run slower during program launch on a single core ARM cpu, this penalty are removed once the initial warm-up have complete (somewhere around 300 to 500 compiled methods). New multi-core ARM Cortex-A9 CPU do not have this penalty since the compiler process are run in a separate thread and can be scheduled on a CPU of its own.

Some quick ways to fix the warm-up issue:
0. First of all I want to state that these results where generated using a debug build of shark, I have a build machine working on creating a release builds as I type so hopefully I will be able to generate some improved benchmark scores in the near future, especially to deal with the warm-up penalty.
1. A quick way to reduce the warmup penalty would be to make shark able to use the new assembler optimized interpreter instead of the pure cpp interpreter found in Icedtea6-1.6.1 and this could become a reality quite soon since they both share the same in memory structures. Also by using the new assembler optimizations would make Shark JIT more usable as a client JVM where initial GUI performance are crucial, and in this GUI area the assembler interpreter really shine.
2. I have also identified some parts in the LLVM JIT that could be quickly improved to make the LLVM JIT jitting faster. Basically I want to make the LLVM tablegen generate better lookuptables to speed up the instruction lowering, currently shark spends quite a large deal of time here running the LLVM ExecutionEngine::getPointerToFunction(). I think by generating some improved formatter code for the LLVM tablegen backend could quite quickly improve the autogenerated .inc files used for the target instruction lowering.
3. Examine the posibility to implement a bytecode cache in Shark to jumpstart the JIT even further. By making the JIT able to load precalculated LLVM IR or in memory representations of the methods would reduce some of the JIT overhead on program launch.
4. Add a PassManager framework to Shark to simplify the LLVM IR before it reaches the JIT. The tricky part are to select what passes to use and in what order to use them. If done correctly then this might both lower jitting time and improve the generated machine code quality.

October 6, 2009

picture of the day!

The picture that made my day!

Ok.. so what happened?

xerxes@babbage-karmic:/wd/icedtea6/openjdk/build/linux-arm/bin$ ./java -version
java version "1.6.0_0"
OpenJDK Runtime Environment (IcedTea6 1.7pre-r2a3725ce72d4) (build 1.6.0_0-b16)
OpenJDK Shark VM (build 14.0-b16-product, mixed mode)

xerxes@babbage-karmic:/wd/icedtea6/openjdk/build/linux-arm/bin$ cat /proc/cpuinfo
Processor    : ARMv7 Processor rev 1 (v7l)
BogoMIPS    : 799.53
Features    : swp half thumb fastmult vfp edsp
CPU implementer    : 0x41
CPU architecture: 7
CPU variant    : 0x2
CPU part    : 0xc08
CPU revision    : 1
Hardware    : Freescale MX51 Babbage Board
Revision    : 51011
Serial        : 0000000000000000

xerxes@babbage-karmic:/wd/llvm$ svn info
URL: http://llvm.org/svn/llvm-project/llvm/trunk
Repository Root: http://llvm.org/svn/llvm-project
Repository UUID: 91177308-0d34-0410-b5e6-96231b3b80d8
Revision: 82896
Node Kind: directory
Schedule: normal
Last Changed Author: edwin
Last Changed Rev: 82896
Last Changed Date: 2009-09-27 11:08:03 +0000 (Sun, 27 Sep 2009)

xerxes@babbage-karmic:/wd/llvm$ quilt diff
Index: llvm/lib/Target/ARM/ARMInstrInfo.td
===================================================================
--- llvm.orig/lib/Target/ARM/ARMInstrInfo.td    2009-10-06 12:35:26.000000000 +0000
+++ llvm/lib/Target/ARM/ARMInstrInfo.td    2009-10-06 12:36:03.000000000 +0000
@@ -645,7 +645,7 @@
 IIC_Br, "mov lr, pc\n\tbx $func",
 [(ARMcall_nolink GPR:$func)]>,
 Requires<[IsARM, IsNotDarwin]> {
-    let Inst{7-4}   = 0b0001;
+    let Inst{7-4}   = 0b0011;
 let Inst{19-8}  = 0b111111111111;
 let Inst{27-20} = 0b00010010;
 }

The last patch on LLVM are currently a hack. basically it makes LLVM emit ARM BLX instructions instead of BX instructions for ARM::CALL_NOLINK. So why did this little hack make it work?

In order to understand that, one have to find out what made Shark on ARM crash before...

Lets rewind time to some days ago... 

Hi, i have been enjoying myself inside gdb for some days, and I have now at least found the reason why the cpu
ends up in garbage memory when running shark on arm.

The problem can be illustrated like this:

frame manager invokes jited code
entry_zero.hpp:57 invokes jit code at 0x67c9e990

jited code runs
0x67c9e990:    push    {r4, r5, r6, r7, r8, r9, r10, r11, lr}
0x67c9e994:    sub    sp, sp, #12    ; 0xc
0x67c9e998:    ldr    r12, [r3, #756]
0x67c9e99c:    ldr    lr, [r3, #764]
0x67c9e9a0:    sub    r4, lr, #56    ; 0x38
0x67c9e9a4:    cmp    r4, r12
0x67c9e9a8:    bcc    0x67c9ebd0
0x67c9e9ac:    mov    r5, r3
0x67c9e9b0:    str    r2, [sp, #4]
0x67c9e9b4:    mov    r6, r0
0x67c9e9b8:    str    r4, [r5, #764]
0x67c9e9bc:    str    r4, [r4, #20]
0x67c9e9c0:    ldr    r0, [pc, #640]    ; 0x67c9ec48
0x67c9e9c4:    str    r0, [r4, #28]
0x67c9e9c8:    ldr    r0, [r5, #768]
0x67c9e9cc:    str    r0, [r4, #32]
0x67c9e9d0:    add    r0, r4, #32    ; 0x20
0x67c9e9d4:    str    r0, [r5, #768]
0x67c9e9d8:    str    r6, [r4, #16]
0x67c9e9dc:    ldr    r7, [r1]
0x67c9e9e0:    ldr    r0, [r1, #4]
0x67c9e9e4:    str    r0, [sp]
0x67c9e9e8:    ldr    r8, [r1, #8]
0x67c9e9ec:    ldr    r9, [r1, #12]
0x67c9e9f0:    ldr    r0, [r1, #16]
0x67c9e9f4:    str    r0, [sp, #8]
0x67c9e9f8:    ldr    r10, [r1, #20]
0x67c9e9fc:    ldr    r2, [pc, #584]    ; 0x67c9ec4c   <------ jit code calls a jvm function stored in this address
0x67c9ea00:    mov    r0, r1
0x67c9ea04:    bx    r2 <---------------------------   problem!  should have been blx!

(gdb) x 0x67c9ec4c
0x67c9ec4c:    0x40836d9c
(gdb) x 0x40836d9c
0x40836d9c <_ZN13SharedRuntime17OSR_migration_endEPi>:    0xe92d41f0
(gdb)

so lets check out _ZN13SharedRuntime17OSR_migration_endEPi

0x40836d9c <_ZN13SharedRuntime17OSR_migration_endEPi+0>:    push    {r4, r5, r6, r7, r8, lr}    <------  lr are backed up..  but bx did not update lr..
0x40836da0 <_ZN13SharedRuntime17OSR_migration_endEPi+4>:    ldr    r4, [pc, #284]    ; 0x40836ec4 <_ZN13SharedRuntime17OSR_migration_endEPi+296>
0x40836da4 <_ZN13SharedRuntime17OSR_migration_endEPi+8>:    ldr    r7, [pc, #284]    ; 0x40836ec8 <_ZN13SharedRuntime17OSR_migration_endEPi+300>
0x40836da8 <_ZN13SharedRuntime17OSR_migration_endEPi+12>:    ldr    r6, [pc, #284]    ; 0x40836ecc <_ZN13SharedRuntime17OSR_migration_endEPi+304>
0x40836dac <_ZN13SharedRuntime17OSR_migration_endEPi+16>:    add    r4, pc, r4
0x40836db0 <_ZN13SharedRuntime17OSR_migration_endEPi+20>:    ldr    r12, [r4, r7]
0x40836db4 <_ZN13SharedRuntime17OSR_migration_endEPi+24>:    ldr    r1, [r4, r6]
0x40836db8 <_ZN13SharedRuntime17OSR_migration_endEPi+28>:    ldr    r5, [r12]
0x40836dbc <_ZN13SharedRuntime17OSR_migration_endEPi+32>:    ldrb    r2, [r1]
0x40836dc0 <_ZN13SharedRuntime17OSR_migration_endEPi+36>:    add    r3, r5, #1    ; 0x1
0x40836dc4 <_ZN13SharedRuntime17OSR_migration_endEPi+40>:    cmp    r2, #0    ; 0x0
0x40836dc8 <_ZN13SharedRuntime17OSR_migration_endEPi+44>:    sub    sp, sp, #24    ; 0x18
0x40836dcc <_ZN13SharedRuntime17OSR_migration_endEPi+48>:    str    r3, [r12]
0x40836dd0 <_ZN13SharedRuntime17OSR_migration_endEPi+52>:    mov    r7, r0
0x40836dd4 <_ZN13SharedRuntime17OSR_migration_endEPi+56>:    bne 0x40836e74 <_ZN13SharedRuntime17OSR_migration_endEPi+216>
0x40836dd8 <_ZN13SharedRuntime17OSR_migration_endEPi+60>:    ldr    r2, [pc, #240]    ; 0x40836ed0 <_ZN13SharedRuntime17OSR_migration_endEPi+308>
0x40836ddc <_ZN13SharedRuntime17OSR_migration_endEPi+64>:    ldr    r12, [r4, r2]
0x40836de0 <_ZN13SharedRuntime17OSR_migration_endEPi+68>:    ldrb    r3, [r12]
0x40836de4 <_ZN13SharedRuntime17OSR_migration_endEPi+72>:    cmp    r3, #0    ; 0x0
0x40836de8 <_ZN13SharedRuntime17OSR_migration_endEPi+76>:    beq 0x40836e20 <_ZN13SharedRuntime17OSR_migration_endEPi+132>
0x40836dec <_ZN13SharedRuntime17OSR_migration_endEPi+80>:    ldr    r6, [pc, #224]    ; 0x40836ed4 <_ZN13SharedRuntime17OSR_migration_endEPi+312>
0x40836df0 <_ZN13SharedRuntime17OSR_migration_endEPi+84>:    ldr    r5, [r4, r6]
0x40836df4 <_ZN13SharedRuntime17OSR_migration_endEPi+88>:    add    r0, r4, r6
0x40836df8 <_ZN13SharedRuntime17OSR_migration_endEPi+92>:    tst    r5, #1    ; 0x1
0x40836dfc <_ZN13SharedRuntime17OSR_migration_endEPi+96>:    beq 0x40836e8c <_ZN13SharedRuntime17OSR_migration_endEPi+240>
0x40836e00 <_ZN13SharedRuntime17OSR_migration_endEPi+100>:    ldr    r5, [pc, #208]    ; 0x40836ed8 <_ZN13SharedRuntime17OSR_migration_endEPi+316>
0x40836e04 <_ZN13SharedRuntime17OSR_migration_endEPi+104>:    ldr    r3, [r4, r5]
0x40836e08 <_ZN13SharedRuntime17OSR_migration_endEPi+108>:    cmp    r3, #0    ; 0x0
0x40836e0c <_ZN13SharedRuntime17OSR_migration_endEPi+112>:    movne r0, r3
0x40836e10 <_ZN13SharedRuntime17OSR_migration_endEPi+116>:    ldrne r6, [r3]
0x40836e14 <_ZN13SharedRuntime17OSR_migration_endEPi+120>:    ldrne r12, [r6, #16]
0x40836e18 <_ZN13SharedRuntime17OSR_migration_endEPi+124>:    movne lr, pc
0x40836e1c <_ZN13SharedRuntime17OSR_migration_endEPi+128>:    bxne    r12
0x40836e20 <_ZN13SharedRuntime17OSR_migration_endEPi+132>:    add    r6, sp, #20    ; 0x14
0x40836e24 <_ZN13SharedRuntime17OSR_migration_endEPi+136>:    mov    r0, r6
0x40836e28 <_ZN13SharedRuntime17OSR_migration_endEPi+140>:    bl 0x40596c84 <NoHandleMark>
0x40836e2c <_ZN13SharedRuntime17OSR_migration_endEPi+144>:    mov    r0, sp
0x40836e30 <_ZN13SharedRuntime17OSR_migration_endEPi+148>:    bl 0x4057909c <JRT_Leaf_Verifier>
0x40836e34 <_ZN13SharedRuntime17OSR_migration_endEPi+152>:    ldr    r3, [pc, #160]    ; 0x40836edc <_ZN13SharedRuntime17OSR_migration_endEPi+320>
0x40836e38 <_ZN13SharedRuntime17OSR_migration_endEPi+156>:    mov    r5, sp
0x40836e3c <_ZN13SharedRuntime17OSR_migration_endEPi+160>:    ldr r12, [r4, r3]
0x40836e40 <_ZN13SharedRuntime17OSR_migration_endEPi+164>:    ldrb r0, [r12]
0x40836e44 <_ZN13SharedRuntime17OSR_migration_endEPi+168>:    cmp    r0, #0    ; 0x0
0x40836e48 <_ZN13SharedRuntime17OSR_migration_endEPi+172>:    movne r0, r7
0x40836e4c <_ZN13SharedRuntime17OSR_migration_endEPi+176>:    blne 0x4039b20c <_Z15trace_heap_freePv>
0x40836e50 <_ZN13SharedRuntime17OSR_migration_endEPi+180>:    mov    r0, r7
0x40836e54 <_ZN13SharedRuntime17OSR_migration_endEPi+184>:    bl 0x407b6a94 <_ZN2os4freeEPv>
0x40836e58 <_ZN13SharedRuntime17OSR_migration_endEPi+188>:    mov    r0, sp
0x40836e5c <_ZN13SharedRuntime17OSR_migration_endEPi+192>:    bl 0x40578c5c <~JRT_Leaf_Verifier>
0x40836e60 <_ZN13SharedRuntime17OSR_migration_endEPi+196>:    mov    r0, r6
0x40836e64 <_ZN13SharedRuntime17OSR_migration_endEPi+200>:    bl 0x40596b04 <~NoHandleMark>
0x40836e68 <_ZN13SharedRuntime17OSR_migration_endEPi+204>:    add    sp, sp, #24    ; 0x18
0x40836e6c <_ZN13SharedRuntime17OSR_migration_endEPi+208>:    pop {r4, r5, r6, r7, r8, lr}
0x40836e70 <_ZN13SharedRuntime17OSR_migration_endEPi+212>:    bx    lr <------  and woho. lets enjoy a trip to garbage memory!

So when the function that the jit calls returns we find ourself eating
garbage memory.

So the small hack fixed this issue quite well but broke armv4t compatibility for the moment.

My next task would be to fix this properly in LLVM.

September 24, 2009

4 running ARM developement boards are hidden in this picture, can you find them?

The ARM developement gear that I have access to have been rapidly upgraded during the past months and I do no longer experience any memory or storage bottlenecks. The boards in the picture are running native compiles, running checks and debugging sessions 24/7 to help mankind to deliver the next generation OpenJDK ARM builds using Zero and Shark and makes sure the software that will power our future to be as stable and fast as possible. I belive these small cool and silent boards will help us save the environment as well since this new energy efficient technology enables the transition to a information society where the whole IT infrastructure can become self supplying from simple solar power panels! Thanks to free-software like Linux and GNU , that are running on these small power efficient computers, that makes the new power efficient green computing future possible!

« Newer PostsOlder Posts »

Powered by WordPress