zafena development

February 25, 2010

During the past months i have seen some really cool stuff done using small powerefficient ARM computers and OpenJDK.

SimpleSimon connects

Simple Simon PT connected to a hospital laboratory system by using a powerefficient plugcomputer and displaylink usb screen. All powered by OpenJDK

Simple Simon PT connects:

This project hooks up a battery powered laboratory coagulation device, a Simple Simon PT reader to standard hospital laboratory system using a ASTM-1394-1397 / LIS2-A2 connectivity over ethernet. A small ARM based plugcomputer does all data message processing and communication. User interraction are performed by using the Simon reader and a usb-barcode reader to enter laboratory identification. Optionally can a usb-touch-screen be connected for improved user feedback, by displaying charts using JFreeChart, to show and give a better understanding of the coagulation process.

Powerconsumption tops at 15W with the USB screen attached and 6W without. All running silent without any moving parts!

Shark linked against the shared libLLVM-2.7svn.so

Shark linked against the shared libLLVM-2.7svn.so

Shark linked against dynamic LLVM .so library

Earlier today I got Shark linked against a shared libLLVM-2.7svn.so generated by using LLVM 2.7svn trunk. It work by simply building LLVM using configure –enable-shared –enable-optimized –disable-assertions and then tweak the Icedtea6 main Makefiles to use the shared library during liking:
Replace the line
LLVM_LIBS = -lLLVMX86Disassembler -lLLVMX86AsmParser -lLLVMMCParser -lLLVMX86AsmPrinter -lLLVMX86CodeGen -lLLVMSelectionDAG -lLLVMAsmPrinter -lLLVMX86Info -lLLVMJIT -lLLVMExecutionEngine -lLLVMCodeGen -lLLVMScalarOpts -lLLVMInstCombine -lLLVMTransformUtils -lLLVMipa -lLLVMAnalysis -lLLVMTarget -lLLVMMC -lLLVMCore -lLLVMSupport -lLLVMSystem
with
LLVM_LIBS = -lLLVM-2.7svn
in the main icedtea6/Makefile and then build Icedtea6 normally, Shark currently builds and works right out of the box when using a LLVM release build!

A cool thing by building shark against the shared library are that you can switch the LLVM JIT that Shark uses from running with or without assertions, debug code, and various extra optimizations by simply replacing the /usr/local/lib/libLLVM-2.7svn.so file with what you want. Linking time during shark builds and shark footprint are impressively smaller as well. Im really happy to see this functionallity in LLVM 2.7!

The LLVM 2.7 code freeze before the 2.7 release happens in about 1.5 weeks from now and i will stay busy for some days to observe and polish the current LLVM svn trunk to be usable with openjdk-6-shark.

Edward Nevill created a ARM Jazelle RTC Thumb2 JIT reference implementation

Meanwhile I have been busy taming Sharks a new kind of Thumb2 JIT have emerged built by Edward Nevill of Cambridge Software Labs! The new Tumb2JIT have been committed into the Icedtea6 trunk and it are a working implementation of Jazelle RTC to be used by ARM Cortex-A8+ class CPUs. It wonderfull that this have been released as free software, Wow!,

Suddenly we got three different JITs to use on ARM with OpenJDK: Cacao, Shark and T2. An opurtunity has emerged to tier them and so I did. Here comes the raw “truth” produced by Caffeine Mark 3! This will probably be the last time i will show off any Caffeine Mark 3 benchmark since it really dont give justice on real world client applications where responsiveness are more crucial than top runtime speed, nevertheless benchmarking using CM30 have always felt fun so here we go: All benchmarks running using a Sharp PC-Z1 Cortex-A8 Mobile internet tool.

Tier between Edwards Thumb 2 JIT , Shark LLVM JIT and Cacao JIT: All running on OpenJDK 6 ARM

Tier between Edwards Thumb 2 JIT , Shark LLVM JIT and Cacao JIT: All running on a ARM Sharp PC-Z1 Mobile internet tool smartbook using OpenJDK 6 compiled with Icedtea6.

This new T2 JIT’s main strenght are reduced jitting time, it basically cuts all jtting time to almost zero and client applications on ARM finnaly runs from tick one. This thumb2 jit makes a really nice java applet browser experience with about 15 seconds first applet startuptime on a ARM smartbook and and all usable instantly after being loaded.
A small 1min 12seconds .3gp movie displaying some java applets running on the Sharp PC-Z1 featuring the new thumb2jit from Icedtea6

Cheers and have a great day!
Xerxes

October 6, 2009

picture of the day!

The picture that made my day!

Ok.. so what happened?

xerxes@babbage-karmic:/wd/icedtea6/openjdk/build/linux-arm/bin$ ./java -version
java version "1.6.0_0"
OpenJDK Runtime Environment (IcedTea6 1.7pre-r2a3725ce72d4) (build 1.6.0_0-b16)
OpenJDK Shark VM (build 14.0-b16-product, mixed mode)

xerxes@babbage-karmic:/wd/icedtea6/openjdk/build/linux-arm/bin$ cat /proc/cpuinfo
Processor    : ARMv7 Processor rev 1 (v7l)
BogoMIPS    : 799.53
Features    : swp half thumb fastmult vfp edsp
CPU implementer    : 0x41
CPU architecture: 7
CPU variant    : 0x2
CPU part    : 0xc08
CPU revision    : 1
Hardware    : Freescale MX51 Babbage Board
Revision    : 51011
Serial        : 0000000000000000

xerxes@babbage-karmic:/wd/llvm$ svn info
URL: http://llvm.org/svn/llvm-project/llvm/trunk
Repository Root: http://llvm.org/svn/llvm-project
Repository UUID: 91177308-0d34-0410-b5e6-96231b3b80d8
Revision: 82896
Node Kind: directory
Schedule: normal
Last Changed Author: edwin
Last Changed Rev: 82896
Last Changed Date: 2009-09-27 11:08:03 +0000 (Sun, 27 Sep 2009)

xerxes@babbage-karmic:/wd/llvm$ quilt diff
Index: llvm/lib/Target/ARM/ARMInstrInfo.td
===================================================================
--- llvm.orig/lib/Target/ARM/ARMInstrInfo.td    2009-10-06 12:35:26.000000000 +0000
+++ llvm/lib/Target/ARM/ARMInstrInfo.td    2009-10-06 12:36:03.000000000 +0000
@@ -645,7 +645,7 @@
 IIC_Br, "mov lr, pc\n\tbx $func",
 [(ARMcall_nolink GPR:$func)]>,
 Requires<[IsARM, IsNotDarwin]> {
-    let Inst{7-4}   = 0b0001;
+    let Inst{7-4}   = 0b0011;
 let Inst{19-8}  = 0b111111111111;
 let Inst{27-20} = 0b00010010;
 }

The last patch on LLVM are currently a hack. basically it makes LLVM emit ARM BLX instructions instead of BX instructions for ARM::CALL_NOLINK. So why did this little hack make it work?

In order to understand that, one have to find out what made Shark on ARM crash before…

Lets rewind time to some days ago... 

Hi, i have been enjoying myself inside gdb for some days, and I have now at least found the reason why the cpu
ends up in garbage memory when running shark on arm.

The problem can be illustrated like this:

frame manager invokes jited code
entry_zero.hpp:57 invokes jit code at 0x67c9e990

jited code runs
0x67c9e990:    push    {r4, r5, r6, r7, r8, r9, r10, r11, lr}
0x67c9e994:    sub    sp, sp, #12    ; 0xc
0x67c9e998:    ldr    r12, [r3, #756]
0x67c9e99c:    ldr    lr, [r3, #764]
0x67c9e9a0:    sub    r4, lr, #56    ; 0x38
0x67c9e9a4:    cmp    r4, r12
0x67c9e9a8:    bcc    0x67c9ebd0
0x67c9e9ac:    mov    r5, r3
0x67c9e9b0:    str    r2, [sp, #4]
0x67c9e9b4:    mov    r6, r0
0x67c9e9b8:    str    r4, [r5, #764]
0x67c9e9bc:    str    r4, [r4, #20]
0x67c9e9c0:    ldr    r0, [pc, #640]    ; 0x67c9ec48
0x67c9e9c4:    str    r0, [r4, #28]
0x67c9e9c8:    ldr    r0, [r5, #768]
0x67c9e9cc:    str    r0, [r4, #32]
0x67c9e9d0:    add    r0, r4, #32    ; 0x20
0x67c9e9d4:    str    r0, [r5, #768]
0x67c9e9d8:    str    r6, [r4, #16]
0x67c9e9dc:    ldr    r7, [r1]
0x67c9e9e0:    ldr    r0, [r1, #4]
0x67c9e9e4:    str    r0, [sp]
0x67c9e9e8:    ldr    r8, [r1, #8]
0x67c9e9ec:    ldr    r9, [r1, #12]
0x67c9e9f0:    ldr    r0, [r1, #16]
0x67c9e9f4:    str    r0, [sp, #8]
0x67c9e9f8:    ldr    r10, [r1, #20]
0x67c9e9fc:    ldr    r2, [pc, #584]    ; 0x67c9ec4c   <------ jit code calls a jvm function stored in this address
0x67c9ea00:    mov    r0, r1
0x67c9ea04:    bx    r2 <---------------------------   problem!  should have been blx!

(gdb) x 0x67c9ec4c
0x67c9ec4c:    0x40836d9c
(gdb) x 0x40836d9c
0x40836d9c <_ZN13SharedRuntime17OSR_migration_endEPi>:    0xe92d41f0
(gdb)

so lets check out _ZN13SharedRuntime17OSR_migration_endEPi

0x40836d9c <_ZN13SharedRuntime17OSR_migration_endEPi+0>:    push    {r4, r5, r6, r7, r8, lr}    <------  lr are backed up..  but bx did not update lr..
0x40836da0 <_ZN13SharedRuntime17OSR_migration_endEPi+4>:    ldr    r4, [pc, #284]    ; 0x40836ec4 <_ZN13SharedRuntime17OSR_migration_endEPi+296>
0x40836da4 <_ZN13SharedRuntime17OSR_migration_endEPi+8>:    ldr    r7, [pc, #284]    ; 0x40836ec8 <_ZN13SharedRuntime17OSR_migration_endEPi+300>
0x40836da8 <_ZN13SharedRuntime17OSR_migration_endEPi+12>:    ldr    r6, [pc, #284]    ; 0x40836ecc <_ZN13SharedRuntime17OSR_migration_endEPi+304>
0x40836dac <_ZN13SharedRuntime17OSR_migration_endEPi+16>:    add    r4, pc, r4
0x40836db0 <_ZN13SharedRuntime17OSR_migration_endEPi+20>:    ldr    r12, [r4, r7]
0x40836db4 <_ZN13SharedRuntime17OSR_migration_endEPi+24>:    ldr    r1, [r4, r6]
0x40836db8 <_ZN13SharedRuntime17OSR_migration_endEPi+28>:    ldr    r5, [r12]
0x40836dbc <_ZN13SharedRuntime17OSR_migration_endEPi+32>:    ldrb    r2, [r1]
0x40836dc0 <_ZN13SharedRuntime17OSR_migration_endEPi+36>:    add    r3, r5, #1    ; 0x1
0x40836dc4 <_ZN13SharedRuntime17OSR_migration_endEPi+40>:    cmp    r2, #0    ; 0x0
0x40836dc8 <_ZN13SharedRuntime17OSR_migration_endEPi+44>:    sub    sp, sp, #24    ; 0x18
0x40836dcc <_ZN13SharedRuntime17OSR_migration_endEPi+48>:    str    r3, [r12]
0x40836dd0 <_ZN13SharedRuntime17OSR_migration_endEPi+52>:    mov    r7, r0
0x40836dd4 <_ZN13SharedRuntime17OSR_migration_endEPi+56>:    bne 0x40836e74 <_ZN13SharedRuntime17OSR_migration_endEPi+216>
0x40836dd8 <_ZN13SharedRuntime17OSR_migration_endEPi+60>:    ldr    r2, [pc, #240]    ; 0x40836ed0 <_ZN13SharedRuntime17OSR_migration_endEPi+308>
0x40836ddc <_ZN13SharedRuntime17OSR_migration_endEPi+64>:    ldr    r12, [r4, r2]
0x40836de0 <_ZN13SharedRuntime17OSR_migration_endEPi+68>:    ldrb    r3, [r12]
0x40836de4 <_ZN13SharedRuntime17OSR_migration_endEPi+72>:    cmp    r3, #0    ; 0x0
0x40836de8 <_ZN13SharedRuntime17OSR_migration_endEPi+76>:    beq 0x40836e20 <_ZN13SharedRuntime17OSR_migration_endEPi+132>
0x40836dec <_ZN13SharedRuntime17OSR_migration_endEPi+80>:    ldr    r6, [pc, #224]    ; 0x40836ed4 <_ZN13SharedRuntime17OSR_migration_endEPi+312>
0x40836df0 <_ZN13SharedRuntime17OSR_migration_endEPi+84>:    ldr    r5, [r4, r6]
0x40836df4 <_ZN13SharedRuntime17OSR_migration_endEPi+88>:    add    r0, r4, r6
0x40836df8 <_ZN13SharedRuntime17OSR_migration_endEPi+92>:    tst    r5, #1    ; 0x1
0x40836dfc <_ZN13SharedRuntime17OSR_migration_endEPi+96>:    beq 0x40836e8c <_ZN13SharedRuntime17OSR_migration_endEPi+240>
0x40836e00 <_ZN13SharedRuntime17OSR_migration_endEPi+100>:    ldr    r5, [pc, #208]    ; 0x40836ed8 <_ZN13SharedRuntime17OSR_migration_endEPi+316>
0x40836e04 <_ZN13SharedRuntime17OSR_migration_endEPi+104>:    ldr    r3, [r4, r5]
0x40836e08 <_ZN13SharedRuntime17OSR_migration_endEPi+108>:    cmp    r3, #0    ; 0x0
0x40836e0c <_ZN13SharedRuntime17OSR_migration_endEPi+112>:    movne r0, r3
0x40836e10 <_ZN13SharedRuntime17OSR_migration_endEPi+116>:    ldrne r6, [r3]
0x40836e14 <_ZN13SharedRuntime17OSR_migration_endEPi+120>:    ldrne r12, [r6, #16]
0x40836e18 <_ZN13SharedRuntime17OSR_migration_endEPi+124>:    movne lr, pc
0x40836e1c <_ZN13SharedRuntime17OSR_migration_endEPi+128>:    bxne    r12
0x40836e20 <_ZN13SharedRuntime17OSR_migration_endEPi+132>:    add    r6, sp, #20    ; 0x14
0x40836e24 <_ZN13SharedRuntime17OSR_migration_endEPi+136>:    mov    r0, r6
0x40836e28 <_ZN13SharedRuntime17OSR_migration_endEPi+140>:    bl 0x40596c84 <NoHandleMark>
0x40836e2c <_ZN13SharedRuntime17OSR_migration_endEPi+144>:    mov    r0, sp
0x40836e30 <_ZN13SharedRuntime17OSR_migration_endEPi+148>:    bl 0x4057909c <JRT_Leaf_Verifier>
0x40836e34 <_ZN13SharedRuntime17OSR_migration_endEPi+152>:    ldr    r3, [pc, #160]    ; 0x40836edc <_ZN13SharedRuntime17OSR_migration_endEPi+320>
0x40836e38 <_ZN13SharedRuntime17OSR_migration_endEPi+156>:    mov    r5, sp
0x40836e3c <_ZN13SharedRuntime17OSR_migration_endEPi+160>:    ldr r12, [r4, r3]
0x40836e40 <_ZN13SharedRuntime17OSR_migration_endEPi+164>:    ldrb r0, [r12]
0x40836e44 <_ZN13SharedRuntime17OSR_migration_endEPi+168>:    cmp    r0, #0    ; 0x0
0x40836e48 <_ZN13SharedRuntime17OSR_migration_endEPi+172>:    movne r0, r7
0x40836e4c <_ZN13SharedRuntime17OSR_migration_endEPi+176>:    blne 0x4039b20c <_Z15trace_heap_freePv>
0x40836e50 <_ZN13SharedRuntime17OSR_migration_endEPi+180>:    mov    r0, r7
0x40836e54 <_ZN13SharedRuntime17OSR_migration_endEPi+184>:    bl 0x407b6a94 <_ZN2os4freeEPv>
0x40836e58 <_ZN13SharedRuntime17OSR_migration_endEPi+188>:    mov    r0, sp
0x40836e5c <_ZN13SharedRuntime17OSR_migration_endEPi+192>:    bl 0x40578c5c <~JRT_Leaf_Verifier>
0x40836e60 <_ZN13SharedRuntime17OSR_migration_endEPi+196>:    mov    r0, r6
0x40836e64 <_ZN13SharedRuntime17OSR_migration_endEPi+200>:    bl 0x40596b04 <~NoHandleMark>
0x40836e68 <_ZN13SharedRuntime17OSR_migration_endEPi+204>:    add    sp, sp, #24    ; 0x18
0x40836e6c <_ZN13SharedRuntime17OSR_migration_endEPi+208>:    pop {r4, r5, r6, r7, r8, lr}
0x40836e70 <_ZN13SharedRuntime17OSR_migration_endEPi+212>:    bx    lr <------  and woho. lets enjoy a trip to garbage memory!

So when the function that the jit calls returns we find ourself eating
garbage memory.

So the small hack fixed this issue quite well but broke armv4t compatibility for the moment.

My next task would be to fix this properly in LLVM.

September 13, 2009

During the past month i have been running a public llvm-arm-linux buildbot in order to iron out the remaining bugs in the LLVM Execution Engine JIT for ARM.
My goal was to stabilise the LLVM JIT so that it can be used to speed up cool projects like OpenJDK on ARM by fixing all pre-requirements to run Gary Benson’s Shark JIT compiler on top of Zero!

I have been following the LLVM project for about a year and for me to see the following reports from the buildbot makes me jump of joy! It marks a new era, when all cool and silent energy efficient computing on ARM can get JIT accelerated!

  • (Sep 12 21:57) rev=[81669] success #153: build successful
  • (Sep 12 19:17) rev=[81660] success #152: build successful
  • (Sep 12 16:31) rev=[81655] failure #151: failed test-llvm
  • (Sep 12 13:03) rev=[81626] failure #148: failed test-llvm

The next LLVM release 2.6 gets out in about a week (21 of september 2009) and feel I have done my part in the LLVM stabilisation process for ARM, It are now up for the LLVM 2.6 release managers to merge in the patches from the 2.7 svn trunk to the release branch in order to make the LLVM 2.6 release stable on ARM as well.

Life is cool!

May 10, 2009

I have just pushed an update to the Jalimo project that enables the new OpenJDK 6 b16 sourcebundle to be cross-compile-able for embedded devices using Jalimo as a cross-compile layer for Icedtea6.

Using Jalimo you can now cross-compile OpenJDK b16 and have hotspot + zero, hotspot + shark or cacao as the vm built out of the box, simply awesome!

Since shark are using the pre2.6 LLVM sources for its JIT I have also prepared “.bb” build recipes for Openembedded that enables quick cross compilation of LLVM based on the LLVM svn trunk so that Jalimo can make use of them when building shark.

The shark vm are built with assertions enabled in order to produce better debug output for all Jalimo users.

Robert Schuster have been an excellent tutor for me to understand all the quirks of OE-recipes, quirks that in turn helped me to creating all these new nice cross compile recipes for OE and Jalimo. Thank you Robert and thank you for pushing the LLVM recipes into the main OE dev git tree!

Andrew Haley and Gary Benson have helped me enormously to understand the lock-free code using memory-barriers that are part of the zero and shark hotspot implementations. I will keep working on these parts in order to make zero and shark rock solid on ARM before ARM Cortex A9 multi-core CPU’s will be part of every cool and silent computing loving persons pocket.

April 15, 2009

Why going to hell in the first place?
During the past year I have involved myself in the porting effort of OpenJDK to various embedded systems using the Icedtea build system. Unfortunally the embedded development boards that I had access during last year where equiped with inadequate amounts of RAM making it practically impossible to build Icedtea directly on native hardware. I learnt how to workaround this by using emulated hardware with more RAM using QEMU, now the compilation process of Icedtea was doable, yet it still took a week to compile Icedtea6 using QEMU. urgh…

I spent some time in my personal created emulator hell watching time pass by and always feeling behind not working on the current code base like all other freejava hackers. I finally decided something had to be done about this and wanted to break free.

My liberation was made possible by first scouting for a build environment suitable for rapid cross compilation and porting development of OpenJDK for any hardware architecture imaginable, including your toaster. Lucky for me I got to know about the Jalimo project and even got the chance to meet one of its core developers Robert Schuster during FOSDEM09. Robert demonstrated how easy embedded Java development could be using the Jalimo infrastructure and it provided me with all the tools I needed to speedup my compile run and test cycle, I was no longer in need of a emulator instead I could build binarys swiftly using the full power of an affordable IA32 quadcore cpu (with 12Mb of cache) and deploy my work for testing on real hardware for debugging within ohurs not days, simply bliss.

It took some time for me to understand how the four required pieces openembedded, bitbake, jalimo and my own goal could be merged, it turned out they where designed to fit!
First openembedded bitbake and jalimo where all three downloaded from their respecive svn or git trees
The only tricky part was that I (the user) needs to provide configuration files containing the basic information of what kind of target i want to crosscompile against and specify what bundles of recipe I want to use to accomplish this, basically I had to express my mind in a way that bitbake understood.

Once this was setup I could stand in any directory and start the build by simply typing:
bitbake openjdk-6
… or any other software package as long I knew the name of the recipe to use.
Even bitbake openjdk-6-shark builds out of the box!

Within a day playing with bitbake my build computer had downloaded 2gig of sourcecode, eaten several gigs of harddrive space. Rather cool… It had built all cross compilers tools I needed, compiled all dependent librarys, from scratch and all optimised for the target hardware that I wanted to run and debug the binarys on.
The final result was then obtained in the temp directory of my choice.

I have documented my work crosscompilation experiences on the Icedtea wiki:
http://icedtea.classpath.org/wiki/CrossCompileFaq

Cheers and have a great day!
Xerxes

February 28, 2009

Cheers!

Cheers to you from Xerxes and JeNI on this flashy picture!

I have prepared for your amusement some photogallerys from the pictures I took at FOSDEM 09.
# Free Java - FOSDEM 09 photography - Arrival and first day
# Free Java - FOSDEM 09 photography - First night and dinner
# Free Java - FOSDEM 09 photography - Second day
# Free Java - FOSDEM 09 photography - The day after - a great experience

It was thrilling meeting you all during the event, many thanks to SUN for sponsoring the Free Java devroom dinner!
I will fill in more photos for the second day so stay tuned. If you are portraited in any of these photos and would prefer not to please let me know ASAP: xerxes at zafena dot se .

August 29, 2008

Processing; a lightweight Java IDE targeted for creation of interactive computer art can now be run on embedded ARM hardware thanks to the Icedtea, Cacao, Classpath and OpenJDK projects!

Check out the full Processing IDE is running below on a embedded Fedora 8 ARM Linux system ! It is running using OpenJDK6 with CACAO vm, compiled using the classpath bootstrapped Icedtea buildsystem. The ARM hardware is a ATMEL AT91SAM9263-EK devkit with a 200mhz ARMv5tejl cpu and 64Mb ram.

Good news for embedded ARM users; The Debian armel port for ARM are now shipping prebuilt openjdk packages that can be installed by simply running:

apt-get install openjdk-6-jdk

OpenJDK using the CACAO JIT can also be obtained from the experimental (sid) repositorys and be installed by running:

apt-get install cacao-oj6-jdk

I look forward to see all the possibilities with interactive art created by using embedded ARM hardware, Linux, OpenJDK and Processing!

Ps. the ARM cpu is the same kind of cpu found in most cellphones including the iPhone!

Cheers!

Xerxes Rånby

August 21, 2008

OpenJDK6 got sucessfully compiled using CACAO JIT jvm for ARMv5tejl EABI softfloat using the Icedtea6 buildsystem! This compile was made using mercurial sources, Icedtea6 changeset: 1013:a469b20018d9 and CACAO changeset: 8656:140bc48ab360.

Icedtea is served!

java version “1.6.0_0″
OpenJDK Runtime Environment (build 1.6.0_0-b11)
CACAO (build 1.1.0pre, JIT mode)

The footprint of the compiled CACAO/OpenJDK6 j2re-image is 84mb.
I look forward to the next CACAO release “1.0.0″ and expect it to be a smasher for embedded ARM java developement! The release fixes some rather tricky bugs PR84 and PR99 found in CACAO 0.99.3 that could trigger sporadic crashes during the OpenJDK class compilation and when running java programs on ARM systems.

Some output from the running jvm.
For some reason red and blue gets swapped when running on my ARM displays framebuffer, red and blue looks fine if i run applications remote using “ssh -X”.

The best of enterprise business applications OneSlime on arm!


Voxel speedtest.

Powered by WordPress