zafena development

February 6, 2013

I have just returned from FOSDEM where we Julen Gouesse, Sven Gothel and Xerxes Rånby presented a JogAmp freejava love talk with some live demonstrations running hardware accelerated on x86 laptop, android/meego phone/tables and GNU/LInux systems such as the AC100 and RaspberryPi.
Slides, Video and Teaser from the JogAmp FOSDEM freejava love talk are now online:

If you want to design a game then we recommend you to use JogAmp indirect through a game engine library such as libgdx or JMonkeyEngine 3. We have a forum thread inside the JogAmp community where we work on improving engine support for embedded devices such as the Raspberry Pi using said engines. By using a game engine will get you up to speed developing professional games that can be run across all devices and formfactors.
The video and teaser recordings also includes footage of the JMonkeyEngine 3 JOGL 2 backend initialized by Julien Gouesse that we for time reason never managed to show during the strict 40min talk and live demo at FOSDEM.

Inside the FOSDEM talk we ran the open-source libgdx game pax-britannica using the new libgdx JogAmp JOGL 2 port initialized by Julien Gouesse in combination with the new hardfloat fixed CACAO ARM JVM found in the new IcedTea 6 1.12 release on the Raspberry Pi.
we also ran the JogAmp Jake2 port, a port done by Sven Gothel, using the armhf JamVM from IcedTea 7 on the ac100.
Both opensource games of course rocked running using freejava!
The point that we wanted to show is that if you start to use the dedicated love using the media accelerator found in all new devices your java applications rendering will run *equally* fast regardless of the JVM implementation, since the rendering is then performed by the GPU instead of the CPU.

For demonstation purposes: I had to extend the libgdx backend with a custom mouse pointer in order for otherwise touchscreen oriented games such as pax-britannica to work on the Raspberry Pi from console, the reason why this is needed is because there is no highlevel compositing windowmanager running on the RaspPi adding the overlay mousepointer for you like what you are custom to see when running desktop applications using X11. This libgdx RaspberryPi mouse pointer branch allowed me to test all touch oriented libgdx games and demos from console using a mouse input device.

While we know that compilation of custom JVM can be tricky I have prepared a RaspberryPi armhf CACAO that you can use as an drop in replacement into any OpenJDK 6 installation (/usr/lib/jvm/java-6-openjdk-armhf/jre/lib/arm/server/ This was built using the IcedTea 6 1.12 release from inside a Raspbian chroot.
For JamVM simply install the openjdk-7-jdk package and run it using java -jamvm its already built and packaged by the Raspbian team and work great! The new CACAO armhf is found here:

The Raspberry Pi Rasbian armhf distribution have now packaged IcedTea 6 1.12.1 and included it in the distribution, this means that you can test the new armhf CACAO JVM and JamVM JVM by simply installing the openjdk-6-jdk package.

sudo apt-get update
sudo apt-get install openjdk-6-jdk
java -jamvm -version
java -cacao -version

Also a great KUDOS for Qun, our dear camera-woman!

Cheers and enjoy the love!
On behalf of the JogAmp community - Xerxes

February 15, 2012

Jim Connors at Oracle posed a interesting valentines gift, a compare of the latest open-source OpenJDK ARM JVM inside IcedTea6, 1.12pre, HEAD against their closed source Hotspot c1 and c2 implementations.

The Oracle blog antispam system in use...

I would have liked to comment directly on your blog but your spam system kept me at bay so i posed my reply to you here instead ;)

The OpenJDK Zero *mixed-mode* JVM used in Jims compare includes the now re-maintained ARM Thumb2 JIT and assembler interpreter port that got re-introduced in the IcedTea6-1.11 release.
Many of the OpenJDK JVM like CACAO and JamVM are by design tuned for embedded and client use and thus show strength in both low memory overhead and fast startup time.

When testing JVM performance on ARM its important to remember that the default optimization settings used by the compilers to build the JVM do matter.

The Debian 6.0.4 squeeze "armel" distribution use ARMv4t optimization by default. This low optimization level enable the Debian built packages run on as many kind of different ARM broads and CPU's as possible. The trade-off are that you basically disable all VFP, floating point, optimizations and make synchronization code slower by forcing the JVM to call the Linux kernel helper instead of using faster ARMv7 atomic instructions directly.

To give OpenJDK JVM a better match i would suggest re-running the benchmark using OpenJDK built on top of Debian wheezy "armhf", Ubuntu Precise "armhf" or Fedora F15 that by default optimize for the ARMv7 thumb2 instruction-set and make use of the VFP unit inside the CPU, also the "armhf" ABI allows better argument passing between library functions inside the CPU VFP registers. Two OpenJDK JVM, JamVM and Zero,  are already updated to support the new "armhf" hardfloat ABI.

You could also choose to run this benchmark using OpenJDK JVMs built using the Ubuntu Precise "armel" tool-chains that still use the legacy soft-float ABI while still adding ARMv7 Thumb2 and VFP optimizations. All OpenJDK JVM tested in this compare would run better by simply using a higher optimization level during the build.

All in all thank you Jim to give an introduction to the ARM OpenJDK porting effort, I look forward to the follow up article where all the JVM makers have picked their favourite GCC/Clang/Foo compiler options and suitable matching compile flags. One idea are to create a OpenJDK binary release with a custom OpenJDK installer that would ease testing of tuned OpenJDK JVM implementations.

Cheers, Xerxes

December 2, 2011

I have been following the CACAO JVM development on ARM since 2008, back then CACAO was one of the first alternative JVM, to be used instead of Hotspot in combination with the OpenJDK 6 class libraries.

CACAO history dates back to 1997-1998 when CACAO was one of the first JIT compiler to be used instead of SUN's Java JVM interpreter.

Today CACAO are being used in combination with OpenJDK 6 on architectures like ARM, MIPS and PPC where Oracle have not yet released code for a GPL licensed Hotspot JIT. CACAO are popular, see the Debian OpenJDK-6 popularity contest chart where up to 80% of all the Debian OpenJDK 6 JVM users have picked CACAO to be installed. This trend kept on since the beginning of 2009 up to the summer of 2011.

Carpe diem CACAO JVM!

During the summer of 2011 Oracle released OpenJDK 7 and CACAO users started to abandon the JVM in favour for JamVM, the reason "why?" are that CACAO depends on the HPI API that have been removed from the OpenJDK 7 code base. This means that CACAO currently only work in combination with the "classic" OpenJDK 6. The second black cloud for CACAO JVM on ARM was that all major ARM Linux distributions started to move from "armel" towards the new "armhf" ABI something CACAO do not support. JamVM here provided the ARM Linux distributions and users with a stable and future proof alternative.

If we for a moment forget about the future and focus on today CACAO are in great shape when built from CACAO hg HEAD.

  • CACAO are FAST,
  • CACAO are stable, thanks to Stefan Ring who have been diligent on fixing bugs found in the CACAO JIT codegen.
  • CACAO are fresh, the current CACAO hg HEAD contains the rewritten, "still unreleased" C++ version of CACAO its a totally different JVM compared to the last C based release of CACAO 0.99.4.

If you want to experience the CACAO JVM in its finest the do run the latest development version of CACAO in combination with OpenJDK 6, built using the current IcedTea6 head. Run it on ARM "armel", PPC or MIPS and experience a fast responsive JVM burning brighter than ever before!

February 25, 2010

During the past months i have seen some really cool stuff done using small powerefficient ARM computers and OpenJDK.

SimpleSimon connects

Simple Simon PT connected to a hospital laboratory system by using a powerefficient plugcomputer and displaylink usb screen. All powered by OpenJDK

Simple Simon PT connects:

This project hooks up a battery powered laboratory coagulation device, a Simple Simon PT reader to standard hospital laboratory system using a ASTM-1394-1397 / LIS2-A2 connectivity over ethernet. A small ARM based plugcomputer does all data message processing and communication. User interraction are performed by using the Simon reader and a usb-barcode reader to enter laboratory identification. Optionally can a usb-touch-screen be connected for improved user feedback, by displaying charts using JFreeChart, to show and give a better understanding of the coagulation process.

Powerconsumption tops at 15W with the USB screen attached and 6W without. All running silent without any moving parts!

Shark linked against the shared

Shark linked against the shared

Shark linked against dynamic LLVM .so library

Earlier today I got Shark linked against a shared generated by using LLVM 2.7svn trunk. It work by simply building LLVM using configure --enable-shared --enable-optimized --disable-assertions and then tweak the Icedtea6 main Makefiles to use the shared library during liking:
Replace the line
LLVM_LIBS = -lLLVMX86Disassembler -lLLVMX86AsmParser -lLLVMMCParser -lLLVMX86AsmPrinter -lLLVMX86CodeGen -lLLVMSelectionDAG -lLLVMAsmPrinter -lLLVMX86Info -lLLVMJIT -lLLVMExecutionEngine -lLLVMCodeGen -lLLVMScalarOpts -lLLVMInstCombine -lLLVMTransformUtils -lLLVMipa -lLLVMAnalysis -lLLVMTarget -lLLVMMC -lLLVMCore -lLLVMSupport -lLLVMSystem
LLVM_LIBS = -lLLVM-2.7svn
in the main icedtea6/Makefile and then build Icedtea6 normally, Shark currently builds and works right out of the box when using a LLVM release build!

A cool thing by building shark against the shared library are that you can switch the LLVM JIT that Shark uses from running with or without assertions, debug code, and various extra optimizations by simply replacing the /usr/local/lib/ file with what you want. Linking time during shark builds and shark footprint are impressively smaller as well. Im really happy to see this functionallity in LLVM 2.7!

The LLVM 2.7 code freeze before the 2.7 release happens in about 1.5 weeks from now and i will stay busy for some days to observe and polish the current LLVM svn trunk to be usable with openjdk-6-shark.

Edward Nevill created a ARM Jazelle RTC Thumb2 JIT reference implementation

Meanwhile I have been busy taming Sharks a new kind of Thumb2 JIT have emerged built by Edward Nevill of Cambridge Software Labs! The new Tumb2JIT have been committed into the Icedtea6 trunk and it are a working implementation of Jazelle RTC to be used by ARM Cortex-A8+ class CPUs. It wonderfull that this have been released as free software, Wow!,

Suddenly we got three different JITs to use on ARM with OpenJDK: Cacao, Shark and T2. An opurtunity has emerged to tier them and so I did. Here comes the raw "truth" produced by Caffeine Mark 3! This will probably be the last time i will show off any Caffeine Mark 3 benchmark since it really dont give justice on real world client applications where responsiveness are more crucial than top runtime speed, nevertheless benchmarking using CM30 have always felt fun so here we go: All benchmarks running using a Sharp PC-Z1 Cortex-A8 Mobile internet tool.

Tier between Edwards Thumb 2 JIT , Shark LLVM JIT and Cacao JIT: All running on OpenJDK 6 ARM

Tier between Edwards Thumb 2 JIT , Shark LLVM JIT and Cacao JIT: All running on a ARM Sharp PC-Z1 Mobile internet tool smartbook using OpenJDK 6 compiled with Icedtea6.

This new T2 JIT's main strenght are reduced jitting time, it basically cuts all jtting time to almost zero and client applications on ARM finnaly runs from tick one. This thumb2 jit makes a really nice java applet browser experience with about 15 seconds first applet startuptime on a ARM smartbook and and all usable instantly after being loaded.
A small 1min 12seconds .3gp movie displaying some java applets running on the Sharp PC-Z1 featuring the new thumb2jit from Icedtea6

Cheers and have a great day!

October 9, 2009

The results displayed are generated using the debug build of shark, with assertions, that I made on the 6th of October compared against release builds of the pure zero cpp interpreter and the optimised zero assembler interpreter, both from Icedtea6-1.6.1.

Did we gain anything from having a jumping shark? By taking a quick peek at the graph you can quite quickly spot some 15X+ speed improvements so yes! yeah!, the shark JIT indeed got some sharp toots in its yaws! I am quite delighted to see that some parts of the benchmark got a 25x+ speed boost!
There are still some rough spots that can be identified that of course needs some polishing, so let me share some ideas on how to make the Shark JIT on ARM really shine.

As can be seen in the chart shark uses the zero cpp interpreter before the methods are jited and the extra overhead on running the JIT causes the zero interpreter to run slower during program launch on a single core ARM cpu, this penalty are removed once the initial warm-up have complete (somewhere around 300 to 500 compiled methods). New multi-core ARM Cortex-A9 CPU do not have this penalty since the compiler process are run in a separate thread and can be scheduled on a CPU of its own.

Some quick ways to fix the warm-up issue:
0. First of all I want to state that these results where generated using a debug build of shark, I have a build machine working on creating a release builds as I type so hopefully I will be able to generate some improved benchmark scores in the near future, especially to deal with the warm-up penalty.
1. A quick way to reduce the warmup penalty would be to make shark able to use the new assembler optimized interpreter instead of the pure cpp interpreter found in Icedtea6-1.6.1 and this could become a reality quite soon since they both share the same in memory structures. Also by using the new assembler optimizations would make Shark JIT more usable as a client JVM where initial GUI performance are crucial, and in this GUI area the assembler interpreter really shine.
2. I have also identified some parts in the LLVM JIT that could be quickly improved to make the LLVM JIT jitting faster. Basically I want to make the LLVM tablegen generate better lookuptables to speed up the instruction lowering, currently shark spends quite a large deal of time here running the LLVM ExecutionEngine::getPointerToFunction(). I think by generating some improved formatter code for the LLVM tablegen backend could quite quickly improve the autogenerated .inc files used for the target instruction lowering.
3. Examine the posibility to implement a bytecode cache in Shark to jumpstart the JIT even further. By making the JIT able to load precalculated LLVM IR or in memory representations of the methods would reduce some of the JIT overhead on program launch.
4. Add a PassManager framework to Shark to simplify the LLVM IR before it reaches the JIT. The tricky part are to select what passes to use and in what order to use them. If done correctly then this might both lower jitting time and improve the generated machine code quality.

August 29, 2008

Processing; a lightweight Java IDE targeted for creation of interactive computer art can now be run on embedded ARM hardware thanks to the Icedtea, Cacao, Classpath and OpenJDK projects!

Check out the full Processing IDE is running below on a embedded Fedora 8 ARM Linux system ! It is running using OpenJDK6 with CACAO vm, compiled using the classpath bootstrapped Icedtea buildsystem. The ARM hardware is a ATMEL AT91SAM9263-EK devkit with a 200mhz ARMv5tejl cpu and 64Mb ram.

Good news for embedded ARM users; The Debian armel port for ARM are now shipping prebuilt openjdk packages that can be installed by simply running:

apt-get install openjdk-6-jdk

OpenJDK using the CACAO JIT can also be obtained from the experimental (sid) repositorys and be installed by running:

apt-get install cacao-oj6-jdk

I look forward to see all the possibilities with interactive art created by using embedded ARM hardware, Linux, OpenJDK and Processing!

Ps. the ARM cpu is the same kind of cpu found in most cellphones including the iPhone!


Xerxes Rånby

August 21, 2008

OpenJDK6 got sucessfully compiled using CACAO JIT jvm for ARMv5tejl EABI softfloat using the Icedtea6 buildsystem! This compile was made using mercurial sources, Icedtea6 changeset: 1013:a469b20018d9 and CACAO changeset: 8656:140bc48ab360.

Icedtea is served!

java version "1.6.0_0"
OpenJDK Runtime Environment (build 1.6.0_0-b11)
CACAO (build 1.1.0pre, JIT mode)

The footprint of the compiled CACAO/OpenJDK6 j2re-image is 84mb.
I look forward to the next CACAO release "1.0.0" and expect it to be a smasher for embedded ARM java developement! The release fixes some rather tricky bugs PR84 and PR99 found in CACAO 0.99.3 that could trigger sporadic crashes during the OpenJDK class compilation and when running java programs on ARM systems.

Some output from the running jvm.
For some reason red and blue gets swapped when running on my ARM displays framebuffer, red and blue looks fine if i run applications remote using "ssh -X".

The best of enterprise business applications OneSlime on arm!

Voxel speedtest.

Powered by WordPress