This how-to provides some tips for running Vert.x applications with OpenJ9, an alternative Java Virtual Machine built on top of OpenJDK that is gentle on memory usage.
Vert.x is a resource-efficient toolkit for building all kinds of modern distributed applications, and OpenJ9 is a resource-efficient runtime that is well-suited for virtualized and containerized deployments.
-
You will build a simple microservice that computes the sum of 2 numbers through an HTTP JSON endpoint.
-
We will look at the options for improving startup time with OpenJ9.
-
We will measure the resident set size memory footprint on OpenJ9 under a workload.
-
You will build a Docker image for the microservice and OpenJ9.
-
We will discuss how to improve the startup time of Docker containers and how to tune OpenJ9 in that environment.
-
A text editor or IDE
-
Java 21
-
OpenJ9
-
Maven or Gradle
-
Docker
-
Locust to generate some workload
Note
|
Eclipse Foundation projects are not permitted to distribute, market or promote JDK binaries unless they have passed a Java SE Technology Compatibility Kit licensed from Oracle, to which the Eclipse OpenJ9 project does not currently have access. You can either build your own Eclipse OpenJ9 binary, or download an IBM Semeru runtime. |
The service exposes an HTTP server and fits within a single Java class:
link:src/main/java/io/vertx/howtos/openj9/Main.java[role=include]
We can run the service:
$ ./gradlew run
and then test it with HTTPie:
$ http :8080/sum a:=1 b:=2 HTTP/1.1 200 OK Content-Type: application/json content-length: 9 { "sum": 3 } $
We can also build a JAR archive will all dependencies bundled, then execute it:
$ ./gradlew shadowJar $ java -jar build/libs/openj9-howto-all.jar
The microservice reports the startup time by measuring the time between the main
method entry, and the callback notification when the HTTP server has started.
We can do a few runs of java -jar build/libs/openj9-howto-all.jar
and pick the best time.
On my machine the best I got was 311ms.
OpenJ9 offers both an ahead-of-time compiler and a class data shared cache for improving startup time as well as reducing memory consumption. The first run is typical costly, but then all subsequent runs will benefit from the caches, which are also regularly updated.
The relevant OpenJ9 flags are the following:
-
-Xshareclasses
: enable class sharing -
-Xshareclasses:name=NAME
: a name for the cache, typically one per-application -
-Xshareclasses:cacheDir=DIR
: a folder for storing the cache files
Let us have a few run of:
$ java -Xshareclasses -Xshareclasses:name=sum -Xshareclasses:cacheDir=_cache -jar build/libs/openj9-howto-all.jar
On my machine the first run takes 457ms, which is "much" more than 311ms! However, the next runs are all near 130ms, with the best score of 112ms which is very good for a JVM application start time.
Let us now measure the memory usage of the microservice with OpenJ9 and compare with OpenJDK.
Warning
|
This is not a rigorous benchmark. You have been warned 😉 |
We are using Locust to generate some workload.
The locustfile.py
file contains the code to simulate users that perform sums of random numbers:
link:locustfile.py[role=include]
We can then run locust
, and connect to http://localhost:8089 to start a test.
Let us simulate 100 users with a 10 new users per second hatch rate.
This gives us about 130 requests per second.
The Quarkus team has a good guide on measuring RSS.
On Linux you can use either ps
or pmap
to measure RSS, while on macOS ps
will do.
I am using macOS, so once I have the process id of a running application I can get its RSS as follows:
$ ps x -o pid,rss,command -p 66820 PID RSS COMMAND 66820 124032 java -jar build/libs/openj9-howto-all.jar
For all measures we start Locust and let it warm up the microservice.
After a minute we reset the stats and restart a test, then look into RSS and 99% latency.
We will try to run the application with no tuning and then by limiting the maximum heap size (see the -Xmx
flag).
With OpenJDK 21 and no tuning:
-
RSS: ~143 MB
-
99% latency: 1ms
With Semeru 21 and no tuning:
-
RSS: ~90 MB
-
99% latency: 1ms
OpenJ9 is clearly very efficient with respect to memory consumption, without compromising the latency.
Tip
|
As usual take these numbers with a grain of salt and perform your own measures on your own services with a workload that is appropriate to your usages. |
Ok so we have seen how gentle OpenJ9 was on memory even without tuning. Let us now package the microservice as a Docker image.
Here is the Dockerfile
you can use:
link:Dockerfile[role=include]
You can note:
-
-Xvirtualized
is a flag for virtualized / container environments so OpenJ9 reduces CPU consumption when idle -
/app/_cache
is a volume that will have to be mounted for containers to share the OpenJ9 classes cache.
The image can be built as in:
$ docker build . -t openj9-app
We can then create containers from the image:
$ docker run -it --rm -v /tmp/_cache:/app/_cache -p 8080:8080 openj9-app
Again the first container is slower to start, while the next ones benefit from the cache.
Tip
|
On some platforms
|
-
We wrote a microservice with Vert.x.
-
We ran this microservice on OpenJ9.
-
We improved startup time using class data sharing.
-
We put the microservice under some workload, then checked that the memory footprint remained low with OpenJ9 compared to OpenJDK with HotSpot.
-
We built a Docker image with OpenJ9, class data sharing for fast container boot time and diminished CPU usage when idle.