Now we will rename all references in code, scripts, yaml and packages from movies-base-service to movies-spring-web
Them we will modify our pom.xml to use Java 11 :
Finally we will change our Dockerfile to use the Java 11 base image :
With this changes we will build and deploy our new service :
We could now check that is deployed in our local cluster :
We could check that our service is running correctly with HTTPie :
With this we could do our load test following exactly the procedure that we describe in our previous example, but this time only for 10 concurrent users.
Doing this we will get this results from the JMeter report and Grafana :
Step
C. Users
Pods
ART
TPS
Max CPU
Max MEM
01 base
10
1
570.47
16.73
916
487
02 Java 11
10
1
517.01
18.47
875
918
Java 11
With this we could see that we have a better ART, TPS and Max CPU but we have increase the memory usage.
Creating a optimal JRE
One thing that we haven’t check so far is the size of our image, that may not be important for a performance perspective but it is for how fast our cluster could get install the image, let’s check it out with :
In java 11 we could use jlink to generate a optimize JRE distribution that will contains only what our application needs, for this we will create an script named jlink.sh :
In this script we will uncompress our jar, use jdeps to find witch modules are in use, we will add additionally jdk.crypto.ec since we require to connect via ssl to our PostgreSQL cluster, and then produce a optimize JRE image using jlink.
Now we will modify our Dockerfile to perform a multi-stage build :
In this docker file we create a intermediate container named builder base on the openjdk 11 image, we will use this container to extract our application jar and invoke our jlink.sh script.
Them our Dockerfile will create a image base on the JRE 11 slim and will add our application and replace the provide JRE with the one that we created with jlink in the builder image. We use the base JRE image since our custom JRE requires libraries and configuration that are already present in that image.
We could build our container as before and the check the image size :
Now we will run our performance test as before to get :
Step
C. Users
Pods
ART
TPS
Max CPU
Max MEM
01 base
10
1
570.47
16.73
916
487
02 Java 11
10
1
517.01
18.47
875
918
03 jlink
10
1
518.97
18.41
875
787
jlink
We could see that there is not major changes using a JRE build with jlink, our docker is just smaller, that will help when we need to install a image into our cluster but will not benefit in overall to response time or even the memory that we use.
Checking the Memory Usage
The things that sound a bit odd and we will try to explain next is why we are using so much memory, so we need a tool to understand this futher so we will use VisualVM, but first we will need to modify our deployment in order been able to pass parameters to the JVM when we start our service, and we will to enable the JMX port that we will use.
First we will modify our Dockerfile :
We will have now an environment variable named JAVA_OPTS to set additional parameters when running our java service.
Now we will modify our deployment.yml :
But in order to use JMX we need to add to our jlink.sh the jdk.management.agent module :
Not we will and deploy our application.
Now we will forward our port 12345 for the pod to our localhost :
Now we should open VisualVM and add a JMX connection to our service on locahost:12345 :
Now we could click on our application and the select the tab monitor to see some graphs, I choose to only see CPU and threads :
As we could see our heap is about 1GB, with a maximun of 16GB however we are only using 80M at peak, we could see in the CPU that our garbage collector is almost doing nothing until he enter a cleaning cycle the an small CPU usage and our heap usage drops.
In this moment we do not have any traffic on the service, lest run our load test for a couple of minutes.
I’ll leave a couple of minutes after the test to wait that the garbage collector start again and grab more data from VisualVM :
As we could see know we use around 435M of heap, and the gc busy during our test, them we could see how drops afterwards.
Let’s make a bit of sense on what we have seen so far.
First we do not specify any memory limit on the deployment of our containers so that’s the reason that our maximum heap is set to 16G, them even without using much initial heap but the JVM is prepare to use a full G if need, up to 16G if needs more.
Our garbage collector is trying to do it best to keep up and clean memory when he could.
To just test some limit we could run more test with different loads but I think that with 450M should be ok and starting with 250M, so I going to change our VM options in our deployment :
We do not need to build our application again, since we are not changing our docker just deploy it :
Now we will forward the port again them connect with VisualVM and repeat the test for get a new graph :
Now we could see that our memory is better utilize, let’s now restart our cluster and run our performance test again to check how it perform in comparison with the previous tests, first we modify our deployment to remove the jmx port but keep the heap configuration:
We will remove the jdk.management.agent module as well from our jlink.sh :
Then we build & deploy our service :
And now we follow the procedure, including scaling and restart to perform a clean test :
Step
C. Users
Pods
ART
TPS
Max CPU
Max MEM
01 base
10
1
570.47
16.73
916
487
02 Java 11
10
1
517.01
18.47
875
918
03 jlink
10
1
518.97
18.41
875
787
04 heap
10
1
451.73
21.13
816
633
heap changes
Changing the garbage collector
All the test that we have done so far are using the G1 garbage collector, this is the default since Java 9, however we could change the one that we use changing in our deployment the environment variables to tell the JVM which one to use.
For enabling the parallel garbage collector we will edit our deployment.yml :
If we run our load test again this is the number that we get :
Step
C. Users
Pods
ART
TPS
Max CPU
Max MEM
01 base
10
1
570.47
16.73
916
487
02 Java 11
10
1
517.01
18.47
875
918
03 jlink
10
1
518.97
18.41
875
787
04 heap
10
1
451.73
21.13
816
633
05 parallel GC
10
1
450.85
21.17
799
626
parallel gc
We could see that parallel is even better than G1 as overall, but may need to understand this.
G1 is really good to have a predictable pauses when doing garbage collection, and he is continuously working on do part of the work without producing pauses. parallel in other hand is just waiting to certain thresholds to actually start doing the cleaning and them pause unpredictable, however the overall performance is better. Finally G1 is really for bigger heaps and that’s is not the case in in our service.
This number will vary a lot depending or what your service does, for example if you have tons of static data G1 may be better fit.
Connection pool
So far we haven take into account how our service is connected to the database, to check it out let’s connect to our database and find it out.
First we will forward the master of our database to a local port with :
Now we will get the password from our moviesdba user :
Finally we will use PSQL to connect to our database :
For knowing how many active connection for our moviesuser we have we could do :
If we run our load test and when is running we execute the same command we will see something like :
First we will modify our application.yml to define our pool :
Now we will modify our DataSourceProperties class :
And finally our MoviesDataSource class
No we could build and deploy our service again to repeat our test :
Step
C. Users
Pods
ART
TPS
Max CPU
Max MEM
01 base
10
1
570.47
16.73
916
487
02 Java 11
10
1
517.01
18.47
875
918
03 jlink
10
1
518.97
18.41
875
787
04 heap
10
1
451.73
21.13
816
633
05 parallel GC
10
1
450.85
21.17
799
626
06 connection pool
10
1
129.0
85.49
625
363
connection pool
We have improve our service further since we now reuse our connections.
Final Touches
Now we will finalize our optimizing we will change some more settings, first we will modify our POM to disable Logback and enable log4j2 that have better performance.
We will set log to error disable the spring banner an disable JMX adding to our application.yaml :
Now let’s run a new run with our final changes :
Step
C. Users
Pods
ART
TPS
Max CPU
Max MEM
01 base
10
1
570.47
16.73
916
487
02 Java 11
10
1
517.01
18.47
875
918
03 jlink
10
1
518.97
18.41
875
787
04 heap
10
1
451.73
21.13
816
633
05 parallel GC
10
1
450.85
21.17
799
626
06 connection pool
10
1
129.0
85.49
625
363
07 final
10
1
111.61
85.53
626
350
final
Running the complete set
With this we are ready to run our final test running the full set, let’s do it so :
C. Users
Pods
ART
TPS
Max CPU
Max MEM
Service
1
1
39.16
25.49
403
474
01 Base
1
1
34.61
28.85
209
252
02 Spring Web
10
1
570.47
16.73
916
487
01 Base
10
1
111.61
85.53
626
350
02 Spring Web
25
3
1615.82
14.72
1049
1454
01 Base
25
1
266.03
89.44
686
369
02 Spring Web
50
5
3190.98
14.89
1351
2499
01 Base
50
3
1982.03
23.95
774
1060
02 Spring Web
01 Base = Java 8 - Spring Boot 2.2.2 - Spring Web - Spring Data JDBC default settings, unoptimized 02 Spring Web = Java 11 - Spring Boot 2.2.2 - Spring Web - Spring Data JDBC optimized
Conclusions
With this optimization we have drastically improve our service, it use less CPU and Memory and we need to have less pods running for been able to support the same amount of concurrent users with an improve response time.
In the next chapters of this series we will start to use other frameworks to measure them against our results.
Note: The full code of this service is available at this repository.