Much have been said about the importance of optimizing services that are going to run in a cloud, and they are plenty of resources about the topic, from start up times, memory consumption or resource usage and dozen of data points of your applications.
But even with this amount of information it get exponentially complex to understand why we are doing this?, and what actually means?. So I decided that is going to be simpler to just learn by example, so this is the first part of a series of articles that will help on this matter.
We are going to create a series of services that we will deploy in our kubernetes (k8s) cluster, and we will learn how we could measure them. Them we will optimizing our service to perform better, so we could measure if what we have done improves using data we could use to understand why.
In this first part on the series we will create our base service that will be use as a baseline for comparing the improvements that will be doing down the line, for this our service will be developed with default values, including configuration, forgetting things that we usually do for optimizing an application in order to understand what those optimizations actually do.
We are going to use Spring Initializr https://start.spring.io/ to quickly bootstrap our application.
We will create a Maven Project using Java 8 and Spring Boot version 2.2.2, we will set the Group to be org.learning.by.example.movies, and the Artifact to base-service, the rest of the values should be automatically populated.
For dependencies we will search and add :
Spring Web
Spring Boot Actuator
Spring Data JDBC
PostgreSQL Driver
Now we will click on Generate to download our zip file : base-service.zip, we will uncompress it and leave it ready to be opened with our favorite IDE.
Designing the Service
Before starting to code our application we will design what our application will do, so first lest just look at our movies table loaded in our database, for
this we will do as been doing before using just the PSQL client to explore the data.
Lest explore the data using PSQL, we will login in our server with our moviesuser user so we need to get it password, we could get it with:
We will forward the PostgreSQL port 5432 on our master to our localhost port 6432, this will run until we do ctrl+c :
Finally we can connect to our database with with the provide user and password using PSQL in another
shell :
As we can see we have a in our movies table a column id, as the key of each movie, a column name title that sometimes contains a year and a list of genres, separate with the the character |.
We will do a REST API that returns all the movies from a given genre, but since they are just in a column we need to prepare an special query for it.
We will use the PostgreSQL function string_to_array and the ANY operator for getting all the movies from a given genre, converted to lowercase using the string function lower.
This could return thousands of row, for example for sci-fi will be around 3.5k but that is what we like to do in our service.
Let’s check if with just 10 rows in PSQL
The query seems to work, now lets think on our API, we will create an endpoint in the url /movies/genre that will return all movies under that category as an array of JSON object, but this can not be directly the row in our database it should a bit better, we will design a movie JSON like this :
Implementing our Service
Back to our Service that was generated before we will open it in our favorite IDE and start adding the code require for our service to work.
First we will create a Plain Old Java Object, POJO, class that represents our Movie JSON Object, we will place it in src/main/java/org/learning/by/example/movies/baseservice/model/Movie.java :
This class when deserialize will be just as our designed JSON, now we need we will create our repository that will query database on src/main/java/org/learning/by/example/movies/baseservice/repositories/MoviesRepository.java :
In this simple repository we will extend from CrudRepository to return a Movie List giving a genre, we will use the query that we design before.
But the output of this query does not mach our Movie Classs so we need to create a mapper that will convert a row from our database, we will place on src/main/java/org/learning/by/example/movies/baseservice/mapper/MovieMapper.java :
This simple RowMapper will first use a regular expression to get our movie title and year, if it is available, parse the genres and produce a Movie object, however is not enough to have a mapper we need to tell Spring Data JDBC to use it when query for our Movies so we will set this in a new configuration class on src/main/java/org/learning/by/example/movies/baseservice/repositories/MappingConfiguration.java :
In this MappingConfiguration we are registering our MovieMapper for mapping Movies objects, note that we use dependency constructor injection for obtain the
bean for our MovieMapper.
We will create now a service that will provide movies to who ever request them, to been able to change the implementation of our repository if needed and we will place it on src/main/java/org/learning/by/example/movies/baseservice/service/MovieService.java :
The MovieService class will as well convert to lowercase the genre, since our repository will be perform the query on lowercase genres.
Now we will create our api using a RestController and we will place it on src/main/java/org/learning/by/example/movies/baseservice/controller/MoviesController.java :
Our MovieController will answer HTTP GET request on /movies/{genre} and return the movies using the MoviesService that will use our MoviesRepository, however we need to stablish the connection to our database but first let thinks how this will run on our k8s cluster.
When the service is running we need to connect to our database we could use the environment variables provide by kubernetes to connect to find the IP address and port, as we did in the last example, this are MOVIES_DB_CLUSTER_SERVICE_HOST and MOVIES_DB_CLUSTER_SERVICE_PORT_POSTGRESQL, but we need as well the credentials that we could inject in our kubernetes deployment, as we did before, in a directory containing a username and password. Finally we need to tell spring witch database driver to use, and the JDBC connection string, for this let’s add to our src/main/resources/application.yml some entries :
Let’s get those values using a ConfigurationProperties that will be on src/main/java/org/learning/by/example/movies/baseservice/datasource/DataSourceProperties.java :
Now that we have our DataSourceProperties we could create a DataSource for our connection on src/main/java/org/learning/by/example/movies/baseservice/datasource/MoviesDataSource.java :
We are creating a DriverManagerDataSource with the settings in our application.yml and reading the username and password from our credentials directory.
Finally we will modify our application on src/main/java/org/learning/by/example/movies/baseservice/BaseServiceApplication.java :
We just enable JDBC repositories in our application.
Note: For make this article brief we haven’t include the unit and integrations tests created for this code, but they are available on the repository for this article.
Running our Service in Local
Before start working on deploying the service into our k8s cluster we will run it locally, but since our configuration use a couple of environments variables and some credentials files for this.
First let’s get our user password with :
Now will we save it to a file named /etc/movies-db/password, we will create as well a file name /etc/movies-db/username that just contains our user : moviesuser.
We will build our microservice with :
We will forward the PostgreSQL port 5432 on our master to our localhost port 6432, this will run until we do ctrl+c :
In a new shell window we could run out service, this will run until we do ctrl+c :
Now we could test our service using any HTPP client, such wget, curl, etc., I going to use HTTPie instead :
This will output thousands of records, these where just some of them.
But since we have as well include Spring Actuator in our dependencies we have two more URLs in our service :
Building our Docker Image
Now that we have test that our application runs correctly let’s create a docker image and push it to our local repository that we have in our cluster, first we will create a Dockerfile :
Now we will create a small script that build our docker and publish it to the registry, we will name it build.sh :
Running this script we will build our docker image and push to our local docker registry, but for deploying the image into our cluster we will create a deployment descriptor name deployment.yml :
In this descriptor we are first deploying our service, we declare our liveness and readiness probes to use our actuator endpoint, we inject our credentials and we mount a temporary directory for our service to use. Them we create a k8s service with a load balancer to been able to call our service and balance between the pods that is available.
Finally we will create another script to deploy our service deploy.sh :
This script is a bit different we have create in other tu use the standard kubectl or microk8.kubectl if its available, it will delete the deployment and them deploy our service.
We can now build and deploy our service with :
Now we could check what we have in our cluster for our service with :
Now let’s unzip it and set in path that we could use :
With this now we have available jmeter to be use, however if you are using are high dpi screen, such a 4k or a retina display the JMeter UI may be to small to read but you could edit the file /opt/jmeter/default/bin/user.properties and add this:
Now we could execute jmeter, it will open we a empty test plan, lest just on Name : k8s service load test
We will add a new thread group, right click on the tree panel on k8s service load test and select : Add > Thread (Users) -> Thread Group. Them we will set Number of Threads (Users) = ${__P(NUM_USERS)}, Ramp-up period (seconds) = ${__P(RAMP_UP)}, Loop Count = Infinite, Specify Thread lifetime = Checked, Duration (Seconds) = ${__P(DURATION)}.
No we need to add a HTTP request, right click on the tree pane on Thread Group : Add > Sampler > HTTP Request. Them we will set Name = ${__P(TEST_URL)}, Server Name or IP = ${__P(SERVICE_CLUSTER_IP)}, Port Number = ${__P(SERVICE_PORT)}, Method = Get, Path = ${__P(TEST_URL)}.
Now we will save our plan on k8s-sv-load-test/load_test.jmx.
For launching our load test we wil create a new script on k8s-sv-load-test/k8s-sv-load-test.sh :
This is a very generic script that allow to test any k8s that we have in our cluster, for example if we like to launch a load test against our movies service in the actuator/info endpoint during 30s that will ramp up in 1s :
When the test is complet we get a nice report in html, in this example was on file:////home/jamedina/Sources/movies-base-service/k8s-sv-load-test/report/index.html
But this is a generic script so lets create a more specific to test our movies/genre endpoint on load-test.sh :
Now that we have our script ready we could launch it with
And this is the report that we get
Getting data from Grafana
Now that we have done a simple test we could go to Grafana and look at the graph for this small test
We could see in this test that initial we have a small increase of CPU and them a peak, this was quite rapidly because our ramp up period was just one second,
them we see a dropdown after our test.
In terms of memory we see how slightly increase and remain unmodified during the test.
Scaling
For the next test we will need to increase and decrease the replicas of our service, we will use an small script for this mater name scale.sh
This script will tell K8s to scale our replica to a number and wait until we have the number of replicas that we ask for.
First let’s scale our application to 0 replicas.
Not let’s scale our service to 5 replicas.
And now we will repeat our test but with 10 concurrent users :
And this is the report that we got :
We like to get now the data from Grafana but we will create a copy of the Pods dashboard and name “movies” then we will edit the template to get our al movies pods :
Finally we will get from Grafana the graph for our 5 pods 10 users test :
Getting our measure
Now that we have this tools ready we will defined how we are going to measure this base service, and hopefuly futher services on the nexst parts for the series. We know that we have dozen of different metrics to get from our tools but we are going some very simple with just few data points, and we are not focus on a particular runtime, so they will be not JVM metrics, because we will not haven in all the parts of this series.
We will do this procedure with 1, 10, 25 and 50 concurrent users and 1 replica :
Scale the replicas to 0
Stop our cluster
Start our cluster
Scale to desired number replicas
Run the test for the desire set of concurrent users during 10 minutes with a ramp up of 1 minute
If all request are OK (HTTP response code is 200) :
Get from the JMeter report :
Average response time
Transactions per second
Get from Grafana
Max CPU Usage
Max Memory Usage
If some request are not OK (HTTP response code is not 200)
We will increase the desire number of replicas to one more
Then we will repeat this procedure until we got the results for 100 concurrent users.
We will scale down our replicas and restart our cluster to guarantee that each test is a fresh run, including the database load and connections.
The Results
This are the results for this first example in the series, if you repeat the steps you may get different numbers because they depend on the cluster and the computer that is launching the tests.
C. Users
Pods
ART
TPS
Max CPU
Max MEM
1
1
39.16
25.49
403
474
10
1
570.47
16.73
916
487
25
3
1615.82
14.72
1049
1454
50
5
3190.98
14.89
1351
2499
Java 8 - Spring Boot 2.0 - Spring Web - Spring Data JDBC default settings, unoptimized
With this will close this first article, next we will try to optimize this first service and compare the results with the base.
Note: The full code of this service is available at this repository.