Performance Archives - Piotr's TechBlog

Create Apps with Claude Code on Ollama

piotr.minkowski — Tue, 17 Feb 2026 16:25:18 +0000

This article explains how to run Claude Code on Ollama and use local or cloud models served by Ollama to create Java apps. Read this article if you are experimenting with AI code generation and using paid APIs for this purpose. Relatively recently, Ollama has made a built-in integration with developer tools such as Codex and Claude Code available. This is a really useful feature. Using the example of integration with Claude Coda and several different models running both locally and in the cloud, you will see how it works.

You can find other articles about AI and Java on my blog. For example, if you are interested in how to use Ollama to serve models for Spring AI applications, you can read the following article.

Source Code

Feel free to use my source code if you’d like to try it out yourself. To do that, you must clone my sample GitHub repository. Then you should only follow my instructions. This repository contains several branches, each with an application generated from the same prompt using different models. Currently, the branch with the fewest comments in the code review has been merged into master. This is the version of the code generated using the glm-5 model. However, this may change in the future, and the master branch may be modified. Therefore, it is best to simply refer to the individual branches or pull requests shown below.

Below is the current list of branches. The dev branch contains the initial version of the repository with the CLAUDE.md file, which specifies the basic requirements for the generated code.

$ git branch
    dev
    glm-5
    gpt-oss
  * master
    minimax
    qwen3-coder

$ git branch
    dev
    glm-5
    gpt-oss
  * master
    minimax
    qwen3-coder

ShellSession

Here are instructions for AI from the CLAUDE.md file. They include a description of the technologies I plan to use in my application and a few practices I intend to apply. For example, I don’t want to use Lombok, a popular Java library that automates the generation of code parts such as getters, setters, and constructors. It seems that in the age of AI, this approach doesn’t make sense, but for some reason, AI models really like this library Also, each time I make a code change, I want the LLM model to increment the version number and update the README.md file, etc.

# Project Instructions

- Always use the latest versions of dependencies.
- Always write Java code as the Spring Boot application.
- Always use Maven for dependency management.
- Always create test cases for the generated code both positive and negative.
- Always generate the CircleCI pipeline in the .circleci directory to verify the code.
- Minimize the amount of code generated.
- The Maven artifact name must be the same as the parent directory name.
- Use semantic versioning for the Maven project. Each time you generate a new version, bump the PATCH section of the version number.
- Use `pl.piomin.services` as the group ID for the Maven project and base Java package.
- Do not use the Lombok library.
- Generate the Docker Compose file to run all components used by the application.
- Update README.md each time you generate a new version.

# Project Instructions

- Always use the latest versions of dependencies.
- Always write Java code as the Spring Boot application.
- Always use Maven for dependency management.
- Always create test cases for the generated code both positive and negative.
- Always generate the CircleCI pipeline in the .circleci directory to verify the code.
- Minimize the amount of code generated.
- The Maven artifact name must be the same as the parent directory name.
- Use semantic versioning for the Maven project. Each time you generate a new version, bump the PATCH section of the version number.
- Use `pl.piomin.services` as the group ID for the Maven project and base Java package.
- Do not use the Lombok library.
- Generate the Docker Compose file to run all components used by the application.
- Update README.md each time you generate a new version.

Markdown

Run Claude on Ollama

First, install Ollama on your computer. You can download the installer for your OS here. If you have used Ollama before, please update to the latest version.

$ ollama --version
  ollama version is 0.16.1

$ ollama --version
  ollama version is 0.16.1

ShellSession

Next, install Claude Code.

curl -fsSL https://claude.ai/install.sh | bash

curl -fsSL https://claude.ai/install.sh | bash

ShellSession

Before you start, it is worth increasing the maximum context window value allowed by Ollama. By default, it is set to 4k, and on the Ollama website itself, you will find a recommendation of 64k for Claude Code. I set the maximum value to 256k for testing different models.

For example, the gpt-oss model supports a 128k context window size.

Let’s pull and run the gpt-oss model with Ollama:

ollama run gpt-oss

ollama run gpt-oss

ShellSession

After downloading and launching, you can verify the model parameters with the ollama ps command. If you have 100% GPU and a context window size of ~131k, that’s exactly what I meant.

Ensure you are in the root repository directory, then run Claude Code with the command ollama launch claude. Next, choose the gpt-oss model visible in the list under “More”.

That’s it! Finally, we can start playing with AI.

Generate a Java App with Claude Code

My application will be very simple. I just need something to quickly test the solution. Of course, all guidelines defined in the CLAUDE.md file should be followed. So, here is my prompt. Nothing more, nothing less

Generate an application that exposes REST API and connects to a PostgreSQL database.
The application should have a Person entity with id, and typical fields related to each person.
All REST endpoints should be protected with JWT and OAuth2.
The codebase should use Skaffold to deploy on Kubernetes.

Generate an application that exposes REST API and connects to a PostgreSQL database.
The application should have a Person entity with id, and typical fields related to each person.
All REST endpoints should be protected with JWT and OAuth2.
The codebase should use Skaffold to deploy on Kubernetes.

Plaintext

After a few minutes, I have the entire code generated. Below is a summary from the AI of what has been done. If you want to check it out for yourself, take a look at this branch in my repository.

For the sake of formality, let’s take a look at the generated code. There is nothing spectacular here, because it is just a regular Spring Boot application that exposes a few REST endpoints for CRUD operations. However, it doen’t look bad. Here’s the Spring Boot @Service implementation responsible for using PersonRepository to interact with database.

@Service
public class PersonService {
    private final PersonRepository repository;

    public PersonService(PersonRepository repository) {
        this.repository = repository;
    }

    public List<Person> findAll() {
        return repository.findAll();
    }

    public Optional<Person> findById(Long id) {
        return repository.findById(id);
    }

    @Transactional
    public Person create(Person person) {
        return repository.save(person);
    }

    @Transactional
    public Optional<Person> update(Long id, Person person) {
        return repository.findById(id).map(existing -> {
            existing.setFirstName(person.getFirstName());
            existing.setLastName(person.getLastName());
            existing.setEmail(person.getEmail());
            existing.setAge(person.getAge());
            return repository.save(existing);
        });
    }

    @Transactional
    public void delete(Long id) {
        repository.deleteById(id);
    }
}

@Service
public class PersonService {
    private final PersonRepository repository;

    public PersonService(PersonRepository repository) {
        this.repository = repository;
    }

    public List<Person> findAll() {
        return repository.findAll();
    }

    public Optional<Person> findById(Long id) {
        return repository.findById(id);
    }

    @Transactional
    public Person create(Person person) {
        return repository.save(person);
    }

    @Transactional
    public Optional<Person> update(Long id, Person person) {
        return repository.findById(id).map(existing -> {
            existing.setFirstName(person.getFirstName());
            existing.setLastName(person.getLastName());
            existing.setEmail(person.getEmail());
            existing.setAge(person.getAge());
            return repository.save(existing);
        });
    }

    @Transactional
    public void delete(Long id) {
        repository.deleteById(id);
    }
}

Java

Here’s the generated @RestController witn REST endpoints implementation:

@RestController
@RequestMapping("/api/people")
public class PersonController {
    private final PersonService service;

    public PersonController(PersonService service) {
        this.service = service;
    }

    @GetMapping
    public List<Person> getAll() {
        return service.findAll();
    }

    @GetMapping("/{id}")
    public ResponseEntity<Person> getById(@PathVariable Long id) {
        Optional<Person> person = service.findById(id);
        return person.map(ResponseEntity::ok).orElseGet(() -> ResponseEntity.notFound().build());
    }

    @PostMapping
    public ResponseEntity<Person> create(@RequestBody Person person) {
        Person saved = service.create(person);
        return ResponseEntity.status(201).body(saved);
    }

    @PutMapping("/{id}")
    public ResponseEntity<Person> update(@PathVariable Long id, @RequestBody Person person) {
        Optional<Person> updated = service.update(id, person);
        return updated.map(ResponseEntity::ok).orElseGet(() -> ResponseEntity.notFound().build());
    }

    @DeleteMapping("/{id}")
    public ResponseEntity<Void> delete(@PathVariable Long id) {
        service.delete(id);
        return ResponseEntity.noContent().build();
    }
}

@RestController
@RequestMapping("/api/people")
public class PersonController {
    private final PersonService service;

    public PersonController(PersonService service) {
        this.service = service;
    }

    @GetMapping
    public List<Person> getAll() {
        return service.findAll();
    }

    @GetMapping("/{id}")
    public ResponseEntity<Person> getById(@PathVariable Long id) {
        Optional<Person> person = service.findById(id);
        return person.map(ResponseEntity::ok).orElseGet(() -> ResponseEntity.notFound().build());
    }

    @PostMapping
    public ResponseEntity<Person> create(@RequestBody Person person) {
        Person saved = service.create(person);
        return ResponseEntity.status(201).body(saved);
    }

    @PutMapping("/{id}")
    public ResponseEntity<Person> update(@PathVariable Long id, @RequestBody Person person) {
        Optional<Person> updated = service.update(id, person);
        return updated.map(ResponseEntity::ok).orElseGet(() -> ResponseEntity.notFound().build());
    }

    @DeleteMapping("/{id}")
    public ResponseEntity<Void> delete(@PathVariable Long id) {
        service.delete(id);
        return ResponseEntity.noContent().build();
    }
}

Java

Below is a summary in a pull request with the generated code.

Using Ollama Cloud Models

Recently, Ollama has made it possible to run models not only locally, but also in the cloud. By default, all models tagged with cloud are run this way. Cloud models are automatically offloaded to Ollama’s cloud service while offering the same capabilities as local models. This is the most useful for larger models that wouldn’t fit on a personal computer. You can for example try to experiment with the qwen3-coder model locally. Unfortunately, it didn’t look very good on my laptop.

Then, I can run a same or event a larger model in cloud and automatically connect Claude Code with that model using the following command:

ollama launch claude --model qwen3-coder:480b-cloud

ollama launch claude --model qwen3-coder:480b-cloud

Java

Now you can repeat exactly the same exercise as before or take a look at my branch containing the code generated using this model.

You can also try some other cloud models like minimax-m2.5 or glm-5.

Conclusion

If you’re developing locally and don’t want to burn money on APIs, use Claude Code with Ollama, and e.g., the gpt-oss or glm-5 models. It’s a pretty powerful and free option. If you have a powerful personal computer, the locally launched model should be able to generate the code efficiently. Otherwise, you can use the option of launching the model in the cloud offered by Ollama free of charge up to a certain usage limit (it is difficult to say exactly what that limit is). The gpt-oss model worked really well on my laptop (MacBook Pro M3), and it took about 7-8 minutes to generate the application. You can also look for a model that suits you better.

The post Create Apps with Claude Code on Ollama appeared first on Piotr's TechBlog.

Startup CPU Boost in Kubernetes with In-Place Pod Resize

piotr.minkowski — Mon, 22 Dec 2025 08:22:48 +0000

This article explains how to use the In-Place Pod Resize feature in Kubernetes, combined with Kube Startup CPU Boost, to speed up Java application startup. The In-Place Update of Pod Resources feature was initially introduced in Kubernetes 1.27 as an alpha release. With version 1.35, Kubernetes has reached GA stability. One potential use case for using this feature is to set a high CPU limit only during application startup, which is necessary for Java to launch quickly. I have already described such a scenario in my previous article. The example implemented in that article used the Kyverno tool. However, it is based on an alpha version of the in-place pod resize feature, so it requires a minor tweak to the Kyverno policy to align with the GA release.

The other potential solution in that context is the Vertical Pod Autoscaler. In the latest version, it supports in-place pod resize. Vertical Pod Autoscaler (VPA) in Kubernetes automatically adjusts CPU and memory requests/limits for pods based on their actual usage, ensuring containers receive the appropriate resources. Unlike the Horizontal Pod Autoscaler (HPA), which scales resources, not replicas, it may restart pods to apply changes. For now, VPA does not support this use case, but once this feature is implemented, the situation will change.

On the other hand, Kube Startup CPU Boost is a dedicated feature for scenarios with high CPU requirements during app startup. It is a controller that increases CPU resource requests and limits during Kubernetes workload startup. Once the workload is up and running, the resources are set back to their original values. Let’s see how this solution works in practice!

If you’re looking to go deeper into the practical side of running Java applications on Kubernetes, I explore these topics in much more detail in my book, Hands-On Java with Kubernetes.

Source Code

Feel free to use my source code if you’d like to try it out yourself. To do that, you must clone my sample GitHub repository. The sample application is based on Spring Boot and exposes several REST endpoints. However, in this exercise, we will use a ready-made image published on my Quay: quay.io/pminkows/sample-kotlin-spring:1.5.1.1.

Install Kube Startup CPU Boost

The Kubernetes cluster you are using must enable the In-Place Pod Resize feature. The activation method may vary by Kubernetes distribution. For the Minikube I am using in today’s example, it looks like this:

minikube start --memory='8gb' --cpus='6' --feature-gates=InPlacePodVerticalScaling=true

minikube start --memory='8gb' --cpus='6' --feature-gates=InPlacePodVerticalScaling=true

ShellSession

After that, we can proceed with installing the Kube Startup CPU Boost controller. There are several ways to achieve it. The easiest way to do this is with a Helm chart. Let’s add the following Helm repository:

helm repo add kube-startup-cpu-boost https://google.github.io/kube-startup-cpu-boost

helm repo add kube-startup-cpu-boost https://google.github.io/kube-startup-cpu-boost

ShellSession

Then, we can install the kube-startup-cpu-boost chart in the dedicated kube-startup-cpu-boost-system namespace using the following command:

helm install -n kube-startup-cpu-boost-system kube-startup-cpu-boost \
  kube-startup-cpu-boost/kube-startup-cpu-boost --create-namespace

helm install -n kube-startup-cpu-boost-system kube-startup-cpu-boost \
  kube-startup-cpu-boost/kube-startup-cpu-boost --create-namespace

ShellSession

If the installation was successful, you should see the following pod running in the kube-startup-cpu-boost-system namespace as below.

$ kubectl get pod -n kube-startup-cpu-boost-system
NAME                                                         READY   STATUS    RESTARTS   AGE
kube-startup-cpu-boost-controller-manager-75f95d5fb6-692s6   1/1     Running   0          36s

$ kubectl get pod -n kube-startup-cpu-boost-system
NAME                                                         READY   STATUS    RESTARTS   AGE
kube-startup-cpu-boost-controller-manager-75f95d5fb6-692s6   1/1     Running   0          36s

ShellSession

Install Monitoring Stack (optional)

Then, we can install Prometheus monitoring. It is an optional step to verify pod resource usage in the graphical form. Firstly, let’s install the following Helm repository:

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts

ShellSession

After that, we can install the latest version of the kube-prometheus-stack chart in the monitoring namespace.

helm install my-kube-prometheus-stack prometheus-community/kube-prometheus-stack \
  -n monitoring --create-namespace

helm install my-kube-prometheus-stack prometheus-community/kube-prometheus-stack \
  -n monitoring --create-namespace

ShellSession

Let’s verify the installation succeeded by listing the pods running in the monitoring namespace.

$ kubectl get pods -n monitoring
NAME                                                         READY   STATUS    RESTARTS   AGE
alertmanager-my-kube-prometheus-stack-alertmanager-0         2/2     Running   0          38s
my-kube-prometheus-stack-grafana-f8bb6b8b8-mzt4l             3/3     Running   0          48s
my-kube-prometheus-stack-kube-state-metrics-99f4574c-bf5ln   1/1     Running   0          48s
my-kube-prometheus-stack-operator-6d58dd9d6c-6srtg           1/1     Running   0          48s
my-kube-prometheus-stack-prometheus-node-exporter-tdwmr      1/1     Running   0          48s
prometheus-my-kube-prometheus-stack-prometheus-0             2/2     Running   0          38s

$ kubectl get pods -n monitoring
NAME                                                         READY   STATUS    RESTARTS   AGE
alertmanager-my-kube-prometheus-stack-alertmanager-0         2/2     Running   0          38s
my-kube-prometheus-stack-grafana-f8bb6b8b8-mzt4l             3/3     Running   0          48s
my-kube-prometheus-stack-kube-state-metrics-99f4574c-bf5ln   1/1     Running   0          48s
my-kube-prometheus-stack-operator-6d58dd9d6c-6srtg           1/1     Running   0          48s
my-kube-prometheus-stack-prometheus-node-exporter-tdwmr      1/1     Running   0          48s
prometheus-my-kube-prometheus-stack-prometheus-0             2/2     Running   0          38s

ShellSession

Finally, we can expose the Prometheus console over localhost using the port forwarding feature:

kubectl port-forward svc/my-kube-prometheus-stack-prometheus 9090:9090 -n monitoring

kubectl port-forward svc/my-kube-prometheus-stack-prometheus 9090:9090 -n monitoring

ShellSession

Configure Kube Startup CPU Boost

The Kube Startup CPU Boost configuration is pretty intuitive. We need to create a StartupCPUBoost resource. It can manage multiple applications based on a given selector. In our case, it is a single sample-kotlin-spring Deployment determined by the app.kubernetes.io/name label (1). The next step is to define the resource management policy (2). The Kube Startup CPU Boost increases both request and limit by 50%. Resources should only be increased for the duration of the startup (3). Therefore, once the readiness probe succeeds, the resource level will return to its initial state. Of course, everything happens in-place without restarting the container.

apiVersion: autoscaling.x-k8s.io/v1alpha1
kind: StartupCPUBoost
metadata:
  name: sample-kotlin-spring
  namespace: demo
selector:
  matchExpressions: # (1)
  - key: app.kubernetes.io/name
    operator: In
    values: ["sample-kotlin-spring"]
spec:
  resourcePolicy: # (2)
    containerPolicies:
    - containerName: sample-kotlin-spring
      percentageIncrease:
        value: 50
  durationPolicy: # (3)
    podCondition:
      type: Ready
      status: "True"

apiVersion: autoscaling.x-k8s.io/v1alpha1
kind: StartupCPUBoost
metadata:
  name: sample-kotlin-spring
  namespace: demo
selector:
  matchExpressions: # (1)
  - key: app.kubernetes.io/name
    operator: In
    values: ["sample-kotlin-spring"]
spec:
  resourcePolicy: # (2)
    containerPolicies:
    - containerName: sample-kotlin-spring
      percentageIncrease:
        value: 50
  durationPolicy: # (3)
    podCondition:
      type: Ready
      status: "True"

YAML

Next, we will deploy our sample application. Here’s the Deployment manifest of our Spring Boot app. The name of the app container is sample-kotlin-spring, which matches the target Deployment name defined inside the StartupCPUBoost object (1). Then, we set the CPU limit to 500 millicores (2). There’s also a new field resizePolicy. It tells Kubernetes whether a change to CPU or memory can be applied in-place or requires a Pod restart. (3). The NotRequired value means that changing the resource limit or request will not trigger a pod restart. The Deployment object also contains a readiness probe that calls the GET/actuator/health/readiness exposed with the Spring Boot Actuator (4).

apiVersion: apps/v1
kind: Deployment
metadata:
  name: sample-kotlin-spring
  namespace: demo
  labels:
    app: sample-kotlin-spring
    app.kubernetes.io/name: sample-kotlin-spring
spec:
  replicas: 1
  selector:
    matchLabels:
      app: sample-kotlin-spring
  template:
    metadata:
      labels:
        app: sample-kotlin-spring
        app.kubernetes.io/name: sample-kotlin-spring
    spec:
      containers:
      - name: sample-kotlin-spring # (1)
        image: quay.io/pminkows/sample-kotlin-spring:1.5.1.1
        ports:
        - containerPort: 8080
        resources:
          limits:
            cpu: 500m # (2)
            memory: "1Gi"
          requests:
            cpu: 200m
            memory: "256Mi"
        resizePolicy: # (3)
        - resourceName: "cpu"
          restartPolicy: "NotRequired"
        readinessProbe: # (4)
          httpGet:
            path: /actuator/health/readiness
            port: 8080
            scheme: HTTP
          initialDelaySeconds: 15
          periodSeconds: 5
          successThreshold: 1
          failureThreshold: 3

apiVersion: apps/v1
kind: Deployment
metadata:
  name: sample-kotlin-spring
  namespace: demo
  labels:
    app: sample-kotlin-spring
    app.kubernetes.io/name: sample-kotlin-spring
spec:
  replicas: 1
  selector:
    matchLabels:
      app: sample-kotlin-spring
  template:
    metadata:
      labels:
        app: sample-kotlin-spring
        app.kubernetes.io/name: sample-kotlin-spring
    spec:
      containers:
      - name: sample-kotlin-spring # (1)
        image: quay.io/pminkows/sample-kotlin-spring:1.5.1.1
        ports:
        - containerPort: 8080
        resources:
          limits:
            cpu: 500m # (2)
            memory: "1Gi"
          requests:
            cpu: 200m
            memory: "256Mi"
        resizePolicy: # (3)
        - resourceName: "cpu"
          restartPolicy: "NotRequired"
        readinessProbe: # (4)
          httpGet:
            path: /actuator/health/readiness
            port: 8080
            scheme: HTTP
          initialDelaySeconds: 15
          periodSeconds: 5
          successThreshold: 1
          failureThreshold: 3

YAML

Here are the pod requests and limits configured by Kube Startup CPU Boost. As you can see, the request is set to 300m, while the limit is completely removed.

Once the application startup process completes, Kube Startup CPU Boost restores the initial request and limit.

Now we can switch to the Prometheus console to see the history of CPU request values for our pod. As you can see, the request was temporarily increased during the pod startup.

The chart below illustrates CPU usage when the application is launched and then during normal operation.

We can also define the fixed resources for a target container. The CPU requests and limits of the selected container will be set to the given values (1). If you do not want the operator to remove the CPU limit during boost time, set the REMOVE_LIMITS environment variable to false in the kube-startup-cpu-boost-controller-manager Deployment.

apiVersion: autoscaling.x-k8s.io/v1alpha1
kind: StartupCPUBoost
metadata:
  name: sample-kotlin-spring
  namespace: demo
selector:
  matchExpressions:
  - key: app.kubernetes.io/name
    operator: In
    values: ["sample-kotlin-spring"]
spec:
  resourcePolicy:
    containerPolicies: # (1)
    - containerName: sample-kotlin-spring
      fixedResources:
        requests: "500m"
        limits: "2"
  durationPolicy:
    podCondition:
      type: Ready
      status: "True"

apiVersion: autoscaling.x-k8s.io/v1alpha1
kind: StartupCPUBoost
metadata:
  name: sample-kotlin-spring
  namespace: demo
selector:
  matchExpressions:
  - key: app.kubernetes.io/name
    operator: In
    values: ["sample-kotlin-spring"]
spec:
  resourcePolicy:
    containerPolicies: # (1)
    - containerName: sample-kotlin-spring
      fixedResources:
        requests: "500m"
        limits: "2"
  durationPolicy:
    podCondition:
      type: Ready
      status: "True"

YAML

Conclusion

There are many ways to address application CPU demand during startup. First, you don’t need to set a CPU limit for Deployment. What’s more, many people believe that setting a CPU limit doesn’t make sense, but for different reasons. In this situation, the request issue remains, but given the short timeframe and the significantly higher usage than in the declaration, it isn’t material.

Other solutions are related to strictly Java features. If we compile the application natively with GraalVM or use the CRaC feature, we will significantly speed up startup and reduce CPU requirements.

Finally, several solutions rely on in-place resizing. If you use Kyverno, consider its mutate policy, which can modify resources in response to an application startup event. The Kube Startup CPU Boost tool described in this article operates similarly but is designed exclusively for this use case. In the near future, Vertical Pod Autoscaler will also offer a CPU boost via in-place resize.

The post Startup CPU Boost in Kubernetes with In-Place Pod Resize appeared first on Piotr's TechBlog.

Running Tekton Pipelines on Kubernetes at Scale

piotr.minkowski — Wed, 27 Mar 2024 08:11:49 +0000

In this article, you will learn how to configure and run CI pipelines on Kubernetes at scale with Tekton. Tekton is a Kubernetes-native solution for building CI/CD pipelines. It provides a set of Kubernetes Custom Resources (CRD) that allows us to define the building blocks and reuse them for our pipelines. You can find several articles about Tekton on my blog. If you don’t have previous experience with that tool you can read my introduction to CI/CD with Tekton and Argo CD to understand basic concepts.

Today, we will consider performance issues related to running Tekton pipelines at scale. We will run several different pipelines at the same time or the same pipeline several times simultaneously. It results in maintaining a long history of previous runs. In order to handle it successfully, Tekton provides a special module configured with the TektonResults CRD. It can also clean up of the selected resources using the Kubernetes CronJob.

Source Code

This time we won’t work much with a source code. However, if you would like to try it by yourself, you may always take a look at my source code. In order to do that you need to clone my GitHub repository. After that, you should follow my further instructions.

Install Tekton on Kubernetes

We can easily install Tekton on Kubernetes using the operator. We need to apply the following YAML manifest:

$ kubectl apply -f https://storage.googleapis.com/tekton-releases/operator/latest/release.yaml

ShellSession

After that, we can choose between some installation profiles: lite , all, basic. Let’s choose the all profile:

$ kubectl apply -f https://raw.githubusercontent.com/tektoncd/operator/main/config/crs/kubernetes/config/all/operator_v1alpha1_config_cr.yaml

ShellSession

On OpenShift, we can do it using the web UI. OpenShift Console provides the Operator Hub section, where we can find the “Red Hat OpenShift Pipelines” operator. This operator installs Tekton and integrates it with OpenShift. Once you install it, you can e.g. create, manage, and run pipelines in OpenShift Console.

OpenShift Console offers a dedicated section in the menu for Tekton pipelines as shown below.

We can also install the tkn CLI on the local machine to interact with Tekton Pipelines running on the Kubernetes cluster. For example, on macOS, we can do it using Homebrew:

$ brew install tektoncd-cli

ShellSession

How It Works

Create a Tekton Pipeline

Firstly, let’s discuss some basic concepts around Tekton. We can run the same pipeline several times simultaneously. We can trigger that process by creating the PipelineRun object directly, or indirectly e.g. via the tkn CLI command or graphical dashboard. However, each time the PipelineRun object must be created somehow. The Tekton pipeline consists of one or more tasks. Each task is executed by the separated pod. In order to share the data between those pods, we need to use a persistent volume. An example of such data is the app source code cloned from the git repository. We need to attach such a PVC (Persistent Volume Claim) as the pipeline workspace in the PipelineRun definition. The following diagram illustrates that scenario.

Let’s switch to the code. Here’s the YAML manifest with our sample pipeline. The pipeline consists of three tasks. It refers to the tasks from Tekton Hub: git-clone, s2i-java and openshift-client. With these three simple tasks we clone the git repository with the app source code, build the image using the source-to-image approach, and deploy it on the OpenShift cluster. As you see, the pipeline defines a workspace with the source-dir name. Both git-clone and s2i-java share the same workspace. It tags the image with a branch name. The name of the branch is set as the pipeline input parameter.

apiVersion: tekton.dev/v1
kind: Pipeline
metadata:
  name: sample-pipeline
spec:
  params:
    - description: Git branch name
      name: branch
      type: string
    - description: Target namespace
      name: namespace
      type: string
  tasks:
    - name: git-clone
      params:
        - name: url
          value: 'https://github.com/piomin/sample-spring-kotlin-microservice.git'
        - name: revision
          value: $(params.branch)
      taskRef:
        kind: ClusterTask
        name: git-clone
      workspaces:
        - name: output
          workspace: source-dir
    - name: s2i-java
      params:
        - name: IMAGE
          value: image-registry.openshift-image-registry.svc:5000/$(params.namespace)/sample-spring-kotlin-microservice:$(params.branch)
      runAfter:
        - git-clone
      taskRef:
        kind: ClusterTask
        name: s2i-java
      workspaces:
        - name: source
          workspace: source-dir
    - name: openshift-client
      params:
        - name: SCRIPT
          value: oc process -f openshift/app.yaml -p namespace=$(params.namespace) -p version=$(params.branch) | oc apply -f -
      runAfter:
        - s2i-java
      taskRef:
        kind: ClusterTask
        name: openshift-client
      workspaces:
        - name: manifest-dir
          workspace: source-dir
  workspaces:
    - name: source-dir

YAML

Run a Pipeline Several Times Simultaneously

Now, let’s consider the scenario where we run the pipeline several times with the code from different Git branches. Here’s the updated diagram illustrating it. As you see, we need to attach a dedicated volume to the pipeline run. We store there a code related to each of the source branches.

In order to start the pipeline, we can apply the PipelineRun object. The PipelineRun definition must satisfy the previous requirement for a dedicated volume per run. Therefore we need to define the volumeClaimTemplate, which automatically creates the volume and bounds it to the pods within the pipeline. Here’s a sample PipelineRun object for the master branch:

apiVersion: tekton.dev/v1
kind: PipelineRun
metadata:
  generateName: sample-pipeline-
  labels:
    tekton.dev/pipeline: sample-pipeline
spec:
  params:
    - name: branch
      value: master
    - name: namespace
      value: app-master
  pipelineRef:
    name: sample-pipeline
  taskRunTemplate:
    serviceAccountName: pipeline
  workspaces:
    - name: source-dir
      volumeClaimTemplate:
        spec:
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 1Gi
          volumeMode: Filesystem

YAML

With this bash script, we can run our pipeline for every single branch existing in the source repository prefixed by the feature word. It uses the tkn CLI to interact with Tekton

#! /bin/bash

for OUTPUT in $(git branch -r)
do
  branch=$(echo $OUTPUT | sed -e "s/^origin\///")
  if [[ $branch == feature* ]]
  then
    echo "Running the pipeline: branch="$branch
    tkn pipeline start sample-pipeline -p branch=$branch -p namespace=app-$branch -w name=source-dir,volumeClaimTemplateFile=pvc.yaml
  fi
done

ShellScript

The script is available in the sample GitHub repository under the openshift directory. If you want to reproduce my action you need to clone the repository and then execute the run.sh script on your OpenShift cluster.

$ git clone https://github.com/piomin/sample-spring-kotlin-microservice.git
$ cd sample-spring-kotlin-microservice/openshift
$ ./run.sh

ShellSession

The PipelineRun object is responsible not only for starting a pipeline. We can also use it to see the history of runs with detailed logs generated by each task. However, there is also the other side of the coin. The more times we run the pipeline, the more objects we store on the Kubernetes cluster.

Tekton creates a dedicated PVC per each PipelineRun. Such a PVC exists on Kubernetes until we don’t delete the parent PipelineRun.

Pruning Old Pipeline Runs

I just ran the sample-pipeline six times using different feature-* branches. However, you can imagine that there are many more previous runs. It results in many existing PipelineRun and PersistenceVolumeClaim objects on Kubernetes. Fortunately, Tekton provides an automatic mechanism for removing objects from the previous runs. It installs the global CronJob responsible for pruning the PipelineRun objects. We can override the default CronJob configuration in the TektonConfig CRD. I’ll change the CronJob frequency execution from one day to 10 minutes for testing purposes.

apiVersion: operator.tekton.dev/v1alpha1
kind: TektonConfig
metadata:
  name: config
spec:
  # other properties ...
  pruner:
    disabled: false
    keep: 100
    resources:
      - pipelinerun
    schedule: '*/10 * * * *'

YAML

We can customize the behavior of the Tekton pruner per each namespace. Thanks to that, it is possible to set the different configurations e.g. for the “production” and “development” pipelines. In order to do that, we need to annotate the namespace with some Tekton parameters. For example, instead of keeping the specific number of previous pipeline runs, we can set the time criterion. The operator.tekton.dev/prune.keep-since annotation allows us to retain resources based on their age. Let’s set it to 1 hour. The annotation requires setting that time in minutes, so the value is 60. We will also override the default pruning strategy to keep-since, which enables removing by time.

kind: Namespace
apiVersion: v1
metadata:
  name: tekton-demo
  annotations:
    operator.tekton.dev/prune.keep-since: "60"
    operator.tekton.dev/prune.strategy: "keep-since"
spec: {}

YAML

The CronJob exists in the Tekton operator installation namespace.

$ kubectl get cj -n openshift-pipelines
NAME                           SCHEDULE       SUSPEND   ACTIVE   LAST SCHEDULE   AGE
tekton-resource-pruner-ksdkj   */10 * * * *   False     0        9m44s           24m

ShellSession

As you see, the job runs every ten minutes.

$ kubectl get job -n openshift-pipelines
NAME                                    COMPLETIONS   DURATION   AGE
tekton-resource-pruner-ksdkj-28524850   1/1           5s         11m
tekton-resource-pruner-ksdkj-28524860   1/1           5s         75s

ShellSession

There are no PipelineRun objects older than 1 hour in the tekton-demo namespace.

$ kubectl get pipelinerun -n tekton-demo
NAME                        SUCCEEDED   REASON      STARTTIME   COMPLETIONTIME
sample-pipeline-run-2m4rq   True        Succeeded   55m         51m
sample-pipeline-run-4gjqw   True        Succeeded   55m         53m
sample-pipeline-run-5sxcf   True        Succeeded   55m         51m
sample-pipeline-run-667mb   True        Succeeded   34m         30m
sample-pipeline-run-6jqvl   True        Succeeded   34m         32m
sample-pipeline-run-8slfx   True        Succeeded   34m         31m
sample-pipeline-run-bvjq6   True        Succeeded   34m         30m
sample-pipeline-run-d87kn   True        Succeeded   55m         51m
sample-pipeline-run-lrvm2   True        Succeeded   34m         30m
sample-pipeline-run-tx4hl   True        Succeeded   55m         51m
sample-pipeline-run-w5cq8   True        Succeeded   55m         52m
sample-pipeline-run-wn2xx   True        Succeeded   34m         30m

ShellSession

This approach works fine. It minimizes the number of Kubernetes objects stored on the cluster. However, after removing the old objects, we cannot access the full history of pipeline runs. In some cases, it can be useful. Can we do it better? Yes! We can enable Tekton Results.

Using Tekton Results

Install and Configure Tekton Results

Tekton Results is a feature that allows us to archive the complete information for every pipeline run and task run. After pruning the old PipelineRun or TaskRun objects, we can still access the full history using Tekton Results API. It archives all the required information in the form of results and records stored in the database. Before we enable it, we need to prepare several things. In the first step, we need to generate the certificate for exposing Tekton Results REST API over HTTPS. Let’s generate public/private keys with the following openssl command:

$ openssl req -x509 \
    -newkey rsa:4096 \
    -keyout key.pem \
    -out cert.pem \
    -days 365 \
    -nodes \
    -subj "/CN=tekton-results-api-service.openshift-pipelines.svc.cluster.local" \
    -addext "subjectAltName = DNS:tekton-results-api-service.openshift-pipelines.svc.cluster.local"

ShellSession

Then, we can use the key.pem and cert.pem files to create the Kubernetes TLS Secret in the Tekton operator namespace.

$ kubectl create secret tls tekton-results-tls \
    -n openshift-pipelines \
    --cert=cert.pem \
    --key=key.pem

ShellSession

We also need to generate credentials for the Postgres database in Kubernetes Secret form. By default, Tekton Results uses a PostgreSQL database to store data. We can choose between the external instance of that database or the instance managed by the Tekton operator. We will use the internal Postgres installed on our cluster.

$ kubectl create secret generic tekton-results-postgres \
    -n openshift-pipelines \
    --from-literal=POSTGRES_USER=result \
    --from-literal=POSTGRES_PASSWORD=$(openssl rand -base64 20)

ShellSession

Tekton Results requires a persistence volume for storing the logs from pipeline runs.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: tekton-logs
  namespace: openshift-pipelines 
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi

YAML

Finally, we can proceed to the main step. We need to create the TektonResults object. I won’t get into the details of that object. You can just create it “as is” on your cluster.

apiVersion: operator.tekton.dev/v1alpha1
kind: TektonResult
metadata:
  name: result
spec:
  targetNamespace: openshift-pipelines
  logs_api: true
  log_level: debug
  db_port: 5432
  db_host: tekton-results-postgres-service.openshift-pipelines.svc.cluster.local
  logs_path: /logs
  logs_type: File
  logs_buffer_size: 32768
  auth_disable: true
  tls_hostname_override: tekton-results-api-service.openshift-pipelines.svc.cluster.local
  db_enable_auto_migration: true
  server_port: 8080
  prometheus_port: 9090
  logging_pvc_name: tekton-logs

YAML

Archive Pipeline Runs with Tekton Results

After applying the TektonResult object into the cluster Tekton runs three additional pods in the openshift-pipelines namespace. There are pods with a Postgres database, with Tekton Results API, and a watcher responsible for monitoring and archiving existing PipelineRun objects.

If you run Tekton on OpenShift you will also see the additional “Overview” menu in the “Pipelines” section. It displays the summary of pipeline runs for the selected namespace.

However, the best thing in this mechanism is that we can still access the old pipeline runs with Tekton Results although the PipelineRun objects have been deleted. Tekton Results integrates smoothly with OpenShift Console. The archived pipeline run is marked with the special icon as shown below. We can still access the logs or the results of running every single task in that pipeline.

If we switch to the tkn CLI it doesn’t return any PipelineRun. That’s because all the runs were older than one hour, and thus they were removed by the pruner.

$ kubectl get pipelinerun
NAME                     SUCCEEDED   REASON      STARTTIME   COMPLETIONTIME
sample-pipeline-yiuqhf   Unknown     Running     30s

ShellSession

Consequently, there is also a single PersistentVolumeClaim object.

$ kubectl get pvc
NAME             STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS                           AGE
pvc-0f16a64031   Bound    pvc-ba6ea9ef-4281-4a39-983b-0379419076b0   1Gi        RWO            ocs-external-storagecluster-ceph-rbd   41s

ShellSession

Of course, we can still access access details and logs of archived pipeline runs via the OpenShift Console.

Final Thoughts

Tekton is a Kubernetes-native tool for CI/CD pipelines. This approach involves many advantages, but may also lead to some challenges. One of them is running pipelines at scale. In this article, I focused on showing you new Tekton features that address some concerns around the intensive usage of pipelines. Features like pipeline run pruning or Tekton Results archives work fine and smoothly integrate with e.g. the OpenShift Console. Tekton gradually adds new useful features. It is becoming a really interesting alternative to more popular CI/CD tools like Jenkins, GitLab CI, or Circle CI.

The post Running Tekton Pipelines on Kubernetes at Scale appeared first on Piotr's TechBlog.

Java Flight Recorder on Kubernetes

piotr.minkowski — Tue, 13 Feb 2024 07:44:13 +0000

In this article, you will learn how to continuously monitor apps on Kubernetes with Java Flight Recorder and Cryostat. Java Flight Recorder (JFR) is a tool for collecting diagnostic and profiling data generated by the Java app. It is designed for use even in heavily loaded production environments since it causes almost no performance overhead. We can say that Java Flight Recorder acts similarly to an airplane’s black box. Even if the JVM crashes, we can analyze the diagnostic data collected just before the failure. This fact makes JFR especially usable in an environment with many running apps – like Kubernetes.

Assuming that we are running many Java apps on Kubernetes, we should interested in the tool that helps to automatically gather data generated by Java Flight Recorder. Here comes Cryostat. It allows us to securely manage JFR recordings for the containerized Java workloads. With the built-in discovery mechanism, it can detect all the apps that expose JFR data. Depending on the use case, we can store and analyze recordings directly on the Kubernetes cluster Cryostat Dashboard or export recorded data to perform a more in-depth analysis.

If you are interested in more topics related to Java apps on Kubernetes, you can take a look at some other posts on my blog. The following article describes a list of best practices for running Java apps Kubernetes. You can also read e.g. on how to resize CPU limit to speed up Java startup on Kubernetes here.

Source Code

If you would like to try it by yourself, you may always take a look at my source code. In order to do that you need to clone my GitHub repository. Then you need to go to the callme-service directory. After that, you should just follow my instructions. Let’s begin.

Install Cryostat on Kubernetes

In the first step, we install Cryostat on Kubernetes using its operator. In order to use and manage operators on Kubernetes, we should have the Operator Lifecycle Manager (OLM) installed on the cluster. The operator-sdk binary provides a command to easily install and uninstall OLM:

$ operator-sdk olm install

Alternatively, you can use Helm chart for Cryostat installation on Kubernetes. Firstly, let’s add the following repository:
$ helm repo add openshift https://charts.openshift.io/

Then, install the chart with the following command:
$ helm install my-cryostat openshift/cryostat --version 0.4.0

Once the OLM is running on our cluster, we can proceed to the Cryostat installation. We can find the required YAML manifest with the Subscription declaration in the Operator Hub. Let’s just apply the manifest to the target with the following command:

$ kubectl create -f https://operatorhub.io/install/cryostat-operator.yaml

By default, this operator will be installed in the operators namespace and will be usable from all namespaces in the cluster. After installation, we can verify if the operator works fine by executing the following command:

$ kubectl get csv -n operators

In order to simplify the Cryostat installation process, we can use OpenShift. With OpenShift we don’t need to install OLM, since it is already there. We just need to find the “Red Hat build of Cryostat” operator in the Operator Hub and install it using OpenShift Console. By default, the operator is available in the openshift-operators namespace.

Then, let’s create a namespace dedicated to running Cryostat and our sample app. The name of the namespace is demo-jfr.

$ kubectl create ns demo-jfr

Cryostat recommends using a cert-manager for traffic encryption. In our exercise, we disable that integration for simplification purposes. However, in the production environment, you should install “cert-manager” unless you do not use another solution for encrypting traffic. In order to run Cryostat in the selected namespace, we need to create the Cryostat object. The parameter spec.enableCertManager should be set to false.

apiVersion: operator.cryostat.io/v1beta1
kind: Cryostat
metadata:
  name: cryostat-sample
  namespace: demo-jfr
spec:
  enableCertManager: false
  eventTemplates: []
  minimal: false
  reportOptions:
    replicas: 0
  storageOptions:
    pvc:
      annotations: {}
      labels: {}
      spec: {}
  trustedCertSecrets: []

If everything goes fine, you should see the following pod in the demo-jfr namespace:

$ kubectl get po -n demo-jfr
NAME                               READY   STATUS    RESTARTS   AGE
cryostat-sample-5c57c9b8b8-smzx9   3/3     Running   0          60s

Here’s a list of Kubernetes Services. The Cryostat Dashboard is exposed by the cryostat-sample Service under the 8181 port.

$ kubectl get svc -n demo-jfr
NAME                      TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)             AGE
cryostat-sample           ClusterIP   172.31.56.83            8181/TCP,9091/TCP   70m
cryostat-sample-grafana   ClusterIP   172.31.155.26           3000/TCP            70m

We can access the Cryostat dashboard using the Kubernetes Ingress or OpenShift Route. Currently, there are no apps to monitor.

Create Sample Java App

We build a sample Java app using the Spring Boot framework. Our app exposes a single REST endpoint. As you see the endpoint implementation is very simple. The pingWithRandomDelay() method adds a random delay between 0 and 3 seconds and returns the string. However, there is one interesting thing inside that method. We are creating the ProcessingEvent object (1). Then, we call its begin method just before sleeping the thread (2). After the method is resumed we call the commit method on the ProcessingEvent object (3). In this inconspicuous way, we are generating our first custom JFR event. This event aims to monitor the processing time of our method.

@RestController
@RequestMapping("/callme")
public class CallmeController {

   private static final Logger LOGGER = LoggerFactory.getLogger(CallmeController.class);

   private Random random = new Random();
   private AtomicInteger index = new AtomicInteger();

   @Value("${VERSION}")
   private String version;

   @GetMapping("/ping-with-random-delay")
   public String pingWithRandomDelay() throws InterruptedException {
      int r = new Random().nextInt(3000);
      int i = index.incrementAndGet();
      ProcessingEvent event = new ProcessingEvent(i); // (1)
      event.begin(); // (2)
      LOGGER.info("Ping with random delay: id={}, name={}, version={}, delay={}", i,
             buildProperties.isPresent() ? buildProperties.get().getName() : "callme-service", version, r);
      Thread.sleep(r);
      event.commit(); // (3)
      return "I'm callme-service " + version;
   }

}

Let’s switch to the ProcessingEvent implementation. Our custom event needs to extend the jdk.jfr.Event abstract class. It contains a single parameter id. We can use some additional labels to improve the event presentation in the JFR graphical tools. The event will be visible under the name set in the @Name annotation and under the category set in the @Category annotation. We also need to annotate the parameter @Label to make it visible as part of the event.

@Name("ProcessingEvent")
@Category("Custom Events")
@Label("Processing Time")
public class ProcessingEvent extends Event {
    @Label("Event ID")
    private Integer id;

    public ProcessingEvent(Integer id) {
        this.id = id;
    }

    public Integer getId() {
        return id;
    }

    public void setId(Integer id) {
        this.id = id;
    }
}

Of course, our app will generate a lot of standard JFR events useful for profiling and monitoring. But we could also monitor our custom event.

Build App Image and Deploy on Kubernetes

Once we finish the implementation, we may build the container image of our Spring Boot app. Spring Boot comes with a feature for building container images based on the Cloud Native Buildpacks. In the Maven pom.xml you will find a dedicated profile under the build-image id. Once you activate such a profile, it will build the image using the Paketo builder-jammy-base image.


  build-image
  
    
      
        org.springframework.boot
        spring-boot-maven-plugin
        
          
            paketobuildpacks/builder-jammy-base:latest
            piomin/${project.artifactId}:${project.version}
          
        
        
          
            
              build-image

Before running the build we should start Docker on the local machine. After that, we should execute the following Maven command:

$ mvn clean package -Pbuild-image -DskipTests

With the build-image profile activated, Spring Boot Maven Plugin builds the image of our app. You should have a similar result as shown below. In my case, the image tag is piomin/callme-service:1.2.1.

By default, Paketo Java Buildpacks uses BellSoft Liberica JDK. With the Paketo BellSoft Liberica Buildpack, we can easily enable Java Flight Recorder for the container using the BPL_JFR_ENABLED environment variable. In order to expose data for Cryostat, we also need to enable the JMX port. In theory, we could use BPL_JMX_ENABLED and BPL_JMX_PORT environment variables for that. However, that option includes some additional configuration to the java command parameters that break the Cryostat discovery. This issue has been already described here. Therefore we will use the JAVA_TOOL_OPTIONS environment variable to set the required JVM parameters directly on the running command.

Instead of exposing the JMX port for discovery, we can include the Cryostat agent in the app dependencies. In that case, we should set the address of the Cryostat API in the Kubernetes Deployment manifest. However, I prefer an approach that doesn’t require any changes on the app side.

Now, let’s back to the Cryostat app discovery. Cryostat is able to automatically detect pods with a JMX port exposed. It requires the concrete configuration of the Kubernetes Service. We need to set the name of the port to jfr-jmx. In theory, we can expose JMX on any port we want, but for me anything other than 9091 caused discovery problems on Cryostat. In the Deployment definition, we have to set the BPL_JFR_ENABLED env to true, and the JAVA_TOOL_OPTIONS to -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=9091.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: callme-service
spec:
  replicas: 1
  selector:
    matchLabels:
      app: callme-service
  template:
    metadata:
      labels:
        app: callme-service
    spec:
      containers:
        - name: callme-service
          image: piomin/callme-service:1.2.1
          imagePullPolicy: IfNotPresent
          ports:
            - containerPort: 8080
            - containerPort: 9091
          env:
            - name: VERSION
              value: "v1"
            - name: BPL_JFR_ENABLED
              value: "true"
            - name: JAVA_TOOL_OPTIONS
              value: "-Dcom.sun.management.jmxremote.port=9091 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false"
---
apiVersion: v1
kind: Service
metadata:
  name: callme-service
  labels:
    app: callme-service
spec:
  type: ClusterIP
  ports:
  - port: 8080
    name: http
  - port: 9091
    name: jfr-jmx
  selector:
    app: callme-service

Let’s apply our deployment manifest to the demo-jfr namespace:

$ kubectl apply -f k8s/deployment-jfr.yaml -n demo-jfr

Here’s a list of pods of our callme-service app:

$ kubectl get po -n demo-jfr -l app=callme-service -o wide
NAME                              READY   STATUS    RESTARTS   AGE   IP            NODE
callme-service-6bc5745885-kvqfr   1/1     Running   0          31m   10.134.0.29   worker-cluster-lvsqq-1

Using Cryostat with JFR

View Default Dashboards

Cryostat automatically detects all the pods related to the Kubernetes Service that expose the JMX port. Once we switch to the Cryostat Dashboard, we will see the name of our pod in the “Target” dropdown. The default dashboard shows diagrams illustrating CPU load, heap memory usage, and a number of running Java threads.

Then, we can go to the “Recordings” section. It shows a list of active recordings made by Java Flight Recorder for our app running on Kubernetes. By default, Cryostat creates and starts a single recording per each detected target.

We can expand the selected record to see a detailed view. It provides a summarized panel divided into several different categories like heap, memory leak, or exceptions. It highlights warnings with a yellow color and problems with a red color.

We can display a detailed description of each case. We just need to click on the selected field with a problem name. The detailed description will appear in the context menu.

Create and Use a Custom Event Template

We can create a custom recording strategy by defining a new event template. Firstly, we need to go to the “Events” section, and then to the “Event Templates” tab. There are three built-in templates. We can use each of them as a base for our custom template. After deciding which of them to choose we can download it to our laptop. The default file extension is *.jfc.

In order to edit the *.jfc files we need a special tool called JDK Mission Control. Each vendor provides such a tool for their distribution of JDK. In our case, it is BellSoft Liberica. Once we download and install Liberica Mission Control on the laptop we should go to Window -> Flight Recording Template Manager.

With the Flight Recording Template Manager, we can import and edit an exported event template. I choose the higher monitoring for “Garbage Collection”, “Allocation Profiling”, “Compiler”, and “Thread Dump”.

Once a new template is ready, we should save it under the selected name. For me, it is the “Continuous Detailed” name. After that, we need to export the template to the file.

Then, we need to switch to the Cryostat Dashboard. We have to import the newly created template exported to the *.jfc file.

Once you import the template, you should see a new strategy in the “Event Templates” section.

We can create a recording based on our custom “Continuous_Detailed” template. After some time, Cryostat should gather data generated by the Java Flight Recorder for the app running on Kubernetes. However, this time we want to make some advanced analysys using Liberica Mission Control rather than just with the Cryostat Dashboard. Therefore we will export the recording to the *.jfr file. Such a file may be then imported to the JDK Mission Control tool.

Use the JDK Mission Control Tool

Let’s open the exported *.jfr file with Liberica Mission Control. Once we do it, we can analyze all the important aspects related to the performance of our Java app. We can display a table with memory allocation per the object type.

We can display a list of running Java threads.

Finally, we go to the “Event Browser” section. In the “Custom Events” category we should find our custom event under the name determined by the @Label annotation on the ProcessingEvent class. We can see the history of all generated JFR events together with the duration, start time, and the name of the processing thread.

Final Thoughts

Cryostat helps you to manage the Java Flight Recorder on Kubernetes at scale. It provides a graphical dashboard that allows to monitoring of all the Java workloads that expose JFR data over JMX. The important thing is that even after an app crash we can export the archived monitoring report and analyze it using advanced tools like JDK Mission Control.

The post Java Flight Recorder on Kubernetes appeared first on Piotr's TechBlog.

Testing Java Apps on Kubernetes with Testkube

piotr.minkowski — Mon, 27 Nov 2023 09:32:12 +0000

In this article, you will learn how to test Java apps on Kubernetes with Testkube automatically. We will build the tests for the typical Spring REST-based app. In the first scenario, Testkube runs the JUnit tests using its Maven support. After that, we will run the load tests against the running instance of our app using the Grafana k6 tool. Once again, Kubetest provides a standard mechanism for that, no matter which tool we use for testing.

If you are interested in testing on Kubernetes you can also read my article about integration tests with JUnit. There is also a post about contract testing on Kubernetes with Microcks available here.

Introduction

Testkube is a Kubernetes native test orchestration and execution framework. It allows us to run automated tests inside the Kubernetes cluster. It supports several popular testing or build tools like JMeter, Grafana k6, and Maven. We can easily integrate with the CI/CD pipelines or GitOps workflows. We can manage Kubetest by using the CRD objects directly, with the CLI, or through the UI dashboard. Let’s check how it works.

Source Code

If you would like to try it by yourself, you may always take a look at my source code. In order to do that you need to clone my GitHub repository. It contains only a single app. Once you clone it you can go to the src/test directory. You will find there both the JUnit tests written in Java and the k6 tests written in JavaScript. After that, you should just follow my instructions. Let’s begin.

Run Kubetest on Kubernetes

In the first step, we are going to install Testkube on Kubernetes using its Helm chart. Let’s add the kubeshop Helm repository and fetch latest charts info:

$ helm repo add kubeshop https://kubeshop.github.io/helm-charts
$ helm repo update

Then, we can install Testkube in the testkube namespace by executing the following helm command:

$ helm install testkube kubeshop/testkube \
    --create-namespace --namespace testkube

This will add custom resource definitions (CRD), RBAC roles, and role bindings to the Kubernetes cluster. This installation requires having cluster administrative rights.

Once the installation is finished, we can verify a list of running in the testkube namespace. The testkube-api-server and testkube-dashboard are the most important components. However, there are also some additional tools installed like Mongo database or Minio.

$ oc get po -n testkube
NAME                                                    READY   STATUS    RESTARTS        AGE
testkube-api-server-d4d7f9f8b-xpxc9                     1/1     Running   1 (6h17m ago)   6h18m
testkube-dashboard-64578877c7-xghsz                     1/1     Running   0               6h18m
testkube-minio-testkube-586877d8dd-8pmmj                1/1     Running   0               6h18m
testkube-mongodb-dfd8c7878-wzkbp                        1/1     Running   0               6h18m
testkube-nats-0                                         3/3     Running   0               6h18m
testkube-nats-box-567d94459d-6gc4d                      1/1     Running   0               6h18m
testkube-operator-controller-manager-679b998f58-2sv2x   2/2     Running   0               6h18m

We can also install testkube CLI on our laptop. It is not required, but we will use it during the exercise just try the full spectrum of options. You can find CLI installation instructions here. I’m installing it on macOS:

$ brew install testkube

Once the installation is finished, you can run the testkube version command to see that warm “Hello” screen

Run Maven Tests with Testkube

Firstly, let’s take a look at the JUnit tests inside our sample Spring Boot app. We are using the TestRestTemplate bean to call all the exposed REST endpoints exposed. There are three JUnit tests for testing adding, getting, and removing the Person objects.

@SpringBootTest(webEnvironment = SpringBootTest.WebEnvironment.RANDOM_PORT)
@TestMethodOrder(MethodOrderer.OrderAnnotation::class)
class PersonControllerTests {

   @Autowired
   lateinit var template: TestRestTemplate

   @Test
   @Order(1)
   fun shouldAddPerson() {
      var person = Instancio.of(Person::class.java)
         .ignore(Select.field("id"))
         .create()
      person = template
         .postForObject("/persons", person, Person::class.java)
      Assertions.assertNotNull(person)
      Assertions.assertNotNull(person.id)
      Assertions.assertEquals(1001, person.id)
   }

   @Test
   @Order(2)
   fun shouldUpdatePerson() {
      var person = Instancio.of(Person::class.java)
         .set(Select.field("id"), 1)
         .create()
      template.put("/persons", person)
      var personRemote = template
         .getForObject("/persons/{id}", Person::class.java, 1)
      Assertions.assertNotNull(personRemote)
      Assertions.assertEquals(person.age, personRemote.age)
   }

   @Test
   @Order(3)
   fun shouldDeletePerson() {
      template.delete("/persons/{id}", 1)
      val person = template
         .getForObject("/persons/{id}", Person::class.java, 1)
      Assertions.assertNull(person)
   }

}

We are using Maven as a build tool. The current version of Spring Boot is 3.2.0. The version of JDK used for the compilation is 17. Here’s the fragment of our pom.xml in the repository root directory:


  4.0.0
  
        org.springframework.boot
        spring-boot-starter-parent
        3.2.0
  
  pl.piomin.services
  sample-spring-kotlin-microservice
  1.5.3

  
    17
    1.9.21
  

  
    ...   
    
      org.springframework.boot
      spring-boot-starter-test
      test
    
    
      org.instancio
      instancio-junit
      3.6.0
      test

Testkube provides the Executor CRD for defining a way of running each test. There are several default executors per each type of supported build or test tool. We can display a list of provided executors by running the testkube get executor command. You will see the list of all tools supported by Testkube. Of course, the most interesting executors for us are k6-executor and maven-executor.

$ testkube get executor

Context:  (1.16.8)   Namespace: testkube
----------------------------------------

  NAME                 | URI | LABELS
-----------------------+-----+-----------------------------------
  artillery-executor   |     |
  curl-executor        |     |
  cypress-executor     |     |
  ginkgo-executor      |     |
  gradle-executor      |     |
  jmeter-executor      |     |
  jmeterd-executor     |     |
  k6-executor          |     |
  kubepug-executor     |     |
  maven-executor       |     |
  playwright-executor  |     |
  postman-executor     |     |
  soapui-executor      |     |
  tracetest-executor   |     |
  zap-executor         |     |

By default, maven-executor uses JDK 11 for running Maven tests. Moreover, it still doesn’t provide images for running tests against JDK19+. For me, this is quite a big drawback since the latest LTS version of Java is 21. The maven-executor-jdk17 Executor contains the name of the running image (1) and a list of supported test types (2).

apiVersion: executor.testkube.io/v1
kind: Executor
metadata:
  name: maven-executor-jdk17
  namespace: testkube
spec:
  args:
    - '--settings'
    - 
    - 
    - '-Duser.home'
    - 
  command:
    - mvn
  content_types:
    - git-dir
    - git
  executor_type: job
  features:
    - artifacts
  # (1)
  image: kubeshop/testkube-maven-executor:jdk17 
  meta:
    docsURI: https://kubeshop.github.io/testkube/test-types/executor-maven
    iconURI: maven
  # (2)
  types:
    - maven:jdk17/project
    - maven:jdk17/test
    - maven:jdk17/integration-test

Finally, we just need to define the Test object that references to maven-executor-jdk17 by the type parameter. Of course, we also need to set the address of the Git repository and the name of the branch.

apiVersion: tests.testkube.io/v3
kind: Test
metadata:
  name: sample-spring-kotlin
  namespace: testkube
spec:
  content:
    repository:
      branch: master
      type: git
      uri: https://github.com/piomin/sample-spring-kotlin-microservice.git
    type: git
  type: maven:jdk17/test

Finally, we can run the sample-spring-kotlin test using the following command:

$ testkube run test sample-spring-kotlin

Using UI Dashboard

First of all, let’s expose the Testkube UI dashboard on the local port. The dashboard also requires a connection to the testkube-api-server from the web browser. After exposing the dashboard with the following port-forward command we can access it under the http://localhost:8080 address:

$ kubectl port-forward svc/testkube-dashboard 8080 -n testkube
$ kubectl port-forward svc/testkube-api-server 8088 -n testkube

Once we access the Testkube dashboard we will see a list of all defined tests:

Then, we can click the selected tile with the test to see the details. You will be redirected to the history of previous executions available in the “Recent executions” tab. There are six previous executions of our sample-spring-kotlin test. Two of them were finished successfully, the four others were failed.

Let’s take a look at the logs of the last one execution. As you see, all three JUnit tests were successful.

Run Load Tests with Testkube and Grafana k6

In this section, we will create the tests for the instance of our sample app running on Kubernetes. So, in the first step, we need to deploy the app. Here’s the Deployment manifest. We can apply it to the default namespace. The manifest uses the latest image of the sample app available in the registry under the quay.io/pminkows/sample-kotlin-string:1.5.3 address.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: sample-kotlin-spring
  labels:
    app: sample-kotlin-spring
spec:
  replicas: 1
  selector:
    matchLabels:
      app: sample-kotlin-spring
  template:
    metadata:
      labels:
        app: sample-kotlin-spring
    spec:
      containers:
      - name: sample-kotlin-spring
        image: quay.io/pminkows/sample-kotlin-spring:1.5.3
        ports:
        - containerPort: 8080

Let’s also create the Kubernetes Service that exposes app pods internally:

apiVersion: v1
kind: Service
metadata:
  name: sample-kotlin-spring
spec:
  selector:
    app: sample-kotlin-spring
  ports:
    - protocol: TCP
      port: 8080
      targetPort: 8080

After that, we can proceed to the Test manifest. This time, we don’t have to override the default executor, since the k6 version is not important. The test source is located inside the sample Git repository in the src/test/resources/k6/load-tests-get.js (1) file in the master branch. In that case, the repository type is git (2). The k6 test should run for 5 seconds and should use 5 concurrent threads (3). We also need to set the address of a target service as the PERSONS_URI environment variable (4). Of course, we are testing through the Kubernetes Service visible internally under the sample-kotlin-spring.default.svc host and port 8080. The type of the test is k6/script (5).

apiVersion: tests.testkube.io/v3
kind: Test
metadata:
  labels:
    executor: k6-executor
    test-type: k6-script
  name: load-tests-gets
  namespace: testkube
spec:
  content:
    repository:
      branch: master
      # (1)
      path: src/test/resources/k6/load-tests-get.js
      # (2) 
      type: git
      uri: https://github.com/piomin/sample-spring-kotlin-microservice.git
    type: git
  executionRequest:
    # (3)
    args:
      - '-u'
      - '5'
      - '-d'
      - 10s
    # (4)
    variables:
      PERSONS_URI:
        name: PERSONS_URI
        type: basic
        value: http://sample-kotlin-spring.default.svc:8080
        valueFrom: {}
  # (5)
  type: k6/script

Let’s take a look at the k6 test file written in JavaScript. As I mentioned before, you can find it in the src/test/resources/k6/load-tests-get.js file. The test calls the GET /persons/{id} endpoint. It sets the random number between 1 and 1000 as the id path parameter and reads a target service URL from the PERSONS_URI environment variable.

import http from 'k6/http';
import { check } from 'k6';
import { randomIntBetween } from 'https://jslib.k6.io/k6-utils/1.2.0/index.js';

export default function () {
  const id = randomIntBetween(1, 1000);
  const res = http.get(`${__ENV.PERSONS_URI}/persons/${id}`);
  check(res, {
    'is status 200': (res) => res.status === 200,
    'body size is > 0': (r) => r.body.length > 0,
  });
}

Finally, we can run the load-tests-gets test with the following command:

$ testkube run test load-tests-gets

The same as for the Maven test we can verify the execution history in the Testkube dashboard:

We can also display all the logs from the test:

Final Thoughts

Testkube provides a unified way to run Kubernetes tests for the several most popular testing tools. It may be a part of your CI/CD pipeline or a GitOps process. Honestly, I’m still not very convinced if I need a dedicated Kubernetes-native solution for automated tests, instead e.g. a stage in my pipeline that runs test commands. However, you can also use Testkube to execute load or integration tests against the app running on Kubernetes. It is possible to schedule them periodically. Thanks to that you can verify your apps continuously using a single, central tool.

The post Testing Java Apps on Kubernetes with Testkube appeared first on Piotr's TechBlog.

Speed Up Java Startup on Kubernetes with CRaC

piotr.minkowski — Tue, 05 Sep 2023 07:33:20 +0000

In this article, you will learn how to leverage CRaC to reduce Java startup time and configure it for the app running on Kubernetes. The OpenJDK Coordinated Restore at Checkpoint (CRaC) project was introduced by Azul in 2020. As you probably know, Azul is an organization famous for the OpenJDK distribution called Azul Zulu. Azul shipped an OpenJDK 17 distribution with built-in support for CRaC. Its aim is to drastically reduce the startup time and time to peak performance of Java apps. Micronaut and Quarkus frameworks already support CRaC, while Spring Framework announced to provide support in November 2023.

What’s the idea behind CRaC? In fact, it is a pretty simple concept. CRaC takes a memory snapshot at the app runtime and then restores it in later executions. It is based on the Linux feature called Checkpoint/Restore In Userspace (CRIU). Unfortunately, there is no CRIU equivalent for Windows or Mac, so currently you can use CRaC just on Linux. In our case, it is not a problem, since we are going to build a container from Azul Zulu OpenJDK image and then run it on Kubernetes. However, before we do it, let’s analyze the steps required to achieve a checkpoint/restore mechanism with CRaC.

Some time ago I published an article Which JDK to Choose on Kubernetes. I compared all the most popular JDK implementations. There were no significant differences between them in my tests on Kubernetes. So, the features like CRaC can make a difference for Java on Kubernetes.

How It Works

For the purpose of that part of our exercise, let’s assume we have already installed Azul Zulu OpenJDK, we have Linux and an app supporting CRaC (for me the second point doesn’t work since I have macOS :)). The first step is to run our app with the -XX:CRaCCheckpointTo parameter. It enables CRaC and indicates the location of the snapshot:

$ java -XX:CRaCCheckpointTo=/crac-files -jar target/sample-app.jar

Once our app is running, we can run the following command in another terminal:

$ jcmd target/sample-app.jar JDK.checkpoint

The jcmd command triggers app checkpoint creation. After a while, our snapshot is ready. We can go to the /crac-files directory and see a list of the files. The directory structure won’t tell us much, but there is a file called dump4.log containing the logs from the operation. If the command finishes successfully, we can go to the next step. In order to restore our image and run the app from its saved state, we need to run the following command:

$ java -XX:CRaCRestoreFrom=/crac-files

Your app should start much faster than before. The difference is significant. Instead of seconds, you may have several milliseconds required for startup.

Source Code

If you would like to try it by yourself, you may always take a look at my source code. In order to do that you need to clone my GitHub repository. The sample app Spring Boot for the current exercise is available inside the callme-service directory. You can go to that directory and then just follow my instructions

Enable CRaC for Spring Boot

As I mentioned before, Spring Boot currently won’t support CRaC. It will probably change in November, but for now, let’s see what it means. If we run the standard Spring Boot app and then execute the jcmd command for creating a checkpoint you will see something similar to the following result:

jdk.crac.impl.CheckpointOpenSocketException: tcp6 localAddr :: localPort 8080 remoteAddr :: remotePort 0
        at java.base/jdk.crac.Core.translateJVMExceptions(Core.java:80)
        at java.base/jdk.crac.Core.checkpointRestore1(Core.java:137)
        at java.base/jdk.crac.Core.checkpointRestore(Core.java:177)
        at java.base/jdk.crac.Core.lambda$checkpointRestoreInternal$0(Core.java:194)
        at java.base/java.lang.Thread.run(Thread.java:832)

Fortunately, we can bypass this problem. In the Maven Central repository, there is the Tomcat Embed version that supports CRaC. We can include that dependency and replace the default tomcat-embed-core module used by the Spring Web project. Here’s the solution:


  io.github.crac.org.apache.tomcat.embed
  tomcat-embed-core
  10.1.7


  org.springframework.boot
  spring-boot-starter-web
  
    
      org.apache.tomcat.embed
      tomcat-embed-core

We have a pretty simple Spring Boot app. It exposes some REST endpoints including the following one that returns a value of the VERSION environment variable.

@RestController
@RequestMapping("/callme")
public class CallmeController {

    private static final Logger LOGGER = LoggerFactory.getLogger(CallmeController.class);


    @Autowired
    Optional buildProperties;
    @Value("${VERSION}")
    private String version;

    @GetMapping("/ping")
    public String ping() {
        LOGGER.info("Ping: name={}, version={}", buildProperties.isPresent() ? buildProperties.get().getName() : "callme-service", version);
        return "I'm callme-service " + version;
    }

}

Once we replace the tomcat-embed-core dependency we should rebuild the app. There is a custom Maven profile that activates the replacement of the tomcat-embed-core dependency in my sample app code. So remember about enabling the crac profile during the build:

$ mvn clean package -Pcrac

Java with CRaC as Container on Kubernetes

In the first step, we need to prepare the image of our Java app. In order to do that, we will create a Dockerfile in the app’s root directory. We will use the latest version of Azul Java 17 with CRaC support as a base image. Our image will contain the app uber JAR file and a single script for making the checkpoint.

FROM azul/zulu-openjdk:17-jdk-crac-latest
COPY target/callme-service-1.1.0.jar /app/callme-service-1.1.0.jar
COPY src/scripts/entrypoint.sh /app/entrypoint.sh
RUN chmod 755 /app/entrypoint.sh

Here’s the content of the entrypoint.sh script, which was copied to the target image in our Dockerfile. As you see, we are running here the jcmd command after starting the Java app. There is one important thing about CRaC that we need to mention here. Here’s the fragment from CRaC documentation: “CRaC implementation creates the checkpoint only if the whole Java instance state can be stored in the image. Resources like open files or sockets are cannot, so it is required to release them when checkpoint is made. “. As a result, the jcmd command will stop our Java process, so we should not kill the container/pod after that. If we run the script in that way after starting the container it will first create a snapshot and then will stop the pod after 10 seconds.

#!/bin/bash

java -XX:CRaCCheckpointTo=/crac -jar /app/callme-service-1.1.0.jar&
sleep 10
jcmd /app/callme-service-1.1.0.jar JDK.checkpoint
sleep 10

Let’s build the image using the following command:

$ docker build -t callme-service:1.1.0 .

Now, let’s consider our scenario in the context of Kubernetes. First of all, we need to create the snapshot and save its state on the disk. It is a one-time activity. Or maybe to be more precise, a one-time activity per each release of the app. Therefore, we should perform it even before creating (or updating) the Deployment. Of course, we need to provide storage and assign it to the pod that creates a snapshot, and all the pods that restore the app from the store using CRaC. Let’s begin with the PersistenceVolumeClaim definition:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: crac-store
  namespace: crac
spec:
  accessModes:
    - ReadWriteOnce
  volumeMode: Filesystem
  resources:
    requests:
      storage: 10Gi

In the next step, we will create a Kubernetes Job that performs the checkpoint operation. It will run our already built image (1), and then execute the entrypoint.sh script responsible for making checkpoint (2). The CRaC checkpoint operation requires higher privileges, so we need to allow it in securityContext section (3). We will also mount the crac-store PVC to the job under the /crac path (4).

apiVersion: batch/v1
kind: Job
metadata:
  name: callme-service-snapshot-job
  namespace: crac
spec:
  template:
    spec:
      containers:
        - name: callme-service
          image: callme-service:1.1.0 # (1)
          env:
            - name: VERSION
              value: "v1"
          command: ["/bin/sh","-c", "/app/entrypoint.sh"] # (2)
          volumeMounts:
            - mountPath: /crac
              name: crac
          securityContext:
            privileged: true # (3)
      volumes:
        - persistentVolumeClaim:
            claimName: crac-store # (4)
          name: crac
      restartPolicy: Never
  backoffLimit: 3

Let’s apply the job to the Kubernetes cluster:

$ kubectl apply -f job.yaml

Kubernetes starts a single pod related to the Job. Once it changes the status to Completed, it means that the checkpoint operation is finished.

$ kubectl get po -n crac
NAME                                READY   STATUS      RESTARTS   AGE
callme-service-snapshot-job-j7wkz   0/1     Completed   0          43s

Now, we can proceed with our app deployment. We will run three pods (1) of the app. We will use exactly the same image as before (2), but his time we run the java -XX:CRaCRestoreFrom=/crac command (3) instead of the entrypoint.sh script. In order to measure how much time the pod requires to be ready, we will add the redinessRrobe with the lowest possible periodSeconds (4). Thanks to that we will be able to compare the startup time of the app with and without the CRaC mechanism enabled.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: callme-service
spec:
  replicas: 3 # (1)
  selector:
    matchLabels:
      app: callme-service
  template:
    metadata:
      labels:
        app: callme-service
    spec:
      containers:
        - name: callme-service
          image: callme-service:1.1.0 # (2)
          imagePullPolicy: IfNotPresent
          ports:
            - containerPort: 8080
          env:
            - name: VERSION
              value: "v1"
          command: ["java"] # (3)
          args: ["-XX:CRaCRestoreFrom=/crac"]
          volumeMounts:
            - mountPath: /crac
              name: crac
          readinessProbe: # (4)
            initialDelaySeconds: 0
            periodSeconds: 1
            httpGet:
              path: /actuator/health/readiness
              port: 8080
          securityContext:
            privileged: true
          resources:
            limits: 
              cpu: '1'
      volumes:
        - name: crac
          persistentVolumeClaim:
            claimName: crac-store

Let’s apply the Deployment to the Kubernetes cluster:

$ kubectl apply -f deployment-crac.yaml

Just to clarify – here’s the visualization of our scenario:

Finally, we can display a list of running callme-service pods.

$ kubectl get po -n crac
NAME                                READY   STATUS      RESTARTS   AGE
callme-service-6fb68cbd5b-5wz6x     1/1     Running     0          2m38s
callme-service-6fb68cbd5b-pds8c     1/1     Running     0          3m3s
callme-service-6fb68cbd5b-zbf6h     1/1     Running     0          2m18s

Compare the Startup Time of Pods

In order to compare the startup time of our app with and without CRaC we just need to replace the following single line in the Deployment manifest.

One thing is worth mentioning here. For me, if I try to measure the startup of the Spring Boot app restored using CRaC e.g. with the metric application.started.time it will always print the value measured during making the snapshot. Here’s a fragment of logs from that operation performed by the callme-service-snapshot-job Job.

So now, if I restore the app from the CRaC the value returned by the endpoint GET /actuator/metrics/application.started.time would be exactly the same. What is obviously not valid, but quite logical. Therefore we will base our research on the time reported on the Kubernetes. Since there is no direct statistic that shows the pod startup time period, we need to calculate it as the difference between the time when the pod was scheduled and the time when it was reported to be ready. Of course, such a calculation contains not only app startup time but the time required for the pod initialization or readiness probe period (1s).

What are the results? For the pod with a CPU limit equal to 1 core pod with the standard app starts 14s (around 11s-12s just for the Java), while the pod restored with CRaC 3s (~1s or less for the Java app).

Final Thoughts

CRaC can be treated as another way to achieve fast Java startup and warmup than the native compilation provided by GraalVM. GraalVM will additionally solve a problem with a large memory footprint. However, it has a price, because with GraalVM there are more constraints and a potentially more painful troubleshooting process. On the other hand, with CRaC we need to create a snapshot image and store it on the persistent volume. So each time, we need to mount a volume to the pod running on Kubernetes. Anyway, it is better to have one more option available.

The main goal of this article is to familiarize you with the CRaC approach and show how to adapt it to Java apps running Kubernetes. If you are also interested in native compilation with GraalVM you can read my post about Spring Boot native microservices with Knative. There is also an article about GraalVM and virtual threads on Kubernetes available here.

The post Speed Up Java Startup on Kubernetes with CRaC appeared first on Piotr's TechBlog.

Resize CPU Limit To Speed Up Java Startup on Kubernetes

piotr.minkowski — Tue, 22 Aug 2023 08:04:24 +0000

In this article, you will learn how to solve problems with the slow startup of Java apps on Kubernetes related to the CPU limit. We will use a new Kubernetes feature called “In-place Pod Vertical Scaling”. It allows resizing resources (CPU or memory) assigned to the containers without pod restart. We can use it since the Kubernetes 1.27 version. However, it is still the alpha feature, that has to be explicitly enabled. In order to test we will run a simple Spring Boot Java app on Kubernetes.

Motivation

If you are running Java apps on Kubernetes you probably have already encountered the problem with slow startup after setting too low CPU limit. It occurs because Java apps usually need significantly more CPU during initialization than during standard work. If such applications specify requests and limits suited for regular operation, they may suffer from very long startup times. On the other hand, if they specify a high CPU limit just to start fast it may not be the optimal approach for managing resource limits on Kubernetes. You can find some considerations in this area in my article about best practices for Java apps on Kubernetes.

Thanks to the new feature such pods can request a higher CPU at the time of pod creation and can be resized down to normal running needs once the application has finished initializing. We will also consider how to automatically apply such changes on the cluster once the pod is ready. In order to do that, we will use Kyverno. Kyverno policies can mutate Kubernetes resources in reaction to admission callbacks – which perfectly matches our needs in this exercise.

You can somehow associate “In-place Pod Vertical Scaling” with the Vertical Pod Autoscaler tool. The Kubernetes Vertical Pod Autoscaler (VPA) automatically adjusts the CPU and memory reservations for pods to do the “right-sizing” for your applications. However, these are two different things. Currently, VPA is working on out-of-the-box support for in-place pod vertical scaling. If you don’t use VPA, this article still provides a valuable solution to your problems with CPU limits and Java startup.

I think our goal is pretty clear. Let’s begin!

Enable In-place Pod Vertical Scaling

Since the “in-place pod vertical scaling” feature is still in the alpha state we need to explicitly enable it on Kubernetes. I’m testing that feature on Minikube. Here’s my minikube starting command (you try with lower memory if you wish):

$ minikube start --memory='8g' \
  --feature-gates=InPlacePodVerticalScaling=true

Install Kyverno on Kubernetes

Before we deploy the app we need to install Kyverno and create its policy. However, our scenario is not very standard for Kyverno. Let’s take some time to analyze it. When creating a new Kubernetes Deployment we should set the right CPU to allow fast startup of our Java app. Once our app has started and is ready to work we will resize the limit to match the standard app requirements. We cannot do it until the app startup procedure is in progress. In other words, we are not waiting for the pod Running status…

… but for app container readiness inside the pod.

Here’s a picture that illustrates our scenario. We will set the CPU limit to 2 cores during startup. Once our app is started we decrease it to 500 millicores.

Now, let’s go back to Kyverno. We will install it on Kubernetes using the official Helm chart. In the first step we need to add the following Helm repository:

$ helm repo add kyverno https://kyverno.github.io/kyverno/

During the installation, we need to customize a single property. By default, Kyverno filters out updates made on Kubernetes by the members of the system:nodes group. One of those members is kubelet, which is responsible for updating the state of containers running on the node. So, if we want to catch the container-ready event from kubelet we need to override that behavior. That’s why we set the config.excludeGroups property as an empty array. Here’s our values.yaml file:

config:
  excludeGroups: []

Finally, we can install Kyverno on Kubernetes using the following Helm command:

$ helm install kyverno kyverno/kyverno -n kyverno \
  --create-namespace -f values.yaml

Kyverno has been installed in the kyverno namespace. Just to verify if everything works fine we can display a list of running pods:

$ kubectl get po -n kyverno
NAME                                             READY   STATUS    RESTARTS   AGE
kyverno-admission-controller-79dcbc777c-8pbg2    1/1     Running   0          55s
kyverno-background-controller-67f4b647d7-kp5zr   1/1     Running   0          55s
kyverno-cleanup-controller-566f7bc8c-w5q72       1/1     Running   0          55s
kyverno-reports-controller-6f96648477-k6dcj      1/1     Running   0          55s

Create a Policy for Resizing the CPU Limit

We want to trigger our Kyverno policy on pod start and its status update (1). We will apply the change to the resource only if the current readiness state is true (2). It is possible to select a target container using a special element called “anchor” (3). Finally, we can define a new CPU limit for the container inside the target pod with the patchStrategicMerge section (4).

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: resize-pod-policy
spec:
  mutateExistingOnPolicyUpdate: false
  rules:
    - name: resize-pod-policy
      match:
        any:
          - resources: # (1)
              kinds:
                - Pod/status
                - Pod
      preconditions: 
        all: # (2)
          - key: "{{request.object.status.containerStatuses[0].ready}}"
            operator: Equals
            value: true
      mutate:
        targets:
          - apiVersion: v1
            kind: Pod
            name: "{{request.object.metadata.name}}"
        patchStrategicMerge:
          spec:
            containers:
              - (name): sample-kotlin-spring # (3)
                resources:
                  limits:
                    cpu: 0.5 # (4)

Let’s apply the policy.

We need to add some additional privileges that allow the Kyverno background controller to update pods. We don’t need to create ClusterRoleBinding, but just a ClusterRole with the right aggregation labels in order for those permissions to take effect.

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: kyverno:update-pods
  labels:
    app.kubernetes.io/component: background-controller
    app.kubernetes.io/instance: kyverno
    app.kubernetes.io/part-of: kyverno
rules:
  - verbs:
      - patch
      - update
    apiGroups:
      - ''
    resources:
      - pods

After that, we may try to create a policy once again. As you see, this time there were no more problems with that.

Deploy the Java App and Resize CPU Limit After Startup

Let’s take a look at the Deployment manifest of our Java app. The name of the app container is sample-kotlin-spring, which matches the conditional "anchor" in the Kyverno policy (1). As you see I’m setting the CPU limit to 2 cores (2). There’s also a new field used here resizePolicy (3). I would not have to set it since the default value is NotRequired. It means that changing the resource limit or request will not result in a pod restart. The Deployment object also contains a readiness probe that calls the GET/actuator/health/readiness exposed with the Spring Boot Actuator (4).

apiVersion: apps/v1
kind: Deployment
metadata:
  name: sample-kotlin-spring
  namespace: demo
  labels:
    app: sample-kotlin-spring
spec:
  replicas: 1
  selector:
    matchLabels:
      app: sample-kotlin-spring
  template:
    metadata:
      labels:
        app: sample-kotlin-spring
    spec:
      containers:
      - name: sample-kotlin-spring # (1)
        image: quay.io/pminkows/sample-kotlin-spring:1.5.1.1
        ports:
        - containerPort: 8080
        resources:
          limits:
            cpu: 2 # (2)
            memory: "1Gi"
          requests:
            cpu: 0.1
            memory: "256Mi"
        resizePolicy: # (3)
        - resourceName: "cpu"
          restartPolicy: "NotRequired"
        readinessProbe: # (4)
          httpGet:
            path: /actuator/health/readiness
            port: 8080
            scheme: HTTP
          initialDelaySeconds: 15
          periodSeconds: 5
          successThreshold: 1
          failureThreshold: 3

Once we deploy the app a new pod is starting. We can verify its current resource limits. As you see it is still 2 CPUs.

Our app starts around 10-15 seconds. Therefore the readiness check also waits 15 seconds after it begins to call the Actuator endpoint (the initialDelaySeconds parameter). After that, it finishes with success and our container switches to the ready state.

Then, Kyverno detects container status change and triggers the policy. The policy precondition is met since the container is ready. Now, we can verify the current CPU limit on the same pod. It is 500 millicores. You can also take a look at the Annotations field. It indicates

That’s exactly what we want to achieve. Now, we can scale up the number of running instances of our app just to continue testing. Then, you can verify by yourself that a new pod will also have its CPU limit modified after startup by Kyverno to 0.5 core.

$ kubectl scale --replicas=2 deployment sample-kotlin-spring -n demo

And the last thing. How long would it take to start our app if we set 500 millicores as the CPU limit at the beginning? For my app and such a CPU limit, it is around 40 seconds. So the difference is significant.

Final Thoughts

Finally, there is a solution for Java apps on Kubernetes to dynamically resize the CPU limit after startup. In this article, you can find my proposal for managing it automatically using a Kyverno policy. In our example, the pods have a different CPU limit than the limit declared in the Deployment object. However, I can imagine a policy that consists of two rules and just modifies the limit only for the time of startup.

The post Resize CPU Limit To Speed Up Java Startup on Kubernetes appeared first on Piotr's TechBlog.

Logging in Kubernetes with Loki

piotr.minkowski — Thu, 20 Jul 2023 22:50:37 +0000

In this article, you will learn how to install, configure and use Loki to collect logs from apps running on Kubernetes. Together with Loki, we will use the Promtail agent to ship the logs and the Grafana dashboard to display them in graphical form. We will also create a simple app written in Quarkus that prints the logs in JSON format. Of course, Loki will collect the logs from the whole cluster. If you are interested in other approaches for integrating your apps with Loki you can read my article. It shows how to send the Spring Boot app logs to Loki using Loki4j Logback appended. You can also find the article about Grafana Agent used to send logs from the Spring Boot app to Loki on Grafana Cloud here.

Source Code

If you would like to try it by yourself, you may always take a look at my source code. In order to do that, you need to clone my GitHub repository. Then you should just follow my instructions.

Install Loki Stack on Kubernetes

In the first step, we will install Loki Stack on Kubernetes. The most convenient way to do it is through the Helm chart. Fortunately, there is a single Helm chart that installs and configures all the tools required in our exercise: Loki, Promtail, and Grafana. Let’s add the following Helm repository:

$ helm repo add grafana https://grafana.github.io/helm-charts

Then, we can install the loki-stack chart. By default, it does not install Grafana. In order to enable Grafana we need to set the grafana.enabled parameter to true. Our Loki Stack is installed in the loki-stack namespace:

$ helm install loki grafana/loki-stack \
  -n loki-stack \
  --set grafana.enabled=true \
  --create-namespace

Here’s a list of running pods in the loki-stack namespace:

$ kubectl get po -n loki-stack
NAME                           READY   STATUS    RESTARTS   AGE
loki-0                         1/1     Running   0          78s
loki-grafana-bf598db67-czcds   2/2     Running   0          93s
loki-promtail-vt25p            1/1     Running   0          30s

Let’s enable port forwarding to access the Grafana dashboard on the local port:

$ kubectl port-forward svc/loki-grafana 3000:80 -n loki-stack

Helm chart automatically generates a password for the admin user. We can obtain it with the following command:

$ kubectl get secret -n loki-stack loki-grafana \
    -o jsonpath="{.data.admin-password}" | \
    base64 --decode ; echo

Once we login into the dashboard we will see the auto-configured Loki datasource. We can use it to get the latest logs from the Kubernetes cluster:

It seems that the `loki-stack` Helm chart is not maintained anymore. As the replacement, we can use three separate Helm charts for Loki, Promtail, and Grafana. It is described in the last section of that article. Although `loki-stack` simplifies installation, in the current situation, it is not a suitable method for production. Instead, we should use the `loki-distributed` chart.

Create and Deploy Quarkus App on Kubernetes

In the next step, we will install our sample Quarkus app on Kubernetes. It connects to the Postgres database. Therefore, we will also install Postgres with the Bitnami Helm chart:

$ helm install person-db bitnami/postgresql -n sample-quarkus \
  --set auth.username=quarkus  \
  --set auth.database=quarkus  \
  --set fullnameOverride=person-db \
  --create-namespace

With Quarkus we can easily change the logs format to JSON. We just need to include the following Maven dependency:


  io.quarkus
  quarkus-logging-json

And also enable JSON logging in the application properties:

quarkus.log.console.json = true

Besides the static logging fields, we will include a single dynamic field. We will use the MDC mechanism for that (1) (2). That field indicates the id of the person for whom we make the GET or POST request. Here’s the code of the REST controller:

@Path("/persons")
public class PersonResource {

    private PersonRepository repository;
    private Logger logger;

    public PersonResource(PersonRepository repository, Logger logger) {
        this.repository = repository;
        this.logger = logger;
    }

    @POST
    @Transactional
    public Person add(Person person) {
        repository.persist(person);
        MDC.put("personId", person.id); // (1)
        logger.infof("IN -> add(%s)", person);
        return person;
    }

    @GET
    @APIResponseSchema(Person.class)
    public List findAll() {
        logger.info("IN -> findAll");
        return repository.findAll()
                .list();
    }

    @GET
    @Path("/{id}")
    public Person findById(@PathParam("id") Long id) {
        MDC.put("personId", id); // (2)
        logger.infof("IN -> findById(%d)", id);
        return repository.findById(id);
    }
}

Here’s the sample log for the GET endpoint. Now, our goal is to parse and index it properly in Loki with Promtail.

Now, we need to deploy our sample app on Kubernetes. Fortunately, with Quarkus we can build and deploy the app using the single Maven command. We just need to activate the following custom profile which includes quarkus-kubernetes dependency and enables deployment with the quarkus.kubernetes.deploy property. It also activates image build using the Jib Maven Plugin.


  kubernetes
  
    
      kubernetes
    
  
  
    
      io.quarkus
      quarkus-container-image-jib
    
    
      io.quarkus
      quarkus-kubernetes
    
  
  
    true

Let’s build and deploy the app:

$ mvn clean package -DskipTests -Pkubernetes

Here’s the list of running pods (database and app):

$ kubectl get po -n sample-quarkus
NAME                             READY   STATUS    RESTARTS   AGE
person-db-0                      1/1     Running   0          48s
person-service-9f67b6d57-gvbs6   1/1     Running   0          18s

Configure Promptail to Parse JSON Logs

Let’s take a look at the Promtail configuration. We can find it inside the loki-promtail Secret. As you see it uses only the cri component.

server:
  log_level: info
  http_listen_port: 3101


clients:
  - url: http://loki:3100/loki/api/v1/push

positions:
  filename: /run/promtail/positions.yaml

scrape_configs:
  - job_name: kubernetes-pods
    pipeline_stages:
      - cri: {}
    kubernetes_sd_configs:
      - role: pod
    relabel_configs:
      - source_labels:
          - __meta_kubernetes_pod_controller_name
        regex: ([0-9a-z-.]+?)(-[0-9a-f]{8,10})?
        action: replace
        target_label: __tmp_controller_name
      - source_labels:
          - __meta_kubernetes_pod_label_app_kubernetes_io_name
          - __meta_kubernetes_pod_label_app
          - __tmp_controller_name
          - __meta_kubernetes_pod_name
        regex: ^;*([^;]+)(;.*)?$
        action: replace
        target_label: app
      - source_labels:
          - __meta_kubernetes_pod_label_app_kubernetes_io_instance
          - __meta_kubernetes_pod_label_release
        regex: ^;*([^;]+)(;.*)?$
        action: replace
        target_label: instance
      - source_labels:
          - __meta_kubernetes_pod_label_app_kubernetes_io_component
          - __meta_kubernetes_pod_label_component
        regex: ^;*([^;]+)(;.*)?$
        action: replace
        target_label: component
      - action: replace
        source_labels:
          - __meta_kubernetes_pod_node_name
        target_label: node_name
      - action: replace
        source_labels:
          - __meta_kubernetes_namespace
        target_label: namespace
      - action: replace
        replacement: $1
        separator: /
        source_labels:
          - namespace
          - app
        target_label: job
      - action: replace
        source_labels:
          - __meta_kubernetes_pod_name
        target_label: pod
      - action: replace
        source_labels:
          - __meta_kubernetes_pod_container_name
        target_label: container
      - action: replace
        replacement: /var/log/pods/*$1/*.log
        separator: /
        source_labels:
          - __meta_kubernetes_pod_uid
          - __meta_kubernetes_pod_container_name
        target_label: __path__
      - action: replace
        regex: true/(.*)
        replacement: /var/log/pods/*$1/*.log
        separator: /
        source_labels:
          - __meta_kubernetes_pod_annotationpresent_kubernetes_io_config_hash
          - __meta_kubernetes_pod_annotation_kubernetes_io_config_hash
          - __meta_kubernetes_pod_container_name
        target_label: __path__

The result for our app is quite inadequate. Loki stores the full Kubernetes pods’ log lines and doesn’t recognize our logging fields.

In order to change that behavior we will parse data using the json component. This action will be limited just to our sample application (1). We will label the log records with level, sequence, and the personId MDC field (2) after extracting them from the Kubernetes log line. The mdc field contains a list of objects, so we need to perform additional JSON parsing (3) to extract the personId field. As the output, Promtail should return the log message field (4). Here’s the required transformation in the configuration file:

- job_name: kubernetes-pods
  pipeline_stages:
    - cri: {}
    - match:
        selector: '{app="person-service"}' # (1)
        stages:
          - json:
              expressions:
                log:
          - json: # (2)
              expressions:
                sequence: sequence
                message: message
                level: level
                mdc:
              source: log
          - json: # (3)
              expressions:
                personId: personId
              source: mdc
          - labels:
              sequence:
              level:
              personId:
          - output: # (4)
              source: message

After setting a new value of the loki-promtail Secret we should restart the Promtail pod. Let’s also restart our app and perform some test calls of the REST API:

$ curl http://localhost:8080/persons/1

$ curl http://localhost:8080/persons/6

$ curl -X 'POST' http://localhost:8080/persons \
  -H 'Content-Type: application/json' \
  -d '{
  "name": "John Wick",
  "age": 18,
  "gender": "MALE",
  "externalId": 100,
  "address": {
    "street": "Test Street",
    "city": "Warsaw",
    "flatNo": 18,
    "buildingNo": 100
  }
}'

Let’s see how it looks in Grafana:

As you see, the log record for the GET request is labeled with level, sequence and the personId MDC field. That’s what we exactly wanted to achieve!

Now, we are able to filter results using the fields from our JSON log line:

Distributed Installation of Loki Stack

In the previously described installation method, we run a single instance of Loki. In order to use a more cloud-native and scalable approach we should switch to the loki-distributed Helm chart. It decides a single Loki instance into several independent components. That division also separates read and write streams. Let’s install it in the loki-distributed namespace with the following command:

$ helm install loki grafana/loki-distributed \
  -n loki-distributed --create-namespace

When installing Promtail we should modify the default address of the write endpoint. We use the Loki gateway component for that. In our case the name of the gateway Service is loki-loki-distributed-gateway. That component listens on the 80 port.

config:
  clients:
  - url: http://loki-loki-distributed-gateway/loki/api/v1/push

Let’s install Promtail using the following command:

$ helm install promtail grafana/promtail -n loki-distributed \
  -f values.yml

Finally, we should install Grafana. The same as before we will use a dedicated Helm chart:

$ helm install grafana grafana/grafana -n loki-distributed

Here’s a list of running pods:

$ kubectl get pod -n loki-distributed
NAME                                                    READY   STATUS    RESTARTS   AGE
grafana-6cd56666b9-6hvqg                                1/1     Running   0          42m
loki-loki-distributed-distributor-59767b5445-n59bq      1/1     Running   0          48m
loki-loki-distributed-gateway-7867bc8ddb-kgdfk          1/1     Running   0          48m
loki-loki-distributed-ingester-0                        1/1     Running   0          48m
loki-loki-distributed-querier-0                         1/1     Running   0          48m
loki-loki-distributed-query-frontend-86c944647c-vl2bz   1/1     Running   0          48m
promtail-c6dxj                                          1/1     Running   0          37m

After logging in to Grafana, we should add the Loki data source (we could also do it during the installation with Helm values). This time we have to connect to the query-frontend component available under the address loki-loki-distributed-query-frontend:3100.

Final Thoughts

Loki Stack is an interesting alternative to Elastic Stack for collecting and aggregating logs on Kubernetes. Loki has been designed to be very cost-effective and easy to operate. Since it does not index the contents of the logs, the usage of such resources as disk space or RAM memory is lower than for Elasticsearch. In this article, I showed you how to install Loki Stack on Kubernetes and how to configure it to analyze app logs in practice.

The post Logging in Kubernetes with Loki appeared first on Piotr's TechBlog.

Which JDK to Choose on Kubernetes

piotr.minkowski — Fri, 17 Feb 2023 13:53:19 +0000

In this article, we will make a performance comparison between several most popular JDK implementations for the app running on Kubernetes. This post also answers some questions and concerns about my Twitter publication you see below. I compared Oracle JDK with Eclipse Temurin. The result was quite surprising for me, so I decided to tweet to get some opinions and feedback.

Unfortunately, those results were wrong. Or maybe I should say, were not averaged well enough. After this publication, I also received interesting materials presented on London Java Community. It compares the performance of the Payara application server running on various JDKs. Here’s the link to that presentation (~1h). The results showed there seem to confirm my results. Or at least they confirm the general rule – there are some performance differences between Open JDK implementations. Let’s check it out.

This time I’ll do a very accurate comparison with several repeats to get reproducible results. I’ll test the following JVM implementations:

Adoptium Eclipse Temurin
Alibaba Dragonwell
Amazon Corretto
Azul Zulu
BellSoft Liberica
IBM Semeru OpenJ9
Oracle JDK
Microsoft OpenJDK

For all the tests I’ll use Paketo Java buildpack. We can easily switch between several JVM implementations with Paketo. I’ll test a simple Spring Boot 3 app that uses Spring Data to interact with the Mongo database. Let’s proceed to the details!

If you have already built images with Dockerfile it is possible that you were using the official OpenJDK base image from the Docker Hub. However, currently, the announcement on the image site says that it is officially deprecated and all users should find suitable replacements. In this article, we will compare all the most popular replacements, so I hope it may help you to make a good choice

Testing Environment

Before we run tests it is important to have a provisioned environment. I’ll run all the tests locally. In order to build images, I’m going to use Paketo Buildpacks. Here are some details of my environment:

Machine: MacBook Pro 32G RAM Intel
OS: macOS Ventura 13.1
Kubernetes (v1.25.2) on Docker Desktop: 14G RAM + 4vCPU

We will use Java 17 for app compilation. In order to run load tests, I’m going to leverage the k6 tool. Our app is written in Spring Boot. It connects to the Mongo database running on the same instance of Kubernetes. Each time I’m testing a new JVM provider I’m removing the previous version of the app and database. Then I’m deploying the new, full configuration once again. We will measure the following parameters:

App startup time (the best
result and average) – we will read it directly from the Spring Boot logs
Throughput – with k6 we will simulate 5 and 10 virtual users. It will measure the number of processing requests
The size of the image
The RAM memory consumed by the pod during the load tests. Basically, we will execute the kubectl top pod command

We will also set the memory limit for the container to 1G. In our load tests, the app will insert data into the Mongo database. It is exposing the REST endpoint invoked during the tests. To measure startup time as accurately as possible I’ll restart the app several times.

Let’s take a look at the Deployment YAML manifest. It injects credentials to the Mongo database and set the memory limit to 1G (as I already mentioned):

apiVersion: apps/v1
kind: Deployment
metadata:
  name: sample-spring-boot-on-kubernetes-deployment
spec:
  selector:
    matchLabels:
      app: sample-spring-boot-on-kubernetes
  template:
    metadata:
      labels:
        app: sample-spring-boot-on-kubernetes
    spec:
      containers:
      - name: sample-spring-boot-on-kubernetes
        image: piomin/sample-spring-boot-on-kubernetes
        ports:
        - containerPort: 8080
        env:
          - name: MONGO_DATABASE
            valueFrom:
              configMapKeyRef:
                name: mongodb
                key: database-name
          - name: MONGO_USERNAME
            valueFrom:
              secretKeyRef:
                name: mongodb
                key: database-user
          - name: MONGO_PASSWORD
            valueFrom:
              secretKeyRef:
                name: mongodb
                key: database-password
          - name: MONGO_URL
            value: mongodb
        readinessProbe:
          httpGet:
            port: 8080
            path: /readiness
            scheme: HTTP
          timeoutSeconds: 1
          periodSeconds: 10
          successThreshold: 1
          failureThreshold: 3
        resources:
          limits:
            memory: 1024Mi

Source Code and Images

If you would like to try it by yourself, you may always take a look at my source code. In order to do that you need to clone my GitHub repository. You will also find all the images in my Docker Hub repository piomin/sample-spring-boot-on-kubernetes. Every single image is tagged with the vendor’s name.

Our Spring Boot app exposes several endpoints, but I’ll test the POST /persons endpoint for inserting data into Mongo. In the integration with Mongo, I’m using the Spring Data MongoDB project and its CRUD repository pattern.

// controller

@RestController
@RequestMapping("/persons")
public class PersonController {

   private PersonRepository repository;

   PersonController(PersonRepository repository) {
      this.repository = repository;
   }

   @PostMapping
   public Person add(@RequestBody Person person) {
      return repository.save(person);
   }

   // other endpoints implementation
}


// repository

public interface PersonRepository extends CrudRepository {

   Set findByFirstNameAndLastName(String firstName, 
                                          String lastName);
   Set findByAge(int age);
   Set findByAgeGreaterThan(int age);

}

The Size of the Image

The size of the image is the simplest option to measure. If you would like to check what is exactly inside the image you can use the dive tool. The difference in the size between vendors results from the number of java tools and binaries included inside. From my perspective, the smaller the size the better. I’d rather not use anything that is inside the image. Of course, except all the staff required to run my app successfully. But you may have a different case. Anyway, here’s the content of the app for the Oracle JDK after executing the dive piomin/sample-spring-boot-on-kubernetes:oracle command. As you see, JDK takes up most of the space.

On the other hand, we can analyze the smallest image. I think it explains the differences in image size since Zulu contains JRE, not the whole JDK.

Here are the result ordered from the smallest image to the biggest.

Azul Zulu: 271MB
IBM Semeru OpenJ9: 275MB
Eclipse Temurin: 286MB
BellSoft Liberica: 286MB
Oracle OpenJDK: 446MB
Alibaba Dragonwell: 459MB
Microsoft OpenJDK: 461MB
Amazon Corretto: 463MB

Let’s visualize our first results. I think it excellent shows which image contains JDK and which JRE.

Startup Time

Honestly, it is not very easy to measure a startup time, since the difference between the vendors is not large. Also, the subsequent results for the same provider may differ a lot. For example, on the first try the app starts in 5.8s and after the pod restart 8.4s. My methodology was pretty simple. I restarted the app several times for each JDK provider to measure the average startup time and the fastest startup in the series. Then I repeated the same exercise again to verify if the results are repeatable. The proportions between the first and second series of startup time between corresponding vendors were similar. In fact, the difference between the fastest and the slowest average startup time is not large. I get the best result for Eclipse Temurin (7.2s) and the worst for IBM Semeru OpenJ9 (9.05s).

Let’s see the full list of results. It shows the average startup time of the application from the fastest one.

Eclipse Temurin: 7.20s
Oracle OpenJDK: 7.22s
Amazon Corretto: 7.27s
BellSoft Liberica: 7.44s
Oracle OpenJDK: 7.77s
Alibaba Dragonwell: 8.03s
Microsoft OpenJDK: 8.18s
IBM Semeru OpenJ9: 9.05s

Once again, here’s the graphical representation of our results. The differences between vendors are sometimes rather cosmetic. Maybe, if the same exercise once again from the beginning the results would be quite different.

As I mentioned before, I also measured the fastest attempt. This time the best top 3 are Eclipse Temurin, Amazon Corretto, and BellSoft Liberica.

Eclipse Temurin: 5.6s
Amazon Corretto: 5.95s
BellSoft Liberica: 6.05s
Oracle OpenJDK: 6.1s
Azul Zulu: 6.2s
Alibaba Dragonwell: 6.45s
Microsoft OpenJDK: 6.9s
IBM Semero OpenJ9: 7.85s

Memory

I’m measuring the memory usage of the app under the heavy load with a test simulating 10 users continuously sending requests. It gives me a really large throughput at the level of the app – around 500 requests per second. The results are in line with the expectations. Almost all the vendors have very similar memory usage except IBM Semeru, which uses OpenJ9 JVM. In theory, OpenJ9 should also give us a better startup time. However, in my case, the significant difference is just in the memory footprint. For IBM Semeru the memory usage is around 135MB, while for other vendors it varies in the range of 210-230MB.

IBM Semero OpenJ9: 135M
Oracle OpenJDK: 211M
Azul Zulu: 215M
Alibaba DragonwellOracle OpenJDK: 216M
BellSoft Liberica: 219M
Microsoft OpenJDK: 219M
Amazon Corretto: 220M
Eclipse Temurin: 230M

Here’s the graphical visualization of our results:

Throughput

In order to generate high incoming traffic to the app I used the k6 tool. It allows us to create tests in JavaScript. Here’s the implementation of our test. It is calling the HTTP POST /persons endpoint with input data in JSON. Then it verifies if the request has been successfully processed on the server side.

import http from 'k6/http';
import { check } from 'k6';

export default function () {

  const payload = JSON.stringify({
      firstName: 'aaa',
      lastName: 'bbb',
      age: 50,
      gender: 'MALE'
  });

  const params = {
    headers: {
      'Content-Type': 'application/json',
    },
  };

  const res = http.post(`http://localhost:8080/persons`, payload, params);

  check(res, {
    'is status 200': (res) => res.status === 200,
    'body size is > 0': (r) => r.body.length > 0,
  });
}

Here’s the k6 command for running our test. It is possible to define the duration and number of simultaneous virtual users. In the first step, I’m simulating 5 virtual users:

$ k6 run -d 90s -u 5 load-tests.js

Then, I’m running the tests for 10 virtual users twice per vendor.

$ k6 run -d 90s -u 10 load-tests.js

Here are the sample results printed after executing the k6 test:

I repeated the exercise per the JDK vendor. Here are the throughput results for 5 virtual users:

BellSoft Liberica: 451req/s
Amazon Corretto: 433req/s
IBM Semeru OpenJ9: 432req/s
Oracle OpenJDK: 420req/s
Microsoft OpenJDK: 418req/s
Azul Zulu: 414req/s
Eclipse Temurin: 407req/s
Alibaba Dragonwell: 405req/s

Here are the throughput results for 10 virtual users:

Eclipse Temurin: 580req/s
Azul Zulu: 567req/s
Microsoft OpenJDK: 561req/s
Oracle OpenJDK: 561req/s
IBM Semeru OpenJ9: 552req/s
Amazon Corretto: 552req/s
Alibaba Dragonwell: 551req/s
BellSoft Liberica: 540req/s

Final Thoughts

After repeating the load tests several times I need to admit that there are no significant differences in performance between all JDK vendors. We were using the same JVM settings for testing (set by the Paketo Buildpack). Probably, the more tests I will run, the results between different vendors would be even more similar. So, in summary, the results from my tweet have not been confirmed. Ok, so let’s back to the question – which JDK to choose on Kubernetes?

Probably it somehow depends on where you are running your cluster. If for example, it’s Kubernetes EKS on AWS it’s worth using Amazon Corretto. However, if you are looking for the smallest image size you should choose between Azul Zulu, IBM Semeru, BellSoft Liberica, and Adoptium Eclipse Temurin. Additionally, IBM Semeru will consume significantly less memory than other distributions, since it is built on top of OpenJ9.

Don’t forget about best practices when deploying Java apps on Kubernetes. Here’s my article about it.

The post Which JDK to Choose on Kubernetes appeared first on Piotr's TechBlog.

Native Java with GraalVM and Virtual Threads on Kubernetes

piotr.minkowski — Wed, 04 Jan 2023 12:23:21 +0000

In this article, you will learn how to use virtual threads, build a native image with GraalVM and run such the Java app on Kubernetes. Currently, the native compilation (GraalVM) and virtual threads (Project Loom) are probably the hottest topics in the Java world. They improve the general performance of your app including memory usage and startup time. Since startup time and memory usage were always a problem for Java, expectations for native images or virtual threads were really big.

Of course, we usually consider such performance issues within the context of microservices or serverless apps. They should not consume many OS resources and should be easily auto-scalable. We can easily control resource usage on Kubernetes. If you are interested in Java virtual threads you can read my previous article about using them to create an HTTP server available here. For more details about Knative as serverless on Kubernetes, you can refer to the following article.

Introduction

Let’s start with the plan for our exercise today. In the first step, we will create a simple Java web app that uses virtual threads for processing incoming HTTP requests. Before we run the sample app we will install Knative on Kubernetes to quickly test autoscaling based on HTTP traffic. We will also install Prometheus on Kubernetes. This monitoring stack allows us to compare the performance of the app without/with GraalVM and virtual threads on Kubernetes. Then, we can proceed with the deployment. In order to easily build and run our native app on Kubernetes we will use Cloud Native Buildpacks. Finally, we will perform some load tests and compare metrics.

Source Code

If you would like to try it by yourself, you may always take a look at my source code. In order to do that you need to clone my GitHub repository. After that, you should follow my instructions.

Create Java App with Virtual Threads

In the first step, we will create a simple Java app that acts as an HTTP server and handles incoming requests. In order to do that, we can use the HttpServer object from the core Java API. Once we create the server we can override a default thread executor with the setExecutor method. In the end, we will try to compare the app using standard threads with the same app using virtual threads. Therefore, we allow overriding the type of executor using an environment variable. The name of that is THREAD_TYPE. If you want to enable virtual threads you need to set the value virtual for that env. Here’s the main method of our app.

public class MainApp {

   public static void main(String[] args) throws IOException {
      HttpServer httpServer = HttpServer
         .create(new InetSocketAddress(8080), 0);

      httpServer.createContext("/example", 
         new SimpleCPUConsumeHandler());

      if (System.getenv("THREAD_TYPE").equals("virtual")) {
         httpServer.setExecutor(
            Executors.newVirtualThreadPerTaskExecutor());
      } else {
         httpServer.setExecutor(Executors.newFixedThreadPool(200));
      }
      httpServer.start();
   }

}

In order to process incoming requests, the HTTP server uses the handler that implements the HttpHandler interface. In our case, the handler is implemented inside the SimpleCPUConsumeHandler class as shown below. It consumes a lot of CPU since it creates an instance of BigInteger with the constructor that performs a lot of computations under the hood. It will also consume some time, so we have the simulation of processing time in the same step. As a response, we just return the next number in the sequence with the Hello_ prefix.

public class SimpleCPUConsumeHandler implements HttpHandler {

   Logger LOG = Logger.getLogger("handler");
   AtomicLong i = new AtomicLong();
   final Integer cpus = Runtime.getRuntime().availableProcessors();

   @Override
   public void handle(HttpExchange exchange) throws IOException {
      new BigInteger(1000, 3, new Random());
      String response = "Hello_" + i.incrementAndGet();
      LOG.log(Level.INFO, "(CPU->{0}) {1}", 
         new Object[] {cpus, response});
      exchange.sendResponseHeaders(200, response.length());
      OutputStream os = exchange.getResponseBody();
      os.write(response.getBytes());
      os.close();
   }
}

In order to use virtual threads in Java 19 we need to enable preview mode during compilation. With Maven we need to enable preview features using maven-compiler-plugin as shown below.


  org.apache.maven.plugins
  maven-compiler-plugin
  3.10.1
  
    19
    
      --enable-preview

Install Knative on Kubernetes

This and the next step are not required to run the native application on Kubernetes. We will use Knative to easily autoscale the app in reaction to the volume of incoming traffic. In the next section, I’ll describe how to run a monitoring stack on Kubernetes.

The simplest way to install Knative on Kubernetes is with the kubectl command. We just need the Knative Serving component without any additional features. The Knative CLI (kn) is not required. We will deploy the application from the YAML manifest using Skaffold.

First, let’s install the required custom resources with the following command:

$ kubectl apply -f https://github.com/knative/serving/releases/download/knative-v1.8.3/serving-crds.yaml

Then, we can Install the core components of Knative Serving by running the command:

$ kubectl apply -f https://github.com/knative/serving/releases/download/knative-v1.8.3/serving-core.yaml

In order to access Knative services outside of the Kubernetes cluster we also need to install a networking layer. By default, Knative uses Kourier as an ingress. We can install the Kourier controller by running the following command.

$ kubectl apply -f https://github.com/knative/net-kourier/releases/download/knative-v1.8.1/kourier.yaml

Finally, let’s configure Knative Serving to use Kourier with the following command:

kubectl patch configmap/config-network \
  --namespace knative-serving \
  --type merge \
  --patch '{"data":{"ingress-class":"kourier.ingress.networking.knative.dev"}}'

If you don’t have an external domain configured or you are running Knative on the local cluster you need to configure DNS. Otherwise, you would have to run curl commands with a host header. Knative provides a Kubernetes Job that sets sslip.io as the default DNS suffix.

$ kubectl apply -f https://github.com/knative/serving/releases/download/knative-v1.8.3/serving-default-domain.yaml

The generated URL contains the name of the service, the namespace, and the address of your Kubernetes cluster. Since I’m running my service on the local Kubernetes cluster in the demo-sless namespace my service is available under the following address:

But before we deploy the sample app on Knative, let’s do some other things.

Install Prometheus Stack on Kubernetes

As I mentioned before, we can also install a monitoring stack on Kubernetes.

The simplest way to install it is with the kube-prometheus-stack Helm chart. The package contains Prometheus and Grafana. It also includes all required rules and dashboards to visualize the basic metrics of your Kubernetes cluster. Firstly, let’s add the Helm repository containing our chart:

$ helm repo add prometheus-community https://prometheus-community.github.io/helm-charts

Then we can install the kube-prometheus-stack Helm chart in the prometheus namespace with the following command:

$ helm install prometheus-stack prometheus-community/kube-prometheus-stack  \
    -n prometheus \
    --create-namespace

If everything goes fine, you should see a similar list of Kubernetes services:

$ kubectl get svc -n prometheus
NAME                                        TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                      AGE
alertmanager-operated                       ClusterIP   None                     9093/TCP,9094/TCP,9094/UDP   11s
prometheus-operated                         ClusterIP   None                     9090/TCP                     10s
prometheus-stack-grafana                    ClusterIP   10.96.218.142            80/TCP                       23s
prometheus-stack-kube-prom-alertmanager     ClusterIP   10.105.10.183            9093/TCP                     23s
prometheus-stack-kube-prom-operator         ClusterIP   10.98.190.230            443/TCP                      23s
prometheus-stack-kube-prom-prometheus       ClusterIP   10.111.158.146           9090/TCP                     23s
prometheus-stack-kube-state-metrics         ClusterIP   10.100.111.196           8080/TCP                     23s
prometheus-stack-prometheus-node-exporter   ClusterIP   10.102.39.238            9100/TCP                     23s

We will analyze Grafana dashboards with memory and CPU statistics. We can enable port-forward to access it locally on the defined port, for example 9080:

$ kubectl port-forward svc/prometheus-stack-grafana 9080:80 -n prometheus

The default username for Grafana is admin and password prom-operator.

Running environment

Personally, I’m using a local Kubernetes on Docker Desktop for that exercise. It doesn’t provide any simplified way of running Prometheus or Knative. However, you can use any other Kubernetes distribution. For example in OpenShift, we can do it with a single click from the UI dashboard thanks to operator support.

We will create two panels in the custom Grafana dashboard. First of them will show the memory usage per single pod in the demo-sless namespace.

sum(container_memory_working_set_bytes{namespace="demo-sless"} / (1024 * 1024)) by (pod)

The second of them will show the average CPU usage per single pod in the demo-sless namespace. You can import both of these directly to Grafana from the k8s/grafana-dasboards.json file from the GitHub repo.

rate(container_cpu_usage_seconds_total{namespace="demo-sless"}[3m])

Prometheus Staleness

By default, Prometheus stores metrics without a timestamp for 5 minutes if no value is returned. For example, if the pod is killed you will the metric with a memory and CPU usage of 5 minutes. To change this behavior set the value `prometheus.prometheusSpec.query.lookbackDelta` to e.g. `1m` during kube-prometheus-stack chart installation.

Build and Deploy a native Java Application

We have already created the sample app and then configured the Kubernetes environment. Now, we may proceed to the deployment phase. Our goal here is to simplify the process of building a native image and running it on Kubernetes as much as possible. Therefore, we will use Cloud Native Buildpacks and Skaffold. With Buildpacks we don’t need to have anything installed on our laptop besides Docker. Skaffold can be easily integrated with Buildpacks to automate the whole process of building and running the app on Kubernetes. You just need to install the skaffold CLI on your machine.

For building a native image of a Java application we may use Paketo Buildpacks. It provides a dedicated buildpack for GraalVM called Paketo GraalVM Buildpack. We should include it in the configuration using the paketo-buildpacks/graalvm name. Since Skaffold supports Buildpacks, we should set all the properties inside the skaffold.yaml file. We need to override some default settings with environment variables. First of all, we have to set the version of Java to 19 and enable preview features (virtual threads). The Kubernetes deployment manifest is available under the k8s/deployment.yaml path.

apiVersion: skaffold/v2beta29
kind: Config
metadata:
  name: sample-java-concurrency
build:
  artifacts:
  - image: piomin/sample-java-concurrency
    buildpacks:
      builder: paketobuildpacks/builder:base
      buildpacks:
        - paketo-buildpacks/graalvm
        - paketo-buildpacks/java-native-image
      env:
        - BP_NATIVE_IMAGE=true
        - BP_JVM_VERSION=19
        - BP_NATIVE_IMAGE_BUILD_ARGUMENTS=--enable-preview
  local:
    push: true
deploy:
  kubectl:
    manifests:
    - k8s/deployment.yaml

Knative simplifies not only autoscaling, but also Kubernetes manifests. Here’s the manifest for our sample app available in the k8s/deployment.yaml file. We need to define a single object Service containing details of the application container. We will change the autoscaling target from the default 200 concurrent requests to 80. It means that if a single instance of the app will process more than 80 requests simultaneously Knative will create a new instance of the app (or a pod – to be more precise). In order to enable virtual threads for our app we also need to set the environment variable THREAD_TYPE to virtual.

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: sample-java-concurrency
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/target: "80"
    spec:
      containers:
        - name: sample-java-concurrency
          image: piomin/sample-java-concurrency
          ports:
            - containerPort: 8080
          env:
            - name: THREAD_TYPE
              value: virtual
            - name: JAVA_TOOL_OPTIONS
              value: --enable-preview

Assuming you already installed Skaffold, the only thing you need to do is to run the following command:

$ skaffold run -n demo-sless

Or you can just deploy a ready image from my registry on Docker Hub. However, in that case, you need to change the image tag in the deployment.yaml manifest to virtual-native.

Once you deploy the app you can verify a list of Knative Service. The name of our target service is sample-java-concurrency. The address of the service is returned in the URL field.

$ kn service list -n demo-sless

Run a non-native app

You can also build and deploy a non-native app from my repo using Skaffold and Paketo Buildpacks. Just use the paketo-buildpacks/java buildpack instead of paketo-buildpacks/graalvm in the Skaffold configuration file.

Load Testing

We will run three testing scenarios today. In the first of them, we will test a standard compilation and a standard thread pool of 100 size. In the second of them, we will test a standard compilation with virtual threads. The final test will check native compilation in conjunction with virtual threads. In all these scenarios, we will set the same autoscaling target – 80 concurrent requests. I’m using the k6 tool for load tests. Each test scenario consists of 4 same steps. Each step takes 2 minutes. In the first step, we are simulating 50 users.

$ k6 run -u 50 -d 120s k6-test.js

Then, we are simulating 100 users.

$ k6 run -u 100 -d 120s k6-test.js

Finally, we run the test for 200 users twice. So, in total, there are four tests with 50, 100, 200, and 200 users, which takes 8 minutes.

$ k6 run -u 200 -d 120s k6-test.js

Let’s verify the results. By the way, here is our test for the k6 tool in javascript.

import http from 'k6/http';
import { check } from 'k6';

export default function () {
  const res = http.get(`http://sample-java-concurrency.demo-sless.127.0.0.1.sslip.io/example`);
  check(res, {
    'is status 200': (res) => res.status === 200,
    'body size is > 0': (r) => r.body.length > 0,
  });
}

Test for Standard Compilation and Threads

The diagram visible below shows memory usage at each phase of the test scenario. After simulating 200 users Knative scales up the number of instances. Theoretically, it should do that during 100 users test. But Knative measures incoming traffic at the level of the sidecar container inside the pod. The memory usage for the first instance is around ~900MB (it includes also sidecar container usage).

Here’s a similar view as before but for the CPU usage. The highest consumption was before autoscaling occurs at the level of ~1.2 core. Then, depending on the number of instances ranges from ~0.4 core to ~0.7 core. As I mentioned before, we are using a time-consuming BigInteger constructor to simulate CPU usage under a heavy load.

Here are the test results for 50 users. The application was able to process ~105k requests in 2 minutes. The highest processing time value was ~3 seconds.

Here are the test results for 100 users. The application was able to process ~130k requests in 2 minutes with an average response time of ~90ms.

Finally, we have results for 200 users test. The application was able to process ~135k requests in 2 minutes with an average response time of ~175ms. The failure threshold was at the level of 0.02%.

Test for Standard Compilation and Virtual Threads

The same as in the previous section, here’s the diagram that shows memory usage at each phase of the test scenario. After simulating 100 users Knative scales up the number of instances. Theoretically, it should run the third instance of the app for 200 users. The memory usage for the first instance is around ~850MB (it includes also sidecar container usage).

Here’s a similar view as before but for the CPU usage. The highest consumption was before autoscaling occurs at ~1.1 core. Then, depending on the number of instances ranges from ~0.3 core to ~0.7 core.

Here are the test results for 50 users. The application was able to process ~105k requests in 2 minutes. The highest processing time value was ~2.2 seconds.

Here are the test results for 100 users. The application was able to process ~115k requests in 2 minutes with an average response time of ~100ms.

Finally, we have results for 200 users test. The application was able to process ~135k requests in 2 minutes with an average response time of ~180ms. The failure threshold was at the level of 0.02%.

Test for Native Compilation and Virtual Threads

The same as in the previous section, here’s the diagram that shows memory usage at each phase of the test scenario. After simulating 100 users Knative scales up the number of instances. Theoretically, it should run the third instance of the app for 200 users (the third pod visible on the diagram was in fact in the Terminating phase for some time). The memory usage for the first instance is around ~50MB.

Here’s a similar view as before but for the CPU usage. The highest consumption was before autoscaling occurs at ~1.3 core. Then, depending on the number of instances ranges from ~0.3 core to ~0.9 core.

Here are the test results for 50 users. The application was able to process ~75k requests in 2 minutes. The highest processing time value was ~2 seconds.

Here are the test results for 100 users. The application was able to process ~85k requests in 2 minutes with an average response time of ~140ms

Finally, we have results for 200 users test. The application was able to process ~100k requests in 2 minutes with an average response time of ~240ms. Plus – there were no failures at the second 200 users attempt.

Summary

In this article, I tried to compare the behavior of the Java app for GraalVM native compilation with virtual threads on Kubernetes with a standard approach. There are several conclusions after running all described tests:

There are no significant differences between standard and virtual threads when comes to resource usage or request processing time. The resource usage is slightly lower for virtual threads. On the other hand, the processing time is slightly lower for standard threads. However, if our handler method would take more time, this proportion changes in favor of virtual threads.
Autoscaling works quite better for virtual threads. However, I’m not sure why Anyway, the number of instances was scaled up for 100 users with a target at the level of 80 for virtual threads, while for standard thread no. Of course, virtual threads give us more flexibility when setting an autoscaling target. For standard threads, we have to choose a value lower than a thread pool size, while for virtual threads we can set any reasonable value.
Native compilation significantly reduces app memory usage. For the native app, it was ~50MB instead of ~900MB. On the other hand, the CPU consumption was slightly higher for the native app.
Native app process requests slower than a standard app. In all the tests it was 30% lower than the number of requests processed by a standard app.

The post Native Java with GraalVM and Virtual Threads on Kubernetes appeared first on Piotr's TechBlog.