micrometer Archives - Piotr's TechBlog

Kafka Tracing with Spring Boot and Open Telemetry

piotr.minkowski — Wed, 15 Nov 2023 11:26:33 +0000

In this article, you will learn how to configure tracing for Kafka producer and consumer with Spring Boot and Open Telemetry. We will use the Micrometer library for sending traces and Jaeger for storing and visualizing them. Spring Kafka comes with built-in integration with Micrometer for the KafkaTemplate and listener containers. You will also see how to configure the Spring Kafka observability to add our custom tags to traces.

If you are interested in Kafka and Spring Boot, you may find several articles on my blog about it. To read about concurrency with Kafka and Spring Boot read the following post. For example, there is also an interesting article about Kafka transactions here.

Source Code

If you would like to try it by yourself, you may always take a look at my source code. In order to do that you need to clone my GitHub repository. Then you should go to the kafka directory. After that, you should just follow my instructions. Let’s begin.

Dependencies

Let’s take a look at the list of required Maven dependencies. It is the same for both of our sample Spring Boot apps. Of course, we need to add the Spring Boot starter and the Spring Kafka for sending or receiving messages. In order to automatically generate traces related to each message, we are including the Spring Boot Actuator and the Micrometer Tracing Open Telemetry bridge. Finally, we need to include the opentelemetry-exporter-otlp library to export traces outside the app.


  
    org.springframework.boot
    spring-boot-starter
  
  
    org.springframework.boot
    spring-boot-starter-actuator
  
  
    org.springframework.kafka
    spring-kafka
  
  
    com.fasterxml.jackson.core
    jackson-databind
  
  
    io.micrometer
    micrometer-tracing-bridge-otel
  
  
    io.opentelemetry
    opentelemetry-exporter-otlp

Spring Boot Kafka Tracing for Producer

Our apps don’t do anything complicated. They are just sending and receiving messages. Here’s the class representing the message exchanged between both apps.

public class Info {

    private Long id;
    private String source;
    private String space;
    private String cluster;
    private String message;

    public Info(Long id, String source, String space, String cluster, 
                String message) {
       this.id = id;
       this.source = source;
       this.space = space;
       this.cluster = cluster;
       this.message = message;
    }

   // GETTERS AND SETTERS
}

Let’s begin with the producer app. It generates and sends one message per second. Here’s the implementation of a @Service bean responsible for producing messages. It injects and uses the KafkaTemplate bean for that.

@Service
public class SenderService {

   private static final Logger LOG = LoggerFactory
      .getLogger(SenderService.class);

   AtomicLong id = new AtomicLong();
   @Autowired
   KafkaTemplate template;

   @Value("${POD:kafka-producer}")
   private String pod;
   @Value("${NAMESPACE:empty}")
   private String namespace;
   @Value("${CLUSTER:localhost}")
   private String cluster;
   @Value("${TOPIC:info}")
   private String topic;

   @Scheduled(fixedRate = 1000)
   public void send() {
      Info info = new Info(id.incrementAndGet(), 
                           pod, 
                           namespace, 
                           cluster, 
                           "HELLO");
      CompletableFuture> result = template
         .send(topic, info.getId(), info);
      result.whenComplete((sr, ex) ->
                LOG.info("Sent({}): {}", sr.getProducerRecord().key(), 
                         sr.getProducerRecord().value()));
    }

}

Spring Boot provides an auto-configured instance of KafkaTemplate. However, to enable Kafka tracing with Spring Boot we need to customize that instance. Here’s the implementation of the KafkaTemplate bean inside the producer app’s main class. In order to enable tracing, we need to invoke the setObservationEnabled method. By default, the Micrometer module generates some generic tags. We want to add at least the name of the target topic and the Kafka message key. Therefore we are creating our custom implementation of the KafkaTemplateObservationConvention interface. It uses the KafkaRecordSenderContext to retrieve the topic name and the message key from the ProducerRecord object.

@SpringBootApplication
@EnableScheduling
public class KafkaProducer {

   private static final Logger LOG = LoggerFactory
      .getLogger(KafkaProducer.class);

   public static void main(String[] args) {
      SpringApplication.run(KafkaProducer.class, args);
   }

   @Bean
   public NewTopic infoTopic() {
      return TopicBuilder.name("info")
             .partitions(1)
             .replicas(1)
             .build();
   }

   @Bean
   public KafkaTemplate kafkaTemplate(ProducerFactory producerFactory) {
      KafkaTemplate t = new KafkaTemplate<>(producerFactory);
      t.setObservationEnabled(true);
      t.setObservationConvention(new KafkaTemplateObservationConvention() {
         @Override
         public KeyValues getLowCardinalityKeyValues(KafkaRecordSenderContext context) {
            return KeyValues.of("topic", context.getDestination(),
                    "id", String.valueOf(context.getRecord().key()));
         }
      });
      return t;
   }

}

We also need to set the address of the Jaeger instance and decide which percentage of spans will be exported. Here’s the application.yml file with the required properties:

spring:
  application.name: kafka-producer
  kafka:
    bootstrap-servers: ${KAFKA_URL:localhost}:9092
    producer:
      key-serializer: org.apache.kafka.common.serialization.LongSerializer
      value-serializer: org.springframework.kafka.support.serializer.JsonSerializer

management:
  tracing:
    enabled: true
    sampling:
      probability: 1.0
  otlp:
    tracing:
      endpoint: http://jaeger:4318/v1/traces

Spring Boot Kafka Tracing for Consumer

Let’s switch to the consumer app. It just receives and prints messages coming to the Kafka topic. Here’s the implementation of the listener @Service. Besides the whole message content, it also prints the message key and a topic partition number.

@Service
public class ListenerService {

   private static final Logger LOG = LoggerFactory
      .getLogger(ListenerService.class);

   @KafkaListener(id = "info", topics = "${app.in.topic}")
   public void onMessage(@Payload Info info,
                         @Header(name = KafkaHeaders.RECEIVED_KEY, required = false) Long key,
                         @Header(KafkaHeaders.RECEIVED_PARTITION) int partition) {
      LOG.info("Received(key={}, partition={}): {}", key, partition, info);
   }

}

In order to generate and export traces on the consumer side we need to override the ConcurrentKafkaListenerContainerFactory bean. For the container listener factory, we should obtain the ContainerProperties instance and then invoke the setObservationEnabled method. The same as before we can create a custom implementation of the KafkaTemplateObservationConvention interface to include the additional tags (optionally).

@SpringBootApplication
@EnableKafka
public class KafkaConsumer {

   private static final Logger LOG = LoggerFactory
      .getLogger(KafkaConsumer.class);

    public static void main(String[] args) {
        SpringApplication.run(KafkaConsumer.class, args);
    }

    @Value("${app.in.topic}")
    private String topic;

    @Bean
    public ConcurrentKafkaListenerContainerFactory listenerFactory(ConsumerFactory consumerFactory) {
        ConcurrentKafkaListenerContainerFactory factory =
                new ConcurrentKafkaListenerContainerFactory<>();
        factory.getContainerProperties().setObservationEnabled(true);
        factory.setConsumerFactory(consumerFactory);
        return factory;
    }

    @Bean
    public NewTopic infoTopic() {
        return TopicBuilder.name(topic)
                .partitions(10)
                .replicas(3)
                .build();
    }

}

Of course, we also need to set a Jaeger address in the application.yml file:

spring:
  application.name: kafka-consumer
  kafka:
    bootstrap-servers: ${KAFKA_URL:localhost}:9092
    consumer:
      key-deserializer: org.apache.kafka.common.serialization.LongDeserializer
      value-deserializer: org.springframework.kafka.support.serializer.JsonDeserializer
      properties:
        spring.json.trusted.packages: "*"

app.in.topic: ${TOPIC:info}

management:
  tracing:
    enabled: true
    sampling:
      probability: 1.0
  otlp:
    tracing:
      endpoint: http://jaeger:4318/v1/traces

Trying on Docker

Once we finish the implementation we can try out our solution. We will run both Kafka and Jaeger as Docker containers. Firstly, let’s build the project and container images for the producer and consumer apps. Spring Boot provides built-in tools for that. Therefore, we just need to execute the following command:

$ mvn clean package spring-boot:build-image

After that, we can define the docker-compose.yml file with a list of containers. It is possible to dynamically override Spring Boot properties using a style based on environment variables. Thanks to that, we can easily change the Kafka and Jaeger addresses for the containers. Here’s our docker-compose.yml:

version: "3.8"
services:
  broker:
    image: moeenz/docker-kafka-kraft:latest
    restart: always
    ports:
      - "9092:9092"
    environment:
      - KRAFT_CONTAINER_HOST_NAME=broker
  jaeger:
    image: jaegertracing/all-in-one:latest
    ports:
      - "16686:16686"
      - "4317:4317"
      - "4318:4318"
  producer:
    image: library/producer:1.0-SNAPSHOT
    links:
      - broker
      - jaeger
    environment:
      MANAGEMENT_OTLP_TRACING_ENDPOINT: http://jaeger:4318/v1/traces
      SPRING_KAFKA_BOOTSTRAP_SERVERS: broker:9092
  consumer:
    image: library/consumer:1.0-SNAPSHOT
    links:
      - broker
      - jaeger
    environment:
      MANAGEMENT_OTLP_TRACING_ENDPOINT: http://jaeger:4318/v1/traces
      SPRING_KAFKA_BOOTSTRAP_SERVERS: broker:9092

Let’s run all the defined containers with the following command:

$ docker compose up

Our apps are running and exchanging messages:

The Jaeger dashboard is available under the 16686 port. As you see, there are several traces with the kafka-producer and kafka-consumer spans.

We can go into the details of each entry. The trace generated by the producer app is always correlated to the trace generated by the consumer app for every single message. There are also our two custom tags (id and topic) with values added by the KafkaTemplate bean.

Running on Kubernetes

Our sample apps are prepared for being deployed on Kubernetes. You can easily do it with the Skaffold CLI. Before that, we need to install Kafka and Jaeger on Kubernetes. I will not get into details about Kafka installation. You can find a detailed description of how to run Kafka on Kubernetes with the Strimzi operator in my article available here. After that, we can proceed to the Jaeger installation. In the first step, we need to add the following Helm repository:

$ helm repo add jaegertracing https://jaegertracing.github.io/helm-charts

By default, the Jaeger Helm chart doesn’t expose OTLP endpoints. In order to enable them, we need to override some default settings. Here’s our values YAML manifest:

collector:
  service:
    otlp:
      grpc:
        name: otlp-grpc
        port: 4317
      http:
        name: otlp-http
        port: 4318

Let’s install Jaeger in the jaeger namespace with the parameters from jaeger-values.yaml:

$ helm install jaeger jaegertracing/jaeger -n jaeger \
    --create-namespace \
    -f jaeger-values.yaml

Once we install Jaeger we can verify a list of Kubernetes Services. We will use the jaeger-collector service to send traces for the apps and the jaeger-query service to access the UI dashboard.

$ kubectl get svc -n jaeger
NAME               TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                                           AGE
jaeger-agent       ClusterIP   10.96.147.104           5775/UDP,6831/UDP,6832/UDP,5778/TCP,14271/TCP     14m
jaeger-cassandra   ClusterIP   None                    7000/TCP,7001/TCP,7199/TCP,9042/TCP,9160/TCP      14m
jaeger-collector   ClusterIP   10.96.111.236           14250/TCP,14268/TCP,4317/TCP,4318/TCP,14269/TCP   14m
jaeger-query       ClusterIP   10.96.88.64             80/TCP,16685/TCP,16687/TCP                        14m

Finally, we can run our sample Spring Boot apps that connect to Kafka and Jaeger. Here’s the Deployment object for the producer app. It overrides the default Kafka and Jaeger addresses by defining the KAFKA_URL and MANAGEMENT_OTLP_TRACING_ENDPOINT environment variables.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: producer
spec:
  selector:
    matchLabels:
      app: producer
  template:
    metadata:
      labels:
        app: producer
    spec:
      containers:
      - name: producer
        image: piomin/producer
        resources:
          requests:
            memory: 200Mi
            cpu: 100m
        ports:
        - containerPort: 8080
        env:
          - name: MANAGEMENT_OTLP_TRACING_ENDPOINT
            value: http://jaeger-collector.jaeger:4318/v1/traces
          - name: KAFKA_URL
            value: my-cluster-kafka-bootstrap
          - name: CLUSTER
            value: c1
          - name: TOPIC
            value: test-1
          - name: POD
            valueFrom:
              fieldRef:
                fieldPath: metadata.name
          - name: NAMESPACE
            valueFrom:
              fieldRef:
                fieldPath: metadata.namespace

Here’s a similar Deployment object for the consumer app:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: consumer-1
spec:
  selector:
    matchLabels:
      app: consumer-1
  template:
    metadata:
      labels:
        app: consumer-1
    spec:
      containers:
      - name: consumer
        image: piomin/consumer
        resources:
          requests:
            memory: 200Mi
            cpu: 100m
        ports:
        - containerPort: 8080
        env:
          - name: SPRING_APPLICATION_NAME
            value: kafka-consumer-1
          - name: TOPIC
            value: test-1
          - name: KAFKA_URL
            value: my-cluster-kafka-bootstrap
          - name: MANAGEMENT_OTLP_TRACING_ENDPOINT
            value: http://jaeger-collector.jaeger:4318/v1/traces

Assuming that you are inside the kafka directory in the Git repository, you just need to run the following command to deploy both apps. By the way, I’ll create two deployments of the consumer app (consumer-1 and consumer-2) just for Jaeger visualization purposes.

$ skaffold run -n strimzi --tail

Once you run the apps, you can go to the Jaeger dashboard and verify the list of traces. In order to access the dashboard, we can enable port forwarding for the jaeger-query Service.

$ kubectl port-forward svc/jaeger-query 80:80

Final Thoughts

Integration between Spring Kafka and Micrometer Tracing is a relatively new feature available since the 3.0 version. It is possible, that it will be improved soon with some new features. Anyway, currently it gives a simple way to generate and send traces from Kafka producers and consumers.

The post Kafka Tracing with Spring Boot and Open Telemetry appeared first on Piotr's TechBlog.

Spring Boot 3 Observability with Grafana

piotr.minkowski — Thu, 03 Nov 2022 09:34:50 +0000

This article will teach you how to configure observability for your Spring Boot applications. We assume that observability is understood as the interconnection between metrics, logging, and distributed tracing. In the end, it should allow you to monitor the state of your system to detect errors and latency.

There are some significant changes in the approach to observability between Spring Boot 2 and 3. Tracing will no longer be part of Spring Cloud through the Spring Cloud Sleuth project. The core of that project has been moved to Micrometer Tracing. You can read more about the reasons and future plans in this post on the Spring blog.

The main goal of that article is to give you a simple receipt for how to enable observability for your microservices written in Spring Boot using a new approach. In order to simplify our exercise, we will use a fully managed Grafana instance in their cloud. We will build a very basic architecture with two microservices running locally. Let’s take a moment to discuss our architecture in great detail.

Source Code

If you would like to try it by yourself, you may always take a look at my source code. In order to do that you need to clone my GitHub repository. It contains several tutorials. You need to go to the inter-communication directory. After that, you should just follow my instructions.

Spring Boot Observability Architecture

There are two applications: inter-callme-service and inter-caller-service. The inter-caller-service app calls the HTTP endpoint exposed by the inter-callme-service app. We run two instances of inter-callme-service. We will also configure a static load balancing between those two instances using Spring Cloud Load Balancer. All the apps will expose Prometheus metrics using Spring Boot Actuator and the Micrometer project. For tracing, we are going to use Open Telemetry with Micrometer Tracing and OpenZipkin. In order to send all the data including logs, metrics, and traces from our local Spring Boot instances to the cloud, we have to use Grafana Agent.

On the other hand, there is a stack responsible for collecting and visualizing all the data. As I mentioned before we will leverage Grafana Cloud for that. It is a very comfortable way since we don’t have to install and configure all the required tools. First of all, Grafana Cloud offers a ready instance of Prometheus responsible for collecting metrics. We also need a log aggregation tool for storing and querying logs from our apps. Grafana Cloud offers a preconfigured instance of Loki for that. Finally, we have a distributed tracing backend through the Grafana Tempo. Here’s the visualization of our whole architecture.

Enable Metrics and Tracing with Micrometer

In order to export metrics in Prometheus format, we need to include the micrometer-registry-prometheus dependency. For tracing context propagation we should add the micrometer-tracing-bridge-otel module. We should also export tracing spans using one of the formats supported by Grafana Tempo. It will be OpenZipkin through the opentelemetry-exporter-zipkin dependency.


  org.springframework.boot
  spring-boot-starter-web


  org.springframework.boot
  spring-boot-starter-actuator


  io.micrometer
  micrometer-registry-prometheus


  io.micrometer
  micrometer-tracing-bridge-otel


  io.opentelemetry
  opentelemetry-exporter-zipkin

We need to use the latest available version of Spring Boot 3. Currently, it is 3.0.0-RC1. As a release candidate that version is available in the Spring Milestone repository.


  org.springframework.boot
  spring-boot-starter-parent
  3.0.0-RC1

One of the more interesting new features in Spring Boot 3 is the support for Prometheus exemplars. Exemplars are references to data outside of the metrics published by an application. They allow linking metrics data to distributed traces. In that case, the published metrics contain a reference to the traceId. In order to enable exemplars for the particular metrics, we need to expose percentiles histograms. We will do that for http.server.requests metric (1). We will also send all the traces to Grafana Cloud by setting the sampling probability to 1.0 (2). Finally, just to verify it works properly we print traceId and spanId in the log line (3).

spring:
  application:
    name: inter-callme-service
  output.ansi.enabled: ALWAYS

management:
  endpoints.web.exposure.include: '*'
  metrics.distribution.percentiles-histogram.http.server.requests: true # (1)
  tracing.sampling.probability: 1.0 # (2)

logging.pattern.console: "%clr(%d{HH:mm:ss.SSS}){blue} %clr(%5p [${spring.application.name:},%X{traceId:-},%X{spanId:-}]){yellow} %clr(:){red} %clr(%m){faint}%n" # (3)

The inter-callme-service exposes the POST endpoint that just returns the message in the reversed order. We don’t need to add here anything, just standard Spring Web annotations.

@RestController
@RequestMapping("/callme")
class CallmeController(private val repository: ConversationRepository) {

   private val logger: Logger = LoggerFactory
       .getLogger(CallmeController::class.java)

   @PostMapping("/call")
   fun call(@RequestBody request: CallmeRequest) : CallmeResponse {
      logger.info("In: {}", request)
      val response = CallmeResponse(request.id, request.message.reversed())
      repository.add(Conversation(request = request, response = response))
      return response
   }

}

Load Balancing with Spring Cloud

In the endpoint exposed by the inter-caller-service we just call the endpoint from inter-callme-service. We use Spring RestTemplate for that. You can also declare Spring Cloud OpenFeign client, but it seems it does not currently support Micrometer Tracing out-of-the-box.

@RestController
@RequestMapping("/caller")
class CallerController(private val template: RestTemplate) {

   private val logger: Logger = LoggerFactory
      .getLogger(CallerController::class.java)

   private var id: Int = 0

   @PostMapping("/send/{message}")
   fun send(@PathVariable message: String): CallmeResponse? {
      logger.info("In: {}", message)
      val request = CallmeRequest(++id, message)
      return template.postForObject("http://inter-callme-service/callme/call",
         request, CallmeResponse::class.java)
   }

}

In this exercise, we will use a static client-side load balancer that distributes traffic across two instances of inter-callme-service. Normally, you would integrate Spring Cloud Load Balancer with service discovery based e.g. on Eureka. However, I don’t want to complicate our demo with external components in the architecture. Assuming we are running inter-callme-service on 55800 and 55900 here’s the load balancer configuration in the application.yml file:

spring:
  application:
    name: inter-caller-service
  cloud:
    loadbalancer:
      cache:
        ttl: 1s
      instances:
        - name: inter-callme-service
          servers: localhost:55800, localhost:55900

Since there is no built-in static load balancer implementation we need to add some code. Firstly, we have to inject configuration properties into the Spring bean.

@Configuration
@ConfigurationProperties("spring.cloud.loadbalancer")
class LoadBalancerConfigurationProperties {

   val instances: MutableList = mutableListOf()

   class ServiceConfig {
      var name: String = ""
      var servers: String = ""
   }

}

Then we need to create a bean that implements the ServiceInstanceListSupplier interface. It just returns a list of ServiceInstance objects that represents all defined static addresses.

class StaticServiceInstanceListSupplier(private val properties: LoadBalancerConfigurationProperties,
                                        private val environment: Environment): 
   ServiceInstanceListSupplier {

   override fun getServiceId(): String = environment
      .getProperty(LoadBalancerClientFactory.PROPERTY_NAME)!!

   override fun get(): Flux> {
      val serviceConfig: LoadBalancerConfigurationProperties.ServiceConfig? =
                properties.instances.find { it.name == serviceId }
      val list: MutableList =
                serviceConfig!!.servers.split(",", ignoreCase = false, limit = 0)
                        .map { StaticServiceInstance(serviceId, it) }.toMutableList()
      return Flux.just(list)
   }

}

Finally, we need to enable Spring Cloud Load Balancer for the app and annotate RestTemplate with @LoadBalanced.

@SpringBootApplication
@LoadBalancerClient(value = "inter-callme-service", configuration = [CustomCallmeClientLoadBalancerConfiguration::class])
class InterCallerServiceApplication {

    @Bean
    @LoadBalanced
    fun template(builder: RestTemplateBuilder): RestTemplate = 
       builder.build()

}

Here’s the client-side load balancer configuration. We are providing our custom StaticServiceInstanceListSupplier implementation as a default ServiceInstanceListSupplier. Then we set RandomLoadBalancer as a default implementation of a load-balancing algorithm.

class CustomCallmeClientLoadBalancerConfiguration {

   @Autowired
   lateinit var properties: LoadBalancerConfigurationProperties

   @Bean
   fun clientServiceInstanceListSupplier(
      environment: Environment, context: ApplicationContext):     
      ServiceInstanceListSupplier {
        val delegate = StaticServiceInstanceListSupplier(properties, environment)
        val cacheManagerProvider = context
           .getBeanProvider(LoadBalancerCacheManager::class.java)
        return if (cacheManagerProvider.ifAvailable != null) {
            CachingServiceInstanceListSupplier(delegate, cacheManagerProvider.ifAvailable)
        } else delegate
    }

   @Bean
   fun loadBalancer(environment: Environment, 
      loadBalancerClientFactory: LoadBalancerClientFactory):
            ReactorLoadBalancer {
        val name: String? = environment.getProperty(LoadBalancerClientFactory.PROPERTY_NAME)
        return RandomLoadBalancer(
           loadBalancerClientFactory
              .getLazyProvider(name, ServiceInstanceListSupplier::class.java),
            name!!
        )
    }
}

Testing Observability with Spring Boot

Let’s see how it works. In the first step, we are going to run two instances of inter-callme-service. Since we set a static value of the listen port we need to override the property server.port for each instance. We can do it with the env variable SERVER_PORT. Go to the inter-communication/inter-callme-service directory and run the following commands:

$ SERVER_PORT=55800 mvn spring-boot:run
$ SERVER_PORT=55900 mvn spring-boot:run

Then, go to the inter-communication/inter-caller-service directory and run a single instance on the default port 8080:

$ mvn spring-boot:run

Then, let’s call our endpoint POST /caller/send/{message} several times with parameters, for example:

$ curl -X POST http://localhost:8080/caller/call/hello1

Here are the logs from inter-caller-service with the highlighted value of the traceId parameter:

Let’s take a look at the logs from inter-callme-service. As you see the traceId parameter is the same as the traceId for that request on the inter-caller-service side.

Here are the logs from the second instance of inter-callme-service:

ou could also try the same exercise with Spring Cloud OpenFeign. It is configured and ready to use. However, for me, it didn’t propagate the traceId parameter properly. Maybe, it is the case with the current non-GA versions of Spring Boot and Spring Cloud.

Let’s verify another feature – Prometheus exemplars. In order to do that we need to call the /actuator/prometheus endpoint with the Accept header that is asking for the OpenMetrics format. This is the same header Prometheus will use to scrape the metrics.

$ curl -H 'Accept: application/openmetrics-text; version=1.0.0' http://localhost:8080/actuator/prometheus

As you see several metrics for the result contain traceId and spanId parameters. These are our exemplars.

Install and Configure Grafana Agent

Our sample apps are ready. Now, the main goal is to send all the collected observables to our account on Grafana Cloud. There are various ways of sending metrics, logs, and traces to the Grafana Stack. In this article, I will show you how to use the Grafana Agent for that. Firstly, we need to install it. You can find detailed installation instructions depending on your OS here. Since I’m using macOS I can do it with Homebrew:

$ brew install grafana-agent

Before we start the agent, we need to prepare a configuration file. The location of that file also depends on your OS. For me it is $(brew --prefix)/etc/grafana-agent/config.yml. The configuration YAML manifests contain information on how we want to collect and send metrics, traces, and logs. Let’s begin with the metrics. Inside the scrape_configs section, we need to set a list of endpoints for scraping (1) and a default path (2). Inside the remote_write section, we have to pass our Grafana Cloud instance auth credentials (3) and URL (4). By default, Grafana Agent does not send exemplars. Therefore we need to enable it with the send_exemplars property (5).

metrics:
  configs:
    - name: integrations
      scrape_configs:
        - job_name: agent
          static_configs:
            - targets: ['127.0.0.1:55800','127.0.0.1:55900','127.0.0.1:8080'] # (1)
          metrics_path: /actuator/prometheus # (2)
      remote_write:
        - basic_auth: # (3)
            password: 
            username: 
          url: https://prometheus-prod-01-eu-west-0.grafana.net/api/prom/push # (4)
          send_exemplars: true # (5)
  global:
    scrape_interval: 60s

You can find all information about your instance of Prometheus in the Grafana Cloud dashboard.

In the next step, we prepare a configuration for collecting and sending logs to Grafana Loki. The same as before we need to set auth credentials (1) and the URL (2) of our Loki instance. The most important thing here is to pass the location of log files (3).

logs:
  configs:
    - clients:
        - basic_auth: # (1)
            password: 
            username: 
          # (2)
          url: https://logs-prod-eu-west-0.grafana.net/loki/api/v1/push
      name: springboot
      positions:
        filename: /tmp/positions.yaml
      scrape_configs:
        - job_name: springboot-json
          static_configs:
            - labels:
                __path__: ${HOME}/inter-*/spring.log # (3)
                job: springboot-json
              targets:
                - localhost
      target_config:
        sync_period: 10s

By default, Spring Boot logs only to the console and does not write log files. In our case, the Grafana Agent reads log lines from output files. In order to write log files, we need to set a logging.file.name or logging.file.path property. Since there are two instances of inter-callme-service we need to distinguish somehow their log files. We will use the server.port property for that. The logs inside files are stored in JSON format.

logging:
  pattern:
    console: "%clr(%d{HH:mm:ss.SSS}){blue} %clr(%5p [${spring.application.name:},%X{traceId:-},%X{spanId:-}]){yellow} %clr(:){red} %clr(%m){faint}%n"
    file: "{\"timestamp\":\"%d{HH:mm:ss.SSS}\",\"level\":\"%p\",\"traceId\":\"%X{traceId:-}\",\"spanId\":\"%X{spanId:-}\",\"appName\":\"${spring.application.name}\",\"message\":\"%m\"}%n"
  file:
    path: ${HOME}/inter-callme-service-${server.port}

Finally, we will configure trace collecting. Besides auth credentials and URL of the Grafana Tempo instance, we need to enable OpenZipkin receiver (1).

traces:
  configs:
    - name: default
      remote_write:
        - endpoint: tempo-eu-west-0.grafana.net:443
          basic_auth:
            username: 
            password: 
      # (1)
      receivers:
        zipkin:

Then, we can start the agent with the following command:

$ brew services restart grafana-agent

Grafana agent contains a Zipkin collector that listens on the default port 9411. There is also an HTTP API exposed outside the agent on the port 12345 for verifying agent status.

For example, we can use Grafana Agent HTTP API to verify how many log files it is monitoring. To do that just call the endpoint GET agent/api/v1/logs/targets. As you see, for me, it is monitoring three files. So that’s what we exactly wanted to achieve since there are two running instances of inter-callme-service and a single instance of inter-caller-service.

Visualize Spring Boot Observability with Grafana Stack

One of the most important advantages of Grafana Cloud in our exercise is that we have all the required things configured. After you navigate to the Grafana dashboard you can display a list of available data sources. As you see, there are Loki, Prometheus, and Tempo already configured.

By default, Grafana Cloud enables exemplars in the Prometheus data source. If you are running Grafana by yourself, don’t forget to enable it on your Prometheus data source.

Let’s start with the logs. We will analyze exactly the same logs as in the section “Testing Observability with Spring Boot”. We will get all the logs sent by the Grafana Agent. As you probably remember, we formatted all the logs as JSON. Therefore, we will parse them using the Json parser on the server side. Thanks to that, we would be able to filter by all the log fields. For example, we can use the traceId label a filter expression: {job="springboot-json"} | json | traceId = 1bb1d7d78a5ac47e8ebc2da961955f87.

Here’s a full list of logs without any filtering. The highlighted lines contain logs of two analyzed traces.

In the next step, we will configure Prometheus metrics visualization. Since we enabled percentile histograms for the http.server.requests metrics, we have multiple buckets represented by the http_server_requests_seconds_bucket values. A set of multiple buckets _bucket with a label le which contains a count of all samples whose values are less than or equal to the numeric value contained in the le label. We need to count histograms for 90% and 60% percent of requests. Here are our Prometheus queries:

Here’s our histogram. Exemplars are shown as green diamonds.

When you hover over the selected exemplar, you will see more details. It includes, for example, the traceId value.

Finally, the last part of our exercise. We would like to analyze the particular traces with Grafana Tempo. The only thing you need to do is to choose the grafanacloud-*-traces data source and set the value of the searched traceId. Here’s a sample result.

Final Thoughts

The first GA release of Spring Boot 3 is just around the corner. Probably one of the most important things you will have to handle during migration from Spring Boot 2 is observability. In this article, you can find a detailed description of the current Spring Boot approach. If you are interested in Spring Boot, it’s worth reading about its best practices for building microservices in this article.

The post Spring Boot 3 Observability with Grafana appeared first on Piotr's TechBlog.

Spring Boot Autoscaling on Kubernetes

piotr.minkowski — Thu, 05 Nov 2020 09:39:05 +0000

Autoscaling based on the custom metrics is one of the features that may convince you to run your Spring Boot application on Kubernetes. By default, horizontal pod autoscaling can scale the deployment based on the CPU and memory. However, such an approach is not enough for more advanced scenarios. In that case, you can use Prometheus to collect metrics from your applications. Then, you may integrate Prometheus with the Kubernetes custom metrics mechanism.

Preface

In this article, you will learn how to run Prometheus on Kubernetes using the Helm package manager. You will use the chart that causes Prometheus to scrape a variety of Kubernetes resource types. Thanks to that you won’t have to configure it. In the next step, you will install the Prometheus Adapter. You can also do that using the Helm package manager. The adapter acts as a bridge between the Prometheus instance and the custom metrics API. Our Spring Boot application exposes metrics through the HTTP endpoint. You will learn how to configure autoscaling on Kubernetes based on the number of incoming requests.

Source code

If you would like to try it by yourself, you may always take a look at my source code. In order to do that, you need to clone my repository sample-spring-boot-on-kubernetes. Then you should go to the k8s directory. You can find there all the Kubernetes manifests and configuration files required for that exercise.

Our Spring Boot application is ready to be deployed on Kubernetes with Skaffold. You can find the skaffold.yaml file in the project root directory. Skaffold uses Jib Maven Plugin for building a Docker image. It deploys not only the Spring Boot application but also the Mongo database.

apiVersion: skaffold/v2beta5
kind: Config
metadata:
  name: sample-spring-boot-on-kubernetes
build:
  artifacts:
    - image: piomin/sample-spring-boot-on-kubernetes
      jib:
        args:
          - -Pjib
  tagPolicy:
    gitCommit: {}
deploy:
  kubectl:
    manifests:
      - k8s/mongodb-deployment.yaml
      - k8s/deployment.yaml

The only thing you need to do to build and deploy the application is to execute the following command. It also allows you to access HTTP API through the local port.

$ skaffold dev --port-forward

For more information about Skaffold, Jib and a local development of Java applications, you may refer to the article Local Java Development on Kubernetes

Kubernetes Autoscaling with Spring Boot – Architecture

The picture visible below shows the architecture of our sample system. The horizontal pod autoscaler (HPA) automatically scales the number of pods based on CPU, memory, or other custom metrics. It obtains the value of the metric by pulling the Custom Metrics API. In the beginning, we are running a single instance of our Spring Boot application on Kubernetes. Prometheus gathers and stores metrics from the application by calling HTTP endpoint /actuator/promentheus. Consequently, the HPA scales up the number of pods if the value of the metric exceeds the assumed value.

Run Prometheus on Kubernetes

Let’s start by running the Prometheus instance on Kubernetes. In order to do that, you should use the official Prometheus Helm chart. Firstly, you need to add the required repository.

$ helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
$ helm repo add stable https://charts.helm.sh/stable
$ helm repo update

Then, you should execute the Helm install command and provide the name of your installation.

$ helm install prometheus prometheus-community/prometheus

In a moment, the Prometheus instance is ready to use. You can access it through the Kubernetes Service prometheus-server. It is available on port 443. By default, the type of service is ClusterIP. Therefore, you should execute the kubectl port-forward command to access it on the local port.

Deploy Spring Boot on Kubernetes

In order to enable Prometheus support in Spring Boot, you need to include Spring Boot Actuator and the Micrometer Prometheus library. The full list of required dependencies also contains the Spring Web and Spring Data MongoDB modules.


   org.springframework.boot
   spring-boot-starter-web


   org.springframework.boot
   spring-boot-starter-actuator


   io.micrometer
   micrometer-registry-prometheus


   org.springframework.boot
   spring-boot-devtools


   org.springframework.boot
   spring-boot-starter-data-mongodb

After enabling Prometheus support, the application exposes metrics through the /actuator/prometheus endpoint.

Our example application is very simple. It exposes the REST API for CRUD operations and connects to the Mongo database. Just to clarify, here’s the REST controller implementation.

@RestController
@RequestMapping("/persons")
public class PersonController {

   private PersonRepository repository;
   private PersonService service;

   PersonController(PersonRepository repository, PersonService service) {
      this.repository = repository;
      this.service = service;
   }

   @PostMapping
   public Person add(@RequestBody Person person) {
      return repository.save(person);
   }

   @PutMapping
   public Person update(@RequestBody Person person) {
      return repository.save(person);
   }

   @DeleteMapping("/{id}")
   public void delete(@PathVariable("id") String id) {
      repository.deleteById(id);
   }

   @GetMapping
   public Iterable findAll() {
      return repository.findAll();
   }

   @GetMapping("/{id}")
   public Optional findById(@PathVariable("id") String id) {
      return repository.findById(id);
   }

}

You may add a new person, modify and delete it through the HTTP API. You can also find it by id or just find all available persons. The Spring Boot Actuator generates HTTP traffic statistics per each endpoint and exposes it in the form readable by Prometheus. The number of incoming requests is available in the http_server_requests_seconds_count metric. Consequently, we will use this metric for Spring Boot autoscaling on Kubernetes.

Prometheus collects a pretty large set of metrics from the whole cluster. However, by default, it is not gathering the logs from applications. In order to force Prometheus to scrape the particular pods, you must add annotations to the Deployment as shown below. The annotation prometheus.io/path indicates the context path with the metrics endpoint. Of course, you have to enable scraping for the application using the annotation prometheus.io/scrape. Finally, you need to set the number of HTTP port with prometheus.io/port.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: sample-spring-boot-on-kubernetes-deployment
spec:
  selector:
    matchLabels:
      app: sample-spring-boot-on-kubernetes
  template:
    metadata:
      annotations:
        prometheus.io/path: /actuator/prometheus
        prometheus.io/scrape: "true"
        prometheus.io/port: "8080"
      labels:
        app: sample-spring-boot-on-kubernetes
    spec:
      containers:
      - name: sample-spring-boot-on-kubernetes
        image: piomin/sample-spring-boot-on-kubernetes
        ports:
        - containerPort: 8080
        env:
          - name: MONGO_DATABASE
            valueFrom:
              configMapKeyRef:
                name: mongodb
                key: database-name
          - name: MONGO_USERNAME
            valueFrom:
              secretKeyRef:
                name: mongodb
                key: database-user
          - name: MONGO_PASSWORD
            valueFrom:
              secretKeyRef:
                name: mongodb
                key: database-password

Install Prometheus Adapter on Kubernetes

The Prometheus adapter pulls metrics from the Prometheus instance and exposes them as the Custom Metrics API. In this step, you will have to provide configuration for pulling a custom metric exposed by the Spring Boot Actuator. The http_server_requests_seconds_count metric contains a number of requests received by the particular HTTP endpoint. To clarify, let’s take a look at the list of http_server_requests_seconds_count metrics for the multiple /persons endpoints.

You need to override some configuration settings for the Prometheus adapter. Firstly, you should change the default address of the Prometheus instance. Since the name of the Prometheus Service is prometheus-server, you should change it to prometheus-server.default.svc. The number of HTTP port is 80. Then, you have to define a custom rule for pulling the required metric from Prometheus. It is important to override the name of the Kubernetes pod and namespace used as a metric tag by Prometheus. There are multiple entries for http_server_requests_seconds_count, so you must calculate the sum. The name of the custom Kubernetes metric is http_server_requests_seconds_count_sum.

prometheus:
  url: http://prometheus-server.default.svc
  port: 80
  path: ""

rules:
  default: true
  custom:
  - seriesQuery: '{__name__=~"^http_server_requests_seconds_.*"}'
    resources:
      overrides:
        kubernetes_namespace:
          resource: namespace
        kubernetes_pod_name:
          resource: pod
    name:
      matches: "^http_server_requests_seconds_count(.*)"
      as: "http_server_requests_seconds_count_sum"
    metricsQuery: sum(<<.Series>>{<<.LabelMatchers>>,uri=~"/persons.*"}) by (<<.GroupBy>>)

Now, you just need to execute the Helm install command with the location of the YAML configuration file.

$ helm install -f k8s\helm-config.yaml prometheus-adapter prometheus-community/prometheus-adapter

Finally, you can verify if metrics have been successfully pulled by executing the following command.

Create Kubernetes Horizontal Pod Autoscaler

In the last step of this tutorial, you will create a Kubernetes HorizontalPodAutoscaler. HorizontalPodAutoscaler automatically scales up the number of pods if the average value of the http_server_requests_seconds_count_sum metric exceeds 100. In other words, if your instance of application receives more than 100 requests, HPA automatically runs a new instance. Then, after sending another 100 requests, an average value of metric exceeds 100 once again. So, HPA runs a third instance of the application. Consequently, after sending 1k requests you should have 10 pods. The definition of our HorizontalPodAutoscaler is visible below.

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: sample-hpa
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: sample-spring-boot-on-kubernetes-deployment
  minReplicas: 1
  maxReplicas: 10
  metrics:
    - type: Pods
      pods:
        metric:
          name: http_server_requests_seconds_count_sum
        target:
          type: AverageValue
          averageValue: 100

Testing Kubernetes autoscaling with Spring Boot

After deploying all the required components you may verify a list of running pods. As you see, there is a single instance of our Spring Boot application sample-spring-boot-on-kubernetes.

In order to check the current value of the http_server_requests_seconds_count_sum metric you can display the details about HorizontalPodAutoscaler. As you see I have already sent 15 requests to the different HTTP endpoints.

Here’s the sequence of requests you may send to the application to test autoscaling behavior.

$ curl http://localhost:8080/persons -d "{\"firstName\":\"Test\",\"lastName\":\"Test\",\"age\":20,\"gender\":\"MALE\"}" -H "Content-Type: application/
json"
{"id":"5fa334d149685f24841605a9","firstName":"Test","lastName":"Test","age":20,"gender":"MALE"}
$ curl http://localhost:8080/persons/5fa334d149685f24841605a9
{"id":"5fa334d149685f24841605a9","firstName":"Test","lastName":"Test","age":20,"gender":"MALE"}
$ curl http://localhost:8080/persons
[{"id":"5fa334d149685f24841605a9","firstName":"Test","lastName":"Test","age":20,"gender":"MALE"}]
$ curl -X DELETE http://localhost:8080/persons/5fa334d149685f24841605a9

After sending many HTTP requests to our application, you may verify the number of running pods. In that case, we have 5 instances.

You can also display a list of running pods.

Conclusion

In this article, I showed you a simple scenario of Kubernetes autoscaling with Spring Boot based on the number of incoming requests. You may easily create more advanced scenarios just by modifying a metric query in the Prometheus Adapter configuration file. I run all my tests on Google Cloud with GKE. For more information about running JVM applications on GKE please refer to Running Kotlin Microservice on Goggle Kubernetes Engine.

The post Spring Boot Autoscaling on Kubernetes appeared first on Piotr's TechBlog.