Data Grids Archives - Piotr's TechBlog

Hazelcast with Spring Boot on Kubernetes

piotr.minkowski — Fri, 31 Jan 2020 08:26:38 +0000

Hazelcast is the leading in-memory data grid (IMDG) solution. The main idea behind IMDG is to distribute data across many nodes inside a cluster. Therefore, it seems to be an ideal solution for running on a cloud platform like Kubernetes, where you can easily scale up or scale down a number of running instances. Since Hazelcast is written in Java you can easily integrate it with your Java application using standard libraries. Something that can also simplify a start with Hazelcast is Spring Boot. You may also use an unofficial library implementing Spring Repositories pattern for Hazelcast – Spring Data Hazelcast.
The main goal of this article is to demonstrate how to embed Hazelcast into the Spring Boot application and run it on Kubernetes as a multi-instance cluster. Thanks to Spring Data Hazelcast we won’t have to get into the details of Hazelcast data types. Although Spring Data Hazelcast does not provide many advanced features, it is very good for a start.

Architecture

We are running multiple instances of a single Spring Boot application on Kubernetes. Each application exposes port 8080 for HTTP API access and 5701 for Hazelcast cluster members discovery. Hazelcast instances are embedded into Spring Boot applications. We are creating two services on Kubernetes. The first of them is dedicated for HTTP API access, while the second is responsible for enabling discovery between Hazelcast instances. HTTP API will be used for making some tests requests that add data to the cluster and find data there. Let’s proceed to the implementation.

Example

The source code with sample application is as usual available on GitHub. It is available here https://github.com/piomin/sample-hazelcast-spring-datagrid.git. You should access module employee-kubernetes-service.

Dependencies

An integration between Spring and Hazelcast is provided by hazelcast-spring library. The version of Hazelcast libraries is related to Spring Boot via dependency management, so we just need to define the version of Spring Boot to the newest stable 2.2.4.RELEASE. The current version of Hazelcast related to this version of Spring Boot is 3.12.5. In order to enable Hazelcast members discovery on Kubernetes we also need to include hazelcast-kubernetes dependency. Its versioning is independable from core libraries. The newest version 2.0 is dedicated for Hazelcast 4. Since we are still using Hazelcast 3 we are declaring version 1.5.2 of hazelcast-kubernetes. We also include Spring Data Hazelcast and optionally Lombok for simplification.


   org.springframework.boot
   spring-boot-starter-parent
   2.2.4.RELEASE


   
      org.springframework.boot
      spring-boot-starter-web
   
   
      com.hazelcast
      spring-data-hazelcast
      2.2.2
   
   
      com.hazelcast
      hazelcast-spring
   
   
      com.hazelcast
      hazelcast-client
   
   
      com.hazelcast
      hazelcast-kubernetes
      1.5.2
   
   
      org.projectlombok
      lombok

Enabling Kubernetes Discovery for Hazelcast

After including required dependencies Hazelcast has been enabled for our application. The only thing we need to do is to enable discovery through Kubernetes. The HazelcastInstance bean is already available in the context, so we may change its configuration by defining com.hazelcast.config.Config bean. We need to disable multicast discovery, which is enabled by default, and enable Kubernetes discovery in the network config as shown below. Kubernetes config requires setting a target namespace of Hazelcast deployment and its service name.

@Bean
Config config() {
   Config config = new Config();
   config.getNetworkConfig().getJoin().getTcpIpConfig().setEnabled(false);
   config.getNetworkConfig().getJoin().getMulticastConfig().setEnabled(false);
   config.getNetworkConfig().getJoin().getKubernetesConfig().setEnabled(true)
         .setProperty("namespace", "default")
         .setProperty("service-name", "hazelcast-service");
   return config;
}

We also have to define Kubernetes Service hazelcast-service on port 5701. It is referenced to employee-service deployment.

apiVersion: v1
kind: Service
metadata:
  name: hazelcast-service
spec:
  selector:
    app: employee-service
  ports:
    - name: hazelcast
      port: 5701
  type: LoadBalancer

Here’s Kubernetes Deployment and Service definition for our sample application. We are setting three replicas for our deployment. We are also exposing two ports outside containers.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: employee-service
  labels:
    app: employee-service
spec:
  replicas: 3
  selector:
    matchLabels:
      app: employee-service
  template:
    metadata:
      labels:
        app: employee-service
    spec:
      containers:
        - name: employee-service
          image: piomin/employee-service
          ports:
            - name: http
              containerPort: 8080
            - name: multicast
              containerPort: 5701
---
apiVersion: v1
kind: Service
metadata:
  name: employee-service
  labels:
    app: employee-service
spec:
  ports:
    - port: 8080
      protocol: TCP
  selector:
    app: employee-service
  type: NodePort

In fact, that’s all that needs to be done to succesfully run Hazelcast cluster on Kubernetes. Before proceeding to the deployment let’s take a look on the application implementation details.

Implementation

Our application is very simple. It defines a single model object, which is stored in Hazelcast cluster. Such a class needs to have id – a field annotated with Spring Data @Id, and should implement Seriazable interface.

@Getter
@Setter
@EqualsAndHashCode
@ToString
public class Employee implements Serializable {

   @Id
   private Long id;
   @EqualsAndHashCode.Exclude
   private Integer personId;
   @EqualsAndHashCode.Exclude
   private String company;
   @EqualsAndHashCode.Exclude
   private String position;
   @EqualsAndHashCode.Exclude
   private int salary;

}

With Spring Data Hazelcast we may define repositories without using any queries or Hazelcast specific API for queries. We are using a well-known method naming pattern defined by Spring Data to build find methods as shown below. Our repository interface should extend HazelcastRepository.

public interface EmployeeRepository extends HazelcastRepository {

   Employee findByPersonId(Integer personId);
   List findByCompany(String company);
   List findByCompanyAndPosition(String company, String position);
   List findBySalaryGreaterThan(int salary);

}

To enable Spring Data Hazelcast Repositories we should annotate the main class or the configuration class with @EnableHazelcastRepositories.

@SpringBootApplication
@EnableHazelcastRepositories
public class EmployeeApplication {

   public static void main(String[] args) {
      SpringApplication.run(EmployeeApplication.class, args);
   }
   
}

Finally, here’s the Spring controller implementation. It allows us to invoke all the find methods defined in the repository, add new Employee object to Hazelcast and remove the existing one.

@RestController
@RequestMapping("/employees")
public class EmployeeController {

   private static final Logger logger = LoggerFactory.getLogger(EmployeeController.class);

   private EmployeeRepository repository;

   EmployeeController(EmployeeRepository repository) {
      this.repository = repository;
   }

   @GetMapping("/person/{id}")
   public Employee findByPersonId(@PathVariable("id") Integer personId) {
      logger.info("findByPersonId({})", personId);
      return repository.findByPersonId(personId);
   }
   
   @GetMapping("/company/{company}")
   public List findByCompany(@PathVariable("company") String company) {
      logger.info(String.format("findByCompany({})", company));
      return repository.findByCompany(company);
   }

   @GetMapping("/company/{company}/position/{position}")
   public List findByCompanyAndPosition(@PathVariable("company") String company, @PathVariable("position") String position) {
      logger.info(String.format("findByCompany({}, {})", company, position));
      return repository.findByCompanyAndPosition(company, position);
   }
   
   @GetMapping("/{id}")
   public Employee findById(@PathVariable("id") Long id) {
      logger.info("findById({})", id);
      return repository.findById(id).get();
   }

   @GetMapping("/salary/{salary}")
   public List findBySalaryGreaterThan(@PathVariable("salary") int salary) {
      logger.info(String.format("findBySalaryGreaterThan({})", salary));
      return repository.findBySalaryGreaterThan(salary);
   }
   
   @PostMapping
   public Employee add(@RequestBody Employee emp) {
      logger.info("add({})", emp);
      return repository.save(emp);
   }

   @DeleteMapping("/{id}")
   public void delete(@PathVariable("id") Long id) {
      logger.info("delete({})", id);
      repository.deleteById(id);
   }

}

Running Hazelcast on Kubernetes via Minikube

We will test our sample application on Minikube.

$ minikube start --vm-driver=virtualbox

The application is configured to run with Skaffold and Jib Maven Plugin. I have already described both these tools in one of my previous articles. They simplify the build and deployment process on Minikube. Assuming we are in the root directory of our application we just need to run the following command. Skaffold automatically builds our application using Maven, creates a Docker image based on Maven settings, applies a deployment file from k8s directory, and finally runs the application on Kubernetes.

$ skaffold dev

Since, we have declared three instances of our application in the deployment.yaml three pods are started. If Hazelcast discovery is succesfully finished you should see the following fragment of pods logs printed out by Skaffold.

Let’s take a look at the running pods.

And the list of services. HTTP API is available outside Minikube under port 32090.

Now, we may send some test requests. We will start by calling POST /employees method to add some Employee objects into Hazelcast cluster. Then we will perform some find methods using GET /employees/{id}. Since all the methods have finished succesfully, we should take a look at the logs that clearly show the working of Hazelcast cluster.

$ curl -X POST http://192.168.99.100:32090/employees -d '{"id":1,"personId":1,"company":"Test1","position":"Developer","salary":2000}' -H "Content-Type: application/json"
{"id":1,"personId":1,"company":"Test1","position":"Developer","salary":2000}
$ curl -X POST http://192.168.99.100:32090/employees -d '{"id":2,"personId":2,"company":"Test2","position":"Developer","salary":5000}' -H "Content-Type: application/json"
{"id":2,"personId":2,"company":"Test2","position":"Developer","salary":5000}
$ curl -X POST http://192.168.99.100:32090/employees -d '{"id":3,"personId":3,"company":"Test2","position":"Developer","salary":5000}' -H "Content-Type: application/json"
{"id":3,"personId":3,"company":"Test2","position":"Developer","salary":5000}
$ curl -X POST http://192.168.99.100:32090/employees -d '{"id":4,"personId":4,"company":"Test3","position":"Developer","salary":9000}' -H "Content-Type: application/json"
{"id":4,"personId":4,"company":"Test3","position":"Developer","salary":9000}
$ curl http://192.168.99.100:32090/employees/1
{"id":1,"personId":1,"company":"Test1","position":"Developer","salary":2000}
$ curl http://192.168.99.100:32090/employees/2
{"id":2,"personId":2,"company":"Test2","position":"Developer","salary":5000}
$ curl http://192.168.99.100:32090/employees/3
{"id":3,"personId":3,"company":"Test2","position":"Developer","salary":5000}

Here’s the screen with logs from pods printed out by Skaffold. Skaffold prints pod id for every single log line. Let’s take a closer look on the logs. The request for adding Employee with id=1 is processed by the application running on pod 5b758cc977-s6ptd. When we call find method using id=1 it is processed by the application on pod 5b758cc977-2fj2h. It proves that the Hazelcast cluster works properly. The same behaviour may be observed for other test requests.

We may also call some other find methods.

$ curl http://192.168.99.100:32090/employees/company/Test2/position/Developer
[{"id":2,"personId":2,"company":"Test2","position":"Developer","salary":5000},{"id":3,"personId":3,"company":"Test2","position":"Developer","salary":5000}]
$ curl http://192.168.99.100:32090/employees/salary/3000
[{"id":2,"personId":2,"company":"Test2","position":"Developer","salary":5000},{"id":4,"personId":4,"company":"Test3","position":"Developer","salary":9000},{"id":3,"personId":3,"company":"Test2","position":"Developer","salary":5000}]

Let’s test another scenario. We will remove one pod from the cluster as shown below.

Then we send some test requests to GET /employees/{id}. No matter which instance of the application is processing the request the object is being returned.

The post Hazelcast with Spring Boot on Kubernetes appeared first on Piotr's TechBlog.

Reactive Elasticsearch With Spring Boot

piotr.minkowski — Fri, 25 Oct 2019 08:35:40 +0000

One of the more notable features introduced in the latest release of Spring Data is reactive support for Elasticsearch. Since Spring Data Moore we can take advantage of reactive templates and repositories. It is built on top of a fully reactive Elasticsearch REST client, that is based on Spring WebClient. It is also worth to mention about support for reactive Querydsl, which can be included to your application through ReactiveQueryPredicateExecutor.
I have already shown you how to use Spring Data Repositories for synchronous integration with Elasticsearch API in one of my previous articles Elasticsearch with Spring Boot. There is no big difference between using standard and reactive Spring Data Repositories. I’ll focus on showing you those differences in a sample application used also in the previous article. Therefore it is worth reading my previous article before reading this. Let’s proceed to build the Spring Boot reactive Elasticsearch example.

1. Dependencies

I’m using the latest stable version of Spring Boot with JDK 11.


   org.springframework.boot
   spring-boot-starter-parent
   2.2.0.RELEASE
   


   11

We need to include Spring WebFlux and Spring Data Elasticsearch starters. We will also use Actuator for exposing health checks, and some libraries for automated testing like Spring Test and Testcontainers.


   
      org.springframework.boot
      spring-boot-starter-webflux
   
   
      org.springframework.boot
      spring-boot-starter-actuator
   
   
      org.springframework.boot
      spring-boot-starter-data-elasticsearch
   
   
      org.springframework.boot
      spring-boot-starter-test
      test
   
   
      org.testcontainers
      elasticsearch
      1.12.2
      test

2. Enabling Reactive Repositories

Before starting working with reactive Spring Data repositories we should enable it by annotating the main or configuration class with @EnableReactiveElasticsearchRepositories.

@SpringBootApplication
@EnableReactiveElasticsearchRepositories
public class SampleApplication {

    public static void main(String[] args) {
        SpringApplication.run(SampleApplication.class, args);
    }

    @Bean
    @ConditionalOnProperty("initial-import.enabled")
    public SampleDataSet dataSet() {
        return new SampleDataSet();
    }

}

3. Building reactive Elasticsearch repositories

Spring Data Elasticsearch comes with three interfaces that supports reactive operations: ReactiveRepository, ReactiveCrudRepository that adds save/update operations, and ReactiveSortingRepository offering some methods with sorting. The usage is the same as earlier – we just need to create our own repository that extends one of the interfaces listed above. We can also add some custom find methods following the Spring Data query naming convention. Similarly to all other Spring reactive projects Spring Data Elasticsearch Repositories support is built on top of Project Reactor.

@Repository
public interface EmployeeRepository extends ReactiveCrudRepository {

    Flux findByOrganizationName(String name);
    Flux findByName(String name);

}

Here’s our model class:

@Document(indexName = "sample", type = "employee")
public class Employee {

    @Id
    private Long id;
    @Field(type = FieldType.Object)
    private Organization organization;
    @Field(type = FieldType.Object)
    private Department department;
    private String name;
    private int age;
    private String position;
   
}

3. Building Controller

We will expose some reactive CRUD methods outside application using Spring WebFlux. Here’s our controller class implementation:

@RestController
@RequestMapping("/employees")
public class EmployeeController {

   @Autowired
   EmployeeRepository repository;

   @PostMapping
   public Mono add(@RequestBody Employee employee) {
      return repository.save(employee);
   }

   @GetMapping("/{name}")
   public Flux findByName(@PathVariable("name") String name) {
      return repository.findByName(name);
   }

   @GetMapping
   public Flux findAll() {
      return repository.findAll();
   }

   @GetMapping("/organization/{organizationName}")
   public Flux findByOrganizationName(@PathVariable("organizationName") String organizationName) {
      return repository.findByOrganizationName(organizationName);
   }

}

4. Running Spring Boot application

For the test purpose we need a single node Elasticsearch instance running in development mode. As usual, we will use a Docker container. Here’s the command that starts a Docker container and exposes it on port 9200.

$ docker run -d --name elasticsearch -p 9200:9200 -e "discovery.type=single-node" elasticsearch:6.6.2

My Docker Machine is available on virtual address 192.168.99.100, so I had to override Elasticsearch address in Spring Boot configuration file. Because Elasticsearch reactive repositories use ReactiveElasticsearchClient we have to set property spring.data.elasticsearch.client.reactive.endpoints to 192.168.99.100:9200. Actuator still uses a synchronous REST client for detecting Elasticsearch status in healthcheck, so we also need to override default address in spring.elasticsearch.rest.uris property.

spring:
  application:
    name: sample-spring-elasticsearch
  data:
    elasticsearch:
      client:
        reactive:
          endpoints: 192.168.99.100:9200
  elasticsearch:
    rest:
      uris: http://192.168.99.100:9200

5. Testing Spring Boot reactive Elasticserach support

The same as with synchronous repositories we use Testcontainers for JUnit tests. The only difference is that we need to block a repository method when verifying the result of the test.

@RunWith(SpringRunner.class)
@SpringBootTest
@FixMethodOrder(MethodSorters.NAME_ASCENDING)
public class EmployeeRepositoryTest {

    @ClassRule
    public static ElasticsearchContainer container = new ElasticsearchContainer();
    @Autowired
    EmployeeRepository repository;

    @BeforeClass
    public static void before() {
        System.setProperty("spring.data.elasticsearch.client.reactive.endpoints", container.getContainerIpAddress() + ":" + container.getMappedPort(9200));
    }

    @Test
    public void testAdd() {
        Employee employee = new Employee();
        employee.setId(1L);
        employee.setName("John Smith");
        employee.setAge(33);
        employee.setPosition("Developer");
        employee.setDepartment(new Department(1L, "TestD"));
        employee.setOrganization(new Organization(1L, "TestO", "Test Street No. 1"));
        Mono employeeSaved = repository.save(employee);
        Assert.assertNotNull(employeeSaved.block());
    }

    @Test
    public void testFindAll() {
        Flux employees = repository.findAll();
        Assert.assertTrue(employees.count().block() > 0);
    }

    @Test
    public void testFindByOrganization() {
        Flux employees = repository.findByOrganizationName("TestO");
        Assert.assertTrue(employees.count().block() > 0);
    }

    @Test
    public void testFindByName() {
        Flux employees = repository.findByName("John Smith");
        Assert.assertTrue(employees.count().block() > 0);
    }

}

Source Code

For the current sample I’m using the same repository as for the sample with synchronous repositories. I created a new branch reactive for that. Here’s GitHub repository address https://github.com/piomin/sample-spring-elasticsearch/tree/reactive. It illustrates how to build the Spring Boot reactive Elasticsearch application.

The post Reactive Elasticsearch With Spring Boot appeared first on Piotr's TechBlog.

Apache Ignite Cluster together with Spring Boot

piotr.minkowski — Wed, 04 Apr 2018 15:14:34 +0000

I have already introduced Apache Ignite in one of my previous articles In-memory data grid with Apache Ignite. Apache Ignite can be easily launched locally together with Spring Boot application. The only thing we have to do is to include artifact org.apache.ignite:ignite-spring-data to the project dependencies and then declare Ignite instance @Bean. Sample @Bean declaration is visible below.

@Bean
public Ignite igniteInstance() {
   IgniteConfiguration cfg = new IgniteConfiguration();
   cfg.setIgniteInstanceName("ignite-cluster-node");
   CacheConfiguration ccfg1 = new CacheConfiguration("PersonCache");
   ccfg1.setIndexedTypes(Long.class, Person.class);
   CacheConfiguration ccfg2 = new CacheConfiguration("ContactCache");
   ccfg2.setIndexedTypes(Long.class, Contact.class);
   cfg.setCacheConfiguration(ccfg1, ccfg2);
   IgniteLogger log = new Slf4jLogger();
   cfg.setGridLogger(log);
   return Ignition.start(cfg);
}

In this article I would like to show you a little more advanced sample where we will start multiple Ignite’s nodes inside the cluster, Ignite’s web console for monitoring cluster, and Ignite’s agent for providing communication between nodes and web console. Let’s begin by looking at the picture with an architecture of our sample solution.

We have three nodes that are part of the cluster. If you carefully take a look at the picture illustrating an architecture you have probably noticed that there are two nodes called Server Node, and one called Client Node. By default, all Ignite nodes are started as server nodes. Client mode needs to be explicitly enabled. Server nodes participate in caching, compute execution, stream processing, while client nodes provide an ability to connect to the servers remotely. However, they allow using the whole set of Ignite APIs, including near caching, transactions, compute, and streaming.

Here’s Ignite’s client instance @Bean declaration.

@Bean
public Ignite igniteInstance() {
   IgniteConfiguration cfg = new IgniteConfiguration();
   cfg.setIgniteInstanceName("ignite-cluster-node");
   cfg.setClientMode(true);
   CacheConfiguration ccfg1 = new CacheConfiguration("PersonCache");
   ccfg1.setIndexedTypes(Long.class, Person.class);
   CacheConfiguration ccfg2 = new CacheConfiguration("ContactCache");
   ccfg2.setIndexedTypes(Long.class, Contact.class);
   cfg.setCacheConfiguration(ccfg1, ccfg2);
   return Ignition.start(cfg);
}

The fact is that we don’t have to do anything more to make our nodes working together within the cluster. Every new node is automatically detected by all other cluster nodes using multicast communication. When starting our sample application we only have to guarantee that each instance’s server would listen of a different port by overriding server.port Spring Boot property. Here’s command that starts the sample application, which is available on GitHub (https://github.com/piomin/sample-ignite-jpa.git) under branch cluster (https://github.com/piomin/sample-ignite-jpa/tree/cluster). Each node exposes the same REST API, which may be easily tested using Swagger2 just by opening its dashboard available under address http://localhost:port/swagger-ui.html.

$ java -jar -Dserver.port=8901 -Xms512m -Xmx1024m -XX:+UseG1GC -XX:+DisableExplicitGC -XX:MaxDirectMemorySize=256m target/ignite-rest-service-1.0-SNAPSHOT.jar

If you have successfully started a new node you should see the similar information in your application logs.

>>> +----------------------------------------------------------------------+
>>> Ignite ver. 2.4.0#20180305-sha1:aa342270b13cc1f4713382a8eb23b2eb7edaa3a5
>>> +----------------------------------------------------------------------+
>>> OS name: Windows 10 10.0 amd64
>>> CPU(s): 4
>>> Heap: 1.0GB
>>> VM name: 14132@piomin
>>> Ignite instance name: ignite-cluster-node
>>> Local node [ID=9DB1296A-7EEC-4564-BAAD-14E5D4A3A08D, order=2, clientMode=false]
>>> Local node addresses: [piomin/0:0:0:0:0:0:0:1, piomin/127.0.0.1, piomin/192.168.1.102, piomin/192.168.116.1, /192.168.226.1, /192.168.99.1]
>>> Local ports: TCP:8082 TCP:10801 TCP:11212 TCP:47101 UDP:47400 TCP:47501

Let’s move back for a moment to the source code of our sample application. I assume you have already cloned a given repository from GitHub. There are two Maven modules available. The module ignite-rest-service is responsible for starting Ignite’s cluster node in server mode, while ignite-client-service for starting node in client mode. Because we run only a single instance of the client’s node, we would not override its default port set inside application.yml file. You can build the project using mvn clean install command and then start with java -jar or just run the main class IgniteClientApplication from your IDE.

There is also JUnit test class inside module ignite-client-service, which defines one test responsible for calling HTTP endpoints (POST /person, POST /contact) that put data into Ignite’s cache. This test performs two operations. It puts some data to the Ignite’s in-memory cluster by calling endpoints exposed by the client node, and then check if that data has been propagated through the cluster by calling GET /person/{id}/withContacts endpoint exposed by one of the selected server nodes.

public class TestCluster {

   TestRestTemplate template = new TestRestTemplate();
   Random r = new Random();
   int[] clusterPorts = new int[] {8901, 8902};

   @Test
   public void testCluster() throws InterruptedException {
      for (int i=0; i<1000; i++) {
         Person p = template.postForObject("http://localhost:8090/person", createPerson(), Person.class);
         Assert.notNull(p, "Create person failed");
         Contact c1 = template
            .postForObject("http://localhost:8090/contact", createContact(p.getId(), 0), Contact.class);
         Assert.notNull(c1, "Create contact failed");
         Contact c2 = template
            .postForObject("http://localhost:8090/contact", createContact(p.getId(), 1), Contact.class);
         Assert.notNull(c2, "Create contact failed");
         Thread.sleep(10);
         Person result = template.getForObject("http://localhost:{port}/person/{id}/withContacts", 
            Person.class, clusterPorts[r.nextInt(2)], p.getId());
         Assert.notNull(result, "Person not found");
         Assert.notEmpty(result.getContacts(), "Contacts not found");
      }
   }

   private Contact createContact(Long personId, int index) {
      ...
   }

   private Person createPerson() {
      ...
   }

}

Before running any tests, we should launch two additional elements being a part of our architecture: Ignite’s web console and agent. The most suitable way to run Ignite’s web console on the local machine is through its Docker image apacheignite/web-console-standalone. Here’s a Docker command that starts Ignite’s web console and exposes it on port 80. Because I run Docker on Windows, it is now available under default VM address http://192.168.99.100/.

$ docker run -d -p 80:80 -p 3001:3001 -v /var/data:/var/lib/mongodb --name ignite-web-console apacheignite/web-console-standalone

In order to access it you should first register your user. Although the mail server is not available on the Docker container, you would be logged in after it. You can configure your cluster using Ignite’s web console, and also run some SQL queries on that cluster. Of course, we still need to connect our cluster consisting of three nodes with the instance of web console started on Docker container. To achieve it you have to download a web agent. Probably it is not very intuitive, but you have to click button Start Demo, which is located on the right corner of Ignite’s web console. Then you would be redirected to the download page, where you can accept the download of ignite-web-agent-2.4.0.zip file, which contains all needed libraries and configuration to start web agent locally.

After downloading and unpacking web agent go to its main directory and change property server-uri to http://192.168.99.100 inside default.properties file. Then you may run script ignite-web-agent.bat (or .sh if you are testing it on Linux), which starts web agent. Unfortunately, it’s not all that has to be done. Every server node’s application should include artifact ignite-rest-http in order to be able to communicate with the agent. It is responsible for exposing the HTTP endpoint that is accessed by a web agent. It is based on the Jetty server, which causes some problems in conjunction with Spring Boot. Spring Boot sets default versions of Jetty libraries used inside the project. The problem is that ignite-rest-http requires older versions of that libraries, so we also have to override some default managed versions in pom.xml file according to the sample visible below.


   
      
         org.eclipse.jetty
         jetty-http
         9.2.11.v20150529
      
      
         org.eclipse.jetty
         jetty-server
         9.2.11.v20150529
      
      
         org.eclipse.jetty
         jetty-io
         9.2.11.v20150529
      
      
         org.eclipse.jetty
         jetty-continuation
         9.2.11.v20150529
      
      
         org.eclipse.jetty
         jetty-util
         9.2.11.v20150529
      
      
         org.eclipse.jetty
         jetty-xml
         9.2.11.v20150529

After implementing the changes described above, we may finally proceed to run all the elements being a part of our sample system. If you start Ignite Web Agent locally it should automatically detect all running cluster nodes. Here’s the screen with the logs displayed by the agent after startup.

At the same time, you should see that a new cluster has been detected by Ignite Web Console.

You can configure a new or a currently existing cluster using the web console, or just run a test query on the selected managed cluster. You have to include the name of the cache as a prefix to the table name when defining a query.

Similar queries have been declared inside a repository interface. Here are additional methods used for finding entities stored in PersonCache. If you would like to include results stored in another cache, you have to explicitly declare its name together with the table name.

@RepositoryConfig(cacheName = "PersonCache")
public interface PersonRepository extends IgniteRepository {

   List findByFirstNameAndLastName(String firstName, String lastName);

   @Query("SELECT p.id, p.firstName, p.lastName, c.id, c.type, c.location FROM Person p JOIN \"ContactCache\".Contact c ON p.id=c.personId WHERE p.id=?")
   List findByIdWithContacts(Long id);

   @Query("SELECT c.* FROM Person p JOIN \"ContactCache\".Contact c ON p.id=c.personId WHERE p.firstName=? and p.lastName=?")
   List selectContacts(String firstName, String lastName);

   @Query("SELECT p.id, p.firstName, p.lastName, c.id, c.type, c.location FROM Person p JOIN \"ContactCache\".Contact c ON p.id=c.personId WHERE p.firstName=? and p.lastName=?")
   List selectContacts2(String firstName, String lastName);
}

We are nearing the end. Now, let’s run our JUnit test TestCluster in order to generate some test data and put it into the clustered cache. You can monitor the size of a cache using the web console. All you have to do is to run SELECT COUNT(*) query, and set graph mode as a default mode for result display. The chart visible below illustrates the number of entities stored inside Ignite’s cluster at 5s intervals.

The post Apache Ignite Cluster together with Spring Boot appeared first on Piotr's TechBlog.

In-memory data grid with Apache Ignite

piotr.minkowski — Mon, 13 Nov 2017 14:22:04 +0000

Apache Ignite is a relatively new solution, but quickly increasing its popularity. It is hard to assign to a single area of database engine division because it has characteristics typical for some of them. The primary purpose of this solution is an in-memory data grid and key-value storage. It also has some common RDBMS features like support for SQL queries and ACID transactions. But that’s not to say it is full SQL and transactional database. It does not support foreign key constraints and transactions are available only at the key-value level. Despite that, Apache Ignite seems to be a very interesting solution.

Apache Ignite may be easily started as a node embedded in the Spring Boot application. The simplest way to achieve that is by using the Spring Data Ignite library. Apache Ignite implements Spring Data CrudRepository interface that supports basic CRUD operations and also provides access to the Apache Ignite SQL Grid using the unified Spring Data interfaces. Although it has support for distributed, ACID, and SQL-compliant disk store persistence we design a solution which store in-memory cache objects in MySQL database. The architecture of the presented solution is visible on the figure below and you can see it is very simple. The application put data to the in-memory cache on Apache Ignite. Apache Ignite automatically synchronizes these changes with the database in an asynchronous, background task. The way of reading data by the application also should not surprise you. If an entity is not cached it is read from the database and put to the cache for future use.

I’m going to guide you through the process of the sample application development. The result of this development is available on GitHub. I have found a few examples on the web, but there were only the basics. I’ll show you how to configure Apache Ignite to write objects from the cache in the database and create some more complex cross-cache join queries. Let’s begin by running the database.

1. Setup MySQL database

The best way to start a MySQL database locally is of course by Docker container. For Docker on Windows, MySQL database is now available on 192.168.99.100:33306.

$ docker run -d --name mysql -e MYSQL_DATABASE=ignite -e MYSQL_USER=ignite -e MYSQL_PASSWORD=ignite123 -e MYSQL_ALLOW_EMPTY_PASSWORD=yes -p 33306:3306 mysql

The next step is to create tables used by application entities to store the data: PERSON, CONTACT. Those to tables are in 1…N relation where table CONTACT holds the foreign key referenced to PERSON id.

CREATE TABLE `person` (
`id` int(11) NOT NULL,
`first_name` varchar(45) DEFAULT NULL,
`last_name` varchar(45) DEFAULT NULL,
`gender` varchar(10) DEFAULT NULL,
`country` varchar(10) DEFAULT NULL,
`city` varchar(20) DEFAULT NULL,
`address` varchar(45) DEFAULT NULL,
`birth_date` date DEFAULT NULL,
PRIMARY KEY (`id`)
);

CREATE TABLE `contact` (
`id` int(11) NOT NULL,
`location` varchar(45) DEFAULT NULL,
`contact_type` varchar(10) DEFAULT NULL,
`person_id` int(11) NOT NULL,
PRIMARY KEY (`id`)
);

ALTER TABLE `ignite`.`contact` ADD INDEX `person_fk_idx` (`person_id` ASC);
ALTER TABLE `ignite`.`contact`
ADD CONSTRAINT `person_fk` FOREIGN KEY (`person_id`) REFERENCES `ignite`.`person` (`id`) ON DELETE CASCADE ON UPDATE CASCADE;

2. Maven configuration

The easiest way to start working with Apache Ignite’s Spring Data repository is by adding the following Maven dependency to an application’s pom.xml file. All the other Ignite dependencies would be automatically included. We also need MySQL JDBC driver, Spring JDBC dependencies to configure the connection to the database. They are required because we are embedding Apache Ignite to the application and it has to establish a connection with MySQL in order to be able to synchronize cache with database tables.


   org.springframework.boot
   spring-boot-starter-web


   org.springframework.boot
   spring-boot-starter-jdbc


   mysql
   mysql-connector-java
   runtime


   org.apache.ignite
   ignite-spring-data
   ${ignite.version}

3. Configure Ignite node

Using IgniteConfiguration class we are able to configure all available Ignite’s node settings. The most important thing here is a cache configuration (1). We should add primary key and entity classes as an indexed types (2). Then we have to enable export cache updates to database (3) and read data not found in a cache from database (4). The interaction between Ignite’s node and MySQL may be configured using CacheJdbcPojoStoreFactory class (5). We should pass there DataSource @Bean (6), dialect (7) and mapping between object fields and table columns (8).

@Bean
public Ignite igniteInstance() {
   IgniteConfiguration cfg = new IgniteConfiguration();
   cfg.setIgniteInstanceName("ignite-1");
   cfg.setPeerClassLoadingEnabled(true);

   CacheConfiguration ccfg2 = new CacheConfiguration<>("ContactCache"); // (1)
   ccfg2.setIndexedTypes(Long.class, Contact.class); // (2)
   ccfg2.setWriteBehindEnabled(true);
   ccfg2.setWriteThrough(true); // (3)
   ccfg2.setReadThrough(true); // (4)
   CacheJdbcPojoStoreFactory f2 = new CacheJdbcPojoStoreFactory<>(); // (5)
   f2.setDataSource(datasource); // (6)
   f2.setDialect(new MySQLDialect()); // (7)
   JdbcType jdbcContactType = new JdbcType(); // (8)
   jdbcContactType.setCacheName("ContactCache");
   jdbcContactType.setKeyType(Long.class);
   jdbcContactType.setValueType(Contact.class);
   jdbcContactType.setDatabaseTable("contact");
   jdbcContactType.setDatabaseSchema("ignite");
   jdbcContactType.setKeyFields(new JdbcTypeField(Types.INTEGER, "id", Long.class, "id"));
   jdbcContactType.setValueFields(new JdbcTypeField(Types.VARCHAR, "contact_type", ContactType.class, "type"), new JdbcTypeField(Types.VARCHAR, "location", String.class, "location"), new JdbcTypeField(Types.INTEGER, "person_id", Long.class, "personId"));
   f2.setTypes(jdbcContactType);
   ccfg2.setCacheStoreFactory(f2);

   CacheConfiguration ccfg = new CacheConfiguration<>("PersonCache");
   ccfg.setIndexedTypes(Long.class, Person.class);
   ccfg.setWriteBehindEnabled(true);
   ccfg.setReadThrough(true);
   ccfg.setWriteThrough(true);
   CacheJdbcPojoStoreFactory f = new CacheJdbcPojoStoreFactory<>();
   f.setDataSource(datasource);
   f.setDialect(new MySQLDialect());
   JdbcType jdbcType = new JdbcType();
   jdbcType.setCacheName("PersonCache");
   jdbcType.setKeyType(Long.class);
   jdbcType.setValueType(Person.class);
   jdbcType.setDatabaseTable("person");
   jdbcType.setDatabaseSchema("ignite");
   jdbcType.setKeyFields(new JdbcTypeField(Types.INTEGER, "id", Long.class, "id"));
   jdbcType.setValueFields(new JdbcTypeField(Types.VARCHAR, "first_name", String.class, "firstName"), new JdbcTypeField(Types.VARCHAR, "last_name", String.class, "lastName"), new JdbcTypeField(Types.VARCHAR, "gender", Gender.class, "gender"), new JdbcTypeField(Types.VARCHAR, "country", String.class, "country"), new JdbcTypeField(Types.VARCHAR, "city", String.class, "city"), new JdbcTypeField(Types.VARCHAR, "address", String.class, "address"), new JdbcTypeField(Types.DATE, "birth_date", Date.class, "birthDate"));
   f.setTypes(jdbcType);
   ccfg.setCacheStoreFactory(f);

   cfg.setCacheConfiguration(ccfg, ccfg2);
   return Ignition.start(cfg);
}

Here’s Spring data source configuration for MySQL running as a Docker container.

spring:
  datasource:
    name: mysqlds
    url: jdbc:mysql://192.168.99.100:33306/ignite?useSSL=false
    username: ignite
    password: ignite123

On that occasion, it should be mentioned that Apache Ignite has still some deficiencies. For example, it maps Enum to integer taking its ordinal value although it has configured VARCHAR as JDCB type. When reading such a row from database it is not mapped properly to Enum in object – you would have null in this response field.

new JdbcTypeField(Types.VARCHAR, "contact_type", ContactType.class, "type")

4. Model objects

Like I mentioned before we have two tables in the database schema. There are also two model classes and two cache configurations one per each model class. Here’s model class implementation. One of the few interesting things here is ID generation with AtomicLong class. It is one of the basic Ignite’s components acting as a sequence generator. We can also see a specific annotation @QuerySqlField, which marks the field as available for usage as a query parameter in SQL.

@QueryGroupIndex.List(
@QueryGroupIndex(name="idx1")
)
public class Person implements Serializable {

   private static final long serialVersionUID = -1271194616130404625L;
   private static final AtomicLong ID_GEN = new AtomicLong();

   @QuerySqlField(index = true)
   private Long id;
   @QuerySqlField(index = true)
   @QuerySqlField.Group(name = "idx1", order = 0)
   private String firstName;
   @QuerySqlField(index = true)
   @QuerySqlField.Group(name = "idx1", order = 1)
   private String lastName;
   private Gender gender;
   private Date birthDate;
   private String country;
   private String city;
   private String address;
   private List contacts = new ArrayList<>();

   public void init() {
      this.id = ID_GEN.incrementAndGet();
   }

   public Long getId() {
      return id;
   }

      public void setId(Long id) {
   this.id = id;
   }

   public String getFirstName() {
      return firstName;
   }

   public void setFirstName(String firstName) {
      this.firstName = firstName;
   }

   public String getLastName() {
      return lastName;
   }

   public void setLastName(String lastName) {
      this.lastName = lastName;
   }

   public Gender getGender() {
      return gender;
   }

   public void setGender(Gender gender) {
      this.gender = gender;
   }

   public Date getBirthDate() {
      return birthDate;
   }

   public void setBirthDate(Date birthDate) {
      this.birthDate = birthDate;
   }

   public String getCountry() {
      return country;
   }

   public void setCountry(String country) {
      this.country = country;
   }

   public String getCity() {
      return city;
   }

   public void setCity(String city) {
      this.city = city;
   }

   public String getAddress() {
      return address;
   }

   public void setAddress(String address) {
      this.address = address;
   }

   public List getContacts() {
      return contacts;
   }

   public void setContacts(List contacts) {
      this.contacts = contacts;
   }

}

5. Ignite repositories

I assume that you are familiar with Spring Data JPA concept of creating repositories. A repository handling should be enabled on the main or @Configuration class.


@SpringBootApplication
@EnableIgniteRepositories
public class IgniteRestApplication {

   @Autowired
   DataSource datasource;

   public static void main(String[] args) {
      SpringApplication.run(IgniteRestApplication.class, args);
   }

   // ...
}

Then we have to extend our @Repository interface with base CrudRepository interface. It supports only inherited methods with the id parameter. In the PersonRepository fragment visible below I defined some find methods using the Spring Data naming convention and Ignite’s queries. In those samples, you can see that we can return a full object or selected fields as a query result – according to the needs.

@RepositoryConfig(cacheName = "PersonCache")
public interface PersonRepository extends IgniteRepository {

   List findByFirstNameAndLastName(String firstName, String lastName);

   @Query("SELECT c.* FROM Person p JOIN \"ContactCache\".Contact c ON p.id=c.personId WHERE p.firstName=? and p.lastName=?")
   List selectContacts(String firstName, String lastName);

   @Query("SELECT p.id, p.firstName, p.lastName, c.id, c.type, c.location FROM Person p JOIN \"ContactCache\".Contact c ON p.id=c.personId WHERE p.firstName=? and p.lastName=?")
   List> selectContacts2(String firstName, String lastName);
}

6. API and testing

Finally, we can inject the repository beans into the REST controller classes. API would expose methods for adding a new object to the cache, updating or removing existing objects and some for searching using the primary key or the other more complex indices.

@RestController
@RequestMapping("/person")
public class PersonController {

   private static final Logger LOGGER = LoggerFactory.getLogger(PersonController.class);

   @Autowired
   PersonRepository repository;

   @PostMapping
   public Person add(@RequestBody Person person) {
      person.init();
      return repository.save(person.getId(), person);
   }

   @PutMapping
   public Person update(@RequestBody Person person) {
      return repository.save(person.getId(), person);
   }

   @DeleteMapping("/{id}")
   public void delete(Long id) {
      repository.delete(id);
   }

   @GetMapping("/{id}")
   public Person findById(@PathVariable("id") Long id) {
      return repository.findOne(id);
   }

   @GetMapping("/{firstName}/{lastName}")
   public List findByName(@PathVariable("firstName") String firstName, @PathVariable("lastName") String lastName) {
      return repository.findByFirstNameAndLastName(firstName, lastName);
   }

   @GetMapping("/contacts/{firstName}/{lastName}")
   public List findByNameWithContacts(@PathVariable("firstName") String firstName, @PathVariable("lastName") String lastName) {
      List persons = repository.findByFirstNameAndLastName(firstName, lastName);
      List contacts = repository.selectContacts(firstName, lastName);
      persons.stream().forEach(it -> it.setContacts(contacts.stream().filter(c -> c.getPersonId().equals(it.getId())).collect(Collectors.toList())));
      LOGGER.info("PersonController.findByIdWithContacts: {}", contacts);
      return persons;
   }

   @GetMapping("/contacts2/{firstName}/{lastName}")
   public List findByNameWithContacts2(@PathVariable("firstName") String firstName, @PathVariable("lastName") String lastName) {
      List> result = repository.selectContacts2(firstName, lastName);
      List persons = new ArrayList<>();
      for (List l : result) {
         persons.add(mapPerson(l));
      }
      LOGGER.info("PersonController.findByIdWithContacts: {}", result);
      return persons;
   }

   private Person mapPerson(List l) {
      Person p = new Person();
      Contact c = new Contact();
      p.setId((Long) l.get(0));
      p.setFirstName((String) l.get(1));
      p.setLastName((String) l.get(2));
      c.setId((Long) l.get(3));
      c.setType((ContactType) l.get(4));
      c.setLocation((String) l.get(4));
      p.addContact(c);
      return p;
   }

}

It is certainly important to test the performance of the implemented solution, especially when it is related to an in-memory data grid and databases. For that purpose, I created some JUnit tests which put a large number of objects into the cache and then invoke some find methods using random input data to test queries performance. Here’s method which generates many Person and Contact objects and puts them into cache using API endpoints.

@Test
public void testAddPerson() throws InterruptedException {
   ExecutorService es = Executors.newCachedThreadPool();
   for (int j = 0; j < 10; j++) { es.execute(() -> {
      TestRestTemplate restTemplateLocal = new TestRestTemplate();
      Random r = new Random();
      for (int i = 0; i < 1000000; i++) {
         Person p = restTemplateLocal.postForObject("http://localhost:8090/person", createTestPerson(), Person.class);
         int x = r.nextInt(6);
         for (int k = 0; k < x; k++) {
            restTemplateLocal.postForObject("http://localhost:8090/contact", createTestContact(p.getId()), Contact.class);
         }
      }
      });
   }
   es.shutdown();
   es.awaitTermination(60, TimeUnit.MINUTES);
}

Spring Boot provides methods for capturing basic metrics of API response times. To enable that feature we have to include Spring Actuator to the dependencies. Metrics endpoint is available under http://localhost:8090/metrics address. In addition to each API method processing time, it also prints such statistics like a number of running threads or free memory.

7. Running application

Let’s run our sample application with embedded Apache Ignite’s node. Following some performance suggestions available in the Ignite’s docs I defined the JVM configuration visible below.

$ java -jar -Xms512m -Xmx1024m -XX:MaxDirectMemorySize=256m -XX:+DisableExplicitGC -XX:+UseG1GC target/ignite-rest-service-1.0-SNAPSHOT.jar

Now, we can run JUnit test class IgniteRestControllerTest. It puts some data into the cache and then calls find methods. The metrics for the tests with 1M Person objects and 2.5M Contact objects in the cache are visible below. All find methods have taken about 1ms on average.

{
"mem": 624886,
"mem.free": 389701,
"processors": 4,
"instance.uptime": 2446038,
"uptime": 2466661,
"systemload.average": -1,
"heap.committed": 524288,
"heap.init": 524288,
"heap.used": 133756,
"heap": 1048576,
"threads.peak": 107,
"threads.daemon": 25,
"threads.totalStarted": 565,
"threads": 80,
...
"gauge.response.person.contacts.firstName.lastName": 1,
"gauge.response.contact": 1,
"gauge.response.person.firstName.lastName": 1,
"gauge.response.contact.location.location": 1,
"gauge.response.person.id": 1,
"gauge.response.person": 0,
"counter.status.200.person.id": 1000,
"counter.status.200.person.contacts.firstName.lastName": 1000,
"counter.status.200.person.firstName.lastName": 1000,
"counter.status.200.contact": 2500806,
"counter.status.200.person": 1000000,
"counter.status.200.contact.location.location": 1000
}

The post In-memory data grid with Apache Ignite appeared first on Piotr's TechBlog.

Hazelcast Hot Cache with Striim

piotr.minkowski — Wed, 09 Aug 2017 12:01:26 +0000

I previously introduced some articles about Hazelcast – an open source in memory data grid solution. In the first of them JPA caching with Hazelcast, Hibernate and Spring Boot I described how to set up 2nd level JPA cache with Hazelcast and MySQL. In the second In memory data grid with Hazelcast I showed more advanced sample how to use Hazelcast distributed queries to enable faster data access for Spring Boot application. Using Hazelcast as a cache level between your application and relational database is generally a very good solution under one condition – all changes are going across your application. If a data source is modified by other application which does not use your caching solution it causes problem with outdated data for your application. Did you have encountered this problem in your organization? In my organization we still use relational databases in almost all our systems. Sometimes it causes performance problems, even optimized queries are too slow for real time applications. Relational database is still required, so solutions like Hazelcast can help us.

Let’s return to the topic of outdated cache. That’s why we need Striim, a real-time data integration and streaming analytics software platform. The architecture of presented solution is visible on the figure below. We have two applications. The first one employee-service uses Hazelcast as a cache, the second one employee-app performs changes directly to the database. Without such a solution like Striim data changed by employee-app is not visible for employee-service. Striim enables real-time data integration without modifying or slowing down data source. It uses CDC (Change Data Capture) mechanisms for detecting changes performed on data source, by analizing binary logs. It has a support for the most popular transactional databases like Oracle, Microsoft SQL Server and MySQL. Striim has many interesting features, but also one serious drawback – it is not an open source. An alternative for the presented solution, especially when using Oracle database, can be Oracle In-Memory Data Grid with Golden Gate Hot Cache.

I prepared sample application for that article purpose, which is as usual available on GitHub under striim branch. The application employee-service is based on Spring Boot and has embedded Hazelcast client which connects to the cluster and Hazelcast Management Center. If data is not available in the cache the application connects to MySQL database.

1. Starting MySQL and enabling binary log

Let’s start MySQL database using docker.

[code]
docker run -d –name mysql -e MYSQL_DATABASE=hz -e MYSQL_USER=hz -e MYSQL_PASSWORD=hz123 -e MYSQL_ALLOW_EMPTY_PASSWORD=yes -p 33306:3306 mysql
[/code]

Binary log is disabled by default. We have to enable it by including the following lines into mysqld.cnf. The configuration file is available on docker container under /etc/mysql/mysql.conf.d/mysqld.cnf.

[code]
log_bin = /var/log/mysql/binary.log
expire-logs-days = 14
max-binlog-size = 500M
server-id = 1
[/code]

If you are running MySQL on Docker you should restart your container using docker restart mysql.

2. Starting Hazelcast Dashboard and Striim

Same as for MySQL, I also used Docker.

[code]
docker run -d –name striim -p 39080:9080 striim/evalversion
docker run -d –name hazelcast-mgmt -p 38080:8080 hazelcast/management-center:3.7.7
[/code]

I selected 3.7.7 version of Hazelcast Management Center, because this version is included by default into the Spring Boot release I used in the sample application. Now, you should be able to login into Hazelcast Dashboard available under http://192.168.99.100:38080/mancenter/ and to the Striim Dashboard which is available under http://192.168.99.100:39080/ (admin/admin).

3. Starting sample application

Build sample application with mvn clean install and start using java -jar employee-service-1.0-SNAPSHOT.jar. You can test it by calling one of endpoint:
/employees/person/{id}
/employees/company/{company}
/employees/{id}

Before testing create table employee in MySQL and insert some test data (you can run my test class pl.piomin.services.datagrid.employee.data.AddEmployeeRepositoryTest).

4. Configure entity mapping in Striim

Before creating our first application in Striim we have to provide mapping configuration. The first step is to copy your entity ORM mapping file into docker container filesystem. You can perform it using Striim dashboard or with docker cp command. Here’s my orm.xml file – it is used by Striim HazelcastWriter while putting data into cache.

[code language=”xml”]

[/code]

We also have to provide jar with entity class. It should be placed under /opt/Striim/lib directory on Striim docker container. What is important, the fields are public – do not make them private with setters, because it won’t work for HazelcastWriter. After all changes restart your container and proceed to the next steps. For the sample application just build employee-model module and upload to Striim.

[code language=”java”]
public class Employee implements Serializable {

private static final long serialVersionUID = 3214253910554454648L;
public Integer id;
public Integer personId;
public String company;

public Employee() {

}

@Override
public String toString() {
return "Employee [id=" + id + ", personId=" + personId + ", company=" + company + "]";
}

}
[/code]

5. Configuring MySQL CDC connection on Striim

If all the previous steps are completed we can finally begin to create our application in Striim. When creating a new app select Start with Template, and then MySQL CDC to Hazelcast. Put your MySQL connection data, security credentials and proceed. In addition to connection validation Striim also checks if binary log is enabled.

Then select tables for synchronization with cache.

6. Configuring Hazelcast on Striim

After starting employee-service application you should see the following fragment in the file logs.

[code]
Members [1] {
Member [192.168.8.205]:5701 – d568a38a-7531-453a-b7f8-db2be4715132 this
}
[/code]

This address should be provided as a Hazelcast Cluster URL. We should also put ORM mapping file location and cluster credentials (by default these are dev/dev-pass).

In the next screen you will see ORM mapping visualization and input selection. Your input is MySQL server you defined in the fifth step.

7. Deploy application on Striim

After finishing previous steps you see the flow diagram. I suggest you create log file where all input events will be stored as a JSON. My diagram is visible in the figure below. If your configuration is finished deploy and start application. At this stage I had some problems. For example, if I deploy application after Striim restart I always have to change something and save, otherwise exception during deploy occurs. However, after a long struggle with Striim, I finally succeeded in running the application! So we can start testing.

8. Checking out

I created JUnit test to illustrate cache refresh performed by Striim. Inside this test I invoke employees/company/{company} REST API method and collect entities. Then I modified entities with EmployeeRepository which commits changes directly to the database bypassing Hazelcast cache. I invoke REST API again and compare results with entities collected with previous invoke. Field personId should not be equal with value for previously invoked entity. You also can test it manually by calling REST API endpoint and change something in the database using the client like MySQL Workbench.

[code language=”java”]
@SpringBootTest(webEnvironment = WebEnvironment.DEFINED_PORT)
@RunWith(SpringRunner.class)
public class CacheRefreshEmployeeTest {

protected Logger logger = Logger.getLogger(CacheRefreshEmployeeTest.class.getName());

@Autowired
EmployeeRepository repository;

TestRestTemplate template = new TestRestTemplate();

@Test
public void testModifyAndRefresh() {
Employee[] e = template.getForObject("http://localhost:3333/employees/company/{company}", Employee[].class, "Test001");
for (int i = 0; i < e.length; i++) {
Employee eMod = repository.findOne(e[i].getId());
eMod.setPersonId(eMod.getPersonId()+1);
repository.save(eMod);
}

Employee[] e2 = template.getForObject("http://localhost:3333/employees/company/{company}", Employee[].class, "Test001");
for (int i = 0; i < e2.length; i++) {
Assert.assertNotEquals(e[i].getPersonId(), e2[i].getPersonId());
}

}
}
[/code]

Here’s the picture with Striim dashboard monitor. We can check out how many events were processed, what is actual memory and CPU usage etc.

Final Thoughts

I have no definite opinion about Striim. On the one hand it is an interesting solution with many integration possibilities and a nice dashboard for configuration and monitoring. But on the other hand it is not free from errors and bugs. My application crashed when an exception was thrown for the lack of a matching serializer for the entity in Hazelcast’s cache. This stopped processing any further events. It may be a deliberate action, but in my opinion subsequent events should be processed as they may affect other tables. The application management with web dashboard is not very comfortable at all. Every time I restarted the container, I had to change something in the configuration, because the application threw not intuitive exception on startup. From this type of application I would expect first of all reliability if the application would require updating of the data on the Hazelcast. However, despite some drawbacks, it is worth a closer look at Striim.

The post Hazelcast Hot Cache with Striim appeared first on Piotr's TechBlog.

In memory data grid with Hazelcast

piotr.minkowski — Wed, 10 May 2017 07:13:47 +0000

In this article, we are going to run Hazelcast in memory data grid for the Spring Boot application. I have already described how to use Hibernate second-level cache with Hazelcast in the article JPA caching with Hazelcast, Hibernate and Spring Boot. The big disadvantage of that solution was the ability to cache entities only by a primary key. On the other hand, you can enable JPA queries caching by other indices. However, it does not solve the problem completely. Such a query can’t use cached entities even if they are matching the criteria. We will solve that problem by using Hazelcast distributed queries.

Spring Boot has a built-in auto configuration for Hazelcast. It is enabled if the library available under application classpath and @Bean Config is declared.

@Bean
Config config() {
   Config c = new Config();
   c.setInstanceName("cache-1");
   c.getGroupConfig().setName("dev").setPassword("dev-pass");
   ManagementCenterConfig mcc = new ManagementCenterConfig().setUrl("http://192.168.99.100:38080/mancenter").setEnabled(true);
   c.setManagementCenterConfig(mcc);
   SerializerConfig sc = new SerializerConfig().setTypeClass(Employee.class).setClass(EmployeeSerializer.class);
   c.getSerializationConfig().addSerializerConfig(sc);
   return c;
}

In the code fragment above we declared cluster name and password credentials, connection parameters to Hazelcast Management Center and entity serialization configuration. Entity is pretty simple – it has @Id and two fields for searching personId and company.

@Entity
public class Employee implements Serializable {

   private static final long serialVersionUID = 3214253910554454648L;

   @Id
   @GeneratedValue
   private Integer id;
   private Integer personId;
   private String company;

   public Integer getId() {
      return id;
   }

   public void setId(Integer id) {
      this.id = id;
   }

   public Integer getPersonId() {
      return personId;
   }

   public void setPersonId(Integer personId) {
      this.personId = personId;
   }

   public String getCompany() {
      return company;
   }

   public void setCompany(String company) {
      this.company = company;
   }

}

Every entity needs to have a serializer declared if it is to be inserted and selected from the cache. There are same default serializers available inside the Hazelcast library, but I implemented the custom one for our sample. It is based on StreamSerializer and ObjectDataInput.

public class EmployeeSerializer implements StreamSerializer {

   @Override
   public int getTypeId() {
      return 1;
   }

   @Override
   public void write(ObjectDataOutput out, Employee employee) throws IOException {
      out.writeInt(employee.getId());
      out.writeInt(employee.getPersonId());
      out.writeUTF(employee.getCompany());
   }

   @Override
   public Employee read(ObjectDataInput in) throws IOException {
      Employee e = new Employee();
      e.setId(in.readInt());
      e.setPersonId(in.readInt());
      e.setCompany(in.readUTF());
      return e;
   }

   @Override
      public void destroy() {
   }

}

There is also a DAO interface for interacting with the database. It has two searching methods and extends Spring Data CrudRepository.


public interface EmployeeRepository extends CrudRepository {

   public Employee findByPersonId(Integer personId);
   public List findByCompany(String company);

}

Hazelcast instance is embedded in the application. When starting the Spring Boot application we have to provide VM argument -DPORT which is used for exposing service REST API. Hazelcast automatically detects other running member instances and its port will be incremented out of the box. Here’s REST @Controller class with exposed API.

@RestController
public class EmployeeController {

   private Logger logger = Logger.getLogger(EmployeeController.class.getName());

   @Autowired
   EmployeeService service;

   @GetMapping("/employees/person/{id}")
   public Employee findByPersonId(@PathVariable("id") Integer personId) {
      logger.info(String.format("findByPersonId(%d)", personId));
      return service.findByPersonId(personId);
   }

   @GetMapping("/employees/company/{company}")
   public List findByCompany(@PathVariable("company") String company) {
      logger.info(String.format("findByCompany(%s)", company));
      return service.findByCompany(company);
   }

   @GetMapping("/employees/{id}")
   public Employee findById(@PathVariable("id") Integer id) {
      logger.info(String.format("findById(%d)", id));
      return service.findById(id);
   }

   @PostMapping("/employees")
   public Employee add(@RequestBody Employee emp) {
      logger.info(String.format("add(%s)", emp));
      return service.add(emp);
   }

}

@Service is injected into the EmployeeController. Inside EmployeeService there is a simple implementation of switching between Hazelcast cache instance and Spring Data DAO @Repository. In every find method, we are trying to find data in the cache and in case it’s not there we are searching it in database and then putting found entity into the cache.

@Service
public class EmployeeService {

   private Logger logger = Logger.getLogger(EmployeeService.class.getName());

   @Autowired
   EmployeeRepository repository;
   @Autowired
   HazelcastInstance instance;

   IMap map;

   @PostConstruct
   public void init() {
      map = instance.getMap("employee");
      map.addIndex("company", true);
      logger.info("Employees cache: " + map.size());
   }

   @SuppressWarnings("rawtypes")
   public Employee findByPersonId(Integer personId) {
      Predicate predicate = Predicates.equal("personId", personId);
      logger.info("Employee cache find");
      Collection ps = map.values(predicate);
      logger.info("Employee cached: " + ps);
      Optional e = ps.stream().findFirst();
      if (e.isPresent())
      return e.get();
      logger.info("Employee cache find");
      Employee emp = repository.findByPersonId(personId);
      logger.info("Employee: " + emp);
      map.put(emp.getId(), emp);
      return emp;
   }

   @SuppressWarnings("rawtypes")
   public List findByCompany(String company) {
      Predicate predicate = Predicates.equal("company", company);
      logger.info("Employees cache find");
      Collection ps = map.values(predicate);
      logger.info("Employees cache size: " + ps.size());
      if (ps.size() > 0) {
         return ps.stream().collect(Collectors.toList());
      }
      logger.info("Employees find");
      List e = repository.findByCompany(company);
      logger.info("Employees size: " + e.size());
      e.parallelStream().forEach(it -> {
         map.putIfAbsent(it.getId(), it);
      });
      return e;
   }

   public Employee findById(Integer id) {
      Employee e = map.get(id);
      if (e != null)
         return e;
      e = repository.findOne(id);
      map.put(id, e);
      return e;
   }

   public Employee add(Employee e) {
      e = repository.save(e);
      map.put(e.getId(), e);
      return e;
   }

}

If you are interested in running sample application you can clone my repository on GitHub. In person-service module there is an example for my previous article about Hibernate 2nd cache with Hazelcast, in employee-module there is an example for that article.

Testing

Let’s start three instances of employee service on different ports using VM argument -DPORT. In the first figure visible in the beginning of article these ports are 2222, 3333 and 4444. When starting the last third service’s instance you should see the fragment visible below in the application logs. It means that the Hazelcast cluster of three members has been set up.

2017-05-09 23:01:48.127  INFO 16432 --- [ration.thread-0] c.h.internal.cluster.ClusterService      : [192.168.1.101]:5703 [dev] [3.7.7]

Members [3] {
   Member [192.168.1.101]:5701 - 7a8dbf3d-a488-4813-a312-569f0b9dc2ca
   Member [192.168.1.101]:5702 - 494fd1ac-341b-451c-b585-1ad58a280fac
   Member [192.168.1.101]:5703 - 9750bd3c-9cf8-48b8-a01f-b14c915937c3 this
}

Here is a picture from the Hazelcast Management Center for two running members (only two members are available in the freeware version of Hazelcast Management Center).

Then run docker containers with MySQL and Hazelcast Management Center.

$ docker run -d --name mysql -p 33306:3306 mysql
$ docker run -d --name hazelcast-mgmt -p 38080:8080 hazelcast/management-center:latest

Now, you could try to call endpoint http://localhost:/employees/company/{company} on all of your services. You should see that data is cached in the cluster and even if you call endpoint on different service it find entities put into the cache by different service. After several attempts, my service instances put about 100k entities into the cache. The distribution between the two Hazelcast members is 50% to 50%.

Final Words

Probably we could implement smarter solution for the problem described in that article, but I just wanted to show you the idea. I tried to use Spring Data Hazelcast for that, but I’ve got a problem to run it on Spring Boot application. It has HazelcastRepository interface, which something similar to Spring Data CrudRepository but basing on cached entities in Hazelcast grid and also uses Spring Data KeyValue module. The project is not well document and like I said before it didn’t worked with Spring Boot so I decided to implement my simple solution

In my local environment, queries on the cache were about 10 times faster than similar queries on database. I inserted 2M records into the employee table. Hazelcast data grid could not only be a 2nd level cache but even a middleware between your application and database. If your priority is a performance of queries on the large amounts of data and you need to have a lot of RAM reserved for your Hazelcast in memory data grid. It is the right solution for you

The post In memory data grid with Hazelcast appeared first on Piotr's TechBlog.

JPA caching with Hazelcast, Hibernate and Spring Boot

piotr.minkowski — Mon, 08 May 2017 07:55:25 +0000

Preface

In-Memory Data Grid is an in-memory distributed key-value store that enables caching data using distributed clusters. Do not confuse this solution with in-memory or nosql databases. In most cases it is used for performance reasons – all data is stored in RAM not in the disk like in traditional databases. For the first time I had a touch with an in-memory data grid while we considered moving to Oracle Coherence in one of organizations I had been working for before. The solution really made me curious. Oracle Coherence is obviously a paid solution, but there are also some open source solutions among which the most interesting seem to be Apache Ignite and Hazelcast. Today I’m going to show you how to use Hazelcast for caching data stored in a MySQL database accessed by Spring Data DAO objects. Here’s the figure illustrating the architecture of the presented solution.

Implementation

1. Starting Docker containers

We use three Docker containers. First with MySQL database, second with Hazelcast instance and third for Hazelcast Management Center – UI dashboard for monitoring Hazelcast cluster instances.

$ docker run -d --name mysql -p 33306:3306 mysql
$ docker run -d --name hazelcast -p 5701:5701 hazelcast/hazelcast
$ docker run -d --name hazelcast-mgmt -p 38080:8080 hazelcast/management-center:latest

If we would like to connect with Hazelcast Management Center from Hazelcast instance we need to place custom hazelcast.xml in /opt/hazelcast catalog inside Docker container. This can be done in two ways, by extending hazelcast base image or just by copying file to existing hazelcast container and restarting it.

$ docker run -d --name hazelcast -p 5701:5701 hazelcast/hazelcast
$ docker stop hazelcast
$ docker start hazelcast

Here’s the most important Hazelcast’s configuration file fragment.


   
      dev
      dev-pass
   
   http://192.168.99.100:38080/mancenter
...

Hazelcast Dashboard is available under http://192.168.99.100:38080/mancenter address. We can monitor there all running cluster members, maps and some other parameters.

2. Maven configuration

Project is based on Spring Boot 1.5.3.RELEASE. We also need to add Spring Web and MySQL Java connector dependencies. Here’s the root project pom.xml.


   org.springframework.boot
   spring-boot-starter-parent
   1.5.3.RELEASE

...

   
      org.springframework.boot
      spring-boot-starter-web
   
   
      mysql
      mysql-connector-java
      runtime
   
   ...

Inside the person-service module we declared some other dependencies to Hazelcast artifacts and Spring Data JPA. I had to override the managed hibernate-core version for Spring Boot 1.5.3.RELEASE, because Hazelcast didn’t work properly with 5.0.12.Final. Hazelcast needs hibernate-core in 5.0.9.Final version. Otherwise, an exception occurs when starting application.


   
      org.springframework.boot
     spring-boot-starter-data-jpa
   
   
      com.hazelcast
      hazelcast
   
   
      com.hazelcast
      hazelcast-client
   
   
      com.hazelcast
      hazelcast-hibernate5
   
   
      org.hibernate
      hibernate-core
      5.0.9.Final

3. Hibernate Cache configuration

Probably you can configure it in several different ways, but for me the most suitable solution was inside application.yml. Here’s a YAML configuration file fragment. I enabled L2 Hibernate cache, set Hazelcast native client address, credentials and cache factory class HazelcastCacheRegionFactory. We can also set HazelcastLocalCacheRegionFactory. The differences between them are in performance – the local factory is faster since its operations are handled as distributed calls. While if you use HazelcastCacheRegionFactory, you can see your maps on Management Center.

spring:
  application:
    name: person-service
  datasource:
    url: jdbc:mysql://192.168.99.100:33306/datagrid?useSSL=false
    username: datagrid
    password: datagrid
  jpa:
    properties:
      hibernate:
        show_sql: true
    cache:
      use_query_cache: true
      use_second_level_cache: true
      hazelcast:
        use_native_client: true
        native_client_address: 192.168.99.100:5701
        native_client_group: dev
        native_client_password: dev-pass
      region:
        factory_class: com.hazelcast.hibernate.HazelcastCacheRegionFactory

4. Application code

First, we need to enable caching for Person @Entity.

@Cache(usage = CacheConcurrencyStrategy.READ_WRITE)
@Entity
public class Person implements Serializable {

   private static final long serialVersionUID = 3214253910554454648L;

   @Id
   @GeneratedValue
   private Integer id;
   private String firstName;
   private String lastName;
   private String pesel;
   private int age;

   public Integer getId() {
      return id;
   }

   public void setId(Integer id) {
      this.id = id;
   }

   public String getFirstName() {
      return firstName;
   }

   public void setFirstName(String firstName) {
      this.firstName = firstName;
   }

   public String getLastName() {
      return lastName;
   }

   public void setLastName(String lastName) {
      this.lastName = lastName;
   }

   public String getPesel() {
      return pesel;
   }

   public void setPesel(String pesel) {
      this.pesel = pesel;
   }

   public int getAge() {
      return age;
   }

   public void setAge(int age) {
      this.age = age;
   }

   @Override
   public String toString() {
      return "Person [id=" + id + ", firstName=" + firstName + ", lastName=" + lastName + ", pesel=" + pesel + "]";
   }

}

DAO is implemented using Spring Data CrudRepository. Sample application source code is available on GitHub.

public interface PersonRepository extends CrudRepository {
   public List findByPesel(String pesel);
}

Testing

Let’s insert a little more data to the table. You can use my AddPersonRepositoryTest for that. It will insert 1M rows into the person table. Finally, we can call endpoint http://localhost:2222/persons/{id} twice with the same id. For me, it looks like below: 22ms for first call, 3ms for next call which is read from L2 cache. Entity can be cached only by primary key. If you call http://localhost:2222/persons/pesel/{pesel} entity will always be searched bypassing the L2 cache.

2017-05-05 17:07:27.360 DEBUG 9164 --- [nio-2222-exec-9] org.hibernate.SQL                        : select person0_.id as id1_0_0_, person0_.age as age2_0_0_, person0_.first_name as first_na3_0_0_, person0_.last_name as last_nam4_0_0_, person0_.pesel as pesel5_0_0_ from person person0_ where person0_.id=?
Hibernate: select person0_.id as id1_0_0_, person0_.age as age2_0_0_, person0_.first_name as first_na3_0_0_, person0_.last_name as last_nam4_0_0_, person0_.pesel as pesel5_0_0_ from person person0_ where person0_.id=?
2017-05-05 17:07:27.362 DEBUG 9164 --- [nio-2222-exec-9] o.h.l.p.e.p.i.ResultSetProcessorImpl     : Starting ResultSet row #0
2017-05-05 17:07:27.362 DEBUG 9164 --- [nio-2222-exec-9] l.p.e.p.i.EntityReferenceInitializerImpl : On call to EntityIdentifierReaderImpl#resolve, EntityKey was already known; should only happen on root returns with an optional identifier specified
2017-05-05 17:07:27.363 DEBUG 9164 --- [nio-2222-exec-9] o.h.engine.internal.TwoPhaseLoad         : Resolving associations for [pl.piomin.services.datagrid.person.model.Person#444]
2017-05-05 17:07:27.364 DEBUG 9164 --- [nio-2222-exec-9] o.h.engine.internal.TwoPhaseLoad         : Adding entity to second-level cache: [pl.piomin.services.datagrid.person.model.Person#444]
2017-05-05 17:07:27.373 DEBUG 9164 --- [nio-2222-exec-9] o.h.engine.internal.TwoPhaseLoad         : Done materializing entity [pl.piomin.services.datagrid.person.model.Person#444]
2017-05-05 17:07:27.373 DEBUG 9164 --- [nio-2222-exec-9] o.h.r.j.i.ResourceRegistryStandardImpl   : HHH000387: ResultSet's statement was not registered
2017-05-05 17:07:27.374 DEBUG 9164 --- [nio-2222-exec-9] .l.e.p.AbstractLoadPlanBasedEntityLoader : Done entity load : pl.piomin.services.datagrid.person.model.Person#444
2017-05-05 17:07:27.374 DEBUG 9164 --- [nio-2222-exec-9] o.h.e.t.internal.TransactionImpl         : committing
2017-05-05 17:07:30.168 DEBUG 9164 --- [nio-2222-exec-6] o.h.e.t.internal.TransactionImpl         : begin
2017-05-05 17:07:30.171 DEBUG 9164 --- [nio-2222-exec-6] o.h.e.t.internal.TransactionImpl         : committing

Query Cache

We can enable JPA query caching by marking repository methods with @Cacheable annotation and adding @EnableCaching to main class definition.

public interface PersonRepository extends CrudRepository {

   @Cacheable("findByPesel")
   public List findByPesel(String pesel);

}

In addition to the @EnableCaching annotation we should declare HazelcastIntance and CacheManager beans. As a cache manager HazelcastCacheManager from hazelcast-spring library is used.

@SpringBootApplication
@EnableCaching
public class PersonApplication {

   public static void main(String[] args) {
      SpringApplication.run(PersonApplication.class, args);
   }

   @Bean
   HazelcastInstance hazelcastInstance() {
      ClientConfig config = new ClientConfig();
      config.getGroupConfig().setName("dev").setPassword("dev-pass");
      config.getNetworkConfig().addAddress("192.168.99.100");
      config.setInstanceName("cache-1");
      HazelcastInstance instance = HazelcastClient.newHazelcastClient(config);
      return instance;
   }

   @Bean
   CacheManager cacheManager() {
      return new HazelcastCacheManager(hazelcastInstance());
   }

}

Now, we should try to find a person by PESEL number by calling endpoint http://localhost:2222/persons/pesel/{pesel}. Cached query is stored as a map as you see in the picture below.

Clustering

Before final words let me say a little about clustering, what is the key functionality of Hazelcast in the memory data grid. In the previous chapters we based on a single Hazelcast instance. Let’s begin from running the second container with Hazelcast exposed on a different port.

$ docker run -d --name hazelcast2 -p 5702:5701 hazelcast/hazelcast

Now we should perform one change in hazelcast.xml configuration file. Because the data grid is run inside the Docker container the public address has to be set. For the first container it is 192.168.99.100:5701, and for second 192.168.99.100:5702, because it is exposed on 5702 port.


...
   192.168.99.100:5701
...

When starting a person-service application you should see in the logs similar to visible below – connection with two cluster members.


Members [2] {
   Member [192.168.99.100]:5702 - 04f790bc-6c2d-4c21-ba8f-7761a4a7422c
   Member [192.168.99.100]:5701 - 2ca6e30d-a8a7-46f7-b1fa-37921aaa0e6b
}

All Hazelcast running instances are visible in the Management Center.

Conclusion

Caching and clustering with Hazelcast are simple and fast. We can cache JPA entities and queries. Monitoring is realized via Hazelcast Management Center dashboard. One problem for me is that I’m able to cache entities only by primary key. If I would like to find an entity by another index like the PESEL number I had to cache findByPesel query. Even if the entity was cached before by id query will not find it in the cache but perform SQL on the database. Only the next query call is cached. I’ll show you smart solution for that problem in my next article about that subject In memory data grid with Hazelcast.

The post JPA caching with Hazelcast, Hibernate and Spring Boot appeared first on Piotr's TechBlog.