AI Archives - Piotr's TechBlog https://piotrminkowski.com/tag/ai/ Java, Spring, Kotlin, microservices, Kubernetes, containers Tue, 17 Feb 2026 16:25:21 +0000 en-US hourly 1 https://wordpress.org/?v=6.9.1 https://i0.wp.com/piotrminkowski.com/wp-content/uploads/2020/08/cropped-me-2-tr-x-1.png?fit=32%2C32&ssl=1 AI Archives - Piotr's TechBlog https://piotrminkowski.com/tag/ai/ 32 32 181738725 Create Apps with Claude Code on Ollama https://piotrminkowski.com/2026/02/17/create-apps-with-claude-code-on-ollama/ https://piotrminkowski.com/2026/02/17/create-apps-with-claude-code-on-ollama/#comments Tue, 17 Feb 2026 16:25:18 +0000 https://piotrminkowski.com/?p=15992 This article explains how to run Claude Code on Ollama and use local or cloud models served by Ollama to create Java apps. Read this article if you are experimenting with AI code generation and using paid APIs for this purpose. Relatively recently, Ollama has made a built-in integration with developer tools such as Codex […]

The post Create Apps with Claude Code on Ollama appeared first on Piotr's TechBlog.

]]>
This article explains how to run Claude Code on Ollama and use local or cloud models served by Ollama to create Java apps. Read this article if you are experimenting with AI code generation and using paid APIs for this purpose. Relatively recently, Ollama has made a built-in integration with developer tools such as Codex and Claude Code available. This is a really useful feature. Using the example of integration with Claude Coda and several different models running both locally and in the cloud, you will see how it works.

You can find other articles about AI and Java on my blog. For example, if you are interested in how to use Ollama to serve models for Spring AI applications, you can read the following article.

Source Code

Feel free to use my source code if you’d like to try it out yourself. To do that, you must clone my sample GitHub repository. Then you should only follow my instructions. This repository contains several branches, each with an application generated from the same prompt using different models. Currently, the branch with the fewest comments in the code review has been merged into master. This is the version of the code generated using the glm-5 model. However, this may change in the future, and the master branch may be modified. Therefore, it is best to simply refer to the individual branches or pull requests shown below.

Below is the current list of branches. The dev branch contains the initial version of the repository with the CLAUDE.md file, which specifies the basic requirements for the generated code.

$ git branch
    dev
    glm-5
    gpt-oss
  * master
    minimax
    qwen3-coder
ShellSession

Here are instructions for AI from the CLAUDE.md file. They include a description of the technologies I plan to use in my application and a few practices I intend to apply. For example, I don’t want to use Lombok, a popular Java library that automates the generation of code parts such as getters, setters, and constructors. It seems that in the age of AI, this approach doesn’t make sense, but for some reason, AI models really like this library 🙂 Also, each time I make a code change, I want the LLM model to increment the version number and update the README.md file, etc.

# Project Instructions

- Always use the latest versions of dependencies.
- Always write Java code as the Spring Boot application.
- Always use Maven for dependency management.
- Always create test cases for the generated code both positive and negative.
- Always generate the CircleCI pipeline in the .circleci directory to verify the code.
- Minimize the amount of code generated.
- The Maven artifact name must be the same as the parent directory name.
- Use semantic versioning for the Maven project. Each time you generate a new version, bump the PATCH section of the version number.
- Use `pl.piomin.services` as the group ID for the Maven project and base Java package.
- Do not use the Lombok library.
- Generate the Docker Compose file to run all components used by the application.
- Update README.md each time you generate a new version.
Markdown

Run Claude on Ollama

First, install Ollama on your computer. You can download the installer for your OS here. If you have used Ollama before, please update to the latest version.

$ ollama --version
  ollama version is 0.16.1
ShellSession

Next, install Claude Code.

curl -fsSL https://claude.ai/install.sh | bash
ShellSession

Before you start, it is worth increasing the maximum context window value allowed by Ollama. By default, it is set to 4k, and on the Ollama website itself, you will find a recommendation of 64k for Claude Code. I set the maximum value to 256k for testing different models.

For example, the gpt-oss model supports a 128k context window size.

Let’s pull and run the gpt-oss model with Ollama:

ollama run gpt-oss
ShellSession

After downloading and launching, you can verify the model parameters with the ollama ps command. If you have 100% GPU and a context window size of ~131k, that’s exactly what I meant.

Ensure you are in the root repository directory, then run Claude Code with the command ollama launch claude. Next, choose the gpt-oss model visible in the list under “More”.

ollama-claude-code-gpt-oss

That’s it! Finally, we can start playing with AI.

ollama-claude-code-run

Generate a Java App with Claude Code

My application will be very simple. I just need something to quickly test the solution. Of course, all guidelines defined in the CLAUDE.md file should be followed. So, here is my prompt. Nothing more, nothing less 🙂

Generate an application that exposes REST API and connects to a PostgreSQL database.
The application should have a Person entity with id, and typical fields related to each person.
All REST endpoints should be protected with JWT and OAuth2.
The codebase should use Skaffold to deploy on Kubernetes.
Plaintext

After a few minutes, I have the entire code generated. Below is a summary from the AI of what has been done. If you want to check it out for yourself, take a look at this branch in my repository.

ollama-claude-code-generated

For the sake of formality, let’s take a look at the generated code. There is nothing spectacular here, because it is just a regular Spring Boot application that exposes a few REST endpoints for CRUD operations. However, it doen’t look bad. Here’s the Spring Boot @Service implementation responsible for using PersonRepository to interact with database.

@Service
public class PersonService {
    private final PersonRepository repository;

    public PersonService(PersonRepository repository) {
        this.repository = repository;
    }

    public List<Person> findAll() {
        return repository.findAll();
    }

    public Optional<Person> findById(Long id) {
        return repository.findById(id);
    }

    @Transactional
    public Person create(Person person) {
        return repository.save(person);
    }

    @Transactional
    public Optional<Person> update(Long id, Person person) {
        return repository.findById(id).map(existing -> {
            existing.setFirstName(person.getFirstName());
            existing.setLastName(person.getLastName());
            existing.setEmail(person.getEmail());
            existing.setAge(person.getAge());
            return repository.save(existing);
        });
    }

    @Transactional
    public void delete(Long id) {
        repository.deleteById(id);
    }
}
Java

Here’s the generated @RestController witn REST endpoints implementation:

@RestController
@RequestMapping("/api/people")
public class PersonController {
    private final PersonService service;

    public PersonController(PersonService service) {
        this.service = service;
    }

    @GetMapping
    public List<Person> getAll() {
        return service.findAll();
    }

    @GetMapping("/{id}")
    public ResponseEntity<Person> getById(@PathVariable Long id) {
        Optional<Person> person = service.findById(id);
        return person.map(ResponseEntity::ok).orElseGet(() -> ResponseEntity.notFound().build());
    }

    @PostMapping
    public ResponseEntity<Person> create(@RequestBody Person person) {
        Person saved = service.create(person);
        return ResponseEntity.status(201).body(saved);
    }

    @PutMapping("/{id}")
    public ResponseEntity<Person> update(@PathVariable Long id, @RequestBody Person person) {
        Optional<Person> updated = service.update(id, person);
        return updated.map(ResponseEntity::ok).orElseGet(() -> ResponseEntity.notFound().build());
    }

    @DeleteMapping("/{id}")
    public ResponseEntity<Void> delete(@PathVariable Long id) {
        service.delete(id);
        return ResponseEntity.noContent().build();
    }
}
Java

Below is a summary in a pull request with the generated code.

ollama-claude-code-pr

Using Ollama Cloud Models

Recently, Ollama has made it possible to run models not only locally, but also in the cloud. By default, all models tagged with cloud are run this way. Cloud models are automatically offloaded to Ollama’s cloud service while offering the same capabilities as local models. This is the most useful for larger models that wouldn’t fit on a personal computer. You can for example try to experiment with the qwen3-coder model locally. Unfortunately, it didn’t look very good on my laptop.

Then, I can run a same or event a larger model in cloud and automatically connect Claude Code with that model using the following command:

ollama launch claude --model qwen3-coder:480b-cloud
Java

Now you can repeat exactly the same exercise as before or take a look at my branch containing the code generated using this model.

You can also try some other cloud models like minimax-m2.5 or glm-5.

Conclusion

If you’re developing locally and don’t want to burn money on APIs, use Claude Code with Ollama, and e.g., the gpt-oss or glm-5 models. It’s a pretty powerful and free option. If you have a powerful personal computer, the locally launched model should be able to generate the code efficiently. Otherwise, you can use the option of launching the model in the cloud offered by Ollama free of charge up to a certain usage limit (it is difficult to say exactly what that limit is). The gpt-oss model worked really well on my laptop (MacBook Pro M3), and it took about 7-8 minutes to generate the application. You can also look for a model that suits you better.

The post Create Apps with Claude Code on Ollama appeared first on Piotr's TechBlog.

]]>
https://piotrminkowski.com/2026/02/17/create-apps-with-claude-code-on-ollama/feed/ 5 15992
MCP with Quarkus LangChain4j https://piotrminkowski.com/2025/11/24/mcp-with-quarkus-langchain4j/ https://piotrminkowski.com/2025/11/24/mcp-with-quarkus-langchain4j/#respond Mon, 24 Nov 2025 07:45:15 +0000 https://piotrminkowski.com/?p=15845 This article shows how to use Quarkus LangChain4j support for MCP (Model Context Protocol) on both the server and client sides. You will learn how to serve tools and prompts on the server side and discover them in the Quarkus MCP client-side application. The Model Context Protocol is a standard for managing contextual interactions with […]

The post MCP with Quarkus LangChain4j appeared first on Piotr's TechBlog.

]]>
This article shows how to use Quarkus LangChain4j support for MCP (Model Context Protocol) on both the server and client sides. You will learn how to serve tools and prompts on the server side and discover them in the Quarkus MCP client-side application. The Model Context Protocol is a standard for managing contextual interactions with AI models. It provides a standardized way to connect AI models to external data sources and tools. It can help with building complex workflows on top of LLMs.

This article is the second part of a series describing some of the Quarkus AI project’s most notable features.  Before reading this article, I recommend checking out two previous parts of the tutorial:

You can also compare Quarkus’ support for MCP with similar support on the Spring AI side. You can find the article I mentioned on my blog here.

Source Code

Feel free to use my source code if you’d like to try it out yourself. To do that, you must clone my sample GitHub repository. Then you should only follow my instructions.

Architecture for the Quarkus MCP Scenario

Let’s start with a diagram of our application architecture. Two Quarkus applications act as MCP servers. They connect to the in-memory database and use Quarkus LangChain4j MCP Server support to expose @Tool methods to the MCP client-side app. The client-side app communicates with the OpenAI model. It includes the tools exposed by the server-side apps in the user query to the AI model. The person-mcp-server app provides @Tool methods for searching persons in the database table. The account-mcp-server is doing the same for the persons’ accounts.

quarkus-mcp-arch

Build an MCP Server with Quarkus

Both MCP server applications are similar. They connect to the H2 database via the Quarkus Panache ORM extension. Both provide MCP API via Server-Sent Events (SSE) transport. Here’s a list of required Maven dependencies:

<dependencies>
  <dependency>
    <groupId>io.quarkiverse.mcp</groupId>
    <artifactId>quarkus-mcp-server-sse</artifactId>
  </dependency>
  <dependency>
    <groupId>io.quarkus</groupId>
    <artifactId>quarkus-hibernate-orm-panache</artifactId>
  </dependency>
  <dependency>
    <groupId>io.quarkus</groupId>
    <artifactId>quarkus-jdbc-h2</artifactId>
  </dependency>
  <dependency>
    <groupId>io.quarkus</groupId>
    <artifactId>quarkus-junit5</artifactId>
    <scope>test</scope>
  </dependency>
</dependencies>
XML

Let’s start with the person-mcp-server application. Here’s the @Entity class for interacting with the person table. It uses Panache support to avoid the need for getter and setter declarations.

@Entity
public class Person extends PanacheEntityBase {

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    public Long id;
    public String firstName;
    public String lastName;
    public int age;
    public String nationality;
    @Enumerated(EnumType.STRING)
    public Gender gender;

}
Java

The PersonRepository class contains a single method for searching persons by their nationality:

@ApplicationScoped
public class PersonRepository implements PanacheRepository<Person> {

    public List<Person> findByNationality(String nationality) {
        return find("nationality", nationality).list();
    }

}
Java

Next, prepare the “tools service” that searches for a single person by ID or a list of people of a given nationality in the database. Each method must be annotated with @Tool and include a description in the description field. Quarkus LangChain4j does not allow a Java List to be returned, so we need to wrap it using a dedicated Persons object.

@ApplicationScoped
public class PersonTools {

    PersonRepository personRepository;

    public PersonTools(PersonRepository personRepository) {
        this.personRepository = personRepository;
    }

    @Tool(description = "Find person by ID")
    public Person getPersonById(
            @ToolArg(description = "Person ID") Long id) {
        return personRepository.findById(id);
    }

    @Tool(description = "Find all persons by nationality")
    public Persons getPersonsByNationality(
            @ToolArg(description = "Nationality") String nationality) {
        return new Persons(personRepository.findByNationality(nationality));
    }
}
Java

Here’s our List<Person> wrapper:

public class Persons {

    private List<Person> persons;

    public Persons(List<Person> persons) {
        this.persons = persons;
    }

    public List<Person> getPersons() {
        return persons;
    }

    public void setPersons(List<Person> persons) {
        this.persons = persons;
    }
}
Java

The implementation of the account-mcp-server application is essentially very similar. Here’s the @Entity class for interacting with the account table. It uses Panache support to avoid the need for getter and setter declarations.

@Entity
public class Account extends PanacheEntityBase {

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    public Long id;
    public String number;
    public int balance;
    public Long personId;

}
Java

The AccountRepository class contains a single method for searching accounts by person ID:

@ApplicationScoped
public class AccountRepository implements PanacheRepository<Account> {

    public List<Account> findByPersonId(Long personId) {
        return find("personId", personId).list();
    }

}
Java

Once again, the list inside the “tools service” must be wrapped by the dedicated object. The single method annotated with @Tool returns a list of accounts assigned to a given person.

@ApplicationScoped
public class AccountTools {

    AccountRepository accountRepository;

    public AccountTools(AccountRepository accountRepository) {
        this.accountRepository = accountRepository;
    }

    @Tool(description = "Find all accounts by person ID")
    public Accounts getAccountsByPersonId(
            @ToolArg(description = "Person ID") Long personId) {
        return new Accounts(accountRepository.findByPersonId(personId));
    }

}
Java

The person-mcp-server starts on port 8082, while the account-mcp-server listens on port 8081. To change the default HTTP, use the quarkus.http.port property in your application.properties file.

Build an MCP Client with Quarkus

Our application interacts with the OpenAI chat model, so we must include the Quarkus LangChain4j OpenAI extension. In turn, to integrate the client-side application with MCP-compliant servers, we need to include the quarkus-langchain4j-mcp extension.

<dependencies>
  <dependency>
    <groupId>io.quarkus</groupId>
    <artifactId>quarkus-rest-jackson</artifactId>
  </dependency>
  <dependency>
    <groupId>io.quarkiverse.langchain4j</groupId>
    <artifactId>quarkus-langchain4j-openai</artifactId>
    <version>${quarkus.langchain4j.version}</version>
  </dependency>
  <dependency>
    <groupId>io.quarkiverse.langchain4j</groupId>
    <artifactId>quarkus-langchain4j-mcp</artifactId>
    <version>${quarkus.langchain4j.version}</version>
  </dependency>
</dependencies>
XML

The sample-client Quarkus app interacts with both the person-mcp-server app and the account-mcp-server app. Therefore, it defines two AI services. As with a standard AI application in Quarkus, those services must be annotated with @RegisterAiService. Then we define methods and prompt templates, also with annotations @UserMessage or @SystemMessage. If a given method is to use one of the MCP servers, it must be annotated with @McpToolBox. The name inside the annotation corresponds to the MCP server name set in the configuration properties. The PersonService AI service visible below uses the person-service MCP server.

@ApplicationScoped
@RegisterAiService
public interface PersonService {

    @SystemMessage("""
        You are a helpful assistant that generates realistic person data.
        Always respond with valid JSON format.
        """)
    @UserMessage("""
        Find persons with {nationality} nationality.
        Output **only valid JSON**, no explanations, no markdown, no ```json blocks.
        """)
    @McpToolBox("person-service")
    Persons findByNationality(String nationality);

    @SystemMessage("""
        You are a helpful assistant that generates realistic person data.
        Always respond with valid JSON format.
        """)
    @UserMessage("How many persons come from {nationality} ?")
    @McpToolBox("person-service")
    int countByNationality(String nationality);

}
Java

The AI service shown above corresponds to this configuration. You need to specify the MCP server’s name and address. As you remember, person-mcp-server listens on port 8082. The client application uses the name person-service here, and the standard endpoint to MCP SSE for Quarkus is /mcp/sse. To explore the solution itself, it is also worth enabling logging of MCP requests and responses.

quarkus.langchain4j.mcp.person-service.transport-type = http
quarkus.langchain4j.mcp.person-service.url = http://localhost:8082/mcp/sse
quarkus.langchain4j.mcp.person-service.log-requests = true
quarkus.langchain4j.mcp.person-service.log-responses = true
Plaintext

Here is a similar implementation for the AccountService AI service. It interacts with the MCP server configured under the account-service name.

@ApplicationScoped
@RegisterAiService
public interface AccountService {

    @SystemMessage("""
        You are a helpful assistant that generates realistic data.
        Return a single number.
        """)
    @UserMessage("How many accounts has person with {personId} ID ?")
    @McpToolBox("account-service")
    int countByPersonId(int personId);

    @UserMessage("""
        How many accounts has person with {personId} ID ?
        Return person name, nationality and a total balance on his/her accounts.
        """)
    @McpToolBox("account-service")
    String balanceByPersonId(int personId);

}
Java

Here’s the corresponding configuration for that service. No surprises.

quarkus.langchain4j.mcp.account-service.transport-type = http
quarkus.langchain4j.mcp.account-service.url = http://localhost:8081/mcp/sse
quarkus.langchain4j.mcp.account-service.log-requests = true
quarkus.langchain4j.mcp.account-service.log-responses = true
Plaintext

Finally, we must provide some configuration to integrate our Quarkus application with Open AI chat model. It assumes that Open AI token is available as the OPEN_AI_TOKEN environment variable.

quarkus.langchain4j.chat-model.provider = openai
quarkus.langchain4j.log-requests = true
quarkus.langchain4j.log-responses = true
quarkus.langchain4j.openai.api-key = ${OPEN_AI_TOKEN}
quarkus.langchain4j.openai.timeout = 20s
Plaintext

We can test individual AI services by calling endpoints provided by the client-side application. There are two endpoints GET /count-by-person-id/{personId} and GET /balance-by-person-id/{personId} that use LLM prompts to calculate number of persons and a total balances amount of all accounts belonging to a given person.

@Path("/accounts")
public class AccountResource {

    private final AccountService accountService;

    public AccountResource(AccountService accountService) {
        this.accountService = accountService;
    }

    @POST
    @Path("/count-by-person-id/{personId}")
    public int countByPersonId(int personId) {
        return accountService.countByPersonId(personId);
    }

    @POST
    @Path("/balance-by-person-id/{personId}")
    public String balanceByPersonId(int personId) {
        return accountService.balanceByPersonId(personId);
    }

}
Java

MCP for Promps

MCP servers can also provide other functionalities beyond just tools. Let’s go back to the person-mcp-server app for a moment. To share a prompt message, you can create a class that defines methods returning the PromptMessage object. Then, we must annotate such methods with @Prompt, and their arguments with @PromptArg.

@ApplicationScoped
public class PersonPrompts {

    final String findByNationalityPrompt = """
        Find persons with {nationality} nationality.
        Output **only valid JSON**, no explanations, no markdown, no ```json blocks.
        """;

    @Prompt(description = "Find by nationality.")
    PromptMessage findByNationalityPrompt(@PromptArg(description = "The nationality") String nationality) {
        return PromptMessage.withUserRole(new TextContent(findByNationalityPrompt));
    }

}
Java

Once we start the application, we can use Quarkus Dev UI to verify a list of provided tools and prompts.

Client-side integration with MCP prompts is a bit more complex than with tools. We must inject the McpClient to a resource controller to load a given prompt programmatically using its name.

@Path("/persons")
public class PersonResource {

    @McpClientName("person-service")
    McpClient mcpClient;
    
    // OTHET METHODS...
    
    @POST
    @Path("/nationality-with-prompt/{nationality}")
    public List<Person> findByNationalityWithPrompt(String nationality) {
        Persons p = personService.findByNationalityWithPrompt(loadPrompt(nationality), nationality);
        return p.getPersons();
    }
    
    private String loadPrompt(String nationality) {
        McpGetPromptResult prompt = mcpClient.getPrompt("findByNationalityPrompt", Map.of("nationality", nationality));
        return ((TextContent) prompt.messages().getFirst().content().toContent()).text();
    }
}
Java

In this case, the Quarkus AI service should not define the @UserMessage on the entire method, but just as the method argument. Then a prompt message is loaded from the MCP server and filled with the nationality parameter value before sending to the AI model.

@ApplicationScoped
@RegisterAiService
public interface PersonService {

    // OTHER METHODS...
    
    @SystemMessage("""
        You are a helpful assistant that generates realistic person data.
        Always respond with valid JSON format.
        """)
    @McpToolBox("person-service")
    Persons findByNationalityWithPrompt(@UserMessage String userMessage, String nationality);

}
Java

Testing MCP Tools with Quarkus

Quarkus provides a dedicated module for testing MCP tools. We can use after including the following dependency in the Maven pom.xml:

<dependency>
  <groupId>io.quarkiverse.mcp</groupId>
  <artifactId>quarkus-mcp-server-test</artifactId>
  <scope>test</scope>
</dependency>
XML

The following test verifies the MCP tool methods provided by the person-mcp-server application. The McpAssured class allows us to use SSE, streamable, and WebSocket test clients. To create a new client for SSE, invoke the newConnectedSseClient() static method. After that, we can use one of several available variants of the toolsCall(...) method to verify the response returned by a given @Tool.

@QuarkusTest
public class PersonToolsTest {

    ObjectMapper mapper = new ObjectMapper();

    @Test
    public void testGetPersonsByNationality() {
        McpAssured.McpSseTestClient client = McpAssured.newConnectedSseClient();
        client.when()
                .toolsCall("getPersonsByNationality", Map.of("nationality", "Denmark"),
                        r -> {
                            try {
                                Persons p = mapper.readValue(r.content().getFirst().asText().text(), Persons.class);
                                assertFalse(p.getPersons().isEmpty());
                            } catch (JsonProcessingException e) {
                                throw new RuntimeException(e);
                            }
                        })
                .thenAssertResults();
    }

    @Test
    public void testGetPersonById() {
        McpAssured.McpSseTestClient client = McpAssured.newConnectedSseClient();
        client.when()
                .toolsCall("getPersonById", Map.of("id", 10),
                        r -> {
                            try {
                                Person p = mapper.readValue(r.content().getFirst().asText().text(), Person.class);
                                assertNotNull(p);
                                assertNotNull(p.id);
                            } catch (JsonProcessingException e) {
                                throw new RuntimeException(e);
                            }
                        })
                .thenAssertResults();
    }
}
Java

Running Quarkus Applications

Finally, let’s run all our quarkus applications. Go to the mcp/account-mcp-server directory and run the application in development mode:

$ cd mcp/account-mcp-server
$ mvn quarkus:dev
ShellSession

Then do the same for the person-mcp-server application.

$ cd mcp/person-mcp-server
$ mvn quarkus:dev
ShellSession

Before running the last sample-client application, export the OpenAI API token as the OPEN_AI_TOKEN environment variable.

$ cd mcp/sample-client
$ export OPEN_AI_TOKEN=<YOUR_OPENAI_TOKEN>
$ mvn quarkus:dev
ShellSession

We can verify a list of tools or prompts exposed by each MCP server application by visiting its Quarkus Dev UI console. It provides a dedicated “MCP Server tile.”

quarkus-mcp-devui

Here’s a list of tools provided by the person-mcp-server app via Quarkus Dev UI.

Then, we can switch to the sample-client Dev UI console. We can verify and test all interactions with the MCP servers from our client-side app.

quarkus-mcp-client-ui

Once all the sample applications are running, we can test the MCP communication by calling the HTTP endpoints exposed by the sample-client app. Both person-mcp-server and account-mcp-server load some test data on startup using the import.sql file. Here are the test API calls for all the REST endpoints.

$ curl -X POST http://localhost:8080/persons/nationality/Denmark
$ curl -X POST http://localhost:8080/persons/count-by-nationality/Denmark
$ curl -X POST http://localhost:8080/persons/nationality-with-prompt/Denmark
$ curl -X POST http://localhost:8080/accounts/count-by-person-id/2
$ curl -X POST http://localhost:8080/accounts/balance-by-person-id/2
ShellSession

Conclusion

With Quarkus, creating applications that use MCP is not difficult. If you understand the idea of tool calling in AI, understanding the MCP-based approach is not difficult for you. This article shows you how to connect your application to several MCP servers, implement tests to verify the elements shared by a given application using MCP, and support on the Quarkus Dev UI side.

The post MCP with Quarkus LangChain4j appeared first on Piotr's TechBlog.

]]>
https://piotrminkowski.com/2025/11/24/mcp-with-quarkus-langchain4j/feed/ 0 15845
Getting Started with Quarkus LangChain4j and Chat Model https://piotrminkowski.com/2025/06/18/getting-started-with-quarkus-langchain4j-and-chat-model/ https://piotrminkowski.com/2025/06/18/getting-started-with-quarkus-langchain4j-and-chat-model/#respond Wed, 18 Jun 2025 16:36:08 +0000 https://piotrminkowski.com/?p=15736 This article will teach you how to use the Quarkus LangChain4j project to build applications based on different chat models. The Quarkus AI Chat Model offers a portable and straightforward interface, enabling seamless interaction with these models. Our sample Quarkus application will switch between three popular chat models provided by OpenAI, Mistral AI, and Ollama. […]

The post Getting Started with Quarkus LangChain4j and Chat Model appeared first on Piotr's TechBlog.

]]>
This article will teach you how to use the Quarkus LangChain4j project to build applications based on different chat models. The Quarkus AI Chat Model offers a portable and straightforward interface, enabling seamless interaction with these models. Our sample Quarkus application will switch between three popular chat models provided by OpenAI, Mistral AI, and Ollama. This article is the first in a series explaining AI concepts with Quarkus LangChain4j. Look for more on my blog in this area soon. The idea of this tutorial is very similar to the series on Spring AI. Therefore, you will be able to easily compare the two approaches, as the sample application will do the same thing as an analogous Spring Boot application.

If you like Quarkus, then you can find quite a few articles about it on my blog. Just go to the Quarkus category and find the topic you are interested in.

SourceCode

Feel free to use my source code if you’d like to try it out yourself. To do that, you must clone my sample GitHub repository. Then you should only follow my instructions.

Motivation

Whenever I create a new article or example related to AI, I like to define the problem I’m trying to solve. The problem this example solves is very trivial. I publish numerous small demo apps to explain complex technology concepts. These apps typically require data to display a demo output. Usually, I add demo data by myself or use a library like Datafaker to do it for me. This time, we can leverage the AI Chat Models API for that. Let’s begin!

The Quarkus-related topic I’m describing today, I also explained earlier for Spring Boot. For a comparison of the features offered by both frameworks for simple interaction with the AI chat model, you can read this article on Spring AI.

Dependencies

The sample application uses the current latest version of the Quarkus framework.

<dependencyManagement>
  <dependencies>
    <dependency>
      <groupId>io.quarkus.platform</groupId>
      <artifactId>quarkus-bom</artifactId>
      <version>${quarkus.platform.version}</version>
      <type>pom</type>
      <scope>import</scope>
    </dependency>
  </dependencies>
</dependencyManagement>
XML

You can easily switch between multiple AI model implementations by activating a dedicated Maven profile. By default, the open-ai profile is active. It includes the quarkus-langchain4j-openai module in the Maven dependencies. You can also activate the mistral-ai and ollama profile. In that case, the quarkus-langchain4j-mistral-ai or quarkus-langchain4j-ollama module will be included instead of the LangChain4j OpenAI extension.

<profiles>
  <profile>
    <id>open-ai</id>
    <activation>
      <activeByDefault>true</activeByDefault>
    </activation>
    <dependencies>
      <dependency>
        <groupId>io.quarkiverse.langchain4j</groupId>
        <artifactId>quarkus-langchain4j-openai</artifactId>
        <version>${quarkus-langchain4j.version}</version>
      </dependency>
    </dependencies>
  </profile>
  <profile>
    <id>mistral-ai</id>
    <dependencies>
      <dependency>
        <groupId>io.quarkiverse.langchain4j</groupId>
        <artifactId>quarkus-langchain4j-mistral-ai</artifactId>
        <version>${quarkus-langchain4j.version}</version>
      </dependency>
    </dependencies>
  </profile>
  <profile>
    <id>ollama</id>
    <dependencies>
      <dependency>
        <groupId>io.quarkiverse.langchain4j</groupId>
        <artifactId>quarkus-langchain4j-ollama</artifactId>
        <version>${quarkus-langchain4j.version}</version>
      </dependency>
    </dependencies>
  </profile>
</profiles>
XML

The sample Quarkus application is simple. It exposes some REST endpoints and communicates with a selected AI model to return an AI-generated response via each endpoint. So, you need to include only core Quarkus modules like quarkus-rest-jackson or quarkus-arc. To implement JUnit tests with REST API, it also includes the quarkus-junit5 and rest-assured modules in the test scope.

<dependencies>
  <!-- Core Quarkus dependencies -->
  <dependency>
    <groupId>io.quarkus</groupId>
    <artifactId>quarkus-rest-jackson</artifactId>
  </dependency>
  <dependency>
    <groupId>io.quarkus</groupId>
    <artifactId>quarkus-arc</artifactId>
  </dependency>

  <!-- Test dependencies -->
  <dependency>
    <groupId>io.quarkus</groupId>
    <artifactId>quarkus-junit5</artifactId>
    <scope>test</scope>
  </dependency>
  <dependency>
    <groupId>io.rest-assured</groupId>
    <artifactId>rest-assured</artifactId>
    <scope>test</scope>
  </dependency>
</dependencies>
XML

Quarkus LangChain4j Chat Models Integration

Quarkus provides an innovative approach to interacting with AI chat models. First, you need to annotate your interface by defining AI-oriented methods with the @RegisterAiService annotation. Then you must add a proper description and input prompt inside the @SystemMessage and @UserMessage annotations. Here is the sample PersonAiService interaction, which defines two methods. The generatePersonList method aims to ask the AI model to generate a list of 10 unique persons in a form consistent with the input object structure. The getPersonById method must read the previously generated list from chat memory and return a person’s data with a specified id field.

@RegisterAiService
@ApplicationScoped
public interface PersonAiService {

    @SystemMessage("""
        You are a helpful assistant that generates realistic person data.
        Always respond with valid JSON format.
        """)
    @UserMessage("""
        Generate exactly 10 unique persons

        Requirements:
        - Each person must have a unique integer ID (like 1, 2, 3, etc.)
        - Use realistic first and last names per each nationality
        - Ages should be between 18 and 80
        - Return ONLY the JSON array, no additional text
        """)
    PersonResponse generatePersonList(@MemoryId int userId);

    @SystemMessage("""
        You are a helpful assistant that can recall generated person data from chat memory.
        """)
    @UserMessage("""
        In the previously generated list of persons for user {userId}, find and return the person with id {id}.
        
        Return ONLY the JSON object, no additional text.
        """)
    Person getPersonById(@MemoryId int userId, int id);

}
Java

There are a few more things to add regarding the code snippet above. The beans created by @RegisterAiService are @RequestScoped by default. The Quarkus LangChain4j documentation states that this is possible, allowing objects to be deleted from the chat memory. In the case seen above, the list of people is generated per user ID, which acts as the key by which we search the chat memory. To guarantee that the getPersonById method finds a list of persons generated per @MemoryId the PersonAiService interface must be annotated with @ApplicationScoped. The InMemoryChatMemoryStore implementation is enabled by default, so you don’t need to declare any additional beans to use it.

Quarkus LangChain4j can automatically map the LLM’s JSON response to the output POJO. However, until now, it has not been possible to map it directly to the output collection. Therefore, you must wrap the output list with the additional class, as shown below.

public class PersonResponse {

    private List<Person> persons;

    public List<Person> getPersons() {
        return persons;
    }

    public void setPersons(List<Person> persons) {
        this.persons = persons;
    }
}
Java

Here’s the Person class:

public class Person {

    private Integer id;
    private String firstName;
    private String lastName;
    private int age;
    private String nationality;
    private Gender gender;
    
    // GETTERS and SETTERS

}
Java

Finally, the last part of our implementation is REST endpoints. Here’s the REST controller that injects and uses PersonAiService to interact with the AI chat model. It exposes two endpoints: GET /api/{userId}/persons and GET /api/{userId}/persons/{id}. You can generate several lists of persons by specifying the userId path parameter.

@Path("/api")
@Produces(MediaType.APPLICATION_JSON)
@Consumes(MediaType.APPLICATION_JSON)
public class PersonController {

    private static final Logger LOG = Logger.getLogger(PersonController.class);

    PersonAiService personAiService;

    public PersonController(PersonAiService personAiService) {
        this.personAiService = personAiService;
    }

    @GET
    @Path("/{userId}/persons")
    public PersonResponse generatePersons(@PathParam("userId") int userId) {
        return personAiService.generatePersonList(userId);
    }

    @GET
    @Path("/{userId}/persons/{id}")
    public Person getPersonById(@PathParam("userId") int userId, @PathParam("id") int id) {
        return personAiService.getPersonById(userId, id);
    }

}
Java

Use Different AI Models with Quarkus LangChain4j

Configuration Properties

Here is a configuration defined within the application.properties file. Before proceeding, you must generate the OpenAI and Mistral AI API tokens and export them as environment variables. Additionally, you can enable logging of requests and responses in AI model communication. It is also worth increasing the default timeout for a single request from 10 seconds to a higher value, such as 20 seconds.

quarkus.langchain4j.chat-model.provider = ${AI_MODEL_PROVIDER:openai}
quarkus.langchain4j.log-requests = true
quarkus.langchain4j.log-responses = true

# OpenAI Configuration
quarkus.langchain4j.openai.api-key = ${OPEN_AI_TOKEN}
quarkus.langchain4j.openai.timeout = 20s

# Mistral AI Configuration
quarkus.langchain4j.mistralai.api-key = ${MISTRAL_AI_TOKEN}
quarkus.langchain4j.mistralai.timeout = 20s

# Ollama Configuration
quarkus.langchain4j.ollama.base-url = ${OLLAMA_BASE_URL:http://localhost:11434}
Plaintext

To run a sample Quarkus application and connect it with OpenAI, you must set the OPEN_AI_TOKEN environment variable. Since the open-ai Maven profile is activated by default, you don’t need to set anything else while running an app.

$ export OPEN_AI_TOKEN=<your_openai_token>
$ mvn quarkus:dev
ShellSession

Then, you can call the GET /api/{userId}/persons endpoint with different userId path variable values. Here are sample API requests and responses.

quarkus-langchain4j-calls

After that, you can call the GET /api/{userId}/persons/{id} endpoint to return a specified person found in the chat memory.

Switch Between AI Models

Then, you can repeat the same exercise with the Mistral AI model. You must set the AI_MODEL_PROVIDER to mistral, export its API token as the MISTRAL_AI_TOKEN environment variable, and enable the mistral-ai profile while running the app.

$ export AI_MODEL_PROVIDER=mistralai
$ export MISTRAL_AI_TOKEN=<your_mistralai_token>
$ mvn quarkus:dev -Pmistral-ai
ShellSession

The app should start successfully.

quarkus-langchain4j-logs

Once it happens, you can repeat the same sequence of requests as before for OpenAI.

$ curl http://localhost:8080/api/1/persons
$ curl http://localhost:8080/api/2/persons
$ curl http://localhost:8080/api/1/persons/1
$ curl http://localhost:8080/api/2/persons/1
ShellSession

You can check the request sent to the AI model in the application logs.

Here’s a log showing an AI chat model response:

Finally, you can run a test with ollama. By default, the LangChain4j extension for Ollama uses the llama3.2 model. You can change it by setting the quarkus.langchain4j.ollama.chat-model.model-id property in the application.properties file. Assuming that you use the llama3.3 model, here’s your configuration:

quarkus.langchain4j.ollama.base-url = ${OLLAMA_BASE_URL:http://localhost:11434}
quarkus.langchain4j.ollama.chat-model.model-id = llama3.3
quarkus.langchain4j.ollama.timeout = 60s
Plaintext

Before proceeding, you must run the llama3.3 model on your laptop. Of course, you can choose another, smaller model, because llama3.3 is 42 GB.

ollama run llama3.3
ShellSession

It can take a lot of time. However, a model is finally ready to use.

Once a model is running, you can set the AI_MODEL_PROVIDER environment variable to ollama and activate the ollama profile for the app:

$ export AI_MODEL_PROVIDER=ollama
$ mvn quarkus:dev -Pollama
ShellSession

This time, our application is connected to the llama3.3 model started with ollama:

quarkus-langchain4j-ollama

With the Quarkus LangChain4j Ollama extension, you can take advantage of dev services support. It means that you don’t need to install and run Ollama on your laptop or run a model with ollama CLI. Quarkus will run Ollama as a Docker container and automatically run a selected AI model on it. In that case, you don’t need to set the quarkus.langchain4j.ollama.base-url property. Before switching to that option, let’s use a smaller AI model by setting the quarkus.langchain4j.ollama.chat-model.model-id = mistral property. Then start the app in the same way as before.

Final Thoughts

I must admit that the Quarkus LangChain4j extension is enjoyable to use. With a few simple annotations, you can configure your application to talk to the AI model of your choice correctly. In this article, I presented a straightforward example of integrating Quarkus with an AI chat model. However, we quickly reviewed features such as prompts, structured output, and chat memory. You can expect more articles in the Quarkus series with AI soon.

The post Getting Started with Quarkus LangChain4j and Chat Model appeared first on Piotr's TechBlog.

]]>
https://piotrminkowski.com/2025/06/18/getting-started-with-quarkus-langchain4j-and-chat-model/feed/ 0 15736
Getting Started with Spring AI Function Calling https://piotrminkowski.com/2025/01/30/getting-started-with-spring-ai-function-calling/ https://piotrminkowski.com/2025/01/30/getting-started-with-spring-ai-function-calling/#comments Thu, 30 Jan 2025 16:40:30 +0000 https://piotrminkowski.com/?p=15522 This article will show you how to use Spring AI support for Java function calling with the OpenAI chat model. The Spring AI function calling feature lets us connect the LLM capabilities with external APIs or systems. OpenAI’s models are trained to know when to call a function. We will work on implementing a Java […]

The post Getting Started with Spring AI Function Calling appeared first on Piotr's TechBlog.

]]>
This article will show you how to use Spring AI support for Java function calling with the OpenAI chat model. The Spring AI function calling feature lets us connect the LLM capabilities with external APIs or systems. OpenAI’s models are trained to know when to call a function. We will work on implementing a Java function that takes the call arguments from the AI model and sends the result back. Our main goal is to connect to the third-party APIs to provide these results. Then the AI model uses the provided results to complete the conversation.

This article is the second part of a series describing some of the AI project’s most notable features. Before reading on, I recommend checking out my introduction to Spring AI, which is available here. The first part describes such features as prompts, structured output, chat memory, and built-in advisors. Additionally, it demonstrates the capability to switch between the most popular AI chat model API providers.

Source Code

If you would like to try it by yourself, you may always take a look at my source code. To do that, you must clone my sample GitHub repository. Then you should only follow my instructions.

Problem

Whenever I create a new article or example related to AI, I like to define the problem I’m trying to solve. The problem we will solve in this exercise is visible in the following prompt template. I’m asking the AI model about the value of my stock wallet. However, the model doesn’t know how many shares I have, and can’t get the latest stock prices. Since the OpenAI model is trained on a static dataset it does not have direct access to the online services or APIs.

spring-ai-function-calling-prompt

So, in this case, we should provide private data with our wallet structure and “connect” our model with a public API that returns live stock market data. Let’s see how we tackle this challenge with Spring AI function calling.

Create Spring Functions

WalletService Supplier

We will begin with a source code. Then we will visualize the whole process on the diagram. Spring AI supports different ways of registering a function to call. You can read more about it in the Spring AI docs here. We will choose the way based on plain Java functions defined as beans in the Spring application context. This approach allows us to use interfaces from the java.util.function package such as Function, Supplier, or Consumer. Our first function takes no input, so it implements the Supplier interface. It just returns a list of shares that we currently have. It obtains such information from the database through the Spring Data WalletRepository bean.

public class WalletService implements Supplier<WalletResponse> {

    private WalletRepository walletRepository;

    public WalletService(WalletRepository walletRepository) {
        this.walletRepository = walletRepository;
    }

    @Override
    public WalletResponse get() {
        return new WalletResponse((List<Share>) walletRepository.findAll());
    }
}
Java

Information about the number of owned shares is stored in the share table. Each row contains a company name and the quantity of that company shares.

@Entity
public class Share {

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;
    private String company;
    private int quantity;
    
    // ... GETTERS/SETTERS
}
Java

The Spring Boot application launches an embedded, in-memory database and inserts test data into the stock table. Our wallet contains the most popular companies on the U.S. stock market, including Amazon, Meta, and Microsoft.

insert into share(id, company, quantity) values (1, 'AAPL', 100);
insert into share(id, company, quantity) values (2, 'AMZN', 300);
insert into share(id, company, quantity) values (3, 'META', 300);
insert into share(id, company, quantity) values (4, 'MSFT', 400);
insert into share(id, company, quantity) values (5, 'NVDA', 200);
SQL

StockService Function

Our second function takes an input argument and returns an output. Therefore, it implements the Function interface. It must interact with live stock market API to get the current price of a given company share. We use the api.twelvedata.com service to access stock exchange quotes. The function returns a current price wrapped by the StockResponse object.

public class StockService implements Function<StockRequest, StockResponse> {

    private static final Logger LOG = LoggerFactory.getLogger(StockService.class);

    @Autowired
    RestTemplate restTemplate;
    @Value("${STOCK_API_KEY}")
    String apiKey;

    @Override
    public StockResponse apply(StockRequest stockRequest) {
        StockData data = restTemplate.getForObject("https://api.twelvedata.com/time_series?symbol={0}&interval=1min&outputsize=1&apikey={1}",
                StockData.class,
                stockRequest.company(),
                apiKey);
        DailyStockData latestData = data.getValues().get(0);
        LOG.info("Get stock prices: {} -> {}", stockRequest.company(), latestData.getClose());
        return new StockResponse(Float.parseFloat(latestData.getClose()));
    }
}
Java

Here are the Java records for request and response objects.

public record StockRequest(String company) { }

public record StockResponse(Float price) { }
Java

To summarize, the first function accesses the database to get owned shares quantity, while the second function communicates public API to get the current price of a company share.

Spring AI Function Calling Flow

Architecture

Here’s the diagram that visualizes the flow of our application. The Spring AI Prompt object must contain references to our function beans. This allows the OpenAI model to recognize when a function should be called. However, the model does not call the function directly but only generates JSON used to call the function on the application side. Each function must provide a name, description, and signature (as JSON schema) to let the model know what arguments it expects. We have two functions. The StockService function returns a list of owned company shares, while the second function takes a single company name as the argument. This is where the magic happens. The chat model should call the StockService function for each object in the list returned by the WalletService function. The final response combines results received from both our functions.

spring-ai-function-calling-arch

Implementation

To implement the flow visualized above we must register our functions as Spring beans. The method name determines the name of the bean in the Spring context. Each bean declaration should also contain a description, which helps the model to understand when to call the function. The WalletResponse function is registered under the numberOfShares name, while the StockService function under the latestStockPrices name. The WalletService doesn’t take any input arguments, but injects the WalletRepository bean to interact with the database.

@Bean
@Description("Number of shares for each company in my portfolio")
public Supplier<WalletResponse> numberOfShares(WalletRepository walletRepository) {
    return new WalletService(walletRepository);
}

@Bean
@Description("Latest stock prices")
public Function<StockRequest, StockResponse> latestStockPrices() {
    return new StockService();
}
Java

Finally, let’s take a look at the REST controller implementation. It exposes the GET /wallet endpoint that communicates with the OpenAI chat model. When creating a prompt we should register both our functions using the OpenAiChatOptions class and its function method. The reference contains only the function @Bean name.

@RestController
@RequestMapping("/wallet")
public class WalletController {

    private final ChatClient chatClient;

    public WalletController(ChatClient.Builder chatClientBuilder) {
        this.chatClient = chatClientBuilder
                .defaultAdvisors(new SimpleLoggerAdvisor())
                .build();
    }

    @GetMapping
    String calculateWalletValue() {
        PromptTemplate pt = new PromptTemplate("""
        What’s the current value in dollars of my wallet based on the latest stock daily prices ?
        """);

        return this.chatClient.prompt(pt.create(
                OpenAiChatOptions.builder()
                        .function("numberOfShares")
                        .function("latestStockPrices")
                        .build()))
                .call()
                .content();
    }
}
Java

Run and Test the Spring AI Application

Before running the app, we must export OpenAI and Twelvedata API keys as environment variables.

export STOCK_API_KEY=<YOUR_TWELVEDATA_API_KEY>
export OPEN_AI_TOKEN=<YOUR_OPENAI_TOKEN>
ShellSession

We must create an account on the Twelvedata platform to obtain its API key. The Twelvedata platform provides API to get the latest stock prices.

Of course, we must have an API key on the OpenAI platform. Once you create an account there you should go to that page. Then choose the name for your token and copy it after creation.

Then, we run our Spring AI app using the following Maven command:

mvn spring-boot:run
ShellSession

After running the app, we can call the /wallet endpoint to calculate our stock portfolio.

curl http://localhost:8080/wallet
ShellSession

Here’s the response returned by OpenAI for the provided test data and the current stock market prices.

Then, let’s switch to the application logs. We can see that the StockService function was called five times – once for every company in the wallet. After we added the SimpleLoggerAdvisor advisor to the and set the property logging.level.org.springframework.ai to DEBUG, we can observe detailed logs with requests and responses from the OpenAI chat model.

spring-ai-function-calling-logs

Final Thoughts

In this article, we analyzed the Spring AI integration with function support in AI models. OpenAI’s function calling is a powerful feature that enhances how AI models interact with external tools, APIs, and structured data. It makes AI more interactive and practical for real-world applications. Spring AI provides a flexible way to register and invoke such functions. However, it still requires attention from developers, who need to define clear function schemas and handle edge cases.

The post Getting Started with Spring AI Function Calling appeared first on Piotr's TechBlog.

]]>
https://piotrminkowski.com/2025/01/30/getting-started-with-spring-ai-function-calling/feed/ 6 15522
Getting Started with Spring AI and Chat Model https://piotrminkowski.com/2025/01/28/getting-started-with-spring-ai-and-chat-model/ https://piotrminkowski.com/2025/01/28/getting-started-with-spring-ai-and-chat-model/#comments Tue, 28 Jan 2025 10:02:24 +0000 https://piotrminkowski.com/?p=15494 This article will teach you how to use the Spring AI project to build applications based on different chat models. The Spring AI Chat Model is a simple and portable interface that allows us to interact with these models. Our sample Spring Boot application will switch between three popular chat models provided by OpenAI, Mistral […]

The post Getting Started with Spring AI and Chat Model appeared first on Piotr's TechBlog.

]]>
This article will teach you how to use the Spring AI project to build applications based on different chat models. The Spring AI Chat Model is a simple and portable interface that allows us to interact with these models. Our sample Spring Boot application will switch between three popular chat models provided by OpenAI, Mistral AI, and Ollama. This article is the first in a series explaining AI concepts with Spring Boot. Look for more on my blog in this area soon.

If you are interested in Spring Boot, read my article about tips, tricks, and techniques for this framework here.

Source Code

If you would like to try it by yourself, you may always take a look at my source code. To do that, you must clone my sample GitHub repository. Then you should only follow my instructions.

Problem

Whenever I create a new article or example related to AI, I like to define the problem I’m trying to solve. The problem this example solves is very trivial. I publish a lot of small demo apps to explain technology concepts. These apps usually need data to show a demo output. Usually, I add demo data by myself or use a library like Datafaker to do it for me. This time, we can leverage AI Chat Models API for that. Let’s begin!

Dependencies

The Spring AI project is still under active development. Currently, we are waiting for the 1.0 GA release. Until then, we will switch to the milestone releases of the project. The current milestone is 1.0.0-M5. So let’s add the Spring Milestones repository to our Maven pom.xml file.

    <repositories>
        <repository>
            <id>central</id>
            <name>Central</name>
            <url>https://repo1.maven.org/maven2/</url>
        </repository>
        <repository>
            <id>spring-milestones</id>
            <name>Spring Milestones</name>
            <url>https://repo.spring.io/milestone</url>
            <snapshots>
                <enabled>false</enabled>
            </snapshots>
        </repository>
    </repositories>
XML

Then we should include the Maven BOM with a specified version of the Spring AI project.

    <properties>
        <java.version>21</java.version>
        <spring-ai.version>1.0.0-M5</spring-ai.version>
    </properties>
    
    <dependencyManagement>
        <dependencies>
            <dependency>
                <groupId>org.springframework.ai</groupId>
                <artifactId>spring-ai-bom</artifactId>
                <version>${spring-ai.version}</version>
                <type>pom</type>
                <scope>import</scope>
            </dependency>
        </dependencies>
    </dependencyManagement>
XML

Since our sample application exposes some REST endpoints, we should include the Spring Boot Web Starter. We can include the Spring Boot Test Starter to create some JUnit tests. The Spring AI modules are included in the Maven profiles section. There are three different profiles for each chat model provider. By default, our application uses Open AI, and thus it activates the open-ai profile, which includes the spring-ai-openai-spring-boot-starter library. We should activate the mistral-ai profile to switch to Mistral AI. The third option is the ollama-ai profile including the spring-ai-ollama-spring-boot-starter dependency. Here’s a full list of dependencies. That’ll make it a breeze to switch between different chat model AI providers — we’ll only need to set the profile parameter in the Maven running command.

    <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
        </dependency>
    </dependencies>

    <profiles>
        <profile>
            <id>open-ai</id>
            <activation>
                <activeByDefault>true</activeByDefault>
            </activation>
            <dependencies>
                <dependency>
                    <groupId>org.springframework.ai</groupId>
                    <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
                </dependency>
            </dependencies>
        </profile>
        <profile>
            <id>mistral-ai</id>
            <dependencies>
                <dependency>
                    <groupId>org.springframework.ai</groupId>
                    <artifactId>spring-ai-mistral-ai-spring-boot-starter</artifactId>
                </dependency>
            </dependencies>
        </profile>
        <profile>
            <id>ollama-ai</id>
            <dependencies>
                <dependency>
                    <groupId>org.springframework.ai</groupId>
                    <artifactId>spring-ai-ollama-spring-boot-starter</artifactId>
                </dependency>
            </dependencies>
        </profile>
    </profiles>
XML

Connect to AI Chat Model Providers

Configure OpenAI

Before we proceed with a source code, we should prepare chat model AI tools. Let’s begin with OpenAI. We must have an account on the OpenAI Platform portal. After signing in we should access the API Keys page to generate an API token. Once we set its name, we can click the “Create secret key” button. Don’t forget to copy the key after creation.

The value of the generated token should be saved as an environment variable. Our sample Spring Boot application read its value from the OPEN_AI_TOKEN variable.

export OPEN_AI_TOKEN=<YOUR_TOKEN_VALUE>
ShellSession

Configure Mistral AI

Then, we should repeat a very similar action for Mistral AI. We must have an account on the Mistral AI Platform portal. After signing in we should access the API Keys page to generate an API token. Both the name and expiration date fields are optional. Once we generate a token by clicking the “Create key” button, we should copy it.

spring-ai-mistral-ai

The value of the generated token should be saved as an environment variable. Our sample Spring Boot application read its value for Mistral AI from the MISTRAL_AI_TOKEN variable.

export MISTRAL_AI_TOKEN=<YOUR_TOKEN_VALUE>
ShellSession

Run and Configure Ollama

Opposite to OpenAI or Mistral AI, Ollama is built to allow to run large language models (LLMs) directly on our workstations. This means we don’t have any connection to the remote API to access it. First, we must download the Ollama binary dedicated to our OS from the following page. After installation, we can interact with it using the ollama CLI. First, we should choose the model to run. The full list of available models can be found here. By default, Spring AI expects the mistral model for the Ollama. Let’s choose llama3.2.

ollama run llama3.2
ShellSession

After running Ollama locally we can interact with it using the CLI terminal.

spring-ai-ollama

Configure Spring Boot Properties

Ollama exposes port over localhost and does not require an API token. Fortunately, all necessary URLs for our APIs come with the Spring AI auto-configuration. After choosing the llama3.2 model, we should provide the change in Spring Boot application properties respectively. We can also set the gpt-4o-mini model for OpenAI to decrease API costs.

spring.ai.openai.api-key = ${OPEN_AI_TOKEN}
spring.ai.openai.chat.options.model = gpt-4o-mini
spring.ai.mistralai.api-key = ${MISTRAL_AI_TOKEN}
spring.ai.ollama.chat.options.model = llama3.2
Plaintext

Spring AI Chat Model API

Prompting and Structured Output

Here is our model class. It contains the id field and several other fields that best describe each person.

public class Person {

    private Integer id;
    private String firstName;
    private String lastName;
    private int age;
    private Gender gender;
    private String nationality;
    
    //... GETTERS/SETTERS
}

public enum Gender {
    MALE, FEMALE;
}
Java

The @RestController class injects auto-configured ChatClient.Builder to create an instance of ChatClient. PersonController implements a method for returning a list of persons from the GET /persons endpoint. The main goal is to generate a list of 10 objects with the fields defined in the Person class. The id field should be auto-incremented. The PromptTemplate object defines a message, that will be sent to the chat model AI API. It doesn’t have to specify the exact fields that should be returned. This part is handled automatically by the Spring AI library after we invoke the entity() method on the ChatClient instance. The ParameterizedTypeReference object inside the entity method tells Spring AI to generate a list of objects.

@RestController
@RequestMapping("/persons")
public class PersonController {

    private final ChatClient chatClient;

    public PersonController(ChatClient.Builder chatClientBuilder) {
        this.chatClient = chatClientBuilder.build();
    }

    @GetMapping
    List<Person> findAll() {
        PromptTemplate pt = new PromptTemplate("""
                Return a current list of 10 persons if exists or generate a new list with random values.
                Each object should contain an auto-incremented id field.
                Do not include any explanations or additional text.
                """);

        return this.chatClient.prompt(pt.create())
                .call()
                .entity(new ParameterizedTypeReference<>() {});
    }

}    
Java

Assuming you exported the OpenAI token to the OPEN_AI_TOKEN environment variable, you can run the application using the following command:

mvn spring-boot:run
ShellSession

Then, let’s call the http://localhost:8080/persons endpoint. It returns a list of 10 people with different nationalities. It

Now, we can change the PromptTemplate content and add the word “famous” before persons. Just for fun.

The results are not surprising at all – “Elon Musk” enters the list 🙂 However, the list will be slightly different the second time you call the same endpoint. According to our prompt, a chat client should “return a current list of 10 persons”. So, I expected to get the same list as before. In this case, the problem is that the chat client doesn’t remember a previous conversation.

spring-ai-requests

Advisors and Chat Memory

Let’s try to change it. First, we should define the implementation of the ChatMemory interface. InMemoryChatMemory is good enough for our tests.

@SpringBootApplication
public class SpringAIShowcase {

    public static void main(String[] args) {
        SpringApplication.run(SpringAIShowcase.class, args);
    }

    @Bean
    InMemoryChatMemory chatMemory() {
        return new InMemoryChatMemory();
    }
}
Java

To enable conversation history for a chat client we should define an advisor. The Spring AI Advisors API lets us intercept, modify, and enhance AI-driven interactions handled by Spring applications. Spring AI offers API to create custom advisors, but we can also leverage several built-in advisors. It can be e.g. PromptChatMemoryAdvisor that enables chat memory and adds it to the prompt’s system text or SimpleLoggerAdvisor which enables request/response logging. Let’s take a look at the latest implementation of the PersonController class. I highlighted the added lines of code. Besides advisors, it contains a new GET /persons/{id} endpoint implementation. This endpoint takes a previously returned list of persons and seeks the object with a specified id. The PromptTemplate object specifies the id parameter filled with the value read from the context path.

@RestController
@RequestMapping("/persons")
public class PersonController {

    private final ChatClient chatClient;

    public PersonController(ChatClient.Builder chatClientBuilder, 
                            ChatMemory chatMemory) {
        this.chatClient = chatClientBuilder
                .defaultAdvisors(
                        new PromptChatMemoryAdvisor(chatMemory),
                        new SimpleLoggerAdvisor())
                .build();
    }

    @GetMapping
    List<Person> findAll() {
        PromptTemplate pt = new PromptTemplate("""
                Return a current list of 10 persons if exists or generate a new list with random values.
                Each object should contain an auto-incremented id field.
                Do not include any explanations or additional text.
                """);

        return this.chatClient.prompt(pt.create())
                .call()
                .entity(new ParameterizedTypeReference<>() {});
    }

    @GetMapping("/{id}")
    Person findById(@PathVariable String id) {
        PromptTemplate pt = new PromptTemplate("""
                Find and return the object with id {id} in a current list of persons.
                """);
        Prompt p = pt.create(Map.of("id", id));
        return this.chatClient.prompt(p)
                .call()
                .entity(Person.class);
    }
}
Java

Now, let’s make a final test. After the application restarts, we can call the endpoint that generates a list of persons. Then, we will call the GET /persons/{id} endpoint to display only a single person by ID. Spring application reads the value from the list of persons stored in the chat memory. Finally, we can repeat the call to the GET /persons endpoint to verify if it returns the same list.

Different Chat AI Models

Assuming you exported the Mistral AI token to the MISTRAL_AI_TOKEN environment variable, you can run the application using the following command. It activates the mistral-ai Maven profile and includes the starter with the Mistral AI support.

mvn spring-boot:run -Pmistral-ai
ShellSession

It returns responses similar to OpenAI’s, but some small differences exist. It always returns 0 in the age field and a 3-letter shortcut as a country name.

spring-ai-id-call

Let’s tweak our template to get Mistrai AI to generate an accurate age number. Here’s the fixed prompt template:

Now, it looks quite better. Even so, the names don’t match up with the countries they’re from, so there’s room for improvement.

The last test is for Ollama. Let’s run our application once again. This time we should activate the ollama-ai Maven profile.

mvn spring-boot:run -Pollama-ai
ShellSession

Then, we can repeat the same requests to check out the responses from Ollama AI. You can check out the responses by yourself.

$ curl http://localhost:8080/persons
$ curl http://localhost:8080/persons/2
ShellSession

Final Thoughts

This example doesn’t do anything unusual but only shows some basic features offered by Spring AI Chat Models API. We quickly reviewed features like prompts, structured output, chat memory, and built-in advisors. We also switched between some popular AI Chat Models API providers. You can expect more articles in this area soon. If you want to continue with the next part of the AI series on my blog, go here.

The post Getting Started with Spring AI and Chat Model appeared first on Piotr's TechBlog.

]]>
https://piotrminkowski.com/2025/01/28/getting-started-with-spring-ai-and-chat-model/feed/ 4 15494