open-ai Archives - Piotr's TechBlog

Spring AI with External MCP Servers

piotr.minkowski — Fri, 06 Feb 2026 10:00:53 +0000

This article explains how to integrate Spring AI with external MCP servers that provide APIs for popular tools such as GitHub and SonarQube. Spring AI provides built-in support for MCP clients and servers. In this article, we will use only the Spring MCP client. If you are interested in more details on building MCP servers, please refer to the following post on my blog. MCP has recently become very popular, and you can easily find an MCP server implementation for almost any existing technology.

You can actually run MCP servers in many different ways. Ultimately, they are just ordinary applications whose task is to make a given tool available via an API compatible with the MCP protocol. The most popular AI IDE tools, such as Cloud Code, Codex, and Cursor, make it easy to run any MCP server. I will take a slightly unusual approach and use the support provided with Docker Desktop, namely the MCP Toolkit.

My idea for today is to build a simple Spring AI application that communicates with MCP servers for GitHub, SonarQube, and CircleCI to retrieve information about my repositories and projects hosted on those platforms. The Docker MCP Toolkit provides a single gateway that distributes incoming requests among running MCP servers. Let’s see how it works in practice!

Source Code

Feel free to use my source code if you’d like to try it out yourself. To do that, you must clone my sample GitHub repository. Then you should only follow my instructions. This repository contains several sample applications. The correct application for this article is in the spring-ai-mcp/external-mcp-sample-client directory.

Getting Started with Docker MCP Toolkit

First, run your Docker Desktop. You can find more than 300 popular MCP servers to run in the “Catalog” bookmark. Next, you should search for SonarQube, CircleCI, and GitHub Official servers (note that there are additional GitHub servers). To be honest, I encountered unexpected issues running the CircleCI server, so for now, I based the application on MCP communication with GitHub and SonarCloud.

Each MCP server usually requires configuration, such as your authorization token or service address. Therefore, before adding a server to Docker Toolkit, you must first configure it as described below. Only then should you click the “Add MCP server” button.

For the GitHub MCP server, in addition to entering the token itself, you must also authorize it via OAuth. Here, too, the MCP Toolkit provides graphical support. After entering the token, go to the “OAuth” tab to complete the process.

This is what your final result should look like before moving on to implementing the Spring Boot application. You have added two MCP servers, which together offer 65 tools.

To make both MCP servers available outside of Docker, you need to run the Docker MCP gateway. In the default stdio mode, the API is not exposed outside Docker. Therefore, you need to change the mode to streaming using the transport parameter, as shown below. The gateway is exposed on port 8811.

docker mcp gateway run --port 8811 --transport streaming

docker mcp gateway run --port 8811 --transport streaming

ShellSession

This is what it looks like after launch. Additionally, the Docker MCP gateway is secured by an API token. This will require appropriate settings on the MCP client side in the Spring AI application.

Integrate Spring AI with External MCP Clients

Prepare the MCP Client with Spring AI

Let’s move on to implementing our sample application. We need to include the Spring AI MCP client and the library that communicates with the LLM model. For me, it’s OpenAI, but you can use many other options available through Spring AI’s integration with popular chat models.

<dependencies>
  <dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-web</artifactId>
  </dependency>
  <dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-starter-mcp-client-webflux</artifactId>
  </dependency>
  <dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-starter-model-openai</artifactId>
  </dependency>
</dependencies>


  
    org.springframework.boot
    spring-boot-starter-web
  
  
    org.springframework.ai
    spring-ai-starter-mcp-client-webflux
  
  
    org.springframework.ai
    spring-ai-starter-model-openai

XML

Our MCP client must authenticate itself to the Docker MCP gateway using an API token. Therefore, we need to modify the Spring WebClient used by Spring AI to communicate with MCP servers. It is best to use the ExchangeFilterFunction interface to create an HTTP filter that adds the appropriate Authorization header with the bearer token to the outgoing request. The token will be injected from the application properties.

@Component
public class McpSyncClientExchangeFilterFunction implements ExchangeFilterFunction {

    @Value("${mcp.token}")
    private String token;

    @Override
    public Mono<ClientResponse> filter(ClientRequest request, 
                                       ExchangeFunction next) {

            var requestWithToken = ClientRequest.from(request)
                    .headers(headers -> headers.setBearerAuth(token))
                    .build();
            return next.exchange(requestWithToken);

    }

}

@Component
public class McpSyncClientExchangeFilterFunction implements ExchangeFilterFunction {

    @Value("${mcp.token}")
    private String token;

    @Override
    public Mono<ClientResponse> filter(ClientRequest request, 
                                       ExchangeFunction next) {

            var requestWithToken = ClientRequest.from(request)
                    .headers(headers -> headers.setBearerAuth(token))
                    .build();
            return next.exchange(requestWithToken);

    }

}

Java

Then, let’s set the previously implemented filter for the default WebClient builder.

@SpringBootApplication
public class ExternalMcpSampleClient {

    public static void main(String[] args) {
        SpringApplication.run(ExternalMcpSampleClient.class, args);
    }

    @Bean
    WebClient.Builder webClientBuilder(McpSyncClientExchangeFilterFunction filterFunction) {
        return WebClient.builder()
                .filter(filterFunction);
    }
}

@SpringBootApplication
public class ExternalMcpSampleClient {

    public static void main(String[] args) {
        SpringApplication.run(ExternalMcpSampleClient.class, args);
    }

    @Bean
    WebClient.Builder webClientBuilder(McpSyncClientExchangeFilterFunction filterFunction) {
        return WebClient.builder()
                .filter(filterFunction);
    }
}

Java

After that, we must configure the MCP gateway address and token in the application properties. To achieve that, we must use the spring.ai.mcp.client.streamable-http.connections property. The MCP gateway listens on port 8811. The token value will be read from the MCP_TOKEN environment variable.

spring.ai.mcp.client.streamable-http.connections:
  docker-mcp-gateway:
    url: http://localhost:8811

mcp.token: ${MCP_TOKEN}

spring.ai.mcp.client.streamable-http.connections:
  docker-mcp-gateway:
    url: http://localhost:8811

mcp.token: ${MCP_TOKEN}

YAML

Implement Application Logic with Spring AI and OpenAI Support

The concept behind the sample application is quite simple. It involves creating a @RestController per tool provided by each MCP server. For each, I will create a simple prompt to request the number of repositories or projects in my account on a given platform. Let’s start with SonCloud. Each implementation uses the Spring AI ToolCallbackProvider bean to enable the available MCP server to communicate with the LLM model.

@RestController
@RequestMapping("/sonarcloud")
public class SonarCloudController {

    private final static Logger LOG = LoggerFactory
        .getLogger(SonarCloudController.class);
    private final ChatClient chatClient;

    public SonarCloudController(ChatClient.Builder chatClientBuilder,
                                ToolCallbackProvider tools) {
        this.chatClient = chatClientBuilder
                .defaultToolCallbacks(tools)
                .build();
    }

    @GetMapping("/count")
    String countRepositories() {
        PromptTemplate pt = new PromptTemplate("""
                How many projects in Sonarcloud do I have ?
                """);
        Prompt p = pt.create();
        return this.chatClient.prompt(p)
                .call()
                .content();
    }

}

@RestController
@RequestMapping("/sonarcloud")
public class SonarCloudController {

    private final static Logger LOG = LoggerFactory
        .getLogger(SonarCloudController.class);
    private final ChatClient chatClient;

    public SonarCloudController(ChatClient.Builder chatClientBuilder,
                                ToolCallbackProvider tools) {
        this.chatClient = chatClientBuilder
                .defaultToolCallbacks(tools)
                .build();
    }

    @GetMapping("/count")
    String countRepositories() {
        PromptTemplate pt = new PromptTemplate("""
                How many projects in Sonarcloud do I have ?
                """);
        Prompt p = pt.create();
        return this.chatClient.prompt(p)
                .call()
                .content();
    }

}

Java

Below is a very similar implementation for GitHub MCP. This controller is exposed under the /github context path.

@RestController
@RequestMapping("/github")
public class GitHubController {

    private final static Logger LOG = LoggerFactory
        .getLogger(GitHubController.class);
    private final ChatClient chatClient;

    public GitHubController(ChatClient.Builder chatClientBuilder,
                            ToolCallbackProvider tools) {
        this.chatClient = chatClientBuilder
                .defaultToolCallbacks(tools)
                .build();
    }

    @GetMapping("/count")
    String countRepositories() {
        PromptTemplate pt = new PromptTemplate("""
                How many repositories in GitHub do I have ?
                """);
        Prompt p = pt.create();
        return this.chatClient.prompt(p)
                .call()
                .content();
    }

}

@RestController
@RequestMapping("/github")
public class GitHubController {

    private final static Logger LOG = LoggerFactory
        .getLogger(GitHubController.class);
    private final ChatClient chatClient;

    public GitHubController(ChatClient.Builder chatClientBuilder,
                            ToolCallbackProvider tools) {
        this.chatClient = chatClientBuilder
                .defaultToolCallbacks(tools)
                .build();
    }

    @GetMapping("/count")
    String countRepositories() {
        PromptTemplate pt = new PromptTemplate("""
                How many repositories in GitHub do I have ?
                """);
        Prompt p = pt.create();
        return this.chatClient.prompt(p)
                .call()
                .content();
    }

}

Java

Finally, there is the controller implementation for CircleCI MCP. It is available externally under the /circleci context path.

@RestController
@RequestMapping("/circleci")
public class CircleCIController {

    private final static Logger LOG = LoggerFactory
        .getLogger(CircleCIController.class);
    private final ChatClient chatClient;

    public CircleCIController(ChatClient.Builder chatClientBuilder,
                              ToolCallbackProvider tools) {
        this.chatClient = chatClientBuilder
                .defaultToolCallbacks(tools)
                .build();
    }

    @GetMapping("/count")
    String countRepositories() {
        PromptTemplate pt = new PromptTemplate("""
                How many projects in CircleCI do I have ?
                """);
        Prompt p = pt.create();
        return this.chatClient.prompt(p)
                .call()
                .content();
    }

}

@RestController
@RequestMapping("/circleci")
public class CircleCIController {

    private final static Logger LOG = LoggerFactory
        .getLogger(CircleCIController.class);
    private final ChatClient chatClient;

    public CircleCIController(ChatClient.Builder chatClientBuilder,
                              ToolCallbackProvider tools) {
        this.chatClient = chatClientBuilder
                .defaultToolCallbacks(tools)
                .build();
    }

    @GetMapping("/count")
    String countRepositories() {
        PromptTemplate pt = new PromptTemplate("""
                How many projects in CircleCI do I have ?
                """);
        Prompt p = pt.create();
        return this.chatClient.prompt(p)
                .call()
                .content();
    }

}

Java

The last controller implementation is a bit more complex. First, I need to instruct the LLM model to generate project names in SonarQube and specify my GitHub username. This will not be part of the main prompt. Rather, it will be the system role, which guides the AI’s behavior and response style. Therefore, I’ll create the SystemPromptTemplate first. The user role prompt accepts an input parameter specifying the name of my GitHub repository. The response should combine data on the last commit in a given repository with the status of the most recent SonarQube analysis. In this case, the LLM will need to communicate with two MCP servers running with Docker MCP simultaneously.

@RestController
@RequestMapping("/global")
public class GlobalController {

    private final static Logger LOG = LoggerFactory
        .getLogger(CircleCIController.class);
    private final ChatClient chatClient;

    public GlobalController(ChatClient.Builder chatClientBuilder,
                            ToolCallbackProvider tools) {
        this.chatClient = chatClientBuilder
                .defaultToolCallbacks(tools)
                .build();
    }

    @GetMapping("/status/{repo}")
    String repoStatus(@PathVariable String repo) {
        SystemPromptTemplate st = new SystemPromptTemplate("""
                My username in GitHub is piomin.
                Each my project key in SonarCloud contains the prefix with my organization name and _ char.
                """);
        var stMsg = st.createMessage();

        PromptTemplate pt = new PromptTemplate("""
                When was the last commit made in my GitHub repository {repo} ?
                What is the latest analyze status in SonarCloud for that repo ?
                """);
        var usMsg = pt.createMessage(Map.of("repo", repo));

        Prompt prompt = new Prompt(List.of(usMsg, stMsg));
        return this.chatClient.prompt(prompt)
                .call()
                .content();
    }
}

@RestController
@RequestMapping("/global")
public class GlobalController {

    private final static Logger LOG = LoggerFactory
        .getLogger(CircleCIController.class);
    private final ChatClient chatClient;

    public GlobalController(ChatClient.Builder chatClientBuilder,
                            ToolCallbackProvider tools) {
        this.chatClient = chatClientBuilder
                .defaultToolCallbacks(tools)
                .build();
    }

    @GetMapping("/status/{repo}")
    String repoStatus(@PathVariable String repo) {
        SystemPromptTemplate st = new SystemPromptTemplate("""
                My username in GitHub is piomin.
                Each my project key in SonarCloud contains the prefix with my organization name and _ char.
                """);
        var stMsg = st.createMessage();

        PromptTemplate pt = new PromptTemplate("""
                When was the last commit made in my GitHub repository {repo} ?
                What is the latest analyze status in SonarCloud for that repo ?
                """);
        var usMsg = pt.createMessage(Map.of("repo", repo));

        Prompt prompt = new Prompt(List.of(usMsg, stMsg));
        return this.chatClient.prompt(prompt)
                .call()
                .content();
    }
}

Java

Before running the app, we must set two required environment variables that contain the OpenAI and Docker MCP gateway tokens.

export MCP_TOKEN=by1culxc6sctmycxtyl9xh7499mb8pctbsdb3brha1hvmm4d8l
export SPRING_AI_OPENAI_API_KEY=<YOUR_OPEN_AI_TOKEN>

export MCP_TOKEN=by1culxc6sctmycxtyl9xh7499mb8pctbsdb3brha1hvmm4d8l
export SPRING_AI_OPENAI_API_KEY=

Plaintext

Finally, we can run our Spring Boot app with the following command.

mvn spring-boot:run

mvn spring-boot:run

ShellSession

Firstly, I’m going to ask about the number of my GitHub repositories.

curl http://localhost:8080/github/count

curl http://localhost:8080/github/count

ShellSession

Then, I can check the number of projects in my SonarCloud account.

curl http://localhost:8080/github/sonarcloud

curl http://localhost:8080/github/sonarcloud

ShellSession

Finally, I can choose a specific repository and verify the last commit and the current analysis status in SonarCloud.

curl http://localhost:8080/global/status/sample-spring-boot-kafka

curl http://localhost:8080/global/status/sample-spring-boot-kafka

ShellSession

Here’s the LLM answer for my sample-spring-boot-kafka repository. You can perform the same exercise for your repositories and projects.

Conclusion

Spring AI, combined with the MCP client, opens a powerful path toward building truly tool-aware AI applications. By using the Docker MCP Gateway, we can easily host and manage MCP servers such as GitHub or SonarQube consistently and reproducibly, without tightly coupling them to our application runtime. Docker provides a user-friendly interface for managing MCP servers, giving users access to everything through a single MCP gateway. This approach appears to have advantages, particularly during application development.

The post Spring AI with External MCP Servers appeared first on Piotr's TechBlog.

MCP with Quarkus LangChain4j

piotr.minkowski — Mon, 24 Nov 2025 07:45:15 +0000

This article shows how to use Quarkus LangChain4j support for MCP (Model Context Protocol) on both the server and client sides. You will learn how to serve tools and prompts on the server side and discover them in the Quarkus MCP client-side application. The Model Context Protocol is a standard for managing contextual interactions with AI models. It provides a standardized way to connect AI models to external data sources and tools. It can help with building complex workflows on top of LLMs.

This article is the second part of a series describing some of the Quarkus AI project’s most notable features. Before reading this article, I recommend checking out two previous parts of the tutorial:

Part 1: Getting Started with Quarkus LangChain4j and Chat Model
Part 2: AI Tool Calling with Quarkus LangChain4j

You can also compare Quarkus’ support for MCP with similar support on the Spring AI side. You can find the article I mentioned on my blog here.

Source Code

Feel free to use my source code if you’d like to try it out yourself. To do that, you must clone my sample GitHub repository. Then you should only follow my instructions.

Architecture for the Quarkus MCP Scenario

Let’s start with a diagram of our application architecture. Two Quarkus applications act as MCP servers. They connect to the in-memory database and use Quarkus LangChain4j MCP Server support to expose @Tool methods to the MCP client-side app. The client-side app communicates with the OpenAI model. It includes the tools exposed by the server-side apps in the user query to the AI model. The person-mcp-server app provides @Tool methods for searching persons in the database table. The account-mcp-server is doing the same for the persons’ accounts.

Build an MCP Server with Quarkus

Both MCP server applications are similar. They connect to the H2 database via the Quarkus Panache ORM extension. Both provide MCP API via Server-Sent Events (SSE) transport. Here’s a list of required Maven dependencies:

<dependencies>
  <dependency>
    <groupId>io.quarkiverse.mcp</groupId>
    <artifactId>quarkus-mcp-server-sse</artifactId>
  </dependency>
  <dependency>
    <groupId>io.quarkus</groupId>
    <artifactId>quarkus-hibernate-orm-panache</artifactId>
  </dependency>
  <dependency>
    <groupId>io.quarkus</groupId>
    <artifactId>quarkus-jdbc-h2</artifactId>
  </dependency>
  <dependency>
    <groupId>io.quarkus</groupId>
    <artifactId>quarkus-junit5</artifactId>
    <scope>test</scope>
  </dependency>
</dependencies>


  
    io.quarkiverse.mcp
    quarkus-mcp-server-sse
  
  
    io.quarkus
    quarkus-hibernate-orm-panache
  
  
    io.quarkus
    quarkus-jdbc-h2
  
  
    io.quarkus
    quarkus-junit5
    test

XML

Let’s start with the person-mcp-server application. Here’s the @Entity class for interacting with the person table. It uses Panache support to avoid the need for getter and setter declarations.

@Entity
public class Person extends PanacheEntityBase {

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    public Long id;
    public String firstName;
    public String lastName;
    public int age;
    public String nationality;
    @Enumerated(EnumType.STRING)
    public Gender gender;

}

@Entity
public class Person extends PanacheEntityBase {

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    public Long id;
    public String firstName;
    public String lastName;
    public int age;
    public String nationality;
    @Enumerated(EnumType.STRING)
    public Gender gender;

}

Java

The PersonRepository class contains a single method for searching persons by their nationality:

@ApplicationScoped
public class PersonRepository implements PanacheRepository<Person> {

    public List<Person> findByNationality(String nationality) {
        return find("nationality", nationality).list();
    }

}

@ApplicationScoped
public class PersonRepository implements PanacheRepository<Person> {

    public List<Person> findByNationality(String nationality) {
        return find("nationality", nationality).list();
    }

}

Java

Next, prepare the “tools service” that searches for a single person by ID or a list of people of a given nationality in the database. Each method must be annotated with @Tool and include a description in the description field. Quarkus LangChain4j does not allow a Java List to be returned, so we need to wrap it using a dedicated Persons object.

@ApplicationScoped
public class PersonTools {

    PersonRepository personRepository;

    public PersonTools(PersonRepository personRepository) {
        this.personRepository = personRepository;
    }

    @Tool(description = "Find person by ID")
    public Person getPersonById(
            @ToolArg(description = "Person ID") Long id) {
        return personRepository.findById(id);
    }

    @Tool(description = "Find all persons by nationality")
    public Persons getPersonsByNationality(
            @ToolArg(description = "Nationality") String nationality) {
        return new Persons(personRepository.findByNationality(nationality));
    }
}

@ApplicationScoped
public class PersonTools {

    PersonRepository personRepository;

    public PersonTools(PersonRepository personRepository) {
        this.personRepository = personRepository;
    }

    @Tool(description = "Find person by ID")
    public Person getPersonById(
            @ToolArg(description = "Person ID") Long id) {
        return personRepository.findById(id);
    }

    @Tool(description = "Find all persons by nationality")
    public Persons getPersonsByNationality(
            @ToolArg(description = "Nationality") String nationality) {
        return new Persons(personRepository.findByNationality(nationality));
    }
}

Java

Here’s our List wrapper:

public class Persons {

    private List<Person> persons;

    public Persons(List<Person> persons) {
        this.persons = persons;
    }

    public List<Person> getPersons() {
        return persons;
    }

    public void setPersons(List<Person> persons) {
        this.persons = persons;
    }
}

public class Persons {

    private List<Person> persons;

    public Persons(List<Person> persons) {
        this.persons = persons;
    }

    public List<Person> getPersons() {
        return persons;
    }

    public void setPersons(List<Person> persons) {
        this.persons = persons;
    }
}

Java

The implementation of the account-mcp-server application is essentially very similar. Here’s the @Entity class for interacting with the account table. It uses Panache support to avoid the need for getter and setter declarations.

@Entity
public class Account extends PanacheEntityBase {

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    public Long id;
    public String number;
    public int balance;
    public Long personId;

}

@Entity
public class Account extends PanacheEntityBase {

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    public Long id;
    public String number;
    public int balance;
    public Long personId;

}

Java

The AccountRepository class contains a single method for searching accounts by person ID:

@ApplicationScoped
public class AccountRepository implements PanacheRepository<Account> {

    public List<Account> findByPersonId(Long personId) {
        return find("personId", personId).list();
    }

}

@ApplicationScoped
public class AccountRepository implements PanacheRepository<Account> {

    public List<Account> findByPersonId(Long personId) {
        return find("personId", personId).list();
    }

}

Java

Once again, the list inside the “tools service” must be wrapped by the dedicated object. The single method annotated with @Tool returns a list of accounts assigned to a given person.

@ApplicationScoped
public class AccountTools {

    AccountRepository accountRepository;

    public AccountTools(AccountRepository accountRepository) {
        this.accountRepository = accountRepository;
    }

    @Tool(description = "Find all accounts by person ID")
    public Accounts getAccountsByPersonId(
            @ToolArg(description = "Person ID") Long personId) {
        return new Accounts(accountRepository.findByPersonId(personId));
    }

}

@ApplicationScoped
public class AccountTools {

    AccountRepository accountRepository;

    public AccountTools(AccountRepository accountRepository) {
        this.accountRepository = accountRepository;
    }

    @Tool(description = "Find all accounts by person ID")
    public Accounts getAccountsByPersonId(
            @ToolArg(description = "Person ID") Long personId) {
        return new Accounts(accountRepository.findByPersonId(personId));
    }

}

Java

The person-mcp-server starts on port 8082, while the account-mcp-server listens on port 8081. To change the default HTTP, use the quarkus.http.port property in your application.properties file.

Build an MCP Client with Quarkus

Our application interacts with the OpenAI chat model, so we must include the Quarkus LangChain4j OpenAI extension. In turn, to integrate the client-side application with MCP-compliant servers, we need to include the quarkus-langchain4j-mcp extension.

<dependencies>
  <dependency>
    <groupId>io.quarkus</groupId>
    <artifactId>quarkus-rest-jackson</artifactId>
  </dependency>
  <dependency>
    <groupId>io.quarkiverse.langchain4j</groupId>
    <artifactId>quarkus-langchain4j-openai</artifactId>
    <version>${quarkus.langchain4j.version}</version>
  </dependency>
  <dependency>
    <groupId>io.quarkiverse.langchain4j</groupId>
    <artifactId>quarkus-langchain4j-mcp</artifactId>
    <version>${quarkus.langchain4j.version}</version>
  </dependency>
</dependencies>


  
    io.quarkus
    quarkus-rest-jackson
  
  
    io.quarkiverse.langchain4j
    quarkus-langchain4j-openai
    ${quarkus.langchain4j.version}
  
  
    io.quarkiverse.langchain4j
    quarkus-langchain4j-mcp
    ${quarkus.langchain4j.version}

XML

The sample-client Quarkus app interacts with both the person-mcp-server app and the account-mcp-server app. Therefore, it defines two AI services. As with a standard AI application in Quarkus, those services must be annotated with @RegisterAiService. Then we define methods and prompt templates, also with annotations @UserMessage or @SystemMessage. If a given method is to use one of the MCP servers, it must be annotated with @McpToolBox. The name inside the annotation corresponds to the MCP server name set in the configuration properties. The PersonService AI service visible below uses the person-service MCP server.

@ApplicationScoped
@RegisterAiService
public interface PersonService {

    @SystemMessage("""
        You are a helpful assistant that generates realistic person data.
        Always respond with valid JSON format.
        """)
    @UserMessage("""
        Find persons with {nationality} nationality.
        Output **only valid JSON**, no explanations, no markdown, no ```json blocks.
        """)
    @McpToolBox("person-service")
    Persons findByNationality(String nationality);

    @SystemMessage("""
        You are a helpful assistant that generates realistic person data.
        Always respond with valid JSON format.
        """)
    @UserMessage("How many persons come from {nationality} ?")
    @McpToolBox("person-service")
    int countByNationality(String nationality);

}

@ApplicationScoped
@RegisterAiService
public interface PersonService {

    @SystemMessage("""
        You are a helpful assistant that generates realistic person data.
        Always respond with valid JSON format.
        """)
    @UserMessage("""
        Find persons with {nationality} nationality.
        Output **only valid JSON**, no explanations, no markdown, no ```json blocks.
        """)
    @McpToolBox("person-service")
    Persons findByNationality(String nationality);

    @SystemMessage("""
        You are a helpful assistant that generates realistic person data.
        Always respond with valid JSON format.
        """)
    @UserMessage("How many persons come from {nationality} ?")
    @McpToolBox("person-service")
    int countByNationality(String nationality);

}

Java

The AI service shown above corresponds to this configuration. You need to specify the MCP server’s name and address. As you remember, person-mcp-server listens on port 8082. The client application uses the name person-service here, and the standard endpoint to MCP SSE for Quarkus is /mcp/sse. To explore the solution itself, it is also worth enabling logging of MCP requests and responses.

quarkus.langchain4j.mcp.person-service.transport-type = http
quarkus.langchain4j.mcp.person-service.url = http://localhost:8082/mcp/sse
quarkus.langchain4j.mcp.person-service.log-requests = true
quarkus.langchain4j.mcp.person-service.log-responses = true

quarkus.langchain4j.mcp.person-service.transport-type = http
quarkus.langchain4j.mcp.person-service.url = http://localhost:8082/mcp/sse
quarkus.langchain4j.mcp.person-service.log-requests = true
quarkus.langchain4j.mcp.person-service.log-responses = true

Plaintext

Here is a similar implementation for the AccountService AI service. It interacts with the MCP server configured under the account-service name.

@ApplicationScoped
@RegisterAiService
public interface AccountService {

    @SystemMessage("""
        You are a helpful assistant that generates realistic data.
        Return a single number.
        """)
    @UserMessage("How many accounts has person with {personId} ID ?")
    @McpToolBox("account-service")
    int countByPersonId(int personId);

    @UserMessage("""
        How many accounts has person with {personId} ID ?
        Return person name, nationality and a total balance on his/her accounts.
        """)
    @McpToolBox("account-service")
    String balanceByPersonId(int personId);

}

@ApplicationScoped
@RegisterAiService
public interface AccountService {

    @SystemMessage("""
        You are a helpful assistant that generates realistic data.
        Return a single number.
        """)
    @UserMessage("How many accounts has person with {personId} ID ?")
    @McpToolBox("account-service")
    int countByPersonId(int personId);

    @UserMessage("""
        How many accounts has person with {personId} ID ?
        Return person name, nationality and a total balance on his/her accounts.
        """)
    @McpToolBox("account-service")
    String balanceByPersonId(int personId);

}

Java

Here’s the corresponding configuration for that service. No surprises.

quarkus.langchain4j.mcp.account-service.transport-type = http
quarkus.langchain4j.mcp.account-service.url = http://localhost:8081/mcp/sse
quarkus.langchain4j.mcp.account-service.log-requests = true
quarkus.langchain4j.mcp.account-service.log-responses = true

quarkus.langchain4j.mcp.account-service.transport-type = http
quarkus.langchain4j.mcp.account-service.url = http://localhost:8081/mcp/sse
quarkus.langchain4j.mcp.account-service.log-requests = true
quarkus.langchain4j.mcp.account-service.log-responses = true

Plaintext

Finally, we must provide some configuration to integrate our Quarkus application with Open AI chat model. It assumes that Open AI token is available as the OPEN_AI_TOKEN environment variable.

quarkus.langchain4j.chat-model.provider = openai
quarkus.langchain4j.log-requests = true
quarkus.langchain4j.log-responses = true
quarkus.langchain4j.openai.api-key = ${OPEN_AI_TOKEN}
quarkus.langchain4j.openai.timeout = 20s

quarkus.langchain4j.chat-model.provider = openai
quarkus.langchain4j.log-requests = true
quarkus.langchain4j.log-responses = true
quarkus.langchain4j.openai.api-key = ${OPEN_AI_TOKEN}
quarkus.langchain4j.openai.timeout = 20s

Plaintext

We can test individual AI services by calling endpoints provided by the client-side application. There are two endpoints GET /count-by-person-id/{personId} and GET /balance-by-person-id/{personId} that use LLM prompts to calculate number of persons and a total balances amount of all accounts belonging to a given person.

@Path("/accounts")
public class AccountResource {

    private final AccountService accountService;

    public AccountResource(AccountService accountService) {
        this.accountService = accountService;
    }

    @POST
    @Path("/count-by-person-id/{personId}")
    public int countByPersonId(int personId) {
        return accountService.countByPersonId(personId);
    }

    @POST
    @Path("/balance-by-person-id/{personId}")
    public String balanceByPersonId(int personId) {
        return accountService.balanceByPersonId(personId);
    }

}

@Path("/accounts")
public class AccountResource {

    private final AccountService accountService;

    public AccountResource(AccountService accountService) {
        this.accountService = accountService;
    }

    @POST
    @Path("/count-by-person-id/{personId}")
    public int countByPersonId(int personId) {
        return accountService.countByPersonId(personId);
    }

    @POST
    @Path("/balance-by-person-id/{personId}")
    public String balanceByPersonId(int personId) {
        return accountService.balanceByPersonId(personId);
    }

}

Java

MCP for Promps

MCP servers can also provide other functionalities beyond just tools. Let’s go back to the person-mcp-server app for a moment. To share a prompt message, you can create a class that defines methods returning the PromptMessage object. Then, we must annotate such methods with @Prompt, and their arguments with @PromptArg.

@ApplicationScoped
public class PersonPrompts {

    final String findByNationalityPrompt = """
        Find persons with {nationality} nationality.
        Output **only valid JSON**, no explanations, no markdown, no ```json blocks.
        """;

    @Prompt(description = "Find by nationality.")
    PromptMessage findByNationalityPrompt(@PromptArg(description = "The nationality") String nationality) {
        return PromptMessage.withUserRole(new TextContent(findByNationalityPrompt));
    }

}

@ApplicationScoped
public class PersonPrompts {

    final String findByNationalityPrompt = """
        Find persons with {nationality} nationality.
        Output **only valid JSON**, no explanations, no markdown, no ```json blocks.
        """;

    @Prompt(description = "Find by nationality.")
    PromptMessage findByNationalityPrompt(@PromptArg(description = "The nationality") String nationality) {
        return PromptMessage.withUserRole(new TextContent(findByNationalityPrompt));
    }

}

Java

Once we start the application, we can use Quarkus Dev UI to verify a list of provided tools and prompts.

Client-side integration with MCP prompts is a bit more complex than with tools. We must inject the McpClient to a resource controller to load a given prompt programmatically using its name.

@Path("/persons")
public class PersonResource {

    @McpClientName("person-service")
    McpClient mcpClient;
    
    // OTHET METHODS...
    
    @POST
    @Path("/nationality-with-prompt/{nationality}")
    public List<Person> findByNationalityWithPrompt(String nationality) {
        Persons p = personService.findByNationalityWithPrompt(loadPrompt(nationality), nationality);
        return p.getPersons();
    }
    
    private String loadPrompt(String nationality) {
        McpGetPromptResult prompt = mcpClient.getPrompt("findByNationalityPrompt", Map.of("nationality", nationality));
        return ((TextContent) prompt.messages().getFirst().content().toContent()).text();
    }
}

@Path("/persons")
public class PersonResource {

    @McpClientName("person-service")
    McpClient mcpClient;
    
    // OTHET METHODS...
    
    @POST
    @Path("/nationality-with-prompt/{nationality}")
    public List<Person> findByNationalityWithPrompt(String nationality) {
        Persons p = personService.findByNationalityWithPrompt(loadPrompt(nationality), nationality);
        return p.getPersons();
    }
    
    private String loadPrompt(String nationality) {
        McpGetPromptResult prompt = mcpClient.getPrompt("findByNationalityPrompt", Map.of("nationality", nationality));
        return ((TextContent) prompt.messages().getFirst().content().toContent()).text();
    }
}

Java

In this case, the Quarkus AI service should not define the @UserMessage on the entire method, but just as the method argument. Then a prompt message is loaded from the MCP server and filled with the nationality parameter value before sending to the AI model.

@ApplicationScoped
@RegisterAiService
public interface PersonService {

    // OTHER METHODS...
    
    @SystemMessage("""
        You are a helpful assistant that generates realistic person data.
        Always respond with valid JSON format.
        """)
    @McpToolBox("person-service")
    Persons findByNationalityWithPrompt(@UserMessage String userMessage, String nationality);

}

@ApplicationScoped
@RegisterAiService
public interface PersonService {

    // OTHER METHODS...
    
    @SystemMessage("""
        You are a helpful assistant that generates realistic person data.
        Always respond with valid JSON format.
        """)
    @McpToolBox("person-service")
    Persons findByNationalityWithPrompt(@UserMessage String userMessage, String nationality);

}

Java

Testing MCP Tools with Quarkus

Quarkus provides a dedicated module for testing MCP tools. We can use after including the following dependency in the Maven pom.xml:

<dependency>
  <groupId>io.quarkiverse.mcp</groupId>
  <artifactId>quarkus-mcp-server-test</artifactId>
  <scope>test</scope>
</dependency>


  io.quarkiverse.mcp
  quarkus-mcp-server-test
  test

XML

The following test verifies the MCP tool methods provided by the person-mcp-server application. The McpAssured class allows us to use SSE, streamable, and WebSocket test clients. To create a new client for SSE, invoke the newConnectedSseClient() static method. After that, we can use one of several available variants of the toolsCall(...) method to verify the response returned by a given @Tool.

@QuarkusTest
public class PersonToolsTest {

    ObjectMapper mapper = new ObjectMapper();

    @Test
    public void testGetPersonsByNationality() {
        McpAssured.McpSseTestClient client = McpAssured.newConnectedSseClient();
        client.when()
                .toolsCall("getPersonsByNationality", Map.of("nationality", "Denmark"),
                        r -> {
                            try {
                                Persons p = mapper.readValue(r.content().getFirst().asText().text(), Persons.class);
                                assertFalse(p.getPersons().isEmpty());
                            } catch (JsonProcessingException e) {
                                throw new RuntimeException(e);
                            }
                        })
                .thenAssertResults();
    }

    @Test
    public void testGetPersonById() {
        McpAssured.McpSseTestClient client = McpAssured.newConnectedSseClient();
        client.when()
                .toolsCall("getPersonById", Map.of("id", 10),
                        r -> {
                            try {
                                Person p = mapper.readValue(r.content().getFirst().asText().text(), Person.class);
                                assertNotNull(p);
                                assertNotNull(p.id);
                            } catch (JsonProcessingException e) {
                                throw new RuntimeException(e);
                            }
                        })
                .thenAssertResults();
    }
}

@QuarkusTest
public class PersonToolsTest {

    ObjectMapper mapper = new ObjectMapper();

    @Test
    public void testGetPersonsByNationality() {
        McpAssured.McpSseTestClient client = McpAssured.newConnectedSseClient();
        client.when()
                .toolsCall("getPersonsByNationality", Map.of("nationality", "Denmark"),
                        r -> {
                            try {
                                Persons p = mapper.readValue(r.content().getFirst().asText().text(), Persons.class);
                                assertFalse(p.getPersons().isEmpty());
                            } catch (JsonProcessingException e) {
                                throw new RuntimeException(e);
                            }
                        })
                .thenAssertResults();
    }

    @Test
    public void testGetPersonById() {
        McpAssured.McpSseTestClient client = McpAssured.newConnectedSseClient();
        client.when()
                .toolsCall("getPersonById", Map.of("id", 10),
                        r -> {
                            try {
                                Person p = mapper.readValue(r.content().getFirst().asText().text(), Person.class);
                                assertNotNull(p);
                                assertNotNull(p.id);
                            } catch (JsonProcessingException e) {
                                throw new RuntimeException(e);
                            }
                        })
                .thenAssertResults();
    }
}

Java

Running Quarkus Applications

Finally, let’s run all our quarkus applications. Go to the mcp/account-mcp-server directory and run the application in development mode:

$ cd mcp/account-mcp-server
$ mvn quarkus:dev

$ cd mcp/account-mcp-server
$ mvn quarkus:dev

ShellSession

Then do the same for the person-mcp-server application.

$ cd mcp/person-mcp-server
$ mvn quarkus:dev

$ cd mcp/person-mcp-server
$ mvn quarkus:dev

ShellSession

Before running the last sample-client application, export the OpenAI API token as the OPEN_AI_TOKEN environment variable.

$ cd mcp/sample-client
$ export OPEN_AI_TOKEN=<YOUR_OPENAI_TOKEN>
$ mvn quarkus:dev

$ cd mcp/sample-client
$ export OPEN_AI_TOKEN=<YOUR_OPENAI_TOKEN>
$ mvn quarkus:dev

ShellSession

We can verify a list of tools or prompts exposed by each MCP server application by visiting its Quarkus Dev UI console. It provides a dedicated “MCP Server tile.”

Here’s a list of tools provided by the person-mcp-server app via Quarkus Dev UI.

Then, we can switch to the sample-client Dev UI console. We can verify and test all interactions with the MCP servers from our client-side app.

Once all the sample applications are running, we can test the MCP communication by calling the HTTP endpoints exposed by the sample-client app. Both person-mcp-server and account-mcp-server load some test data on startup using the import.sql file. Here are the test API calls for all the REST endpoints.

$ curl -X POST http://localhost:8080/persons/nationality/Denmark
$ curl -X POST http://localhost:8080/persons/count-by-nationality/Denmark
$ curl -X POST http://localhost:8080/persons/nationality-with-prompt/Denmark
$ curl -X POST http://localhost:8080/accounts/count-by-person-id/2
$ curl -X POST http://localhost:8080/accounts/balance-by-person-id/2

$ curl -X POST http://localhost:8080/persons/nationality/Denmark
$ curl -X POST http://localhost:8080/persons/count-by-nationality/Denmark
$ curl -X POST http://localhost:8080/persons/nationality-with-prompt/Denmark
$ curl -X POST http://localhost:8080/accounts/count-by-person-id/2
$ curl -X POST http://localhost:8080/accounts/balance-by-person-id/2

ShellSession

Conclusion

With Quarkus, creating applications that use MCP is not difficult. If you understand the idea of tool calling in AI, understanding the MCP-based approach is not difficult for you. This article shows you how to connect your application to several MCP servers, implement tests to verify the elements shared by a given application using MCP, and support on the Quarkus Dev UI side.

The post MCP with Quarkus LangChain4j appeared first on Piotr's TechBlog.

Getting Started with Quarkus LangChain4j and Chat Model

piotr.minkowski — Wed, 18 Jun 2025 16:36:08 +0000

This article will teach you how to use the Quarkus LangChain4j project to build applications based on different chat models. The Quarkus AI Chat Model offers a portable and straightforward interface, enabling seamless interaction with these models. Our sample Quarkus application will switch between three popular chat models provided by OpenAI, Mistral AI, and Ollama. This article is the first in a series explaining AI concepts with Quarkus LangChain4j. Look for more on my blog in this area soon. The idea of this tutorial is very similar to the series on Spring AI. Therefore, you will be able to easily compare the two approaches, as the sample application will do the same thing as an analogous Spring Boot application.

If you like Quarkus, then you can find quite a few articles about it on my blog. Just go to the Quarkus category and find the topic you are interested in.

SourceCode

Feel free to use my source code if you’d like to try it out yourself. To do that, you must clone my sample GitHub repository. Then you should only follow my instructions.

Motivation

Whenever I create a new article or example related to AI, I like to define the problem I’m trying to solve. The problem this example solves is very trivial. I publish numerous small demo apps to explain complex technology concepts. These apps typically require data to display a demo output. Usually, I add demo data by myself or use a library like Datafaker to do it for me. This time, we can leverage the AI Chat Models API for that. Let’s begin!

The Quarkus-related topic I’m describing today, I also explained earlier for Spring Boot. For a comparison of the features offered by both frameworks for simple interaction with the AI chat model, you can read this article on Spring AI.

Dependencies

The sample application uses the current latest version of the Quarkus framework.


  
    
      io.quarkus.platform
      quarkus-bom
      ${quarkus.platform.version}
      pom
      import

XML

You can easily switch between multiple AI model implementations by activating a dedicated Maven profile. By default, the open-ai profile is active. It includes the quarkus-langchain4j-openai module in the Maven dependencies. You can also activate the mistral-ai and ollama profile. In that case, the quarkus-langchain4j-mistral-ai or quarkus-langchain4j-ollama module will be included instead of the LangChain4j OpenAI extension.


  
    open-ai
    
      true
    
    
      
        io.quarkiverse.langchain4j
        quarkus-langchain4j-openai
        ${quarkus-langchain4j.version}
      
    
  
  
    mistral-ai
    
      
        io.quarkiverse.langchain4j
        quarkus-langchain4j-mistral-ai
        ${quarkus-langchain4j.version}
      
    
  
  
    ollama
    
      
        io.quarkiverse.langchain4j
        quarkus-langchain4j-ollama
        ${quarkus-langchain4j.version}

XML

The sample Quarkus application is simple. It exposes some REST endpoints and communicates with a selected AI model to return an AI-generated response via each endpoint. So, you need to include only core Quarkus modules like quarkus-rest-jackson or quarkus-arc. To implement JUnit tests with REST API, it also includes the quarkus-junit5 and rest-assured modules in the test scope.


  
  
    io.quarkus
    quarkus-rest-jackson
  
  
    io.quarkus
    quarkus-arc
  

  
  
    io.quarkus
    quarkus-junit5
    test
  
  
    io.rest-assured
    rest-assured
    test

XML

Quarkus LangChain4j Chat Models Integration

Quarkus provides an innovative approach to interacting with AI chat models. First, you need to annotate your interface by defining AI-oriented methods with the @RegisterAiService annotation. Then you must add a proper description and input prompt inside the @SystemMessage and @UserMessage annotations. Here is the sample PersonAiService interaction, which defines two methods. The generatePersonList method aims to ask the AI model to generate a list of 10 unique persons in a form consistent with the input object structure. The getPersonById method must read the previously generated list from chat memory and return a person’s data with a specified id field.

@RegisterAiService
@ApplicationScoped
public interface PersonAiService {

    @SystemMessage("""
        You are a helpful assistant that generates realistic person data.
        Always respond with valid JSON format.
        """)
    @UserMessage("""
        Generate exactly 10 unique persons

        Requirements:
        - Each person must have a unique integer ID (like 1, 2, 3, etc.)
        - Use realistic first and last names per each nationality
        - Ages should be between 18 and 80
        - Return ONLY the JSON array, no additional text
        """)
    PersonResponse generatePersonList(@MemoryId int userId);

    @SystemMessage("""
        You are a helpful assistant that can recall generated person data from chat memory.
        """)
    @UserMessage("""
        In the previously generated list of persons for user {userId}, find and return the person with id {id}.
        
        Return ONLY the JSON object, no additional text.
        """)
    Person getPersonById(@MemoryId int userId, int id);

}

Java

There are a few more things to add regarding the code snippet above. The beans created by @RegisterAiService are @RequestScoped by default. The Quarkus LangChain4j documentation states that this is possible, allowing objects to be deleted from the chat memory. In the case seen above, the list of people is generated per user ID, which acts as the key by which we search the chat memory. To guarantee that the getPersonById method finds a list of persons generated per @MemoryId the PersonAiService interface must be annotated with @ApplicationScoped. The InMemoryChatMemoryStore implementation is enabled by default, so you don’t need to declare any additional beans to use it.

Quarkus LangChain4j can automatically map the LLM’s JSON response to the output POJO. However, until now, it has not been possible to map it directly to the output collection. Therefore, you must wrap the output list with the additional class, as shown below.

public class PersonResponse {

    private List<Person> persons;

    public List<Person> getPersons() {
        return persons;
    }

    public void setPersons(List<Person> persons) {
        this.persons = persons;
    }
}

Java

Here’s the Person class:

public class Person {

    private Integer id;
    private String firstName;
    private String lastName;
    private int age;
    private String nationality;
    private Gender gender;
    
    // GETTERS and SETTERS

}

Java

Finally, the last part of our implementation is REST endpoints. Here’s the REST controller that injects and uses PersonAiService to interact with the AI chat model. It exposes two endpoints: GET /api/{userId}/persons and GET /api/{userId}/persons/{id}. You can generate several lists of persons by specifying the userId path parameter.

@Path("/api")
@Produces(MediaType.APPLICATION_JSON)
@Consumes(MediaType.APPLICATION_JSON)
public class PersonController {

    private static final Logger LOG = Logger.getLogger(PersonController.class);

    PersonAiService personAiService;

    public PersonController(PersonAiService personAiService) {
        this.personAiService = personAiService;
    }

    @GET
    @Path("/{userId}/persons")
    public PersonResponse generatePersons(@PathParam("userId") int userId) {
        return personAiService.generatePersonList(userId);
    }

    @GET
    @Path("/{userId}/persons/{id}")
    public Person getPersonById(@PathParam("userId") int userId, @PathParam("id") int id) {
        return personAiService.getPersonById(userId, id);
    }

}

Java

Use Different AI Models with Quarkus LangChain4j

Configuration Properties

Here is a configuration defined within the application.properties file. Before proceeding, you must generate the OpenAI and Mistral AI API tokens and export them as environment variables. Additionally, you can enable logging of requests and responses in AI model communication. It is also worth increasing the default timeout for a single request from 10 seconds to a higher value, such as 20 seconds.

quarkus.langchain4j.chat-model.provider = ${AI_MODEL_PROVIDER:openai}
quarkus.langchain4j.log-requests = true
quarkus.langchain4j.log-responses = true

# OpenAI Configuration
quarkus.langchain4j.openai.api-key = ${OPEN_AI_TOKEN}
quarkus.langchain4j.openai.timeout = 20s

# Mistral AI Configuration
quarkus.langchain4j.mistralai.api-key = ${MISTRAL_AI_TOKEN}
quarkus.langchain4j.mistralai.timeout = 20s

# Ollama Configuration
quarkus.langchain4j.ollama.base-url = ${OLLAMA_BASE_URL:http://localhost:11434}

Plaintext

To run a sample Quarkus application and connect it with OpenAI, you must set the OPEN_AI_TOKEN environment variable. Since the open-ai Maven profile is activated by default, you don’t need to set anything else while running an app.

$ export OPEN_AI_TOKEN=<your_openai_token>
$ mvn quarkus:dev

ShellSession

Then, you can call the GET /api/{userId}/persons endpoint with different userId path variable values. Here are sample API requests and responses.

After that, you can call the GET /api/{userId}/persons/{id} endpoint to return a specified person found in the chat memory.

Switch Between AI Models

Then, you can repeat the same exercise with the Mistral AI model. You must set the AI_MODEL_PROVIDER to mistral, export its API token as the MISTRAL_AI_TOKEN environment variable, and enable the mistral-ai profile while running the app.

$ export AI_MODEL_PROVIDER=mistralai
$ export MISTRAL_AI_TOKEN=<your_mistralai_token>
$ mvn quarkus:dev -Pmistral-ai

ShellSession

The app should start successfully.

Once it happens, you can repeat the same sequence of requests as before for OpenAI.

$ curl http://localhost:8080/api/1/persons
$ curl http://localhost:8080/api/2/persons
$ curl http://localhost:8080/api/1/persons/1
$ curl http://localhost:8080/api/2/persons/1

ShellSession

You can check the request sent to the AI model in the application logs.

Here’s a log showing an AI chat model response:

Finally, you can run a test with ollama. By default, the LangChain4j extension for Ollama uses the llama3.2 model. You can change it by setting the quarkus.langchain4j.ollama.chat-model.model-id property in the application.properties file. Assuming that you use the llama3.3 model, here’s your configuration:

quarkus.langchain4j.ollama.base-url = ${OLLAMA_BASE_URL:http://localhost:11434}
quarkus.langchain4j.ollama.chat-model.model-id = llama3.3
quarkus.langchain4j.ollama.timeout = 60s

Plaintext

Before proceeding, you must run the llama3.3 model on your laptop. Of course, you can choose another, smaller model, because llama3.3 is 42 GB.

ollama run llama3.3

ShellSession

It can take a lot of time. However, a model is finally ready to use.

Once a model is running, you can set the AI_MODEL_PROVIDER environment variable to ollama and activate the ollama profile for the app:

$ export AI_MODEL_PROVIDER=ollama
$ mvn quarkus:dev -Pollama

ShellSession

This time, our application is connected to the llama3.3 model started with ollama:

With the Quarkus LangChain4j Ollama extension, you can take advantage of dev services support. It means that you don’t need to install and run Ollama on your laptop or run a model with ollama CLI. Quarkus will run Ollama as a Docker container and automatically run a selected AI model on it. In that case, you don’t need to set the quarkus.langchain4j.ollama.base-url property. Before switching to that option, let’s use a smaller AI model by setting the quarkus.langchain4j.ollama.chat-model.model-id = mistral property. Then start the app in the same way as before.

Final Thoughts

I must admit that the Quarkus LangChain4j extension is enjoyable to use. With a few simple annotations, you can configure your application to talk to the AI model of your choice correctly. In this article, I presented a straightforward example of integrating Quarkus with an AI chat model. However, we quickly reviewed features such as prompts, structured output, and chat memory. You can expect more articles in the Quarkus series with AI soon.

The post Getting Started with Quarkus LangChain4j and Chat Model appeared first on Piotr's TechBlog.

Using Model Context Protocol (MCP) with Spring AI

piotr.minkowski — Mon, 17 Mar 2025 16:17:32 +0000

This article will show how to use Spring AI support for MCP (Model Context Protocol) in Spring Boot server-side and client-side applications. You will learn how to serve tools and prompts on the server side and discover them on the client-side Spring AI application. The Model Context Protocol is a standard for managing contextual interactions with AI models. It provides a standardized way to connect AI models to external data sources and tools. It can help with building complex workflows on top of LLMs. Spring AI MCP extends the MCP Java SDK and provides client and server Spring Boot starters. The MCP Client is responsible for establishing and managing connections with MCP servers.

This is the seventh part of my series of articles about Spring Boot and AI. It is worth reading the following posts before proceeding with the current one. Please pay special attention to the last article from the list about the tool calling feature since we will implement it in our sample client and server apps using MCP.

https://piotrminkowski.com/2025/01/28/getting-started-with-spring-ai-and-chat-model: The first tutorial introduces the Spring AI project and its support for building applications based on chat models like OpenAI or Mistral AI.
https://piotrminkowski.com/2025/01/30/getting-started-with-spring-ai-function-calling: The second tutorial shows Spring AI support for Java function calling with the OpenAI chat model.
https://piotrminkowski.com/2025/02/24/using-rag-and-vector-store-with-spring-ai: The third tutorial shows Spring AI support for RAG (Retrieval Augmented Generation) and vector store.
https://piotrminkowski.com/2025/03/04/spring-ai-with-multimodality-and-images: The fourth tutorial shows Spring AI support for a multimodality feature and image generation
https://piotrminkowski.com/2025/03/10/using-ollama-with-spring-ai: The fifth tutorial shows Spring AI support for interactions with AI models run with Ollama
https://piotrminkowski.com/2025/03/13/tool-calling-with-spring-ai: The sixth tutorial show Spring AI for the Tool Calling feature.

Source Code

Feel free to use my source code if you’d like to try it out yourself. To do that, you must clone my sample GitHub repository. Then you should only follow my instructions.

Motivation for MCP with Spring AI

MCP introduces an interesting concept for applications interacting with AI models. With MCP the application can provide specific tools/functions for several other services, which need to use data exposed by that application. Additionally, it can expose prompt templates and resources. Thanks to that, we don’t need to implement AI tools/functions inside every client service but integrate them with the application that exposes tools over MCP.

The best way to analyze the MCP concept is through an example. Let’s consider an application that connects to a database and exposes data through REST endpoints. If we want to use that data in our AI application we should implement and register AI tools that retrieve data by connecting such the REST endpoints. So, each client-side application that needs data from the source service would have to implement its own set of AI tools locally. Here comes the MCP concept. The source service defines and exposes AI tools/functions in the standardized form. All other apps that need to provide data to AI models can load and use a predefined set of tools.

The following diagram illustrates our scenario. Two Spring Boot applications act as MCP servers. They connect to the database and use Spring AI MCP Server support to expose @Tool methods to the MCP client-side app. The client-side app communicates with the OpenAI model. It includes the tools exposed by the server-side apps in the user query to the AI model. The person-mcp-service app provides @Tool methods for searching persons in the database table. The account-mcp-service is doing the same for the persons’ accounts.

Build MCP Server App with Spring AI

Let’s begin with the implementation of applications that act as MCP servers. They both run and use an in-memory H2 database. To interact with a database we include the Spring Data JPA module. Spring AI allows us to switch between three transport types: STDIO, Spring MVC, and Spring WebFlux. MCP Server with Spring WebFlux supports Server-Sent Events (SSE) and an optional STDIO transport. Here’s a list of required Maven dependencies:


  
    org.springframework.ai
    spring-ai-mcp-server-webflux-spring-boot-starter
  
  
    org.springframework.boot
    spring-boot-starter-data-jpa
  
  
    com.h2database
    h2
    runtime

XML

Create the Person MCP Server

Here’s an @Entity class for interacting with the person table:

@Entity
public class Person {

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;
    private String firstName;
    private String lastName;
    private int age;
    private String nationality;
    @Enumerated(EnumType.STRING)
    private Gender gender;
    
    // ... getters and setters
    
}

Java

The Spring Data Repository interface contains a single method for searching persons by their nationality:

public interface PersonRepository extends CrudRepository<Person, Long> {
    List<Person> findByNationality(String nationality);
}

Java

The PersonTools @Service bean contains two Spring AI @Tool methods. It injects the PersonRepository bean to interact with the H2 database. The getPersonById method returns a single person with a specific ID field, while the getPersonsByNationality returns a list of all persons with a given nationality.

getPersonsByNationality( @ToolParam(description = "Nationality") String nationality) { return personRepository.findByNationality(nationality); } }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

@Service
public class PersonTools {

    private PersonRepository personRepository;

    public PersonTools(PersonRepository personRepository) {
        this.personRepository = personRepository;
    }

    @Tool(description = "Find person by ID")
    public Person getPersonById(
            @ToolParam(description = "Person ID") Long id) {
        return personRepository.findById(id).orElse(null);
    }

    @Tool(description = "Find all persons by nationality")
    public List<Person> getPersonsByNationality(
            @ToolParam(description = "Nationality") String nationality) {
        return personRepository.findByNationality(nationality);
    }
    
}

Java

Once we define @Tool methods, we must register them within the Spring AI MCP server. We can use the ToolCallbackProvider bean for that. More specifically, the MethodToolCallbackProvider class provides a builder that creates an instance of the ToolCallbackProvider class with a list of references to objects with @Tool methods.

@SpringBootApplication
public class PersonMCPServer {

    public static void main(String[] args) {
        SpringApplication.run(PersonMCPServer.class, args);
    }

    @Bean
    public ToolCallbackProvider tools(PersonTools personTools) {
        return MethodToolCallbackProvider.builder()
                .toolObjects(personTools)
                .build();
    }

}

Java

Finally, we must provide configuration properties. The person-mcp-server app will listen on the 8060 port. We should also set the name and version of the MCP server embedded in our application.

spring:
  ai:
    mcp:
      server:
        name: person-mcp-server
        version: 1.0.0
  jpa:
    database-platform: H2
    generate-ddl: true
    hibernate:
      ddl-auto: create-drop

logging.level.org.springframework.ai: DEBUG

server.port: 8060

YAML

That’s all. We can start the application.

$ cd spring-ai-mcp/person-mcp-service
$ mvn spring-boot:run

ShellSession

Create the Account MCP Server

Then, we will do very similar things in the second application that acts as an MCP server. Here’s the @Entity class for interacting with the account table:

@Entity
public class Account {

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;
    private String number;
    private int balance;
    private Long personId;
    
    // ... getters and setters
    
}

Java

The Spring Data Repository interface contains a single method for searching accounts belonging to a given person:

public interface AccountRepository extends CrudRepository<Account, Long> {
    List<Account> findByPersonId(Long personId);
}

Java

The AccountTools @Service bean contains a single Spring AI @Tool method. It injects the AccountRepository bean to interact with the H2 database. The getAccountsByPersonId method returns a list of accounts owned by the person with a specified ID field value.

getAccountsByPersonId( @ToolParam(description = "Person ID") Long personId) { return accountRepository.findByPersonId(personId); } }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

@Service
public class AccountTools {

    private AccountRepository accountRepository;

    public AccountTools(AccountRepository accountRepository) {
        this.accountRepository = accountRepository;
    }

    @Tool(description = "Find all accounts by person ID")
    public List<Account> getAccountsByPersonId(
            @ToolParam(description = "Person ID") Long personId) {
        return accountRepository.findByPersonId(personId);
    }
}

Java

Of course, the account-mcp-server application will use ToolCallbackProvider to register @Tool methods defined inside the AccountTools class.

@SpringBootApplication
public class AccountMCPService {

    public static void main(String[] args) {
        SpringApplication.run(AccountMCPService.class, args);
    }

    @Bean
    public ToolCallbackProvider tools(AccountTools accountTools) {
        return MethodToolCallbackProvider.builder()
                .toolObjects(accountTools)
                .build();
    }
    
}

Java

Here are the application configuration properties. The account-mcp-server app will listen on the 8040 port.

spring:
  ai:
    mcp:
      server:
        name: account-mcp-server
        version: 1.0.0
  jpa:
    database-platform: H2
    generate-ddl: true
    hibernate:
      ddl-auto: create-drop

logging.level.org.springframework.ai: DEBUG

server.port: 8040

YAML

Let’s run the second server-side app:

$ cd spring-ai-mcp/account-mcp-service
$ mvn spring-boot:run

ShellSession

Once we start the application, we should see the log indicating how many tools were registered in the MCP server.

Build MCP Client App with Spring AI

Implementation

We will create a single client-side application. However, we can imagine an architecture where many applications consume tools exposed by one MCP server. Our application interacts with the OpenAI chat model, so we must include the Spring AI OpenAI starter. For the MCP Client starter, we can choose between two dependencies: Standard MCP client and Spring WebFlux client. Spring team recommends using the WebFlux-based SSE connection with the spring-ai-mcp-client-webflux-spring-boot-starter. Finally, we include the Spring Web starter to expose the REST endpoint. However, you can use Spring WebFlux starter to expose them reactively.


  
    org.springframework.boot
    spring-boot-starter-web
  
  
    org.springframework.ai
    spring-ai-mcp-client-webflux-spring-boot-starter
  
  
    org.springframework.ai
    spring-ai-openai-spring-boot-starter

XML

Our MCP client connects with two MCP servers. We must provide the following connection settings in the application.yml file.

spring.ai.mcp.client.sse.connections:
  person-mcp-server:
    url: http://localhost:8060
  account-mcp-server:
    url: http://localhost:8040

ShellSession

Our sample Spring Boot application contains to @RestControllers, which expose HTTP endpoints. The PersonController class defines two endpoints for searching and counting persons by nationality. The MCP Client Boot Starter automatically configures tool callbacks that integrate with Spring AI’s tool execution framework. Thanks to that we can use the ToolCallbackProvider instance to provide default tools to the ChatClient bean. Then, we can perform the standard steps to interact with the AI model with Spring AI ChatClient. However, the client will use tools exposed by both sample MCP servers.

@RestController
@RequestMapping("/persons")
public class PersonController {

    private final static Logger LOG = LoggerFactory
        .getLogger(PersonController.class);
    private final ChatClient chatClient;

    public PersonController(ChatClient.Builder chatClientBuilder,
                            ToolCallbackProvider tools) {
        this.chatClient = chatClientBuilder
                .defaultTools(tools)
                .build();
    }

    @GetMapping("/nationality/{nationality}")
    String findByNationality(@PathVariable String nationality) {

        PromptTemplate pt = new PromptTemplate("""
                Find persons with {nationality} nationality.
                """);
        Prompt p = pt.create(Map.of("nationality", nationality));
        return this.chatClient.prompt(p)
                .call()
                .content();
    }

    @GetMapping("/count-by-nationality/{nationality}")
    String countByNationality(@PathVariable String nationality) {
        PromptTemplate pt = new PromptTemplate("""
                How many persons come from {nationality} ?
                """);
        Prompt p = pt.create(Map.of("nationality", nationality));
        return this.chatClient.prompt(p)
                .call()
                .content();
    }
}

Java

Let’s switch to the second @RestController. The AccountController class defines two endpoints for searching accounts by person ID. The GET /accounts/count-by-person-id/{personId} returns the number of accounts belonging to a given person. The GET /accounts/balance-by-person-id/{personId} is slightly more complex. It counts the total balance in all person’s accounts. However, it must also return the person’s name and nationality, which means that it must call the getPersonById tool method exposed by the person-mcp-server app after calling the tool for searching accounts by person ID.

@RestController
@RequestMapping("/accounts")
public class AccountController {

    private final static Logger LOG = LoggerFactory.getLogger(PersonController.class);
    private final ChatClient chatClient;

    public AccountController(ChatClient.Builder chatClientBuilder,
                            ToolCallbackProvider tools) {
        this.chatClient = chatClientBuilder
                .defaultTools(tools)
                .build();
    }

    @GetMapping("/count-by-person-id/{personId}")
    String countByPersonId(@PathVariable String personId) {
        PromptTemplate pt = new PromptTemplate("""
                How many accounts has person with {personId} ID ?
                """);
        Prompt p = pt.create(Map.of("personId", personId));
        return this.chatClient.prompt(p)
                .call()
                .content();
    }

    @GetMapping("/balance-by-person-id/{personId}")
    String balanceByPersonId(@PathVariable String personId) {
        PromptTemplate pt = new PromptTemplate("""
                How many accounts has person with {personId} ID ?
                Return person name, nationality and a total balance on his/her accounts.
                """);
        Prompt p = pt.create(Map.of("personId", personId));
        return this.chatClient.prompt(p)
                .call()
                .content();
    }

}

Java

Running the Application

Before starting the client-side app we must export the OpenAI token as the SPRING_AI_OPENAI_API_KEY environment variable.

export SPRING_AI_OPENAI_API_KEY=

ShellSession

Then go to the sample-client directory and run the app with the following command:

$ cd spring-ai-mcp/sample-client
$ mvn spring-boot:run

ShellSession

Once we start the application, we can switch to the logs. As you see, the sample-client app receives responses with tools from both person-mcp-server and account-mcp-server apps.

Testing MCP with Spring Boot

Both server-side applications load data from the import.sql scripts on startup. Spring Data JPA automatically imports data from such scripts. Our MCP client application listens on the 8080 port. Let’s call the first endpoint to get a list of persons from Germany:

curl http://localhost:8080/persons/nationality/Germany

ShellSession

Here’s the response from the OpenAI model:

We can also call the endpoint that counts the number with a given nationality.

curl http://localhost:8080/persons/count-by-nationality/Germany

ShellSession

As the final test, we can call the GET /accounts/balance-by-person-id/{personId} endpoint that interacts with tools exposed by both MCP server-side apps. It requires an AI model to combine data from person and account sources.

Exposing Prompts with MCP

We can also expose prompts and resources with the Spring AI MCP server support. To register and expose prompts we need to define the list of SyncPromptRegistration objects. It contains the name of the prompt, a list of input arguments, and a text content.

{ String argument = (String) getPromptRequest.arguments().get("nationality"); var userMessage = new McpSchema.PromptMessage(McpSchema.Role.USER, new McpSchema.TextContent("How many persons come from " + argument + " ?")); return new McpSchema.GetPromptResult("Count persons by nationality", List.of(userMessage)); }); return List.of(promptRegistration); } }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

@SpringBootApplication
public class PersonMCPServer {

    public static void main(String[] args) {
        SpringApplication.run(PersonMCPServer.class, args);
    }

    @Bean
    public ToolCallbackProvider tools(PersonTools personTools) {
        return MethodToolCallbackProvider.builder()
                .toolObjects(personTools)
                .build();
    }

    @Bean
    public List prompts() {
        var prompt = new McpSchema.Prompt("persons-by-nationality", "Get persons by nationality",
                List.of(new McpSchema.PromptArgument("nationality", "Person nationality", true)));

        var promptRegistration = new McpServerFeatures.SyncPromptRegistration(prompt, getPromptRequest -> {
            String argument = (String) getPromptRequest.arguments().get("nationality");
            var userMessage = new McpSchema.PromptMessage(McpSchema.Role.USER,
                    new McpSchema.TextContent("How many persons come from " + argument + " ?"));
            return new McpSchema.GetPromptResult("Count persons by nationality", List.of(userMessage));
        });

        return List.of(promptRegistration);
    }
}

ShellSession

After startup, the application prints information about a list of registered prompts in the logs.

There is no built-in Spring AI support for loading prompts using the MCP client. However, Spring AI MCP support is under active development so we may expect some new features soon. For now, Spring AI provides the auto-configured instance of McpSyncClient. We can use it to search the prompt in the list of prompts received from the server. Then, we can prepare the PromptTemplate instance using the registered content and create the Prompt by filling the template with the input parameters.

mcpSyncClients; public PersonController(ChatClient.Builder chatClientBuilder, ToolCallbackProvider tools, List mcpSyncClients) { this.chatClient = chatClientBuilder .defaultTools(tools) .build(); this.mcpSyncClients = mcpSyncClients; } // ... other endpoints @GetMapping("/count-by-nationality-from-client/{nationality}") String countByNationalityFromClient(@PathVariable String nationality) { return this.chatClient .prompt(loadPromptByName("persons-by-nationality", nationality)) .call() .content(); } Prompt loadPromptByName(String name, String nationality) { McpSchema.GetPromptRequest r = new McpSchema .GetPromptRequest(name, Map.of("nationality", nationality)); var client = mcpSyncClients.stream() .filter(c -> c.getServerInfo().name().equals("person-mcp-server")) .findFirst(); if (client.isPresent()) { var content = (McpSchema.TextContent) client.get() .getPrompt(r) .messages() .getFirst() .content(); PromptTemplate pt = new PromptTemplate(content.text()); Prompt p = pt.create(Map.of("nationality", nationality)); LOG.info("Prompt: {}", p); return p; } else return null; } }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

@RestController
@RequestMapping("/persons")
public class PersonController {

    private final static Logger LOG = LoggerFactory
        .getLogger(PersonController.class);
    private final ChatClient chatClient;
    private final List<McpSyncClient> mcpSyncClients;

    public PersonController(ChatClient.Builder chatClientBuilder,
                            ToolCallbackProvider tools,
                            List<McpSyncClient> mcpSyncClients) {
        this.chatClient = chatClientBuilder
                .defaultTools(tools)
                .build();
        this.mcpSyncClients = mcpSyncClients;
    }

    // ... other endpoints
    
    @GetMapping("/count-by-nationality-from-client/{nationality}")
    String countByNationalityFromClient(@PathVariable String nationality) {
        return this.chatClient
                .prompt(loadPromptByName("persons-by-nationality", nationality))
                .call()
                .content();
    }

    Prompt loadPromptByName(String name, String nationality) {
        McpSchema.GetPromptRequest r = new McpSchema
            .GetPromptRequest(name, Map.of("nationality", nationality));
        var client = mcpSyncClients.stream()
                .filter(c -> c.getServerInfo().name().equals("person-mcp-server"))
                .findFirst();
        if (client.isPresent()) {
            var content = (McpSchema.TextContent) client.get() 
                .getPrompt(r)
                .messages()
                .getFirst()
                .content();
            PromptTemplate pt = new PromptTemplate(content.text());
            Prompt p = pt.create(Map.of("nationality", nationality));
            LOG.info("Prompt: {}", p);
            return p;
        } else return null;
    }
}

Java

Final Thoughts

Model Context Protocol is an important initiative in the AI world. It allows us to avoid reinventing the wheel for each new data source. A unified protocol streamlines integration, minimizing development time and complexity. As businesses expand their AI toolsets, MCP enables seamless connectivity across multiple systems without the burden of excessive custom code. Spring AI introduced the initial version of MCP support recently. It seems promising. With Spring AI Client and Server starters, we may implement a distributed architecture, where several different apps use the AI tools exposed by a single service.

The post Using Model Context Protocol (MCP) with Spring AI appeared first on Piotr's TechBlog.

Tool Calling with Spring AI

piotr.minkowski — Thu, 13 Mar 2025 15:55:40 +0000

This article will show you how to use Spring AI support with the most popular AI models for the tool calling feature. Tool calling (or function calling), is a common pattern in AI applications that enables a model to interact with APIs or tools, extending its capabilities. The most popular AI models are trained to know when to call a function. Spring AI formerly supported it through the Function Calling API, which has been deprecated and marked for removal in the next release. My previous article described that feature based on interactions with an internal database and an external market stock API. Today, we will consider the same use case. This time, however, we will replace the deprecated Function Calling API with a new Tool calling feature.

This is the sixth part of my series of articles about Spring Boot and AI. It is worth reading the following posts before proceeding with the current one. Please pay special attention to the second article. I will refer to it often in this article.

https://piotrminkowski.com/2025/01/28/getting-started-with-spring-ai-and-chat-model: The first tutorial introduces the Spring AI project and its support for building applications based on chat models like OpenAI or Mistral AI.
https://piotrminkowski.com/2025/01/30/getting-started-with-spring-ai-function-calling: The second tutorial shows Spring AI support for Java function calling with the OpenAI chat model.
https://piotrminkowski.com/2025/02/24/using-rag-and-vector-store-with-spring-ai: The third tutorial shows Spring AI support for RAG (Retrieval Augmented Generation) and vector store.
https://piotrminkowski.com/2025/03/04/spring-ai-with-multimodality-and-images: The fourth tutorial shows Spring AI support for a multimodality feature and image generation
https://piotrminkowski.com/2025/03/10/using-ollama-with-spring-ai: The fifth tutorial shows Spring AI supports for interactions with AI models run with Ollama

Source Code

Feel free to use my source code if you’d like to try it out yourself. To do that, you must clone my sample GitHub repository. Then you should only follow my instructions.

Motivation for Tool Calling in Spring AI

The tool calling feature helps us solve a common AI model challenge related to internal or live data sources. If we want to augment a model with such data our applications must allow it to interact with a set of APIs or tools. In our case, the internal database (H2) contains information about the structure of our stock wallet. The sample Spring Boot application asks an AI model about the total value of the wallet based on daily stock prices or the highest value for the last few days. The model must retrieve the structure of our stock wallet and the latest stock prices. We will do the same exercise as for a function calling feature. It will be enhanced with additional scenarios I’ll describe later.

Use the Calling Tools Feature in Spring AI

Create WalletTools

Let’s begin with the WalletTools implementation, which is responsible for interaction with a database. We can compare it to the previous implementation based on Spring functions available in the pl.piomin.services.functions.stock.WalletService class. It defines a single method annotated with @Tool. The important element is the right description that must inform the model what that method does. The method returns the number of shares for each company in our portfolio retrieved from the database through the Spring Data @Repository.

getNumberOfShares() { return (List) walletRepository.findAll(); } }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

public class WalletTools {

    private WalletRepository walletRepository;

    public WalletTools(WalletRepository walletRepository) {
        this.walletRepository = walletRepository;
    }

    @Tool(description = "Number of shares for each company in my wallet")
    public List<Share> getNumberOfShares() {
        return (List<Share>) walletRepository.findAll();
    }
}

Java

We can register the WalletTools class as a Spring @Bean in the application main class.

@Bean
public WalletTools walletTools(WalletRepository walletRepository) {
   return new WalletTools(walletRepository);
}

Java

The Spring Boot application launches an embedded, in-memory database and inserts test data into the stock table. Our wallet contains the most popular companies on the U.S. stock market, including Amazon, Meta, and Microsoft.

insert into share(id, company, quantity) values (1, 'AAPL', 100);
insert into share(id, company, quantity) values (2, 'AMZN', 300);
insert into share(id, company, quantity) values (3, 'META', 300);
insert into share(id, company, quantity) values (4, 'MSFT', 400);
insert into share(id, company, quantity) values (5, 'NVDA', 200);

SQL

Create StockTools

The StockTools class is responsible for interaction with TwelveData stock API. It defines two methods. The getLatestStockPrices method returns only the latest close price for a specified company. It is a tool calling version of the method provided within the pl.piomin.services.functions.stock.StockService function. The second method is more complicated. It must return a historical daily close prices for a defined number of days. Each price must be correlated with a quotation date.

{}", company, latestData.getClose()); return new StockResponse(Float.parseFloat(latestData.getClose())); } @Tool(description = "Historical daily stock prices") public List getHistoricalStockPrices(@ToolParam(description = "Search period in days") int days, @ToolParam(description = "Name of company") String company) { StockData data = restTemplate.getForObject("https://api.twelvedata.com/time_series?symbol={0}&interval=1day&outputsize={1}&apikey={2}", StockData.class, company, days, apiKey); return data.getValues().stream() .map(d -> new DailyShareQuote(company, Float.parseFloat(d.getClose()), d.getDatetime())) .toList(); } }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

public class StockTools {

    private static final Logger LOG = LoggerFactory.getLogger(StockTools.class);

    private RestTemplate restTemplate;
    @Value("${STOCK_API_KEY:none}")
    String apiKey;

    public StockTools(RestTemplate restTemplate) {
        this.restTemplate = restTemplate;
    }

    @Tool(description = "Latest stock prices")
    public StockResponse getLatestStockPrices(@ToolParam(description = "Name of company") String company) {
        StockData data = restTemplate.getForObject("https://api.twelvedata.com/time_series?symbol={0}&interval=1min&outputsize=1&apikey={1}",
                StockData.class,
                company,
                apiKey);
        DailyStockData latestData = data.getValues().get(0);
        LOG.info("Get stock prices: {} -> {}", company, latestData.getClose());
        return new StockResponse(Float.parseFloat(latestData.getClose()));
    }

    @Tool(description = "Historical daily stock prices")
    public List<DailyShareQuote> getHistoricalStockPrices(@ToolParam(description = "Search period in days") int days,
                                                          @ToolParam(description = "Name of company") String company) {
        StockData data = restTemplate.getForObject("https://api.twelvedata.com/time_series?symbol={0}&interval=1day&outputsize={1}&apikey={2}",
                StockData.class,
                company,
                days,
                apiKey);
        return data.getValues().stream()
                .map(d -> new DailyShareQuote(company, Float.parseFloat(d.getClose()), d.getDatetime()))
                .toList();
    }
}

Java

Here’s the DailyShareQuote Java record returned in the response list.

public record DailyShareQuote(String company, float price, String datetime) {
}

Java

Then, let’s register the StockUtils class as a Spring @Bean.

@Bean
public StockTools stockTools() {
   return new StockTools(restTemplate());
}

Java

Spring AI Tool Calling Flow

Here’s a fragment of the WalletController code, which is responsible for defining interactions with LLM and HTTP endpoints implementation. It injects both StockTools and WalletTools beans.

@RestController
@RequestMapping("/wallet")
public class WalletController {

    private final ChatClient chatClient;
    private final StockTools stockTools;
    private final WalletTools walletTools;

    public WalletController(ChatClient.Builder chatClientBuilder,
                            StockTools stockTools,
                            WalletTools walletTools) {
        this.chatClient = chatClientBuilder
                .defaultAdvisors(new SimpleLoggerAdvisor())
                .build();
        this.stockTools = stockTools;
        this.walletTools = walletTools;
    }
    
    // HTTP endpoints implementation
}

Java

The GET /wallet/with-tools endpoint calculates the value of our stock wallet in dollars. It uses the latest daily stock prices for each company’s shares from the wallet. There are a few ways to register tools for a chat model call. We use the tools method provided by the ChatClient interface. It allows us to pass the tool object references directly to the chat client. In this case, we are registering the StockTools bean which contains two @Tool methods. The AI model must choose the right method to call in StockTools based on the description and input argument. It should call the getLatestStockPrices method.

@GetMapping("/with-tools")
String calculateWalletValueWithTools() {
   PromptTemplate pt = new PromptTemplate("""
   What’s the current value in dollars of my wallet based on the latest stock daily prices ?
   """);

   return this.chatClient.prompt(pt.create())
           .tools(stockTools, walletTools)
           .call()
           .content();
}

Java

The GET /wallet/highest-day/{days} endpoint calculates the value of our stock wallet in dollars for each day in the specified period determined by the days variable. Then it must return the day with the highest stock wallet value. Same as before we use the tools method from ChatClient to register our tool calling methods. It should call the getHistoricalStockPrices method.

@GetMapping("/highest-day/{days}")
String calculateHighestWalletValue(@PathVariable int days) {
   PromptTemplate pt = new PromptTemplate("""
   On which day during last {days} days my wallet had the highest value in dollars based on the historical daily stock prices ?
   """);

   return this.chatClient.prompt(pt.create(Map.of("days", days)))
            .tools(stockTools, walletTools)
            .call()
            .content();
}

Java

The following diagram illustrates a flow for the second use case that returns the day with the highest stock wallet value. First, it must connect with the database and retrieve the stock wallet structure containing a number of each company shares. Then, it must call the stock API for every company found in the wallet. So, finally, the method calculateHighestWalletValue should be called five times with different values of the company @ToolParam and a value of the days determined by the HTTP endpoint path variable. Once all the data is collected AI model calculates the highest wallet value and returns it together with the quotation date.

Run Application and Verify Tool Calling

Before starting the application we must set environment variables with the AI model and stock API tokens.

export OPEN_AI_TOKEN=<YOUR_OPEN_AI_TOKEN>
export STOCK_API_KEY=<YOUR_STOCK_API_KEY>

Java

Then run the following Maven command:

mvn spring-boot:run

Java

Once the application is started, we can call the first endpoint. The GET /wallet/with-tools calculates the total least value of the stock wallet structure stored in the database.

curl http://localhost:8080/wallet/with-tools

ShellSession

Here’s the fragment of logs generated by the Spring AI @Tool methods. The model behaves as expected. First, it calls the getNumberOfShares tool to retrieve a wallet structure. Then it calls the getLatestStockPrices tool per share to obtain its current price.

Here’s a final response with a wallet value with a detailed explanation.

Then we can call the GET /wallet/highest-day/{days} endpoint to return the day with the highest wallet value. Let’s calculate it for the last 20 days.

curl http://localhost:8080/wallet/highest-day/20

ShellSession

The response is very detailed. Here’s the final part of the content returned by the OpenAI chat model. It returns 26.02.2025 as the day with the highest wallet value. Frankly, sometimes it returns different answers…

However, the AI flow works fine. First, it calls the getNumberOfShares tool to retrieve a wallet structure. Then it calls the getHistoricalStockPrices tool per share to obtain its prices for the last 20 days.

We can switch to another AI model to compare their responses. You can connect my sample Spring Boot application e.g. with Mistral AI by activating the mistral-ai Maven profile.

mvn spring-boot:run -Pmistral-ai

ShellSession

Before running the app we must export the Mistral API token.

export MISTRAL_AI_TOKEN=

ShellSession

To get the best results I changed the Mistral model to mistral-large-latest.

spring.ai.mistralai.chat.options.model = mistral-large-latest

ShellSession

The response from Mistral AI was pretty quick and short:

Final Thoughts

In this article, we analyzed the Spring AI support for tool calling support, which replaces Function Calling API. Tool calling is a powerful feature that enhances how AI models interact with external tools, APIs, and structured data. It makes AI more interactive and practical for real-world applications. Spring AI provides a flexible way to register and invoke such tools. However, it still requires attention from developers, who need to define clear function schemas and handle edge cases.

The post Tool Calling with Spring AI appeared first on Piotr's TechBlog.

Spring AI with Multimodality and Images

piotr.minkowski — Tue, 04 Mar 2025 08:56:24 +0000

This article will teach you how to create a Spring Boot application that handles images and text using the Spring AI multimodality feature. Multimodality is the ability to understand and process information from different sources simultaneously. It covers text, images, audio, and other data formats. We will perform simple experiments with multimodality and images. This is the fourth part of my series of articles about Spring Boot and AI. It is worth reading the following posts before proceeding with the current one:

https://piotrminkowski.com/2025/01/28/getting-started-with-spring-ai-and-chat-model: The first tutorial introduces the Spring AI project and its support for building applications based on chat models like OpenAI or Mistral AI.
https://piotrminkowski.com/2025/01/30/getting-started-with-spring-ai-function-calling: The second tutorial shows Spring AI support for Java function calling with the OpenAI chat model.
https://piotrminkowski.com/2025/02/24/using-rag-and-vector-store-with-spring-ai: The third tutorial shows Spring AI support for RAG (Retrieval Augmented Generation) and vector store.

Source Code

Feel free to use my source code if you’d like to try it out yourself. To do that, you must clone my sample GitHub repository. Then you should only follow my instructions.

Motivation for Multimodality with Spring AI

The multimodal large language model (LLM) capabilities allow it to process and generate text alongside other modalities, including images, audio, and video. This feature covers a use case when we want LLM to detect something specific inside an image or describe its content. Let’s assume we have a list of input images. We want to find the image in that list that matches our description. For example, this description can ask a model to find the image that contains a specified item. The Spring AI Message API provides all the necessary elements to support multimodal LLMs. Here’s a diagram that illustrates our scenario.

Use Multimodality with Spring AI

We don’t need to include any specific library other than the Spring AI starter for a particular AI model. The default option is spring-ai-openai-spring-boot-starter. Our application uses images stored in the src/main/resources/images directory. Spring AI multimodality support requires the image to be passed inside the Media object. We load all the pictures from the classpath inside the constructor.

Recognize Items in the Image

The GET /images/find/{object} tries to find the image that contains the item determined by the object path variable. AI model must return a position on the image in the input list. To achieve that, we create an UserMessage object that contains a user query and a list of the Media objects. Once the model returns the position, the endpoint reads the image from the list and returns its content in the image/png format.

images; private List dynamicImages = new ArrayList<>(); public ImageController(ChatClient.Builder chatClientBuilder) { this.chatClient = chatClientBuilder .defaultAdvisors(new SimpleLoggerAdvisor()) .build(); this.images = List.of( Media.builder().id("fruits").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits.png")).build(), Media.builder().id("fruits-2").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits-2.png")).build(), Media.builder().id("fruits-3").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits-3.png")).build(), Media.builder().id("fruits-4").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits-4.png")).build(), Media.builder().id("fruits-5").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits-5.png")).build(), Media.builder().id("animals").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals.png")).build(), Media.builder().id("animals-2").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals-2.png")).build(), Media.builder().id("animals-3").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals-3.png")).build(), Media.builder().id("animals-4").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals-4.png")).build(), Media.builder().id("animals-5").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals-5.png")).build() ); } @GetMapping(value = "/find/{object}", produces = MediaType.IMAGE_PNG_VALUE) @ResponseBody byte[] analyze(@PathVariable String object) { String msg = """ Which picture contains %s. Return only a single picture. Return only the number that indicates its position in the media list. """.formatted(object); LOG.info(msg); UserMessage um = new UserMessage(msg, images); String content = this.chatClient.prompt(new Prompt(um)) .call() .content(); assert content != null; return images.get(Integer.parseInt(content)-1).getDataAsByteArray(); } }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

@RestController
@RequestMapping("/images")
public class ImageController {

    private final static Logger LOG = LoggerFactory
        .getLogger(ImageController.class);

    private final ChatClient chatClient;
    private List<Media> images;
    private List<Media> dynamicImages = new ArrayList<>();

    public ImageController(ChatClient.Builder chatClientBuilder) {
        this.chatClient = chatClientBuilder
                .defaultAdvisors(new SimpleLoggerAdvisor())
                .build();
        this.images = List.of(
                Media.builder().id("fruits").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits.png")).build(),
                Media.builder().id("fruits-2").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits-2.png")).build(),
                Media.builder().id("fruits-3").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits-3.png")).build(),
                Media.builder().id("fruits-4").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits-4.png")).build(),
                Media.builder().id("fruits-5").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits-5.png")).build(),
                Media.builder().id("animals").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals.png")).build(),
                Media.builder().id("animals-2").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals-2.png")).build(),
                Media.builder().id("animals-3").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals-3.png")).build(),
                Media.builder().id("animals-4").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals-4.png")).build(),
                Media.builder().id("animals-5").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals-5.png")).build()
        );
    }

    @GetMapping(value = "/find/{object}", produces = MediaType.IMAGE_PNG_VALUE)
    @ResponseBody byte[] analyze(@PathVariable String object) {
        String msg = """
        Which picture contains %s.
        Return only a single picture.
        Return only the number that indicates its position in the media list.
        """.formatted(object);
        LOG.info(msg);

        UserMessage um = new UserMessage(msg, images);

        String content = this.chatClient.prompt(new Prompt(um))
                .call()
                .content();

        assert content != null;
        return images.get(Integer.parseInt(content)-1).getDataAsByteArray();
    }

}

Java

Let’s make a test call. We will look for the picture containing a banana. Here’s the AI model response after calling the http://localhost:8080/images/find/banana. You can try to make other test calls and find an image with e.g. an orange or a tomato.

Describe Image Contents

On the other hand, we can ask the AI model to generate a short description of all images included as the Media content. The GET /images/describe endpoint merges two lists of images.

@GetMapping("/describe")
String[] describe() {
   UserMessage um = new UserMessage("Explain what do you see on each image.",
            List.copyOf(Stream.concat(images.stream(), dynamicImages.stream()).toList()));
      return this.chatClient.prompt(new Prompt(um))
              .call()
              .entity(String[].class);
}

Java

Once we call the http://localhost:8080/images/describe URL we will receive a compact description of all input images. The two highlighted descriptions have been generated for images from the dynamicImages List. These images were generated by the AI image model. We will discuss this in the next section.

Generate Images with AI Model

To generate an image using AI API we must inject the ImageModel bean. It provides a single call method that allows us to communicate with AI Models dedicated to image generation. This method takes the ImagePrompt object as an argument. Typically, we use the ImagePrompt constructor that takes instructions for image generation and options that customize the height, width, and number of images. We will generate a single (N=1) image with 1024 pixels in height and width. The AI model returns the image URL (responseFormat). Once the image is generated, we create an UrlResource object, create the Media object, and put it into the dynamicImages List. The GET /images/generate/{object} endpoint returns a byte array representation of the image object.

images; private List dynamicImages = new ArrayList<>(); public ImageController(ChatClient.Builder chatClientBuilder, ImageModel imageModel) { this.chatClient = chatClientBuilder .defaultAdvisors(new SimpleLoggerAdvisor()) .build(); this.imageModel = imageModel; // other initializations } @GetMapping(value = "/generate/{object}", produces = MediaType.IMAGE_PNG_VALUE) byte[] generate(@PathVariable String object) throws IOException { ImageResponse ir = imageModel.call(new ImagePrompt("Generate an image with " + object, ImageOptionsBuilder.builder() .height(1024) .width(1024) .N(1) .responseFormat("url") .build())); UrlResource url = new UrlResource(ir.getResult().getOutput().getUrl()); LOG.info("Generated URL: {}", ir.getResult().getOutput().getUrl()); dynamicImages.add(Media.builder() .id(UUID.randomUUID().toString()) .mimeType(MimeTypeUtils.IMAGE_PNG) .data(url) .build()); return url.getContentAsByteArray(); } }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

@RestController
@RequestMapping("/images")
public class ImageController {

    private final ChatClient chatClient;
    private final ImageModel imageModel;
    private List<Media> images;
    private List<Media> dynamicImages = new ArrayList<>();
    
    public ImageController(ChatClient.Builder chatClientBuilder,
                           ImageModel imageModel) {
        this.chatClient = chatClientBuilder
                .defaultAdvisors(new SimpleLoggerAdvisor())
                .build();
        this.imageModel = imageModel;
        // other initializations
    }
    
    @GetMapping(value = "/generate/{object}", produces = MediaType.IMAGE_PNG_VALUE)
    byte[] generate(@PathVariable String object) throws IOException {
        ImageResponse ir = imageModel.call(new ImagePrompt("Generate an image with " + object, ImageOptionsBuilder.builder()
                .height(1024)
                .width(1024)
                .N(1)
                .responseFormat("url")
                .build()));
        UrlResource url = new UrlResource(ir.getResult().getOutput().getUrl());
        LOG.info("Generated URL: {}", ir.getResult().getOutput().getUrl());
        dynamicImages.add(Media.builder()
                .id(UUID.randomUUID().toString())
                .mimeType(MimeTypeUtils.IMAGE_PNG)
                .data(url)
                .build());
        return url.getContentAsByteArray();
    }
    
}

Java

Do you remember the description of that image returned by the GET /images/describe endpoint? Here’s our image with strawberry generated by the AI model after calling the http://localhost:8080/images/generate/strawberry URL.

Here’s a similar test for the banana input parameter.

Use Vector Store with Spring AI Multimodality

Let’s consider how we can leverage vector store in our scenario. We cannot insert image representation directly to a vector store since most popular vendors like OpenAI or Mistral AI do not provide image embedding models. We could integrate directly with a model like clip-vit-base-patch32 to generate image embeddings, but this article won’t cover such a scenario. Instead, a vector store may contain an image description and its location (or name). The GET /images/load endpoint provides a method for loading image descriptions into a vector store. It uses Spring AI multimodality support to generate a compact description of each image in the input list and then puts it into the store.

    @GetMapping("/load")
    void load() throws JsonProcessingException {
        String msg = """
        Explain what do you see on the image.
        Generate a compact description that explains only what is visible.
        """;
        for (Media image : images) {
            UserMessage um = new UserMessage(msg, image);
            String content = this.chatClient.prompt(new Prompt(um))
                    .call()
                    .content();

            var doc = Document.builder()
                    .id(image.getId())
                    .text(mapper.writeValueAsString(new ImageDescription(image.getId(), content)))
                    .build();
            store.add(List.of(doc));
            LOG.info("Document added: {}", image.getId());
        }
    }

Java

Finally, we can implement another endpoint that generates a new image and asks the AI model to generate an image description. Then, it performs a similarity search in a vector store to find the most similar image based on its text description.

generateAndMatch(@PathVariable String object) throws IOException { ImageResponse ir = imageModel.call(new ImagePrompt("Generate an image with " + object, ImageOptionsBuilder.builder() .height(1024) .width(1024) .N(1) .responseFormat("url") .build())); UrlResource url = new UrlResource(ir.getResult().getOutput().getUrl()); LOG.info("URL: {}", ir.getResult().getOutput().getUrl()); String msg = """ Explain what do you see on the image. Generate a compact description that explains only what is visible. """; UserMessage um = new UserMessage(msg, new Media(MimeTypeUtils.IMAGE_PNG, url)); String content = this.chatClient.prompt(new Prompt(um)) .call() .content(); SearchRequest searchRequest = SearchRequest.builder() .query("Find the most similar description to this: " + content) .topK(2) .build(); return store.similaritySearch(searchRequest); }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

    @GetMapping("/generate-and-match/{object}")
    List<Document> generateAndMatch(@PathVariable String object) throws IOException {
        ImageResponse ir = imageModel.call(new ImagePrompt("Generate an image with " + object, ImageOptionsBuilder.builder()
                .height(1024)
                .width(1024)
                .N(1)
                .responseFormat("url")
                .build()));
        UrlResource url = new UrlResource(ir.getResult().getOutput().getUrl());
        LOG.info("URL: {}", ir.getResult().getOutput().getUrl());

        String msg = """
        Explain what do you see on the image.
        Generate a compact description that explains only what is visible.
        """;

        UserMessage um = new UserMessage(msg, new Media(MimeTypeUtils.IMAGE_PNG, url));
        String content = this.chatClient.prompt(new Prompt(um))
                .call()
                .content();

        SearchRequest searchRequest = SearchRequest.builder()
                .query("Find the most similar description to this: " + content)
                .topK(2)
                .build();

        return store.similaritySearch(searchRequest);
    }

Java

Let’s test the GET /images/generate-and-match/{object} endpoint using the pineapple parameter. It returns the description of the fruits.png image from the classpath.

By the way, here’s the fruits.png image located in the /src/main/resources/images directory.

Final Thoughts

Spring AI provides multimodality and image generation support. All the features presented in this article work fine with OpenAI. It supports both the image model and multimodality. To read more about the support offered by other models, refer to the Spring AI chat and image model docs.

This article shows how we can use Spring AI and AI models to interact with images in various ways.

The post Spring AI with Multimodality and Images appeared first on Piotr's TechBlog.

Using RAG and Vector Store with Spring AI

piotr.minkowski — Mon, 24 Feb 2025 14:29:03 +0000

This article will teach you how to create a Spring Boot application that uses RAG (Retrieval Augmented Generation) and vector store with Spring AI. We will continue experiments with stock data, which were initiated in my previous article about Spring AI. This is the third part of my series of articles about Spring Boot and AI. It is worth reading the following posts before proceeding with the current one:

https://piotrminkowski.com/2025/01/28/getting-started-with-spring-ai-and-chat-model: The first tutorial introduces the Spring AI project and its support for building applications based on chat models like OpenAI or Mistral AI.
https://piotrminkowski.com/2025/01/30/getting-started-with-spring-ai-function-calling: The second tutorial shows Spring AI support for Java function calling with the OpenAI chat model.

This article will show how to include one of the vector stores supported by Spring AI and advisors dedicated to RAG support in our sample application codebase used by two previous articles. It will connect to Open AI API, but you can easily switch to other models using Mistral AI or Ollama support in Spring AI. For more details, please refer to my first article.

Source Code

If you would like to try it by yourself, you may always take a look at my source code. To do that, you must clone my sample GitHub repository. Then you should only follow my instructions.

Motivation for RAG with Spring AI

The problem to solve is similar to the one described in my previous article about the Spring AI function calling feature. Since the OpenAI model is trained on a static dataset it does not have direct access to the online services or APIs. We want it to analyze stock growth trends for the biggest companies in the US stock market. Therefore, we must obtain share prices from a public API that returns live stock market data. Then, we can store this data in our local database and integrate it with the sample Spring Boot AI application. Instead of a typical relational database, we will use a vector store. In vector databases, queries work differently from those in traditional relational databases. Instead of looking for exact matches, they conduct similarity searches. It retrieves the most similar vectors to a given input vector.

After loading all required data into a vector database, we must integrate it with the AI model. Spring AI provides a comfortable mechanism for that based on the Advisors API. We have already used some built-in advisors in the previous examples, e.g. to print detailed AI communication logs or enable chat memory. This time they will allow us to implement a Retrieval Augmented Generation (RAG) technique for our app. Thanks to that, the Spring Boot app will retrieve similar documents that best match a user query before sending a request to the AI model. These documents provide context for the query and are sent to the AI model alongside the user’s question.

Here’s a simplified visualization of our process.

Vector Store with Spring AI

Set up Pinecone Database

In this section, we will prepare a vector store, integrate it with our Spring Boot application, and load some data there. Spring AI supports various vector databases. It provides the VectorStore interface to directly interact with a vector store from our Spring Boot app. The full list of supported databases can be found in the Spring AI docs here.

We will proceed with the Pinecone database. It is a popular cloud-based vector database, that allows us to store and search vectors efficiently. Instead of a cloud-based database, we can set up a local instance of another popular vector store – ChromaDB. In that case, you can use the docker-compose.yml file in the repository root directory, to run that database with the docker compose up command. With Pinecone we need to sign up to create an account on their portal. Then we should create an index. There are several customizations available, but the most important thing is to choose the right embedding model. Since text-embedding-ada-002 is a default embedding model for OpenAI we should choose that option. The name of our index is spring-ai. We can read an environment and project name from the generated host URL.

After creating an index, we should generate an API key.

Then, we will copy the generated key and export it as the PINECONE_TOKEN environment variable.

export PINECONE_TOKEN=

ShellSession

Integrate Spring Boot app with Pinecode using Spring AI

Our Spring Boot application must include the spring-ai-pinecone-store-spring-boot-starter dependency to smoothly integrate with the Pinecone vector store.


  org.springframework.ai
  spring-ai-pinecone-store-spring-boot-starter

XML

Then, we must provide connection settings and credentials to the Pinecone database in the Spring Boot application.properties file. It must be at least the Pinecode API key, environment name, project name, and index name.

spring.ai.vectorstore.pinecone.apiKey = ${PINECONE_TOKEN}
spring.ai.vectorstore.pinecone.environment = aped-4627-b74a
spring.ai.vectorstore.pinecone.projectId = fsbak04
spring.ai.vectorstore.pinecone.index-name = spring-ai

SQL

After providing all the required configuration settings we can inject and use the VectorStore bean e.g. in our application REST controller. In the following code fragment, we load input data into a vector store and perform a simple similarity search to find the most growth stock trend. We individually query the Twelvedata API for each company from a list and deserialize the response to the StockData object. Then we create a Spring AI Document object, which contains the name of a company and share close prices for the last 10 days. Data is written in the JSON format.

companies = List.of("AAPL", "MSFT", "GOOG", "AMZN", "META", "NVDA"); for (String company : companies) { StockData data = restTemplate.getForObject("https://api.twelvedata.com/time_series?symbol={0}&interval=1day&outputsize=10&apikey={1}", StockData.class, company, apiKey); if (data != null && data.getValues() != null) { var list = data.getValues().stream().map(DailyStockData::getClose).toList(); var doc = Document.builder() .id(company) .text(mapper.writeValueAsString(new Stock(company, list))) .build(); store.add(List.of(doc)); LOG.info("Document added: {}", company); } } } @GetMapping("/docs") List query() { SearchRequest searchRequest = SearchRequest.builder() .query("Find the most growth trends") .topK(2) .build(); List docs = store.similaritySearch(searchRequest); return docs; } }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

@RestController
@RequestMapping("/stocks")
public class StockController {

    private final ObjectMapper mapper = new ObjectMapper();
    private final static Logger LOG = LoggerFactory.getLogger(StockController.class);
    private final RestTemplate restTemplate;
    private final VectorStore store;

    @Value("${STOCK_API_KEY}")
    private String apiKey;

    public StockController(VectorStore store,
                           RestTemplate restTemplate) {
        this.store = store;
        this.restTemplate = restTemplate;
    }

    @PostMapping("/load-data")
    void load() throws JsonProcessingException {
        final List<String> companies = List.of("AAPL", "MSFT", "GOOG", "AMZN", "META", "NVDA");
        for (String company : companies) {
            StockData data = restTemplate.getForObject("https://api.twelvedata.com/time_series?symbol={0}&interval=1day&outputsize=10&apikey={1}",
                    StockData.class,
                    company,
                    apiKey);
            if (data != null && data.getValues() != null) {
                var list = data.getValues().stream().map(DailyStockData::getClose).toList();
                var doc = Document.builder()
                        .id(company)
                        .text(mapper.writeValueAsString(new Stock(company, list)))
                        .build();
                store.add(List.of(doc));
                LOG.info("Document added: {}", company);
            }
        }
    }
    
    @GetMapping("/docs")
    List<Document> query() {
        SearchRequest searchRequest = SearchRequest.builder()
                .query("Find the most growth trends")
                .topK(2)
                .build();
        List<Document> docs = store.similaritySearch(searchRequest);
        return docs;
    }
    
}

Java

Once we start our application and call the POST /stocks/load-data endpoint, we should see 6 records loaded into the target store. You can verify the content of the database in the Pinocone index browser.

Then we can interact directly with a vector store by calling the GET /stocks/docs endpoint.

curl http://localhost:8080/stocks/docs

ShellSession

Implement RAG with Spring AI

Use QuestionAnswerAdvisor

Previously we loaded data into a target vector store and performed a simple search to find the most growth trend. Our main goal in this section is to incorporate relevant data into an AI model prompt. We can implement RAG with Spring AI in two ways with different advisors. Let’s begin with QuestionAnswerAdvisor. To perform RAG we must provide an instance of QuestionAnswerAdvisor to the ChatClient bean. The QuestionAnswerAdvisor constructor takes the VectorStore instance as an input argument.

@RequestMapping("/v1/most-growth-trend")
String getBestTrend() {
   PromptTemplate pt = new PromptTemplate("""
            {query}.
            Which {target} is the most % growth?
            The 0 element in the prices table is the latest price, while the last element is the oldest price.
            """);

   Prompt p = pt.create(
            Map.of("query", "Find the most growth trends",
                   "target", "share")
   );

   return this.chatClient.prompt(p)
            .advisors(new QuestionAnswerAdvisor(store))
            .call()
            .content();
}

Java

Then, we can call the endpoint GET /stocks/v1/most-growth-trend to see the AI model response. By the way, the result is not accurate.

Let’s work a little bit on our previous code. We will publish a new version of the AI prompt under the GET /stocks/v1-1/most-growth-trend endpoint. The changed lines have been highlighted. We build the SearchRequest objects that return the top 3 records with the 0.7 similarity threshold. The newly created SearchRequest object must be passed as an argument in the QuestionAnswerAdvisor constructor.

@RequestMapping("/v1-1/most-growth-trend")
String getBestTrendV11() {
   PromptTemplate pt = new PromptTemplate("""
            Which share is the most % growth?
            The 0 element in the prices table is the latest price, while the last element is the oldest price.
            Return a full name of company instead of a market shortcut. 
            """);

   SearchRequest searchRequest = SearchRequest.builder()
            .query("""
            Find the most growth trends.
            The 0 element in the prices table is the latest price, while the last element is the oldest price.
            """)
            .topK(3)
            .similarityThreshold(0.7)
            .build();

   return this.chatClient.prompt(pt.create())
            .advisors(new QuestionAnswerAdvisor(store, searchRequest))
            .call()
            .content();
}

Java

Now, the results are more accurate. The model also returns the full name of companies instead of a market shortcut.

Use RetrievalAugmentationAdvisor

Instead of the QuestionAnswerAdvisor class, we can also use the experimental RetrievalAugmentationAdvisor. It provides an out-of-the-box implementation for the most common RAG flows, based on a modular architecture. There are several built-in modules we can use with RetrievalAugmentationAdvisor. We will include the RewriteQueryTransformer module that uses LLM to rewrite a user query to provide better results when querying a target vector database. It requires the query and target placeholders to be present in the prompt template. Thanks to that transformer we can retrieve the optimal set of records for a percentage growth calculation.

@RestController
@RequestMapping("/stocks")
public class StockController {

    private final ObjectMapper mapper = new ObjectMapper();
    private final static Logger LOG = LoggerFactory.getLogger(StockController.class);
    private final ChatClient chatClient;
    private final RewriteQueryTransformer.Builder rqtBuilder;
    private final RestTemplate restTemplate;
    private final VectorStore store;

    @Value("${STOCK_API_KEY}")
    private String apiKey;

    public StockController(ChatClient.Builder chatClientBuilder,
                           VectorStore store,
                           RestTemplate restTemplate) {
        this.chatClient = chatClientBuilder
                .defaultAdvisors(new SimpleLoggerAdvisor())
                .build();
        this.rqtBuilder = RewriteQueryTransformer.builder()
                .chatClientBuilder(chatClientBuilder);
        this.store = store;
        this.restTemplate = restTemplate;
    }
    
    // other methods ...
    
    @RequestMapping("/v2/most-growth-trend")
    String getBestTrendV2() {
        PromptTemplate pt = new PromptTemplate("""
                {query}.
                Which {target} is the most % growth?
                The 0 element in the prices table is the latest price, while the last element is the oldest price.
                """);

        Prompt p = pt.create(Map.of("query", "Find the most growth trends", "target", "share"));

        Advisor retrievalAugmentationAdvisor = RetrievalAugmentationAdvisor.builder()
                .documentRetriever(VectorStoreDocumentRetriever.builder()
                        .similarityThreshold(0.7)
                        .topK(3)
                        .vectorStore(store)
                        .build())
                .queryTransformers(rqtBuilder.promptTemplate(pt).build())
                .build();

        return this.chatClient.prompt(p)
                .advisors(retrievalAugmentationAdvisor)
                .call()
                .content();
    }

}

Java

Once again, we can verify the AI model response by calling the GET /stocks/v2/most-growth-trend endpoint. The response is similar to those generated by the GET /stocks/v1-1/most-growth-trend endpoint.

Run the Application

Only to remind you. Before running the application, we should provide OpenAI and TwelveData API tokens.

$ export OPEN_AI_TOKEN=<YOUR_OPEN_AI_TOKEN>
$ export STOCK_API_KEY=<YOUR_STOCK_API_KEY>
$ mvn spring-boot:run

ShellSession

Final Thoughts

In this article you learned how to use an important AI technique called Retrieval Augmented Generation (RAG) with Spring AI. Spring AI simplifies RAG by providing built-in support for vector stores and easy data incorporation into the chat model through the Advisor API. However, since RetrievalAugmentationAdvisor is an experimental feature we cannot rule out some changes in future releases.

The post Using RAG and Vector Store with Spring AI appeared first on Piotr's TechBlog.

Getting Started with Spring AI Function Calling

piotr.minkowski — Thu, 30 Jan 2025 16:40:30 +0000

This article will show you how to use Spring AI support for Java function calling with the OpenAI chat model. The Spring AI function calling feature lets us connect the LLM capabilities with external APIs or systems. OpenAI’s models are trained to know when to call a function. We will work on implementing a Java function that takes the call arguments from the AI model and sends the result back. Our main goal is to connect to the third-party APIs to provide these results. Then the AI model uses the provided results to complete the conversation.

This article is the second part of a series describing some of the AI project’s most notable features. Before reading on, I recommend checking out my introduction to Spring AI, which is available here. The first part describes such features as prompts, structured output, chat memory, and built-in advisors. Additionally, it demonstrates the capability to switch between the most popular AI chat model API providers.

Source Code

If you would like to try it by yourself, you may always take a look at my source code. To do that, you must clone my sample GitHub repository. Then you should only follow my instructions.

Problem

Whenever I create a new article or example related to AI, I like to define the problem I’m trying to solve. The problem we will solve in this exercise is visible in the following prompt template. I’m asking the AI model about the value of my stock wallet. However, the model doesn’t know how many shares I have, and can’t get the latest stock prices. Since the OpenAI model is trained on a static dataset it does not have direct access to the online services or APIs.

So, in this case, we should provide private data with our wallet structure and “connect” our model with a public API that returns live stock market data. Let’s see how we tackle this challenge with Spring AI function calling.

Create Spring Functions

WalletService Supplier

We will begin with a source code. Then we will visualize the whole process on the diagram. Spring AI supports different ways of registering a function to call. You can read more about it in the Spring AI docs here. We will choose the way based on plain Java functions defined as beans in the Spring application context. This approach allows us to use interfaces from the java.util.function package such as Function, Supplier, or Consumer. Our first function takes no input, so it implements the Supplier interface. It just returns a list of shares that we currently have. It obtains such information from the database through the Spring Data WalletRepository bean.

public class WalletService implements Supplier<WalletResponse> {

    private WalletRepository walletRepository;

    public WalletService(WalletRepository walletRepository) {
        this.walletRepository = walletRepository;
    }

    @Override
    public WalletResponse get() {
        return new WalletResponse((List<Share>) walletRepository.findAll());
    }
}

Java

Information about the number of owned shares is stored in the share table. Each row contains a company name and the quantity of that company shares.

@Entity
public class Share {

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;
    private String company;
    private int quantity;
    
    // ... GETTERS/SETTERS
}

Java

insert into share(id, company, quantity) values (1, 'AAPL', 100);
insert into share(id, company, quantity) values (2, 'AMZN', 300);
insert into share(id, company, quantity) values (3, 'META', 300);
insert into share(id, company, quantity) values (4, 'MSFT', 400);
insert into share(id, company, quantity) values (5, 'NVDA', 200);

SQL

StockService Function

Our second function takes an input argument and returns an output. Therefore, it implements the Function interface. It must interact with live stock market API to get the current price of a given company share. We use the api.twelvedata.com service to access stock exchange quotes. The function returns a current price wrapped by the StockResponse object.

{}", stockRequest.company(), latestData.getClose()); return new StockResponse(Float.parseFloat(latestData.getClose())); } }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

public class StockService implements Function<StockRequest, StockResponse> {

    private static final Logger LOG = LoggerFactory.getLogger(StockService.class);

    @Autowired
    RestTemplate restTemplate;
    @Value("${STOCK_API_KEY}")
    String apiKey;

    @Override
    public StockResponse apply(StockRequest stockRequest) {
        StockData data = restTemplate.getForObject("https://api.twelvedata.com/time_series?symbol={0}&interval=1min&outputsize=1&apikey={1}",
                StockData.class,
                stockRequest.company(),
                apiKey);
        DailyStockData latestData = data.getValues().get(0);
        LOG.info("Get stock prices: {} -> {}", stockRequest.company(), latestData.getClose());
        return new StockResponse(Float.parseFloat(latestData.getClose()));
    }
}

Java

Here are the Java records for request and response objects.

public record StockRequest(String company) { }

public record StockResponse(Float price) { }

Java

To summarize, the first function accesses the database to get owned shares quantity, while the second function communicates public API to get the current price of a company share.

Spring AI Function Calling Flow

Architecture

Here’s the diagram that visualizes the flow of our application. The Spring AI Prompt object must contain references to our function beans. This allows the OpenAI model to recognize when a function should be called. However, the model does not call the function directly but only generates JSON used to call the function on the application side. Each function must provide a name, description, and signature (as JSON schema) to let the model know what arguments it expects. We have two functions. The StockService function returns a list of owned company shares, while the second function takes a single company name as the argument. This is where the magic happens. The chat model should call the StockService function for each object in the list returned by the WalletService function. The final response combines results received from both our functions.

Implementation

To implement the flow visualized above we must register our functions as Spring beans. The method name determines the name of the bean in the Spring context. Each bean declaration should also contain a description, which helps the model to understand when to call the function. The WalletResponse function is registered under the numberOfShares name, while the StockService function under the latestStockPrices name. The WalletService doesn’t take any input arguments, but injects the WalletRepository bean to interact with the database.

numberOfShares(WalletRepository walletRepository) { return new WalletService(walletRepository); } @Bean @Description("Latest stock prices") public Function latestStockPrices() { return new StockService(); }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

@Bean
@Description("Number of shares for each company in my portfolio")
public Supplier<WalletResponse> numberOfShares(WalletRepository walletRepository) {
    return new WalletService(walletRepository);
}

@Bean
@Description("Latest stock prices")
public Function<StockRequest, StockResponse> latestStockPrices() {
    return new StockService();
}

Java

Finally, let’s take a look at the REST controller implementation. It exposes the GET /wallet endpoint that communicates with the OpenAI chat model. When creating a prompt we should register both our functions using the OpenAiChatOptions class and its function method. The reference contains only the function @Bean name.

@RestController
@RequestMapping("/wallet")
public class WalletController {

    private final ChatClient chatClient;

    public WalletController(ChatClient.Builder chatClientBuilder) {
        this.chatClient = chatClientBuilder
                .defaultAdvisors(new SimpleLoggerAdvisor())
                .build();
    }

    @GetMapping
    String calculateWalletValue() {
        PromptTemplate pt = new PromptTemplate("""
        What’s the current value in dollars of my wallet based on the latest stock daily prices ?
        """);

        return this.chatClient.prompt(pt.create(
                OpenAiChatOptions.builder()
                        .function("numberOfShares")
                        .function("latestStockPrices")
                        .build()))
                .call()
                .content();
    }
}

Java

Run and Test the Spring AI Application

Before running the app, we must export OpenAI and Twelvedata API keys as environment variables.

export STOCK_API_KEY=
export OPEN_AI_TOKEN=

ShellSession

We must create an account on the Twelvedata platform to obtain its API key. The Twelvedata platform provides API to get the latest stock prices.

Of course, we must have an API key on the OpenAI platform. Once you create an account there you should go to that page. Then choose the name for your token and copy it after creation.

Then, we run our Spring AI app using the following Maven command:

mvn spring-boot:run

ShellSession

After running the app, we can call the /wallet endpoint to calculate our stock portfolio.

curl http://localhost:8080/wallet

ShellSession

Here’s the response returned by OpenAI for the provided test data and the current stock market prices.

Then, let’s switch to the application logs. We can see that the StockService function was called five times – once for every company in the wallet. After we added the SimpleLoggerAdvisor advisor to the and set the property logging.level.org.springframework.ai to DEBUG, we can observe detailed logs with requests and responses from the OpenAI chat model.

Final Thoughts

In this article, we analyzed the Spring AI integration with function support in AI models. OpenAI’s function calling is a powerful feature that enhances how AI models interact with external tools, APIs, and structured data. It makes AI more interactive and practical for real-world applications. Spring AI provides a flexible way to register and invoke such functions. However, it still requires attention from developers, who need to define clear function schemas and handle edge cases.

The post Getting Started with Spring AI Function Calling appeared first on Piotr's TechBlog.

Getting Started with Spring AI and Chat Model

piotr.minkowski — Tue, 28 Jan 2025 10:02:24 +0000

This article will teach you how to use the Spring AI project to build applications based on different chat models. The Spring AI Chat Model is a simple and portable interface that allows us to interact with these models. Our sample Spring Boot application will switch between three popular chat models provided by OpenAI, Mistral AI, and Ollama. This article is the first in a series explaining AI concepts with Spring Boot. Look for more on my blog in this area soon.

If you are interested in Spring Boot, read my article about tips, tricks, and techniques for this framework here.

Source Code

If you would like to try it by yourself, you may always take a look at my source code. To do that, you must clone my sample GitHub repository. Then you should only follow my instructions.

Problem

Whenever I create a new article or example related to AI, I like to define the problem I’m trying to solve. The problem this example solves is very trivial. I publish a lot of small demo apps to explain technology concepts. These apps usually need data to show a demo output. Usually, I add demo data by myself or use a library like Datafaker to do it for me. This time, we can leverage AI Chat Models API for that. Let’s begin!

Dependencies

The Spring AI project is still under active development. Currently, we are waiting for the 1.0 GA release. Until then, we will switch to the milestone releases of the project. The current milestone is 1.0.0-M5. So let’s add the Spring Milestones repository to our Maven pom.xml file.

    
        
            central
            Central
            https://repo1.maven.org/maven2/
        
        
            spring-milestones
            Spring Milestones
            https://repo.spring.io/milestone
            
                false

XML

Then we should include the Maven BOM with a specified version of the Spring AI project.

    
        21
        1.0.0-M5
    
    
    
        
            
                org.springframework.ai
                spring-ai-bom
                ${spring-ai.version}
                pom
                import

XML

Since our sample application exposes some REST endpoints, we should include the Spring Boot Web Starter. We can include the Spring Boot Test Starter to create some JUnit tests. The Spring AI modules are included in the Maven profiles section. There are three different profiles for each chat model provider. By default, our application uses Open AI, and thus it activates the open-ai profile, which includes the spring-ai-openai-spring-boot-starter library. We should activate the mistral-ai profile to switch to Mistral AI. The third option is the ollama-ai profile including the spring-ai-ollama-spring-boot-starter dependency. Here’s a full list of dependencies. That’ll make it a breeze to switch between different chat model AI providers — we’ll only need to set the profile parameter in the Maven running command.

    
        
            org.springframework.boot
            spring-boot-starter-web
        
        
            org.springframework.boot
            spring-boot-starter-test
            test
        
    

    
        
            open-ai
            
                true
            
            
                
                    org.springframework.ai
                    spring-ai-openai-spring-boot-starter
                
            
        
        
            mistral-ai
            
                
                    org.springframework.ai
                    spring-ai-mistral-ai-spring-boot-starter
                
            
        
        
            ollama-ai
            
                
                    org.springframework.ai
                    spring-ai-ollama-spring-boot-starter

XML

Connect to AI Chat Model Providers

Configure OpenAI

Before we proceed with a source code, we should prepare chat model AI tools. Let’s begin with OpenAI. We must have an account on the OpenAI Platform portal. After signing in we should access the API Keys page to generate an API token. Once we set its name, we can click the “Create secret key” button. Don’t forget to copy the key after creation.

The value of the generated token should be saved as an environment variable. Our sample Spring Boot application read its value from the OPEN_AI_TOKEN variable.

export OPEN_AI_TOKEN=

ShellSession

Configure Mistral AI

Then, we should repeat a very similar action for Mistral AI. We must have an account on the Mistral AI Platform portal. After signing in we should access the API Keys page to generate an API token. Both the name and expiration date fields are optional. Once we generate a token by clicking the “Create key” button, we should copy it.

The value of the generated token should be saved as an environment variable. Our sample Spring Boot application read its value for Mistral AI from the MISTRAL_AI_TOKEN variable.

export MISTRAL_AI_TOKEN=

ShellSession

Run and Configure Ollama

Opposite to OpenAI or Mistral AI, Ollama is built to allow to run large language models (LLMs) directly on our workstations. This means we don’t have any connection to the remote API to access it. First, we must download the Ollama binary dedicated to our OS from the following page. After installation, we can interact with it using the ollama CLI. First, we should choose the model to run. The full list of available models can be found here. By default, Spring AI expects the mistral model for the Ollama. Let’s choose llama3.2.

ollama run llama3.2

ShellSession

After running Ollama locally we can interact with it using the CLI terminal.

Configure Spring Boot Properties

Ollama exposes port over localhost and does not require an API token. Fortunately, all necessary URLs for our APIs come with the Spring AI auto-configuration. After choosing the llama3.2 model, we should provide the change in Spring Boot application properties respectively. We can also set the gpt-4o-mini model for OpenAI to decrease API costs.

spring.ai.openai.api-key = ${OPEN_AI_TOKEN}
spring.ai.openai.chat.options.model = gpt-4o-mini
spring.ai.mistralai.api-key = ${MISTRAL_AI_TOKEN}
spring.ai.ollama.chat.options.model = llama3.2

Plaintext

Spring AI Chat Model API

Prompting and Structured Output

Here is our model class. It contains the id field and several other fields that best describe each person.

public class Person {

    private Integer id;
    private String firstName;
    private String lastName;
    private int age;
    private Gender gender;
    private String nationality;
    
    //... GETTERS/SETTERS
}

public enum Gender {
    MALE, FEMALE;
}

Java

The @RestController class injects auto-configured ChatClient.Builder to create an instance of ChatClient. PersonController implements a method for returning a list of persons from the GET /persons endpoint. The main goal is to generate a list of 10 objects with the fields defined in the Person class. The id field should be auto-incremented. The PromptTemplate object defines a message, that will be sent to the chat model AI API. It doesn’t have to specify the exact fields that should be returned. This part is handled automatically by the Spring AI library after we invoke the entity() method on the ChatClient instance. The ParameterizedTypeReference object inside the entity method tells Spring AI to generate a list of objects.

@RestController
@RequestMapping("/persons")
public class PersonController {

    private final ChatClient chatClient;

    public PersonController(ChatClient.Builder chatClientBuilder) {
        this.chatClient = chatClientBuilder.build();
    }

    @GetMapping
    List<Person> findAll() {
        PromptTemplate pt = new PromptTemplate("""
                Return a current list of 10 persons if exists or generate a new list with random values.
                Each object should contain an auto-incremented id field.
                Do not include any explanations or additional text.
                """);

        return this.chatClient.prompt(pt.create())
                .call()
                .entity(new ParameterizedTypeReference<>() {});
    }

}

Java

Assuming you exported the OpenAI token to the OPEN_AI_TOKEN environment variable, you can run the application using the following command:

mvn spring-boot:run

ShellSession

Then, let’s call the http://localhost:8080/persons endpoint. It returns a list of 10 people with different nationalities. It

Now, we can change the PromptTemplate content and add the word “famous” before persons. Just for fun.

The results are not surprising at all – “Elon Musk” enters the list However, the list will be slightly different the second time you call the same endpoint. According to our prompt, a chat client should “return a current list of 10 persons”. So, I expected to get the same list as before. In this case, the problem is that the chat client doesn’t remember a previous conversation.

Advisors and Chat Memory

Let’s try to change it. First, we should define the implementation of the ChatMemory interface. InMemoryChatMemory is good enough for our tests.

@SpringBootApplication
public class SpringAIShowcase {

    public static void main(String[] args) {
        SpringApplication.run(SpringAIShowcase.class, args);
    }

    @Bean
    InMemoryChatMemory chatMemory() {
        return new InMemoryChatMemory();
    }
}

Java

To enable conversation history for a chat client we should define an advisor. The Spring AI Advisors API lets us intercept, modify, and enhance AI-driven interactions handled by Spring applications. Spring AI offers API to create custom advisors, but we can also leverage several built-in advisors. It can be e.g. PromptChatMemoryAdvisor that enables chat memory and adds it to the prompt’s system text or SimpleLoggerAdvisor which enables request/response logging. Let’s take a look at the latest implementation of the PersonController class. I highlighted the added lines of code. Besides advisors, it contains a new GET /persons/{id} endpoint implementation. This endpoint takes a previously returned list of persons and seeks the object with a specified id. The PromptTemplate object specifies the id parameter filled with the value read from the context path.

findAll() { PromptTemplate pt = new PromptTemplate(""" Return a current list of 10 persons if exists or generate a new list with random values. Each object should contain an auto-incremented id field. Do not include any explanations or additional text. """); return this.chatClient.prompt(pt.create()) .call() .entity(new ParameterizedTypeReference<>() {}); } @GetMapping("/{id}") Person findById(@PathVariable String id) { PromptTemplate pt = new PromptTemplate(""" Find and return the object with id {id} in a current list of persons. """); Prompt p = pt.create(Map.of("id", id)); return this.chatClient.prompt(p) .call() .entity(Person.class); } }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

@RestController
@RequestMapping("/persons")
public class PersonController {

    private final ChatClient chatClient;

    public PersonController(ChatClient.Builder chatClientBuilder, 
                            ChatMemory chatMemory) {
        this.chatClient = chatClientBuilder
                .defaultAdvisors(
                        new PromptChatMemoryAdvisor(chatMemory),
                        new SimpleLoggerAdvisor())
                .build();
    }

    @GetMapping
    List<Person> findAll() {
        PromptTemplate pt = new PromptTemplate("""
                Return a current list of 10 persons if exists or generate a new list with random values.
                Each object should contain an auto-incremented id field.
                Do not include any explanations or additional text.
                """);

        return this.chatClient.prompt(pt.create())
                .call()
                .entity(new ParameterizedTypeReference<>() {});
    }

    @GetMapping("/{id}")
    Person findById(@PathVariable String id) {
        PromptTemplate pt = new PromptTemplate("""
                Find and return the object with id {id} in a current list of persons.
                """);
        Prompt p = pt.create(Map.of("id", id));
        return this.chatClient.prompt(p)
                .call()
                .entity(Person.class);
    }
}

Java

Now, let’s make a final test. After the application restarts, we can call the endpoint that generates a list of persons. Then, we will call the GET /persons/{id} endpoint to display only a single person by ID. Spring application reads the value from the list of persons stored in the chat memory. Finally, we can repeat the call to the GET /persons endpoint to verify if it returns the same list.

Different Chat AI Models

Assuming you exported the Mistral AI token to the MISTRAL_AI_TOKEN environment variable, you can run the application using the following command. It activates the mistral-ai Maven profile and includes the starter with the Mistral AI support.

mvn spring-boot:run -Pmistral-ai

ShellSession

It returns responses similar to OpenAI’s, but some small differences exist. It always returns 0 in the age field and a 3-letter shortcut as a country name.

Let’s tweak our template to get Mistrai AI to generate an accurate age number. Here’s the fixed prompt template:

Now, it looks quite better. Even so, the names don’t match up with the countries they’re from, so there’s room for improvement.

The last test is for Ollama. Let’s run our application once again. This time we should activate the ollama-ai Maven profile.

mvn spring-boot:run -Pollama-ai

ShellSession

Then, we can repeat the same requests to check out the responses from Ollama AI. You can check out the responses by yourself.

$ curl http://localhost:8080/persons
$ curl http://localhost:8080/persons/2

ShellSession

Final Thoughts

This example doesn’t do anything unusual but only shows some basic features offered by Spring AI Chat Models API. We quickly reviewed features like prompts, structured output, chat memory, and built-in advisors. We also switched between some popular AI Chat Models API providers. You can expect more articles in this area soon. If you want to continue with the next part of the AI series on my blog, go here.

The post Getting Started with Spring AI and Chat Model appeared first on Piotr's TechBlog.