LLM Archives - Piotr's TechBlog

AI Tool Calling with Quarkus LangChain4j

piotr.minkowski — Mon, 23 Jun 2025 05:19:14 +0000

This article will show you how to use Quarkus LangChain4j AI support with the most popular chat models for the “tool calling” feature. Tool calling (sometimes referred to as function calling) is a typical pattern in AI applications that enables a model to interact with APIs or tools, extending its capabilities. The most popular AI models are trained to know when to call a function. The Quarkus LangChain4j extension offers built-in support for tool calling. In this article, you will learn how to define tool methods to get data from the third-party APIs and the internal database.

This article is the second part of a series describing some of the Quarkus AI project’s most notable features. Before reading on, I recommend checking out my introduction to Quarkus LangChain4j, which is available here. The first part describes such features as prompts, structured output, and chat memory. There is also a similar tutorial series about Spring AI. You can compare Quarkus support for tool calling described here with a similar Spring AI support described in the following post.

Source Code

Feel free to use my source code if you’d like to try it out yourself. To do that, you must clone my sample GitHub repository. Then you should only follow my instructions.

Tool Calling Motivation

For ease of comparison, this article will implement an identical scenario to an analogous application written in Spring AI. You can find a GitHub sample repository with the Spring AI app here. As you know, the “tool calling” feature helps us solve a common AI model challenge related to internal or live data sources. If we want to augment a model with such data, our applications must allow it to interact with a set of APIs or tools. In our case, the internal database (H2) contains information about the structure of our stock wallet. The sample Quarkus application asks an AI model about the total value of the wallet based on daily stock prices or the highest value for the last few days. The model must retrieve the structure of our stock wallet and the latest stock prices.

Use Tool Calling with Quarkus LangChain4j

Create ShareTools

Let’s begin with the ShareTools implementation, which is responsible for getting a list of the wallet’s shares from a database. It defines a single method annotated with @Tool. The most crucial element here is to provide a clear description of the method within the @Tool annotation. It allows the AI model to understand the function’s responsibilities. The method returns the number of shares for each company in our portfolio. It is retrieved from the database through the Quarkus Panache ORM repository.

@ApplicationScoped
public class ShareTools {

    private ShareRepository shareRepository;

    public ShareTools(ShareRepository shareRepository) {
        this.shareRepository = shareRepository;
    }

    @Tool("Return number of shares for each company in my wallet")
    public List<Share> getNumberOfShares() {
        return shareRepository.findAll().list();
    }
}

@ApplicationScoped
public class ShareTools {

    private ShareRepository shareRepository;

    public ShareTools(ShareRepository shareRepository) {
        this.shareRepository = shareRepository;
    }

    @Tool("Return number of shares for each company in my wallet")
    public List<Share> getNumberOfShares() {
        return shareRepository.findAll().list();
    }
}

Java

The sample application launches an embedded, in-memory database and inserts test data into the stock table. Our wallet contains the most popular companies on the U.S. stock market, including Amazon, Meta, and Microsoft. Here’s a dataset inserted on application startup.

insert into share(id, company, quantity) values (1, 'AAPL', 100);
insert into share(id, company, quantity) values (2, 'AMZN', 300);
insert into share(id, company, quantity) values (3, 'META', 300);
insert into share(id, company, quantity) values (4, 'MSFT', 400);

insert into share(id, company, quantity) values (1, 'AAPL', 100);
insert into share(id, company, quantity) values (2, 'AMZN', 300);
insert into share(id, company, quantity) values (3, 'META', 300);
insert into share(id, company, quantity) values (4, 'MSFT', 400);

SQL

Create StockTools

The StockTools The class is responsible for interaction with the TwelveData stock API. It defines two methods. The getLatestStockPrices method returns only the latest close price for a specified company. It is a tool calling version of the method provided within the pl.piomin.services.functions.stock.StockService function. The second method is more complicated. It must return historical daily close prices for a defined number of days. Each price must be correlated with a quotation date.

@ApplicationScoped
public class StockTools {

    private Logger log;
    private StockDataClient stockDataClient;

    public StockTools(@RestClient StockDataClient stockDataClient, Logger log) {
        this.stockDataClient = stockDataClient;
        this.log = log;
    }
    
    @ConfigProperty(name = "STOCK_API_KEY", defaultValue = "none")
    String apiKey;

    @Tool("Return latest stock prices for a given company")
    public StockResponse getLatestStockPrices(String company) {
        log.infof("Get stock prices for: %s", company);
        StockData data = stockDataClient.getStockData(company, apiKey, "1min", 1);
        DailyStockData latestData = data.getValues().get(0);
        log.infof("Get stock prices (%s) -> %s", company, latestData.getClose());
        return new StockResponse(Float.parseFloat(latestData.getClose()));
    }

    @Tool("Return historical daily stock prices for a given company")
    public List<DailyShareQuote> getHistoricalStockPrices(String company, int days) {
        log.infof("Get historical stock prices: %s for %d days", company, days);
        StockData data = stockDataClient.getStockData(company, apiKey, "1min", days);
        return data.getValues().stream()
                .map(d -> new DailyShareQuote(company, Float.parseFloat(d.getClose()), d.getDatetime()))
                .toList();
    }

}

@ApplicationScoped
public class StockTools {

    private Logger log;
    private StockDataClient stockDataClient;

    public StockTools(@RestClient StockDataClient stockDataClient, Logger log) {
        this.stockDataClient = stockDataClient;
        this.log = log;
    }
    
    @ConfigProperty(name = "STOCK_API_KEY", defaultValue = "none")
    String apiKey;

    @Tool("Return latest stock prices for a given company")
    public StockResponse getLatestStockPrices(String company) {
        log.infof("Get stock prices for: %s", company);
        StockData data = stockDataClient.getStockData(company, apiKey, "1min", 1);
        DailyStockData latestData = data.getValues().get(0);
        log.infof("Get stock prices (%s) -> %s", company, latestData.getClose());
        return new StockResponse(Float.parseFloat(latestData.getClose()));
    }

    @Tool("Return historical daily stock prices for a given company")
    public List<DailyShareQuote> getHistoricalStockPrices(String company, int days) {
        log.infof("Get historical stock prices: %s for %d days", company, days);
        StockData data = stockDataClient.getStockData(company, apiKey, "1min", days);
        return data.getValues().stream()
                .map(d -> new DailyShareQuote(company, Float.parseFloat(d.getClose()), d.getDatetime()))
                .toList();
    }

}

Java

Here’s the DailyShareQuote Java record returned in the response list.

public record DailyShareQuote(String company, float price, String datetime) {
}

public record DailyShareQuote(String company, float price, String datetime) {
}

Java

Here’s a @RestClient responsible for calling the TwelveData stock API.

@RegisterRestClient(configKey = "stock-api")
public interface StockDataClient {

    @GET
    @Path("/time_series")
    StockData getStockData(@RestQuery String symbol,
                           @RestQuery String apikey,
                           @RestQuery String interval,
                           @RestQuery int outputsize);
}

@RegisterRestClient(configKey = "stock-api")
public interface StockDataClient {

    @GET
    @Path("/time_series")
    StockData getStockData(@RestQuery String symbol,
                           @RestQuery String apikey,
                           @RestQuery String interval,
                           @RestQuery int outputsize);
}

Java

For the demo, you can easily enable complete logging of both communication with the AI model through LangChain4j and with the stock API via @RestClient.

quarkus.langchain4j.log-requests = true
quarkus.langchain4j.log-responses = true
quarkus.rest-client.stock-api.url = https://api.twelvedata.com
quarkus.rest-client.logging.scope = request-response
quarkus.rest-client.stock-api.scope = all
%dev.quarkus.log.category."org.jboss.resteasy.reactive.client.logging".level = DEBUG

quarkus.langchain4j.log-requests = true
quarkus.langchain4j.log-responses = true
quarkus.rest-client.stock-api.url = https://api.twelvedata.com
quarkus.rest-client.logging.scope = request-response
quarkus.rest-client.stock-api.scope = all
%dev.quarkus.log.category."org.jboss.resteasy.reactive.client.logging".level = DEBUG

Plaintext

Quarkus LangChain4j Tool Calling Flow

You can easily register @Tools on your Quarkus AI service with the tools argument inside the @RegisterAiService annotation. The calculateWalletValueWithTools() method calculates the value of our stock wallet in dollars. It uses the latest daily stock prices for each company’s shares from the wallet. Since this method directly returns the response received from the AI model, it is essential to perform additional validation of the content received. For this purpose, a so-called guardrail should be implemented and set in place. We can easily achieve it with the @OutputGuardrails annotation. The calculateHighestWalletValue method calculates the value of our stock wallet in dollars for each day in the specified period determined by the days variable. Then it must return the day with the highest stock wallet value.

@RegisterAiService(tools = {StockTools.class, ShareTools.class})
public interface WalletAiService {

    @UserMessage("""
    What’s the current value in dollars of my wallet based on the latest stock daily prices ?
    
    Return subtotal value in dollars for each company in my wallet.
    In the end, return the total value in dollars wrapped by ***.
    """)
    @OutputGuardrails(WalletGuardrail.class)
    String calculateWalletValueWithTools();

    @UserMessage("""
    On which day during last {days} days my wallet had the highest value in dollars based on the historical daily stock prices ?
    """)
    String calculateHighestWalletValue(int days);
}

@RegisterAiService(tools = {StockTools.class, ShareTools.class})
public interface WalletAiService {

    @UserMessage("""
    What’s the current value in dollars of my wallet based on the latest stock daily prices ?
    
    Return subtotal value in dollars for each company in my wallet.
    In the end, return the total value in dollars wrapped by ***.
    """)
    @OutputGuardrails(WalletGuardrail.class)
    String calculateWalletValueWithTools();

    @UserMessage("""
    On which day during last {days} days my wallet had the highest value in dollars based on the historical daily stock prices ?
    """)
    String calculateHighestWalletValue(int days);
}

Java

Here’s the implementation of the guardrail that validates the response returned by the calculateWalletValueWithTools method. It verifies if the total value in dollars is wrapped by *** and starts with the $ sign.

@ApplicationScoped
public class WalletGuardrail implements OutputGuardrail {

    Pattern pattern = Pattern.compile("\\*\\*\\*(.*?)\\*\\*\\*");

    private Logger log;
    
    public WalletGuardrail(Logger log) {
        this.log = log;
    }

    @Override
    public OutputGuardrailResult validate(AiMessage responseFromLLM) {
        try {
            Matcher matcher = pattern.matcher(responseFromLLM.text());
            if (matcher.find()) {
                String amount = matcher.group(1);
                log.infof("Extracted amount: %s", amount);
                if (amount.startsWith("$")) {
                    return success();
                }
            }
        } catch (Exception e) {
            return reprompt("Invalid text format", e, "Make sure you return a valid requested text");
        }
        return failure("Total amount not found");
    }
}

@ApplicationScoped
public class WalletGuardrail implements OutputGuardrail {

    Pattern pattern = Pattern.compile("\\*\\*\\*(.*?)\\*\\*\\*");

    private Logger log;
    
    public WalletGuardrail(Logger log) {
        this.log = log;
    }

    @Override
    public OutputGuardrailResult validate(AiMessage responseFromLLM) {
        try {
            Matcher matcher = pattern.matcher(responseFromLLM.text());
            if (matcher.find()) {
                String amount = matcher.group(1);
                log.infof("Extracted amount: %s", amount);
                if (amount.startsWith("$")) {
                    return success();
                }
            }
        } catch (Exception e) {
            return reprompt("Invalid text format", e, "Make sure you return a valid requested text");
        }
        return failure("Total amount not found");
    }
}

Java

Here’s the REST endpoints implementation. It uses the WalletAiService bean to interact with the AI model. It exposes two endpoints: GET /wallet/with-tools and GET /wallet/highest-day/{days}.

@Path("/wallet")
@Produces(MediaType.TEXT_PLAIN)
public class WalletController {

    private final WalletAiService walletAiService;

    public WalletController(WalletAiService walletAiService) {
        this.walletAiService = walletAiService;
    }

    @GET
    @Path("/with-tools")
    public String calculateWalletValueWithTools() {
        return walletAiService.calculateWalletValueWithTools();
    }

    @GET
    @Path("/highest-day/{days}")
    public String calculateHighestWalletValue(int days) {
        return walletAiService.calculateHighestWalletValue(days);
    }

}

@Path("/wallet")
@Produces(MediaType.TEXT_PLAIN)
public class WalletController {

    private final WalletAiService walletAiService;

    public WalletController(WalletAiService walletAiService) {
        this.walletAiService = walletAiService;
    }

    @GET
    @Path("/with-tools")
    public String calculateWalletValueWithTools() {
        return walletAiService.calculateWalletValueWithTools();
    }

    @GET
    @Path("/highest-day/{days}")
    public String calculateHighestWalletValue(int days) {
        return walletAiService.calculateHighestWalletValue(days);
    }

}

Java

The following diagram illustrates the flow for the second use case, which returns the day with the highest stock wallet value. First, it must connect to the database and retrieve the stock wallet structure, which contains the number of shares for each company. Then, it must call the stock API for every company found in the wallet. So, finally, the method calculateHighestWalletValue should be called four times with different values of the company name parameter and a value of the days determined by the HTTP endpoint path variable. Once all the data is collected, the AI model calculates the highest wallet value and returns it together with the quotation date.

Automated Testing

Most of my repositories are automatically updated to the latest versions of libraries. After updating the library version, automated tests are run to verify that everything works as expected. To verify the correctness of today’s scenario, we will mock stock API calls while integrating with the actual OpenAI service. To mock API calls, you can use the quarkus-junit5-mockito extension.

<dependency>
  <groupId>io.quarkus</groupId>
  <artifactId>quarkus-junit5</artifactId>
  <scope>test</scope>
</dependency>
<dependency>
  <groupId>io.quarkus</groupId>
  <artifactId>quarkus-junit5-mockito</artifactId>
  <scope>test</scope>
</dependency>
<dependency>
  <groupId>io.rest-assured</groupId>
  <artifactId>rest-assured</artifactId>
  <scope>test</scope>
</dependency>


  io.quarkus
  quarkus-junit5
  test


  io.quarkus
  quarkus-junit5-mockito
  test


  io.rest-assured
  rest-assured
  test

XML

The following JUnit test verifies two endpoints exposed by WalletController. As you may remember, there is also an output guardrail set on the AI service called by the GET /wallet/with-tools endpoint.

@QuarkusTest
@TestMethodOrder(MethodOrderer.OrderAnnotation.class)
class WalletControllerTest {

    @InjectMock
    @RestClient
    StockDataClient stockDataClient;

    @BeforeEach
    void setUp() {
        // Mock the stock data responses
        StockData aaplStockData = createMockStockData("AAPL", "150.25");
        StockData amznStockData = createMockStockData("AMZN", "120.50");
        StockData metaStockData = createMockStockData("META", "250.75");
        StockData msftStockData = createMockStockData("MSFT", "300.00");

        // Mock the stock data client responses
        when(stockDataClient.getStockData(eq("AAPL"), anyString(), anyString(), anyInt()))
            .thenReturn(aaplStockData);
        when(stockDataClient.getStockData(eq("AMZN"), anyString(), anyString(), anyInt()))
            .thenReturn(amznStockData);
        when(stockDataClient.getStockData(eq("META"), anyString(), anyString(), anyInt()))
            .thenReturn(metaStockData);
        when(stockDataClient.getStockData(eq("MSFT"), anyString(), anyString(), anyInt()))
            .thenReturn(msftStockData);
    }

    private StockData createMockStockData(String symbol, String price) {
        DailyStockData dailyData = new DailyStockData();
        dailyData.setDatetime("2023-01-01");
        dailyData.setOpen(price);
        dailyData.setHigh(price);
        dailyData.setLow(price);
        dailyData.setClose(price);
        dailyData.setVolume("1000");

        StockData stockData = new StockData();
        stockData.setValues(List.of(dailyData));
        return stockData;
    }

    @Test
    @Order(1)
    void testCalculateWalletValueWithTools() {
        given()
          .when().get("/wallet/with-tools")
          .then().statusCode(200)
                 .contentType(ContentType.TEXT)
                 .body(notNullValue())
                 .body(not(emptyString()));
    }

    @Test
    @Order(2)
    void testCalculateHighestWalletValue() {
        given()
          .pathParam("days", 7)
          .when().get("/wallet/highest-day/{days}")
          .then().statusCode(200)
                 .contentType(ContentType.TEXT)
                 .body(notNullValue())
                 .body(not(emptyString()));
    }
}

@QuarkusTest
@TestMethodOrder(MethodOrderer.OrderAnnotation.class)
class WalletControllerTest {

    @InjectMock
    @RestClient
    StockDataClient stockDataClient;

    @BeforeEach
    void setUp() {
        // Mock the stock data responses
        StockData aaplStockData = createMockStockData("AAPL", "150.25");
        StockData amznStockData = createMockStockData("AMZN", "120.50");
        StockData metaStockData = createMockStockData("META", "250.75");
        StockData msftStockData = createMockStockData("MSFT", "300.00");

        // Mock the stock data client responses
        when(stockDataClient.getStockData(eq("AAPL"), anyString(), anyString(), anyInt()))
            .thenReturn(aaplStockData);
        when(stockDataClient.getStockData(eq("AMZN"), anyString(), anyString(), anyInt()))
            .thenReturn(amznStockData);
        when(stockDataClient.getStockData(eq("META"), anyString(), anyString(), anyInt()))
            .thenReturn(metaStockData);
        when(stockDataClient.getStockData(eq("MSFT"), anyString(), anyString(), anyInt()))
            .thenReturn(msftStockData);
    }

    private StockData createMockStockData(String symbol, String price) {
        DailyStockData dailyData = new DailyStockData();
        dailyData.setDatetime("2023-01-01");
        dailyData.setOpen(price);
        dailyData.setHigh(price);
        dailyData.setLow(price);
        dailyData.setClose(price);
        dailyData.setVolume("1000");

        StockData stockData = new StockData();
        stockData.setValues(List.of(dailyData));
        return stockData;
    }

    @Test
    @Order(1)
    void testCalculateWalletValueWithTools() {
        given()
          .when().get("/wallet/with-tools")
          .then().statusCode(200)
                 .contentType(ContentType.TEXT)
                 .body(notNullValue())
                 .body(not(emptyString()));
    }

    @Test
    @Order(2)
    void testCalculateHighestWalletValue() {
        given()
          .pathParam("days", 7)
          .when().get("/wallet/highest-day/{days}")
          .then().statusCode(200)
                 .contentType(ContentType.TEXT)
                 .body(notNullValue())
                 .body(not(emptyString()));
    }
}

Java

Tests can be automatically run, for example, by the CircleCI pipeline on each dependency update via the pull request.

Run the Application to Verify Tool Calling

Before starting the application, we must set environment variables with the AI model and stock API tokens.

$ export OPEN_AI_TOKEN=<YOUR_OPEN_AI_TOKEN>
$ export STOCK_API_KEY=<YOUR_STOCK_API_KEY>

$ export OPEN_AI_TOKEN=<YOUR_OPEN_AI_TOKEN>
$ export STOCK_API_KEY=<YOUR_STOCK_API_KEY>

ShellSession

Then, run the application in development mode with the following command:

mvn quarkus:dev

mvn quarkus:dev

ShellSession

Once the application is started, you can call the first endpoint. The GET /wallet/with-tools calculates the total least value of the stock wallet structure stored in the database.

curl http://localhost:8080/wallet/with-tools

curl http://localhost:8080/wallet/with-tools

ShellSession

You can see either the response from the chat AI model or the exception thrown after an unsuccessful validation using a guardrail. If LLM response validation fails, the REST endpoint returns the HTTP 500 code.

Here’s the successfully validated LLM response.

The sample Quarkus application logs the whole communication with the AI model. Here, you can see a first request containing a list of registered functions (tools) along with their descriptions.

Then we can call the GET /wallet/highest-day/{days} endpoint to return the day with the highest wallet value. Let’s calculate it for the last 7 days.

curl http://localhost:8080/wallet/highest-day/7

curl http://localhost:8080/wallet/highest-day/7

ShellSession

Here’s the response.

Finally, you can perform a similar test as before, but for the Mistral AI model. Before running the application, set your API token for Mistral AI and rename the default model to mistralai.

$ export MISTRAL_AI_TOKEN=<YOUR_MISTRAL_AI_TOKEN>
$ export AI_MODEL_PROVIDER=mistralai

$ export MISTRAL_AI_TOKEN=<YOUR_MISTRAL_AI_TOKEN>
$ export AI_MODEL_PROVIDER=mistralai

ShellSession

Then, run the sample Quarkus application with the following command and repeat the same “tool calling” tests as before.

mvn quarkus:dev -Pmistral-ai

mvn quarkus:dev -Pmistral-ai

ShellSession

Final Thoughts

Quarkus LangChain4j provides a seamless way to run tools in AI-powered conversations. You can register a tool by adding it as a part of the @RegisterAiService annotation. Also, you can easily add a guardrail on the selected AI service method. Tools are a vital part of agentic AI and the MCP concepts. It is therefore essential to understand it properly. You can expect more articles on Quarkus LangChain4j soon, including on MCP.

The post AI Tool Calling with Quarkus LangChain4j appeared first on Piotr's TechBlog.

Getting Started with Quarkus LangChain4j and Chat Model

piotr.minkowski — Wed, 18 Jun 2025 16:36:08 +0000

This article will teach you how to use the Quarkus LangChain4j project to build applications based on different chat models. The Quarkus AI Chat Model offers a portable and straightforward interface, enabling seamless interaction with these models. Our sample Quarkus application will switch between three popular chat models provided by OpenAI, Mistral AI, and Ollama. This article is the first in a series explaining AI concepts with Quarkus LangChain4j. Look for more on my blog in this area soon. The idea of this tutorial is very similar to the series on Spring AI. Therefore, you will be able to easily compare the two approaches, as the sample application will do the same thing as an analogous Spring Boot application.

If you like Quarkus, then you can find quite a few articles about it on my blog. Just go to the Quarkus category and find the topic you are interested in.

SourceCode

Feel free to use my source code if you’d like to try it out yourself. To do that, you must clone my sample GitHub repository. Then you should only follow my instructions.

Motivation

Whenever I create a new article or example related to AI, I like to define the problem I’m trying to solve. The problem this example solves is very trivial. I publish numerous small demo apps to explain complex technology concepts. These apps typically require data to display a demo output. Usually, I add demo data by myself or use a library like Datafaker to do it for me. This time, we can leverage the AI Chat Models API for that. Let’s begin!

The Quarkus-related topic I’m describing today, I also explained earlier for Spring Boot. For a comparison of the features offered by both frameworks for simple interaction with the AI chat model, you can read this article on Spring AI.

Dependencies

The sample application uses the current latest version of the Quarkus framework.


  
    
      io.quarkus.platform
      quarkus-bom
      ${quarkus.platform.version}
      pom
      import

XML

You can easily switch between multiple AI model implementations by activating a dedicated Maven profile. By default, the open-ai profile is active. It includes the quarkus-langchain4j-openai module in the Maven dependencies. You can also activate the mistral-ai and ollama profile. In that case, the quarkus-langchain4j-mistral-ai or quarkus-langchain4j-ollama module will be included instead of the LangChain4j OpenAI extension.


  
    open-ai
    
      true
    
    
      
        io.quarkiverse.langchain4j
        quarkus-langchain4j-openai
        ${quarkus-langchain4j.version}
      
    
  
  
    mistral-ai
    
      
        io.quarkiverse.langchain4j
        quarkus-langchain4j-mistral-ai
        ${quarkus-langchain4j.version}
      
    
  
  
    ollama
    
      
        io.quarkiverse.langchain4j
        quarkus-langchain4j-ollama
        ${quarkus-langchain4j.version}

XML

The sample Quarkus application is simple. It exposes some REST endpoints and communicates with a selected AI model to return an AI-generated response via each endpoint. So, you need to include only core Quarkus modules like quarkus-rest-jackson or quarkus-arc. To implement JUnit tests with REST API, it also includes the quarkus-junit5 and rest-assured modules in the test scope.


  
  
    io.quarkus
    quarkus-rest-jackson
  
  
    io.quarkus
    quarkus-arc
  

  
  
    io.quarkus
    quarkus-junit5
    test
  
  
    io.rest-assured
    rest-assured
    test

XML

Quarkus LangChain4j Chat Models Integration

Quarkus provides an innovative approach to interacting with AI chat models. First, you need to annotate your interface by defining AI-oriented methods with the @RegisterAiService annotation. Then you must add a proper description and input prompt inside the @SystemMessage and @UserMessage annotations. Here is the sample PersonAiService interaction, which defines two methods. The generatePersonList method aims to ask the AI model to generate a list of 10 unique persons in a form consistent with the input object structure. The getPersonById method must read the previously generated list from chat memory and return a person’s data with a specified id field.

@RegisterAiService
@ApplicationScoped
public interface PersonAiService {

    @SystemMessage("""
        You are a helpful assistant that generates realistic person data.
        Always respond with valid JSON format.
        """)
    @UserMessage("""
        Generate exactly 10 unique persons

        Requirements:
        - Each person must have a unique integer ID (like 1, 2, 3, etc.)
        - Use realistic first and last names per each nationality
        - Ages should be between 18 and 80
        - Return ONLY the JSON array, no additional text
        """)
    PersonResponse generatePersonList(@MemoryId int userId);

    @SystemMessage("""
        You are a helpful assistant that can recall generated person data from chat memory.
        """)
    @UserMessage("""
        In the previously generated list of persons for user {userId}, find and return the person with id {id}.
        
        Return ONLY the JSON object, no additional text.
        """)
    Person getPersonById(@MemoryId int userId, int id);

}

Java

There are a few more things to add regarding the code snippet above. The beans created by @RegisterAiService are @RequestScoped by default. The Quarkus LangChain4j documentation states that this is possible, allowing objects to be deleted from the chat memory. In the case seen above, the list of people is generated per user ID, which acts as the key by which we search the chat memory. To guarantee that the getPersonById method finds a list of persons generated per @MemoryId the PersonAiService interface must be annotated with @ApplicationScoped. The InMemoryChatMemoryStore implementation is enabled by default, so you don’t need to declare any additional beans to use it.

Quarkus LangChain4j can automatically map the LLM’s JSON response to the output POJO. However, until now, it has not been possible to map it directly to the output collection. Therefore, you must wrap the output list with the additional class, as shown below.

public class PersonResponse {

    private List<Person> persons;

    public List<Person> getPersons() {
        return persons;
    }

    public void setPersons(List<Person> persons) {
        this.persons = persons;
    }
}

Java

Here’s the Person class:

public class Person {

    private Integer id;
    private String firstName;
    private String lastName;
    private int age;
    private String nationality;
    private Gender gender;
    
    // GETTERS and SETTERS

}

Java

Finally, the last part of our implementation is REST endpoints. Here’s the REST controller that injects and uses PersonAiService to interact with the AI chat model. It exposes two endpoints: GET /api/{userId}/persons and GET /api/{userId}/persons/{id}. You can generate several lists of persons by specifying the userId path parameter.

@Path("/api")
@Produces(MediaType.APPLICATION_JSON)
@Consumes(MediaType.APPLICATION_JSON)
public class PersonController {

    private static final Logger LOG = Logger.getLogger(PersonController.class);

    PersonAiService personAiService;

    public PersonController(PersonAiService personAiService) {
        this.personAiService = personAiService;
    }

    @GET
    @Path("/{userId}/persons")
    public PersonResponse generatePersons(@PathParam("userId") int userId) {
        return personAiService.generatePersonList(userId);
    }

    @GET
    @Path("/{userId}/persons/{id}")
    public Person getPersonById(@PathParam("userId") int userId, @PathParam("id") int id) {
        return personAiService.getPersonById(userId, id);
    }

}

Java

Use Different AI Models with Quarkus LangChain4j

Configuration Properties

Here is a configuration defined within the application.properties file. Before proceeding, you must generate the OpenAI and Mistral AI API tokens and export them as environment variables. Additionally, you can enable logging of requests and responses in AI model communication. It is also worth increasing the default timeout for a single request from 10 seconds to a higher value, such as 20 seconds.

quarkus.langchain4j.chat-model.provider = ${AI_MODEL_PROVIDER:openai}
quarkus.langchain4j.log-requests = true
quarkus.langchain4j.log-responses = true

# OpenAI Configuration
quarkus.langchain4j.openai.api-key = ${OPEN_AI_TOKEN}
quarkus.langchain4j.openai.timeout = 20s

# Mistral AI Configuration
quarkus.langchain4j.mistralai.api-key = ${MISTRAL_AI_TOKEN}
quarkus.langchain4j.mistralai.timeout = 20s

# Ollama Configuration
quarkus.langchain4j.ollama.base-url = ${OLLAMA_BASE_URL:http://localhost:11434}

Plaintext

To run a sample Quarkus application and connect it with OpenAI, you must set the OPEN_AI_TOKEN environment variable. Since the open-ai Maven profile is activated by default, you don’t need to set anything else while running an app.

$ export OPEN_AI_TOKEN=<your_openai_token>
$ mvn quarkus:dev

ShellSession

Then, you can call the GET /api/{userId}/persons endpoint with different userId path variable values. Here are sample API requests and responses.

After that, you can call the GET /api/{userId}/persons/{id} endpoint to return a specified person found in the chat memory.

Switch Between AI Models

Then, you can repeat the same exercise with the Mistral AI model. You must set the AI_MODEL_PROVIDER to mistral, export its API token as the MISTRAL_AI_TOKEN environment variable, and enable the mistral-ai profile while running the app.

$ export AI_MODEL_PROVIDER=mistralai
$ export MISTRAL_AI_TOKEN=<your_mistralai_token>
$ mvn quarkus:dev -Pmistral-ai

ShellSession

The app should start successfully.

Once it happens, you can repeat the same sequence of requests as before for OpenAI.

$ curl http://localhost:8080/api/1/persons
$ curl http://localhost:8080/api/2/persons
$ curl http://localhost:8080/api/1/persons/1
$ curl http://localhost:8080/api/2/persons/1

ShellSession

You can check the request sent to the AI model in the application logs.

Here’s a log showing an AI chat model response:

Finally, you can run a test with ollama. By default, the LangChain4j extension for Ollama uses the llama3.2 model. You can change it by setting the quarkus.langchain4j.ollama.chat-model.model-id property in the application.properties file. Assuming that you use the llama3.3 model, here’s your configuration:

quarkus.langchain4j.ollama.base-url = ${OLLAMA_BASE_URL:http://localhost:11434}
quarkus.langchain4j.ollama.chat-model.model-id = llama3.3
quarkus.langchain4j.ollama.timeout = 60s

Plaintext

Before proceeding, you must run the llama3.3 model on your laptop. Of course, you can choose another, smaller model, because llama3.3 is 42 GB.

ollama run llama3.3

ShellSession

It can take a lot of time. However, a model is finally ready to use.

Once a model is running, you can set the AI_MODEL_PROVIDER environment variable to ollama and activate the ollama profile for the app:

$ export AI_MODEL_PROVIDER=ollama
$ mvn quarkus:dev -Pollama

ShellSession

This time, our application is connected to the llama3.3 model started with ollama:

With the Quarkus LangChain4j Ollama extension, you can take advantage of dev services support. It means that you don’t need to install and run Ollama on your laptop or run a model with ollama CLI. Quarkus will run Ollama as a Docker container and automatically run a selected AI model on it. In that case, you don’t need to set the quarkus.langchain4j.ollama.base-url property. Before switching to that option, let’s use a smaller AI model by setting the quarkus.langchain4j.ollama.chat-model.model-id = mistral property. Then start the app in the same way as before.

Final Thoughts

I must admit that the Quarkus LangChain4j extension is enjoyable to use. With a few simple annotations, you can configure your application to talk to the AI model of your choice correctly. In this article, I presented a straightforward example of integrating Quarkus with an AI chat model. However, we quickly reviewed features such as prompts, structured output, and chat memory. You can expect more articles in the Quarkus series with AI soon.

The post Getting Started with Quarkus LangChain4j and Chat Model appeared first on Piotr's TechBlog.

Using Model Context Protocol (MCP) with Spring AI

piotr.minkowski — Mon, 17 Mar 2025 16:17:32 +0000

This article will show how to use Spring AI support for MCP (Model Context Protocol) in Spring Boot server-side and client-side applications. You will learn how to serve tools and prompts on the server side and discover them on the client-side Spring AI application. The Model Context Protocol is a standard for managing contextual interactions with AI models. It provides a standardized way to connect AI models to external data sources and tools. It can help with building complex workflows on top of LLMs. Spring AI MCP extends the MCP Java SDK and provides client and server Spring Boot starters. The MCP Client is responsible for establishing and managing connections with MCP servers.

This is the seventh part of my series of articles about Spring Boot and AI. It is worth reading the following posts before proceeding with the current one. Please pay special attention to the last article from the list about the tool calling feature since we will implement it in our sample client and server apps using MCP.

https://piotrminkowski.com/2025/01/28/getting-started-with-spring-ai-and-chat-model: The first tutorial introduces the Spring AI project and its support for building applications based on chat models like OpenAI or Mistral AI.
https://piotrminkowski.com/2025/01/30/getting-started-with-spring-ai-function-calling: The second tutorial shows Spring AI support for Java function calling with the OpenAI chat model.
https://piotrminkowski.com/2025/02/24/using-rag-and-vector-store-with-spring-ai: The third tutorial shows Spring AI support for RAG (Retrieval Augmented Generation) and vector store.
https://piotrminkowski.com/2025/03/04/spring-ai-with-multimodality-and-images: The fourth tutorial shows Spring AI support for a multimodality feature and image generation
https://piotrminkowski.com/2025/03/10/using-ollama-with-spring-ai: The fifth tutorial shows Spring AI support for interactions with AI models run with Ollama
https://piotrminkowski.com/2025/03/13/tool-calling-with-spring-ai: The sixth tutorial show Spring AI for the Tool Calling feature.

Source Code

Feel free to use my source code if you’d like to try it out yourself. To do that, you must clone my sample GitHub repository. Then you should only follow my instructions.

Motivation for MCP with Spring AI

MCP introduces an interesting concept for applications interacting with AI models. With MCP the application can provide specific tools/functions for several other services, which need to use data exposed by that application. Additionally, it can expose prompt templates and resources. Thanks to that, we don’t need to implement AI tools/functions inside every client service but integrate them with the application that exposes tools over MCP.

The best way to analyze the MCP concept is through an example. Let’s consider an application that connects to a database and exposes data through REST endpoints. If we want to use that data in our AI application we should implement and register AI tools that retrieve data by connecting such the REST endpoints. So, each client-side application that needs data from the source service would have to implement its own set of AI tools locally. Here comes the MCP concept. The source service defines and exposes AI tools/functions in the standardized form. All other apps that need to provide data to AI models can load and use a predefined set of tools.

The following diagram illustrates our scenario. Two Spring Boot applications act as MCP servers. They connect to the database and use Spring AI MCP Server support to expose @Tool methods to the MCP client-side app. The client-side app communicates with the OpenAI model. It includes the tools exposed by the server-side apps in the user query to the AI model. The person-mcp-service app provides @Tool methods for searching persons in the database table. The account-mcp-service is doing the same for the persons’ accounts.

Build MCP Server App with Spring AI

Let’s begin with the implementation of applications that act as MCP servers. They both run and use an in-memory H2 database. To interact with a database we include the Spring Data JPA module. Spring AI allows us to switch between three transport types: STDIO, Spring MVC, and Spring WebFlux. MCP Server with Spring WebFlux supports Server-Sent Events (SSE) and an optional STDIO transport. Here’s a list of required Maven dependencies:


  
    org.springframework.ai
    spring-ai-mcp-server-webflux-spring-boot-starter
  
  
    org.springframework.boot
    spring-boot-starter-data-jpa
  
  
    com.h2database
    h2
    runtime

XML

Create the Person MCP Server

Here’s an @Entity class for interacting with the person table:

@Entity
public class Person {

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;
    private String firstName;
    private String lastName;
    private int age;
    private String nationality;
    @Enumerated(EnumType.STRING)
    private Gender gender;
    
    // ... getters and setters
    
}

Java

The Spring Data Repository interface contains a single method for searching persons by their nationality:

public interface PersonRepository extends CrudRepository<Person, Long> {
    List<Person> findByNationality(String nationality);
}

Java

The PersonTools @Service bean contains two Spring AI @Tool methods. It injects the PersonRepository bean to interact with the H2 database. The getPersonById method returns a single person with a specific ID field, while the getPersonsByNationality returns a list of all persons with a given nationality.

getPersonsByNationality( @ToolParam(description = "Nationality") String nationality) { return personRepository.findByNationality(nationality); } }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

@Service
public class PersonTools {

    private PersonRepository personRepository;

    public PersonTools(PersonRepository personRepository) {
        this.personRepository = personRepository;
    }

    @Tool(description = "Find person by ID")
    public Person getPersonById(
            @ToolParam(description = "Person ID") Long id) {
        return personRepository.findById(id).orElse(null);
    }

    @Tool(description = "Find all persons by nationality")
    public List<Person> getPersonsByNationality(
            @ToolParam(description = "Nationality") String nationality) {
        return personRepository.findByNationality(nationality);
    }
    
}

Java

Once we define @Tool methods, we must register them within the Spring AI MCP server. We can use the ToolCallbackProvider bean for that. More specifically, the MethodToolCallbackProvider class provides a builder that creates an instance of the ToolCallbackProvider class with a list of references to objects with @Tool methods.

@SpringBootApplication
public class PersonMCPServer {

    public static void main(String[] args) {
        SpringApplication.run(PersonMCPServer.class, args);
    }

    @Bean
    public ToolCallbackProvider tools(PersonTools personTools) {
        return MethodToolCallbackProvider.builder()
                .toolObjects(personTools)
                .build();
    }

}

Java

Finally, we must provide configuration properties. The person-mcp-server app will listen on the 8060 port. We should also set the name and version of the MCP server embedded in our application.

spring:
  ai:
    mcp:
      server:
        name: person-mcp-server
        version: 1.0.0
  jpa:
    database-platform: H2
    generate-ddl: true
    hibernate:
      ddl-auto: create-drop

logging.level.org.springframework.ai: DEBUG

server.port: 8060

YAML

That’s all. We can start the application.

$ cd spring-ai-mcp/person-mcp-service
$ mvn spring-boot:run

ShellSession

Create the Account MCP Server

Then, we will do very similar things in the second application that acts as an MCP server. Here’s the @Entity class for interacting with the account table:

@Entity
public class Account {

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;
    private String number;
    private int balance;
    private Long personId;
    
    // ... getters and setters
    
}

Java

The Spring Data Repository interface contains a single method for searching accounts belonging to a given person:

public interface AccountRepository extends CrudRepository<Account, Long> {
    List<Account> findByPersonId(Long personId);
}

Java

The AccountTools @Service bean contains a single Spring AI @Tool method. It injects the AccountRepository bean to interact with the H2 database. The getAccountsByPersonId method returns a list of accounts owned by the person with a specified ID field value.

getAccountsByPersonId( @ToolParam(description = "Person ID") Long personId) { return accountRepository.findByPersonId(personId); } }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

@Service
public class AccountTools {

    private AccountRepository accountRepository;

    public AccountTools(AccountRepository accountRepository) {
        this.accountRepository = accountRepository;
    }

    @Tool(description = "Find all accounts by person ID")
    public List<Account> getAccountsByPersonId(
            @ToolParam(description = "Person ID") Long personId) {
        return accountRepository.findByPersonId(personId);
    }
}

Java

Of course, the account-mcp-server application will use ToolCallbackProvider to register @Tool methods defined inside the AccountTools class.

@SpringBootApplication
public class AccountMCPService {

    public static void main(String[] args) {
        SpringApplication.run(AccountMCPService.class, args);
    }

    @Bean
    public ToolCallbackProvider tools(AccountTools accountTools) {
        return MethodToolCallbackProvider.builder()
                .toolObjects(accountTools)
                .build();
    }
    
}

Java

Here are the application configuration properties. The account-mcp-server app will listen on the 8040 port.

spring:
  ai:
    mcp:
      server:
        name: account-mcp-server
        version: 1.0.0
  jpa:
    database-platform: H2
    generate-ddl: true
    hibernate:
      ddl-auto: create-drop

logging.level.org.springframework.ai: DEBUG

server.port: 8040

YAML

Let’s run the second server-side app:

$ cd spring-ai-mcp/account-mcp-service
$ mvn spring-boot:run

ShellSession

Once we start the application, we should see the log indicating how many tools were registered in the MCP server.

Build MCP Client App with Spring AI

Implementation

We will create a single client-side application. However, we can imagine an architecture where many applications consume tools exposed by one MCP server. Our application interacts with the OpenAI chat model, so we must include the Spring AI OpenAI starter. For the MCP Client starter, we can choose between two dependencies: Standard MCP client and Spring WebFlux client. Spring team recommends using the WebFlux-based SSE connection with the spring-ai-mcp-client-webflux-spring-boot-starter. Finally, we include the Spring Web starter to expose the REST endpoint. However, you can use Spring WebFlux starter to expose them reactively.


  
    org.springframework.boot
    spring-boot-starter-web
  
  
    org.springframework.ai
    spring-ai-mcp-client-webflux-spring-boot-starter
  
  
    org.springframework.ai
    spring-ai-openai-spring-boot-starter

XML

Our MCP client connects with two MCP servers. We must provide the following connection settings in the application.yml file.

spring.ai.mcp.client.sse.connections:
  person-mcp-server:
    url: http://localhost:8060
  account-mcp-server:
    url: http://localhost:8040

ShellSession

Our sample Spring Boot application contains to @RestControllers, which expose HTTP endpoints. The PersonController class defines two endpoints for searching and counting persons by nationality. The MCP Client Boot Starter automatically configures tool callbacks that integrate with Spring AI’s tool execution framework. Thanks to that we can use the ToolCallbackProvider instance to provide default tools to the ChatClient bean. Then, we can perform the standard steps to interact with the AI model with Spring AI ChatClient. However, the client will use tools exposed by both sample MCP servers.

@RestController
@RequestMapping("/persons")
public class PersonController {

    private final static Logger LOG = LoggerFactory
        .getLogger(PersonController.class);
    private final ChatClient chatClient;

    public PersonController(ChatClient.Builder chatClientBuilder,
                            ToolCallbackProvider tools) {
        this.chatClient = chatClientBuilder
                .defaultTools(tools)
                .build();
    }

    @GetMapping("/nationality/{nationality}")
    String findByNationality(@PathVariable String nationality) {

        PromptTemplate pt = new PromptTemplate("""
                Find persons with {nationality} nationality.
                """);
        Prompt p = pt.create(Map.of("nationality", nationality));
        return this.chatClient.prompt(p)
                .call()
                .content();
    }

    @GetMapping("/count-by-nationality/{nationality}")
    String countByNationality(@PathVariable String nationality) {
        PromptTemplate pt = new PromptTemplate("""
                How many persons come from {nationality} ?
                """);
        Prompt p = pt.create(Map.of("nationality", nationality));
        return this.chatClient.prompt(p)
                .call()
                .content();
    }
}

Java

Let’s switch to the second @RestController. The AccountController class defines two endpoints for searching accounts by person ID. The GET /accounts/count-by-person-id/{personId} returns the number of accounts belonging to a given person. The GET /accounts/balance-by-person-id/{personId} is slightly more complex. It counts the total balance in all person’s accounts. However, it must also return the person’s name and nationality, which means that it must call the getPersonById tool method exposed by the person-mcp-server app after calling the tool for searching accounts by person ID.

@RestController
@RequestMapping("/accounts")
public class AccountController {

    private final static Logger LOG = LoggerFactory.getLogger(PersonController.class);
    private final ChatClient chatClient;

    public AccountController(ChatClient.Builder chatClientBuilder,
                            ToolCallbackProvider tools) {
        this.chatClient = chatClientBuilder
                .defaultTools(tools)
                .build();
    }

    @GetMapping("/count-by-person-id/{personId}")
    String countByPersonId(@PathVariable String personId) {
        PromptTemplate pt = new PromptTemplate("""
                How many accounts has person with {personId} ID ?
                """);
        Prompt p = pt.create(Map.of("personId", personId));
        return this.chatClient.prompt(p)
                .call()
                .content();
    }

    @GetMapping("/balance-by-person-id/{personId}")
    String balanceByPersonId(@PathVariable String personId) {
        PromptTemplate pt = new PromptTemplate("""
                How many accounts has person with {personId} ID ?
                Return person name, nationality and a total balance on his/her accounts.
                """);
        Prompt p = pt.create(Map.of("personId", personId));
        return this.chatClient.prompt(p)
                .call()
                .content();
    }

}

Java

Running the Application

Before starting the client-side app we must export the OpenAI token as the SPRING_AI_OPENAI_API_KEY environment variable.

export SPRING_AI_OPENAI_API_KEY=

ShellSession

Then go to the sample-client directory and run the app with the following command:

$ cd spring-ai-mcp/sample-client
$ mvn spring-boot:run

ShellSession

Once we start the application, we can switch to the logs. As you see, the sample-client app receives responses with tools from both person-mcp-server and account-mcp-server apps.

Testing MCP with Spring Boot

Both server-side applications load data from the import.sql scripts on startup. Spring Data JPA automatically imports data from such scripts. Our MCP client application listens on the 8080 port. Let’s call the first endpoint to get a list of persons from Germany:

curl http://localhost:8080/persons/nationality/Germany

ShellSession

Here’s the response from the OpenAI model:

We can also call the endpoint that counts the number with a given nationality.

curl http://localhost:8080/persons/count-by-nationality/Germany

ShellSession

As the final test, we can call the GET /accounts/balance-by-person-id/{personId} endpoint that interacts with tools exposed by both MCP server-side apps. It requires an AI model to combine data from person and account sources.

Exposing Prompts with MCP

We can also expose prompts and resources with the Spring AI MCP server support. To register and expose prompts we need to define the list of SyncPromptRegistration objects. It contains the name of the prompt, a list of input arguments, and a text content.

{ String argument = (String) getPromptRequest.arguments().get("nationality"); var userMessage = new McpSchema.PromptMessage(McpSchema.Role.USER, new McpSchema.TextContent("How many persons come from " + argument + " ?")); return new McpSchema.GetPromptResult("Count persons by nationality", List.of(userMessage)); }); return List.of(promptRegistration); } }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

@SpringBootApplication
public class PersonMCPServer {

    public static void main(String[] args) {
        SpringApplication.run(PersonMCPServer.class, args);
    }

    @Bean
    public ToolCallbackProvider tools(PersonTools personTools) {
        return MethodToolCallbackProvider.builder()
                .toolObjects(personTools)
                .build();
    }

    @Bean
    public List prompts() {
        var prompt = new McpSchema.Prompt("persons-by-nationality", "Get persons by nationality",
                List.of(new McpSchema.PromptArgument("nationality", "Person nationality", true)));

        var promptRegistration = new McpServerFeatures.SyncPromptRegistration(prompt, getPromptRequest -> {
            String argument = (String) getPromptRequest.arguments().get("nationality");
            var userMessage = new McpSchema.PromptMessage(McpSchema.Role.USER,
                    new McpSchema.TextContent("How many persons come from " + argument + " ?"));
            return new McpSchema.GetPromptResult("Count persons by nationality", List.of(userMessage));
        });

        return List.of(promptRegistration);
    }
}

ShellSession

After startup, the application prints information about a list of registered prompts in the logs.

There is no built-in Spring AI support for loading prompts using the MCP client. However, Spring AI MCP support is under active development so we may expect some new features soon. For now, Spring AI provides the auto-configured instance of McpSyncClient. We can use it to search the prompt in the list of prompts received from the server. Then, we can prepare the PromptTemplate instance using the registered content and create the Prompt by filling the template with the input parameters.

mcpSyncClients; public PersonController(ChatClient.Builder chatClientBuilder, ToolCallbackProvider tools, List mcpSyncClients) { this.chatClient = chatClientBuilder .defaultTools(tools) .build(); this.mcpSyncClients = mcpSyncClients; } // ... other endpoints @GetMapping("/count-by-nationality-from-client/{nationality}") String countByNationalityFromClient(@PathVariable String nationality) { return this.chatClient .prompt(loadPromptByName("persons-by-nationality", nationality)) .call() .content(); } Prompt loadPromptByName(String name, String nationality) { McpSchema.GetPromptRequest r = new McpSchema .GetPromptRequest(name, Map.of("nationality", nationality)); var client = mcpSyncClients.stream() .filter(c -> c.getServerInfo().name().equals("person-mcp-server")) .findFirst(); if (client.isPresent()) { var content = (McpSchema.TextContent) client.get() .getPrompt(r) .messages() .getFirst() .content(); PromptTemplate pt = new PromptTemplate(content.text()); Prompt p = pt.create(Map.of("nationality", nationality)); LOG.info("Prompt: {}", p); return p; } else return null; } }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

@RestController
@RequestMapping("/persons")
public class PersonController {

    private final static Logger LOG = LoggerFactory
        .getLogger(PersonController.class);
    private final ChatClient chatClient;
    private final List<McpSyncClient> mcpSyncClients;

    public PersonController(ChatClient.Builder chatClientBuilder,
                            ToolCallbackProvider tools,
                            List<McpSyncClient> mcpSyncClients) {
        this.chatClient = chatClientBuilder
                .defaultTools(tools)
                .build();
        this.mcpSyncClients = mcpSyncClients;
    }

    // ... other endpoints
    
    @GetMapping("/count-by-nationality-from-client/{nationality}")
    String countByNationalityFromClient(@PathVariable String nationality) {
        return this.chatClient
                .prompt(loadPromptByName("persons-by-nationality", nationality))
                .call()
                .content();
    }

    Prompt loadPromptByName(String name, String nationality) {
        McpSchema.GetPromptRequest r = new McpSchema
            .GetPromptRequest(name, Map.of("nationality", nationality));
        var client = mcpSyncClients.stream()
                .filter(c -> c.getServerInfo().name().equals("person-mcp-server"))
                .findFirst();
        if (client.isPresent()) {
            var content = (McpSchema.TextContent) client.get() 
                .getPrompt(r)
                .messages()
                .getFirst()
                .content();
            PromptTemplate pt = new PromptTemplate(content.text());
            Prompt p = pt.create(Map.of("nationality", nationality));
            LOG.info("Prompt: {}", p);
            return p;
        } else return null;
    }
}

Java

Final Thoughts

Model Context Protocol is an important initiative in the AI world. It allows us to avoid reinventing the wheel for each new data source. A unified protocol streamlines integration, minimizing development time and complexity. As businesses expand their AI toolsets, MCP enables seamless connectivity across multiple systems without the burden of excessive custom code. Spring AI introduced the initial version of MCP support recently. It seems promising. With Spring AI Client and Server starters, we may implement a distributed architecture, where several different apps use the AI tools exposed by a single service.

The post Using Model Context Protocol (MCP) with Spring AI appeared first on Piotr's TechBlog.

Tool Calling with Spring AI

piotr.minkowski — Thu, 13 Mar 2025 15:55:40 +0000

This article will show you how to use Spring AI support with the most popular AI models for the tool calling feature. Tool calling (or function calling), is a common pattern in AI applications that enables a model to interact with APIs or tools, extending its capabilities. The most popular AI models are trained to know when to call a function. Spring AI formerly supported it through the Function Calling API, which has been deprecated and marked for removal in the next release. My previous article described that feature based on interactions with an internal database and an external market stock API. Today, we will consider the same use case. This time, however, we will replace the deprecated Function Calling API with a new Tool calling feature.

This is the sixth part of my series of articles about Spring Boot and AI. It is worth reading the following posts before proceeding with the current one. Please pay special attention to the second article. I will refer to it often in this article.

https://piotrminkowski.com/2025/01/28/getting-started-with-spring-ai-and-chat-model: The first tutorial introduces the Spring AI project and its support for building applications based on chat models like OpenAI or Mistral AI.
https://piotrminkowski.com/2025/01/30/getting-started-with-spring-ai-function-calling: The second tutorial shows Spring AI support for Java function calling with the OpenAI chat model.
https://piotrminkowski.com/2025/02/24/using-rag-and-vector-store-with-spring-ai: The third tutorial shows Spring AI support for RAG (Retrieval Augmented Generation) and vector store.
https://piotrminkowski.com/2025/03/04/spring-ai-with-multimodality-and-images: The fourth tutorial shows Spring AI support for a multimodality feature and image generation
https://piotrminkowski.com/2025/03/10/using-ollama-with-spring-ai: The fifth tutorial shows Spring AI supports for interactions with AI models run with Ollama

Source Code

Feel free to use my source code if you’d like to try it out yourself. To do that, you must clone my sample GitHub repository. Then you should only follow my instructions.

Motivation for Tool Calling in Spring AI

The tool calling feature helps us solve a common AI model challenge related to internal or live data sources. If we want to augment a model with such data our applications must allow it to interact with a set of APIs or tools. In our case, the internal database (H2) contains information about the structure of our stock wallet. The sample Spring Boot application asks an AI model about the total value of the wallet based on daily stock prices or the highest value for the last few days. The model must retrieve the structure of our stock wallet and the latest stock prices. We will do the same exercise as for a function calling feature. It will be enhanced with additional scenarios I’ll describe later.

Use the Calling Tools Feature in Spring AI

Create WalletTools

Let’s begin with the WalletTools implementation, which is responsible for interaction with a database. We can compare it to the previous implementation based on Spring functions available in the pl.piomin.services.functions.stock.WalletService class. It defines a single method annotated with @Tool. The important element is the right description that must inform the model what that method does. The method returns the number of shares for each company in our portfolio retrieved from the database through the Spring Data @Repository.

getNumberOfShares() { return (List) walletRepository.findAll(); } }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

public class WalletTools {

    private WalletRepository walletRepository;

    public WalletTools(WalletRepository walletRepository) {
        this.walletRepository = walletRepository;
    }

    @Tool(description = "Number of shares for each company in my wallet")
    public List<Share> getNumberOfShares() {
        return (List<Share>) walletRepository.findAll();
    }
}

Java

We can register the WalletTools class as a Spring @Bean in the application main class.

@Bean
public WalletTools walletTools(WalletRepository walletRepository) {
   return new WalletTools(walletRepository);
}

Java

The Spring Boot application launches an embedded, in-memory database and inserts test data into the stock table. Our wallet contains the most popular companies on the U.S. stock market, including Amazon, Meta, and Microsoft.

insert into share(id, company, quantity) values (1, 'AAPL', 100);
insert into share(id, company, quantity) values (2, 'AMZN', 300);
insert into share(id, company, quantity) values (3, 'META', 300);
insert into share(id, company, quantity) values (4, 'MSFT', 400);
insert into share(id, company, quantity) values (5, 'NVDA', 200);

SQL

Create StockTools

The StockTools class is responsible for interaction with TwelveData stock API. It defines two methods. The getLatestStockPrices method returns only the latest close price for a specified company. It is a tool calling version of the method provided within the pl.piomin.services.functions.stock.StockService function. The second method is more complicated. It must return a historical daily close prices for a defined number of days. Each price must be correlated with a quotation date.

{}", company, latestData.getClose()); return new StockResponse(Float.parseFloat(latestData.getClose())); } @Tool(description = "Historical daily stock prices") public List getHistoricalStockPrices(@ToolParam(description = "Search period in days") int days, @ToolParam(description = "Name of company") String company) { StockData data = restTemplate.getForObject("https://api.twelvedata.com/time_series?symbol={0}&interval=1day&outputsize={1}&apikey={2}", StockData.class, company, days, apiKey); return data.getValues().stream() .map(d -> new DailyShareQuote(company, Float.parseFloat(d.getClose()), d.getDatetime())) .toList(); } }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

public class StockTools {

    private static final Logger LOG = LoggerFactory.getLogger(StockTools.class);

    private RestTemplate restTemplate;
    @Value("${STOCK_API_KEY:none}")
    String apiKey;

    public StockTools(RestTemplate restTemplate) {
        this.restTemplate = restTemplate;
    }

    @Tool(description = "Latest stock prices")
    public StockResponse getLatestStockPrices(@ToolParam(description = "Name of company") String company) {
        StockData data = restTemplate.getForObject("https://api.twelvedata.com/time_series?symbol={0}&interval=1min&outputsize=1&apikey={1}",
                StockData.class,
                company,
                apiKey);
        DailyStockData latestData = data.getValues().get(0);
        LOG.info("Get stock prices: {} -> {}", company, latestData.getClose());
        return new StockResponse(Float.parseFloat(latestData.getClose()));
    }

    @Tool(description = "Historical daily stock prices")
    public List<DailyShareQuote> getHistoricalStockPrices(@ToolParam(description = "Search period in days") int days,
                                                          @ToolParam(description = "Name of company") String company) {
        StockData data = restTemplate.getForObject("https://api.twelvedata.com/time_series?symbol={0}&interval=1day&outputsize={1}&apikey={2}",
                StockData.class,
                company,
                days,
                apiKey);
        return data.getValues().stream()
                .map(d -> new DailyShareQuote(company, Float.parseFloat(d.getClose()), d.getDatetime()))
                .toList();
    }
}

Java

Here’s the DailyShareQuote Java record returned in the response list.

public record DailyShareQuote(String company, float price, String datetime) {
}

Java

Then, let’s register the StockUtils class as a Spring @Bean.

@Bean
public StockTools stockTools() {
   return new StockTools(restTemplate());
}

Java

Spring AI Tool Calling Flow

Here’s a fragment of the WalletController code, which is responsible for defining interactions with LLM and HTTP endpoints implementation. It injects both StockTools and WalletTools beans.

@RestController
@RequestMapping("/wallet")
public class WalletController {

    private final ChatClient chatClient;
    private final StockTools stockTools;
    private final WalletTools walletTools;

    public WalletController(ChatClient.Builder chatClientBuilder,
                            StockTools stockTools,
                            WalletTools walletTools) {
        this.chatClient = chatClientBuilder
                .defaultAdvisors(new SimpleLoggerAdvisor())
                .build();
        this.stockTools = stockTools;
        this.walletTools = walletTools;
    }
    
    // HTTP endpoints implementation
}

Java

The GET /wallet/with-tools endpoint calculates the value of our stock wallet in dollars. It uses the latest daily stock prices for each company’s shares from the wallet. There are a few ways to register tools for a chat model call. We use the tools method provided by the ChatClient interface. It allows us to pass the tool object references directly to the chat client. In this case, we are registering the StockTools bean which contains two @Tool methods. The AI model must choose the right method to call in StockTools based on the description and input argument. It should call the getLatestStockPrices method.

@GetMapping("/with-tools")
String calculateWalletValueWithTools() {
   PromptTemplate pt = new PromptTemplate("""
   What’s the current value in dollars of my wallet based on the latest stock daily prices ?
   """);

   return this.chatClient.prompt(pt.create())
           .tools(stockTools, walletTools)
           .call()
           .content();
}

Java

The GET /wallet/highest-day/{days} endpoint calculates the value of our stock wallet in dollars for each day in the specified period determined by the days variable. Then it must return the day with the highest stock wallet value. Same as before we use the tools method from ChatClient to register our tool calling methods. It should call the getHistoricalStockPrices method.

@GetMapping("/highest-day/{days}")
String calculateHighestWalletValue(@PathVariable int days) {
   PromptTemplate pt = new PromptTemplate("""
   On which day during last {days} days my wallet had the highest value in dollars based on the historical daily stock prices ?
   """);

   return this.chatClient.prompt(pt.create(Map.of("days", days)))
            .tools(stockTools, walletTools)
            .call()
            .content();
}

Java

The following diagram illustrates a flow for the second use case that returns the day with the highest stock wallet value. First, it must connect with the database and retrieve the stock wallet structure containing a number of each company shares. Then, it must call the stock API for every company found in the wallet. So, finally, the method calculateHighestWalletValue should be called five times with different values of the company @ToolParam and a value of the days determined by the HTTP endpoint path variable. Once all the data is collected AI model calculates the highest wallet value and returns it together with the quotation date.

Run Application and Verify Tool Calling

Before starting the application we must set environment variables with the AI model and stock API tokens.

export OPEN_AI_TOKEN=<YOUR_OPEN_AI_TOKEN>
export STOCK_API_KEY=<YOUR_STOCK_API_KEY>

Java

Then run the following Maven command:

mvn spring-boot:run

Java

Once the application is started, we can call the first endpoint. The GET /wallet/with-tools calculates the total least value of the stock wallet structure stored in the database.

curl http://localhost:8080/wallet/with-tools

ShellSession

Here’s the fragment of logs generated by the Spring AI @Tool methods. The model behaves as expected. First, it calls the getNumberOfShares tool to retrieve a wallet structure. Then it calls the getLatestStockPrices tool per share to obtain its current price.

Here’s a final response with a wallet value with a detailed explanation.

Then we can call the GET /wallet/highest-day/{days} endpoint to return the day with the highest wallet value. Let’s calculate it for the last 20 days.

curl http://localhost:8080/wallet/highest-day/20

ShellSession

The response is very detailed. Here’s the final part of the content returned by the OpenAI chat model. It returns 26.02.2025 as the day with the highest wallet value. Frankly, sometimes it returns different answers…

However, the AI flow works fine. First, it calls the getNumberOfShares tool to retrieve a wallet structure. Then it calls the getHistoricalStockPrices tool per share to obtain its prices for the last 20 days.

We can switch to another AI model to compare their responses. You can connect my sample Spring Boot application e.g. with Mistral AI by activating the mistral-ai Maven profile.

mvn spring-boot:run -Pmistral-ai

ShellSession

Before running the app we must export the Mistral API token.

export MISTRAL_AI_TOKEN=

ShellSession

To get the best results I changed the Mistral model to mistral-large-latest.

spring.ai.mistralai.chat.options.model = mistral-large-latest

ShellSession

The response from Mistral AI was pretty quick and short:

Final Thoughts

In this article, we analyzed the Spring AI support for tool calling support, which replaces Function Calling API. Tool calling is a powerful feature that enhances how AI models interact with external tools, APIs, and structured data. It makes AI more interactive and practical for real-world applications. Spring AI provides a flexible way to register and invoke such tools. However, it still requires attention from developers, who need to define clear function schemas and handle edge cases.

The post Tool Calling with Spring AI appeared first on Piotr's TechBlog.

Using Ollama with Spring AI

piotr.minkowski — Mon, 10 Mar 2025 09:46:35 +0000

This article will teach you how to create a Spring Boot application that implements several AI scenarios using Spring AI and the Ollama tool. Ollama is an open-source tool that aims to run open LLMs on our local machine. It acts like a bridge between LLM and a workstation, providing an API layer on top of them for other applications or services. With Ollama we can run almost every model we want only by pulling it from a huge library.

This is the fifth part of my series of articles about Spring Boot and AI. I mentioned Ollama in the first part of the series to show how to switch between different AI models with Spring AI. However, it was only a brief introduction. Today, we try to run all AI use cases described in the previous tutorials with the Ollama tool. Those tutorials integrated mostly with OpenAI. In this article, we will test them against different AI models.

https://piotrminkowski.com/2025/01/28/getting-started-with-spring-ai-and-chat-model: The first tutorial introduces the Spring AI project and its support for building applications based on chat models like OpenAI or Mistral AI.
https://piotrminkowski.com/2025/01/30/getting-started-with-spring-ai-function-calling: The second tutorial shows Spring AI support for Java function calling with the OpenAI chat model.
https://piotrminkowski.com/2025/02/24/using-rag-and-vector-store-with-spring-ai: The third tutorial shows Spring AI support for RAG (Retrieval Augmented Generation) and vector store.
https://piotrminkowski.com/2025/03/04/spring-ai-with-multimodality-and-images: The fourth tutorial shows Spring AI support for a multimodality feature and image generation

Fortunately, our application can easily switch between different AI tools or models. To achieve this, we must activate the right Maven profile.

Source Code

Feel free to use my source code if you’d like to try it out yourself. To do that, you must clone my sample GitHub repository. Then you should only follow my instructions.

Prepare a Local Environment for Ollama

A few options exist for accessing Ollama on the local machine with Spring AI. I downloaded Ollama from the following link and installed it on my laptop. Alternatively, we can run it e.g. with Docker Compose or Testcontainers.

Once we install Ollama on our workstation we can run the AI model from its library with the ollama run command. The full list of available models can be found here. At the beginning, we will choose the Llava model. It is one of the most popular models which supports both a vision encoder and language understanding.

ollama run llava

ShellSession

Ollama must pull the model manifest and image. Here’s the ollama run command output. Once we see that, we can interact with the model.

The sample application source code already defines the ollama-ai Maven profile with the spring-ai-ollama-spring-boot-starter Spring Boot starter.


  ollama-ai
  
    
      org.springframework.ai
      spring-ai-ollama-spring-boot-starter

XML

The profile is disabled by default. We might enable it during development as shown below (for IntelliJ IDEA). However, the application doesn’t use any vendor-specific components but only generic Spring AI classes and interfaces.

We must activate the ollama-ai profile when running the same application. Assuming we are in the project root directory, we need to run the following Maven command:

mvn spring-boot:run -Pollama-ai

ShellSession

Portability across AI Models

We should avoid using specific model library components to make our application portable between different models. For example, when registering functions in the chat model client we should use FunctionCallingOptions instead of model-specific components like OpenAIChatOptions or OllamaOptions.

@GetMapping
String calculateWalletValue() {
   PromptTemplate pt = new PromptTemplate("""
   What’s the current value in dollars of my wallet based on the latest stock daily prices ?
   """);

   return this.chatClient.prompt(pt.create(
        FunctionCallingOptions.builder()
                    .function("numberOfShares")
                    .function("latestStockPrices")
                    .build()))
            .call()
            .content();
}

Java

Not all models support all the AI capabilities used in our sample application. For models like Ollama or Mistral AI, Spring AI doesn’t provide image generation implementation since those tools don’t support it right now. Therefore we should inject the ImageModel optionally, in case it is not provided by the model-specific library.

imageModel, VectorStore store) { this.chatClient = chatClientBuilder .defaultAdvisors(new SimpleLoggerAdvisor()) .build(); imageModel.ifPresent(model -> this.imageModel = model); // other initializations } }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

@RestController
@RequestMapping("/images")
public class ImageController {

    private final static Logger LOG = LoggerFactory.getLogger(ImageController.class);
    private final ObjectMapper mapper = new ObjectMapper();

    private final ChatClient chatClient;
    private ImageModel imageModel;

    public ImageController(ChatClient.Builder chatClientBuilder,
                           Optional<ImageModel> imageModel,
                           VectorStore store) {
        this.chatClient = chatClientBuilder
                .defaultAdvisors(new SimpleLoggerAdvisor())
                .build();
        imageModel.ifPresent(model -> this.imageModel = model);
        
        // other initializations 
    }
}

Java

Then, if a method requires the ImageModel bean, we can throw an exception informing it is not by the AI model (1). On the other hand, Spring AI does not provide a dedicated interface for multimodality, which enables AI models to process information from multiple sources. We can use the UserMessage class and the Media class to combine e.g. text with image(s) in the user prompt. The GET /images/describe/{image} endpoint lists items detected in the source image from the classpath (2).

describeImage(@PathVariable String image) { Media media = Media.builder() .id(image) .mimeType(MimeTypeUtils.IMAGE_PNG) .data(new ClassPathResource("images/" + image + ".png")) .build(); UserMessage um = new UserMessage(""" List all items you see on the image and define their category. Return items inside the JSON array in RFC8259 compliant JSON format. """, media); return this.chatClient.prompt(new Prompt(um)) .call() .entity(new ParameterizedTypeReference<>() {}); }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

@GetMapping(value = "/generate/{object}", produces = MediaType.IMAGE_PNG_VALUE)
byte[] generate(@PathVariable String object) throws IOException, NotSupportedException {
   if (imageModel == null)
      throw new NotSupportedException("Image model is not supported by the AI model"); // (1)
   ImageResponse ir = imageModel.call(new ImagePrompt("Generate an image with " + object, ImageOptionsBuilder.builder()
           .height(1024)
           .width(1024)
           .N(1)
           .responseFormat("url")
           .build()));
   String url = ir.getResult().getOutput().getUrl();
   UrlResource resource = new UrlResource(url);
   LOG.info("Generated URL: {}", url);
   dynamicImages.add(Media.builder()
           .id(UUID.randomUUID().toString())
           .mimeType(MimeTypeUtils.IMAGE_PNG)
           .data(url)
           .build());
   return resource.getContentAsByteArray();
}
    
@GetMapping("/describe/{image}") // (2)
List<Item> describeImage(@PathVariable String image) {
   Media media = Media.builder()
           .id(image)
           .mimeType(MimeTypeUtils.IMAGE_PNG)
           .data(new ClassPathResource("images/" + image + ".png"))
           .build();
   UserMessage um = new UserMessage("""
   List all items you see on the image and define their category. 
   Return items inside the JSON array in RFC8259 compliant JSON format.
   """, media);
   return this.chatClient.prompt(new Prompt(um))
           .call()
           .entity(new ParameterizedTypeReference<>() {});
}

Java

Let’s try to avoid similar declarations described in Spring AI. Although they are perfectly correct, they will cause problems when switching between different Spring Boot starters for different AI vendors.

ChatResponse response = chatModel.call(
    new Prompt(
        "Generate the names of 5 famous pirates.",
        OllamaOptions.builder()
            .model(OllamaModel.LLAMA3_1)
            .temperature(0.4)
            .build()
    ));

Java

In this case, we can set the global property in the application.properties file which sets the default model used in the scenario with Ollama.

spring.ai.ollama.chat.options.model = llava

Java

Testing Multiple Models with Spring AI and Ollama

By default, Ollama doesn’t require any API token to establish communication with AI models. The Ollama Spring Boot starter provides auto-configuration that connects the chat client to the Ollama API server running on the localhost:11434 address. So, before running our sample application we must export tokens used to authorize against stock market API and a vector store.

export STOCK_API_KEY=<YOUR_STOCK_API_KEY>
export PINECONE_TOKEN=<YOUR_PINECONE_TOKEN>

Java

Llava on Ollama

Let’s begin with the Llava model. We can call the first endpoint that asks the model to generate a list of persons (GET /persons) and then search for the person with a particular in the list stored in the chat memory (GET /persons/{id}).

Then we can the endpoint that displays all the items visible on the particular image from the classpath (GET /images/describe/{image}).

By the way, here is the analyzed image stored in the src/main/resources/images/fruits-3.png file.

The endpoint for describing all the input images from the classpath doesn’t work fine. I tried to tweak it by adding the RFC8259 JSON format sentence or changing a query. However, the AI model always returned a description of a single instead of a whole Media list. The OpenAI model could print descriptions for all images in the String[] format.

@GetMapping("/describe")
String[] describe() {
   UserMessage um = new UserMessage("""
            Explain what do you see on each image from the input list.
            Return data in RFC8259 compliant JSON format.
            """, List.copyOf(Stream.concat(images.stream(), dynamicImages.stream()).toList()));
   return this.chatClient.prompt(new Prompt(um))
            .call()
            .entity(String[].class);
}

Java

Here’s the response. Of course, we can train a model to receive better results or try to prepare a better prompt.

After calling the GET /wallet endpoint exposed by the WalletController, I received the [400] Bad Request - {"error":"registry.ollama.ai/library/llava:latest does not support tools"} response. It seems Llava doesn’t support the Function/Tool calling feature. We will also always receive the NotSupportedExcpetion for GET /images/generate/{object} endpoint, since the Spring AI Ollama library doesn’t provide ImageModel bean. You can perform other tests e.g. for RAG and vector store features implemented in the StockController @RestController.

Granite on Ollama

Let’s switch to another interesting model – Granite. Particularly we will test the granite3.2-vision model dedicated to automated content extraction from tables, charts, infographics, plots, and diagrams. First, we set the current model name in the Ollama Spring AI configuration properties.

spring.ai.ollama.chat.options.model = granite3.2-vision

Plaintext

Let’s stop the Llava model and then run granite3.2-vision on Ollama:

ollama run granite3.2-vision

Java

After the application restarts, we can perform some test calls. The endpoint for describing a single image returns a more detailed response than the Llava model. The response for the query with multiple images still looks the same as before.

The Granite Vision model supports a “function calling” feature, but it couldn’t call functions properly using my prompt. Please refer to my article for more details about the Spring AI function calling with OpenAI.

Deepseek on Ollama

The last model we will run within this exercise is Deepseek. DeepSeek-R1 achieves performance comparable to OpenAI-o1 on reasoning tasks. First, we must set the current model name in the Ollama Spring AI configuration properties.

spring.ai.ollama.chat.options.model = deepseek-r1

Plaintext

Then let’s stop the Granite model and then run deepseek-r1 on Ollama:

ollama run deepseek-r1

ShellSession

We need to restart the app:

mvn spring-boot:run -Pollama-ai

ShellSession

As usual, we can call the first endpoint that asks the model to generate a list of persons (GET /persons) and then search for the person with a particular in the list stored in the chat memory (GET /persons/{id}). The response was pretty large, but not in the required JSON format. Here’s the fragment of the response:

The deepseek-r1 model doesn’t support a tool/function calling feature. Also, it didn’t analyze my input image properly and it didn’t return a JSON response according to the Spring AI structured output feature.

Final Thoughts

This article shows how to easily switch between multiple AI models with Spring AI and Ollama. We tested several AI use cases implemented in the sample Spring Boot application across models such as Llava, Granite, or Deepseek. The app provides several endpoints for showing such features as multimodality, chat memory, RAG, vector store, or a function calling. It aims not to compare the AI models, but to give a simple recipe for integration with different AI models and allow playing with them using Spring AI.

The post Using Ollama with Spring AI appeared first on Piotr's TechBlog.

Spring AI with Multimodality and Images

piotr.minkowski — Tue, 04 Mar 2025 08:56:24 +0000

This article will teach you how to create a Spring Boot application that handles images and text using the Spring AI multimodality feature. Multimodality is the ability to understand and process information from different sources simultaneously. It covers text, images, audio, and other data formats. We will perform simple experiments with multimodality and images. This is the fourth part of my series of articles about Spring Boot and AI. It is worth reading the following posts before proceeding with the current one:

https://piotrminkowski.com/2025/01/28/getting-started-with-spring-ai-and-chat-model: The first tutorial introduces the Spring AI project and its support for building applications based on chat models like OpenAI or Mistral AI.
https://piotrminkowski.com/2025/01/30/getting-started-with-spring-ai-function-calling: The second tutorial shows Spring AI support for Java function calling with the OpenAI chat model.
https://piotrminkowski.com/2025/02/24/using-rag-and-vector-store-with-spring-ai: The third tutorial shows Spring AI support for RAG (Retrieval Augmented Generation) and vector store.

Source Code

Feel free to use my source code if you’d like to try it out yourself. To do that, you must clone my sample GitHub repository. Then you should only follow my instructions.

Motivation for Multimodality with Spring AI

The multimodal large language model (LLM) capabilities allow it to process and generate text alongside other modalities, including images, audio, and video. This feature covers a use case when we want LLM to detect something specific inside an image or describe its content. Let’s assume we have a list of input images. We want to find the image in that list that matches our description. For example, this description can ask a model to find the image that contains a specified item. The Spring AI Message API provides all the necessary elements to support multimodal LLMs. Here’s a diagram that illustrates our scenario.

Use Multimodality with Spring AI

We don’t need to include any specific library other than the Spring AI starter for a particular AI model. The default option is spring-ai-openai-spring-boot-starter. Our application uses images stored in the src/main/resources/images directory. Spring AI multimodality support requires the image to be passed inside the Media object. We load all the pictures from the classpath inside the constructor.

Recognize Items in the Image

The GET /images/find/{object} tries to find the image that contains the item determined by the object path variable. AI model must return a position on the image in the input list. To achieve that, we create an UserMessage object that contains a user query and a list of the Media objects. Once the model returns the position, the endpoint reads the image from the list and returns its content in the image/png format.

images; private List dynamicImages = new ArrayList<>(); public ImageController(ChatClient.Builder chatClientBuilder) { this.chatClient = chatClientBuilder .defaultAdvisors(new SimpleLoggerAdvisor()) .build(); this.images = List.of( Media.builder().id("fruits").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits.png")).build(), Media.builder().id("fruits-2").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits-2.png")).build(), Media.builder().id("fruits-3").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits-3.png")).build(), Media.builder().id("fruits-4").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits-4.png")).build(), Media.builder().id("fruits-5").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits-5.png")).build(), Media.builder().id("animals").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals.png")).build(), Media.builder().id("animals-2").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals-2.png")).build(), Media.builder().id("animals-3").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals-3.png")).build(), Media.builder().id("animals-4").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals-4.png")).build(), Media.builder().id("animals-5").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals-5.png")).build() ); } @GetMapping(value = "/find/{object}", produces = MediaType.IMAGE_PNG_VALUE) @ResponseBody byte[] analyze(@PathVariable String object) { String msg = """ Which picture contains %s. Return only a single picture. Return only the number that indicates its position in the media list. """.formatted(object); LOG.info(msg); UserMessage um = new UserMessage(msg, images); String content = this.chatClient.prompt(new Prompt(um)) .call() .content(); assert content != null; return images.get(Integer.parseInt(content)-1).getDataAsByteArray(); } }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

@RestController
@RequestMapping("/images")
public class ImageController {

    private final static Logger LOG = LoggerFactory
        .getLogger(ImageController.class);

    private final ChatClient chatClient;
    private List<Media> images;
    private List<Media> dynamicImages = new ArrayList<>();

    public ImageController(ChatClient.Builder chatClientBuilder) {
        this.chatClient = chatClientBuilder
                .defaultAdvisors(new SimpleLoggerAdvisor())
                .build();
        this.images = List.of(
                Media.builder().id("fruits").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits.png")).build(),
                Media.builder().id("fruits-2").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits-2.png")).build(),
                Media.builder().id("fruits-3").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits-3.png")).build(),
                Media.builder().id("fruits-4").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits-4.png")).build(),
                Media.builder().id("fruits-5").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits-5.png")).build(),
                Media.builder().id("animals").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals.png")).build(),
                Media.builder().id("animals-2").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals-2.png")).build(),
                Media.builder().id("animals-3").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals-3.png")).build(),
                Media.builder().id("animals-4").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals-4.png")).build(),
                Media.builder().id("animals-5").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals-5.png")).build()
        );
    }

    @GetMapping(value = "/find/{object}", produces = MediaType.IMAGE_PNG_VALUE)
    @ResponseBody byte[] analyze(@PathVariable String object) {
        String msg = """
        Which picture contains %s.
        Return only a single picture.
        Return only the number that indicates its position in the media list.
        """.formatted(object);
        LOG.info(msg);

        UserMessage um = new UserMessage(msg, images);

        String content = this.chatClient.prompt(new Prompt(um))
                .call()
                .content();

        assert content != null;
        return images.get(Integer.parseInt(content)-1).getDataAsByteArray();
    }

}

Java

Let’s make a test call. We will look for the picture containing a banana. Here’s the AI model response after calling the http://localhost:8080/images/find/banana. You can try to make other test calls and find an image with e.g. an orange or a tomato.

Describe Image Contents

On the other hand, we can ask the AI model to generate a short description of all images included as the Media content. The GET /images/describe endpoint merges two lists of images.

@GetMapping("/describe")
String[] describe() {
   UserMessage um = new UserMessage("Explain what do you see on each image.",
            List.copyOf(Stream.concat(images.stream(), dynamicImages.stream()).toList()));
      return this.chatClient.prompt(new Prompt(um))
              .call()
              .entity(String[].class);
}

Java

Once we call the http://localhost:8080/images/describe URL we will receive a compact description of all input images. The two highlighted descriptions have been generated for images from the dynamicImages List. These images were generated by the AI image model. We will discuss this in the next section.

Generate Images with AI Model

To generate an image using AI API we must inject the ImageModel bean. It provides a single call method that allows us to communicate with AI Models dedicated to image generation. This method takes the ImagePrompt object as an argument. Typically, we use the ImagePrompt constructor that takes instructions for image generation and options that customize the height, width, and number of images. We will generate a single (N=1) image with 1024 pixels in height and width. The AI model returns the image URL (responseFormat). Once the image is generated, we create an UrlResource object, create the Media object, and put it into the dynamicImages List. The GET /images/generate/{object} endpoint returns a byte array representation of the image object.

images; private List dynamicImages = new ArrayList<>(); public ImageController(ChatClient.Builder chatClientBuilder, ImageModel imageModel) { this.chatClient = chatClientBuilder .defaultAdvisors(new SimpleLoggerAdvisor()) .build(); this.imageModel = imageModel; // other initializations } @GetMapping(value = "/generate/{object}", produces = MediaType.IMAGE_PNG_VALUE) byte[] generate(@PathVariable String object) throws IOException { ImageResponse ir = imageModel.call(new ImagePrompt("Generate an image with " + object, ImageOptionsBuilder.builder() .height(1024) .width(1024) .N(1) .responseFormat("url") .build())); UrlResource url = new UrlResource(ir.getResult().getOutput().getUrl()); LOG.info("Generated URL: {}", ir.getResult().getOutput().getUrl()); dynamicImages.add(Media.builder() .id(UUID.randomUUID().toString()) .mimeType(MimeTypeUtils.IMAGE_PNG) .data(url) .build()); return url.getContentAsByteArray(); } }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

@RestController
@RequestMapping("/images")
public class ImageController {

    private final ChatClient chatClient;
    private final ImageModel imageModel;
    private List<Media> images;
    private List<Media> dynamicImages = new ArrayList<>();
    
    public ImageController(ChatClient.Builder chatClientBuilder,
                           ImageModel imageModel) {
        this.chatClient = chatClientBuilder
                .defaultAdvisors(new SimpleLoggerAdvisor())
                .build();
        this.imageModel = imageModel;
        // other initializations
    }
    
    @GetMapping(value = "/generate/{object}", produces = MediaType.IMAGE_PNG_VALUE)
    byte[] generate(@PathVariable String object) throws IOException {
        ImageResponse ir = imageModel.call(new ImagePrompt("Generate an image with " + object, ImageOptionsBuilder.builder()
                .height(1024)
                .width(1024)
                .N(1)
                .responseFormat("url")
                .build()));
        UrlResource url = new UrlResource(ir.getResult().getOutput().getUrl());
        LOG.info("Generated URL: {}", ir.getResult().getOutput().getUrl());
        dynamicImages.add(Media.builder()
                .id(UUID.randomUUID().toString())
                .mimeType(MimeTypeUtils.IMAGE_PNG)
                .data(url)
                .build());
        return url.getContentAsByteArray();
    }
    
}

Java

Do you remember the description of that image returned by the GET /images/describe endpoint? Here’s our image with strawberry generated by the AI model after calling the http://localhost:8080/images/generate/strawberry URL.

Here’s a similar test for the banana input parameter.

Use Vector Store with Spring AI Multimodality

Let’s consider how we can leverage vector store in our scenario. We cannot insert image representation directly to a vector store since most popular vendors like OpenAI or Mistral AI do not provide image embedding models. We could integrate directly with a model like clip-vit-base-patch32 to generate image embeddings, but this article won’t cover such a scenario. Instead, a vector store may contain an image description and its location (or name). The GET /images/load endpoint provides a method for loading image descriptions into a vector store. It uses Spring AI multimodality support to generate a compact description of each image in the input list and then puts it into the store.

    @GetMapping("/load")
    void load() throws JsonProcessingException {
        String msg = """
        Explain what do you see on the image.
        Generate a compact description that explains only what is visible.
        """;
        for (Media image : images) {
            UserMessage um = new UserMessage(msg, image);
            String content = this.chatClient.prompt(new Prompt(um))
                    .call()
                    .content();

            var doc = Document.builder()
                    .id(image.getId())
                    .text(mapper.writeValueAsString(new ImageDescription(image.getId(), content)))
                    .build();
            store.add(List.of(doc));
            LOG.info("Document added: {}", image.getId());
        }
    }

Java

Finally, we can implement another endpoint that generates a new image and asks the AI model to generate an image description. Then, it performs a similarity search in a vector store to find the most similar image based on its text description.

generateAndMatch(@PathVariable String object) throws IOException { ImageResponse ir = imageModel.call(new ImagePrompt("Generate an image with " + object, ImageOptionsBuilder.builder() .height(1024) .width(1024) .N(1) .responseFormat("url") .build())); UrlResource url = new UrlResource(ir.getResult().getOutput().getUrl()); LOG.info("URL: {}", ir.getResult().getOutput().getUrl()); String msg = """ Explain what do you see on the image. Generate a compact description that explains only what is visible. """; UserMessage um = new UserMessage(msg, new Media(MimeTypeUtils.IMAGE_PNG, url)); String content = this.chatClient.prompt(new Prompt(um)) .call() .content(); SearchRequest searchRequest = SearchRequest.builder() .query("Find the most similar description to this: " + content) .topK(2) .build(); return store.similaritySearch(searchRequest); }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

    @GetMapping("/generate-and-match/{object}")
    List<Document> generateAndMatch(@PathVariable String object) throws IOException {
        ImageResponse ir = imageModel.call(new ImagePrompt("Generate an image with " + object, ImageOptionsBuilder.builder()
                .height(1024)
                .width(1024)
                .N(1)
                .responseFormat("url")
                .build()));
        UrlResource url = new UrlResource(ir.getResult().getOutput().getUrl());
        LOG.info("URL: {}", ir.getResult().getOutput().getUrl());

        String msg = """
        Explain what do you see on the image.
        Generate a compact description that explains only what is visible.
        """;

        UserMessage um = new UserMessage(msg, new Media(MimeTypeUtils.IMAGE_PNG, url));
        String content = this.chatClient.prompt(new Prompt(um))
                .call()
                .content();

        SearchRequest searchRequest = SearchRequest.builder()
                .query("Find the most similar description to this: " + content)
                .topK(2)
                .build();

        return store.similaritySearch(searchRequest);
    }

Java

Let’s test the GET /images/generate-and-match/{object} endpoint using the pineapple parameter. It returns the description of the fruits.png image from the classpath.

By the way, here’s the fruits.png image located in the /src/main/resources/images directory.

Final Thoughts

Spring AI provides multimodality and image generation support. All the features presented in this article work fine with OpenAI. It supports both the image model and multimodality. To read more about the support offered by other models, refer to the Spring AI chat and image model docs.

This article shows how we can use Spring AI and AI models to interact with images in various ways.

The post Spring AI with Multimodality and Images appeared first on Piotr's TechBlog.

Using RAG and Vector Store with Spring AI

piotr.minkowski — Mon, 24 Feb 2025 14:29:03 +0000

This article will teach you how to create a Spring Boot application that uses RAG (Retrieval Augmented Generation) and vector store with Spring AI. We will continue experiments with stock data, which were initiated in my previous article about Spring AI. This is the third part of my series of articles about Spring Boot and AI. It is worth reading the following posts before proceeding with the current one:

https://piotrminkowski.com/2025/01/28/getting-started-with-spring-ai-and-chat-model: The first tutorial introduces the Spring AI project and its support for building applications based on chat models like OpenAI or Mistral AI.
https://piotrminkowski.com/2025/01/30/getting-started-with-spring-ai-function-calling: The second tutorial shows Spring AI support for Java function calling with the OpenAI chat model.

This article will show how to include one of the vector stores supported by Spring AI and advisors dedicated to RAG support in our sample application codebase used by two previous articles. It will connect to Open AI API, but you can easily switch to other models using Mistral AI or Ollama support in Spring AI. For more details, please refer to my first article.

Source Code

If you would like to try it by yourself, you may always take a look at my source code. To do that, you must clone my sample GitHub repository. Then you should only follow my instructions.

Motivation for RAG with Spring AI

The problem to solve is similar to the one described in my previous article about the Spring AI function calling feature. Since the OpenAI model is trained on a static dataset it does not have direct access to the online services or APIs. We want it to analyze stock growth trends for the biggest companies in the US stock market. Therefore, we must obtain share prices from a public API that returns live stock market data. Then, we can store this data in our local database and integrate it with the sample Spring Boot AI application. Instead of a typical relational database, we will use a vector store. In vector databases, queries work differently from those in traditional relational databases. Instead of looking for exact matches, they conduct similarity searches. It retrieves the most similar vectors to a given input vector.

After loading all required data into a vector database, we must integrate it with the AI model. Spring AI provides a comfortable mechanism for that based on the Advisors API. We have already used some built-in advisors in the previous examples, e.g. to print detailed AI communication logs or enable chat memory. This time they will allow us to implement a Retrieval Augmented Generation (RAG) technique for our app. Thanks to that, the Spring Boot app will retrieve similar documents that best match a user query before sending a request to the AI model. These documents provide context for the query and are sent to the AI model alongside the user’s question.

Here’s a simplified visualization of our process.

Vector Store with Spring AI

Set up Pinecone Database

In this section, we will prepare a vector store, integrate it with our Spring Boot application, and load some data there. Spring AI supports various vector databases. It provides the VectorStore interface to directly interact with a vector store from our Spring Boot app. The full list of supported databases can be found in the Spring AI docs here.

We will proceed with the Pinecone database. It is a popular cloud-based vector database, that allows us to store and search vectors efficiently. Instead of a cloud-based database, we can set up a local instance of another popular vector store – ChromaDB. In that case, you can use the docker-compose.yml file in the repository root directory, to run that database with the docker compose up command. With Pinecone we need to sign up to create an account on their portal. Then we should create an index. There are several customizations available, but the most important thing is to choose the right embedding model. Since text-embedding-ada-002 is a default embedding model for OpenAI we should choose that option. The name of our index is spring-ai. We can read an environment and project name from the generated host URL.

After creating an index, we should generate an API key.

Then, we will copy the generated key and export it as the PINECONE_TOKEN environment variable.

export PINECONE_TOKEN=

ShellSession

Integrate Spring Boot app with Pinecode using Spring AI

Our Spring Boot application must include the spring-ai-pinecone-store-spring-boot-starter dependency to smoothly integrate with the Pinecone vector store.


  org.springframework.ai
  spring-ai-pinecone-store-spring-boot-starter

XML

Then, we must provide connection settings and credentials to the Pinecone database in the Spring Boot application.properties file. It must be at least the Pinecode API key, environment name, project name, and index name.

spring.ai.vectorstore.pinecone.apiKey = ${PINECONE_TOKEN}
spring.ai.vectorstore.pinecone.environment = aped-4627-b74a
spring.ai.vectorstore.pinecone.projectId = fsbak04
spring.ai.vectorstore.pinecone.index-name = spring-ai

SQL

After providing all the required configuration settings we can inject and use the VectorStore bean e.g. in our application REST controller. In the following code fragment, we load input data into a vector store and perform a simple similarity search to find the most growth stock trend. We individually query the Twelvedata API for each company from a list and deserialize the response to the StockData object. Then we create a Spring AI Document object, which contains the name of a company and share close prices for the last 10 days. Data is written in the JSON format.

companies = List.of("AAPL", "MSFT", "GOOG", "AMZN", "META", "NVDA"); for (String company : companies) { StockData data = restTemplate.getForObject("https://api.twelvedata.com/time_series?symbol={0}&interval=1day&outputsize=10&apikey={1}", StockData.class, company, apiKey); if (data != null && data.getValues() != null) { var list = data.getValues().stream().map(DailyStockData::getClose).toList(); var doc = Document.builder() .id(company) .text(mapper.writeValueAsString(new Stock(company, list))) .build(); store.add(List.of(doc)); LOG.info("Document added: {}", company); } } } @GetMapping("/docs") List query() { SearchRequest searchRequest = SearchRequest.builder() .query("Find the most growth trends") .topK(2) .build(); List docs = store.similaritySearch(searchRequest); return docs; } }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

@RestController
@RequestMapping("/stocks")
public class StockController {

    private final ObjectMapper mapper = new ObjectMapper();
    private final static Logger LOG = LoggerFactory.getLogger(StockController.class);
    private final RestTemplate restTemplate;
    private final VectorStore store;

    @Value("${STOCK_API_KEY}")
    private String apiKey;

    public StockController(VectorStore store,
                           RestTemplate restTemplate) {
        this.store = store;
        this.restTemplate = restTemplate;
    }

    @PostMapping("/load-data")
    void load() throws JsonProcessingException {
        final List<String> companies = List.of("AAPL", "MSFT", "GOOG", "AMZN", "META", "NVDA");
        for (String company : companies) {
            StockData data = restTemplate.getForObject("https://api.twelvedata.com/time_series?symbol={0}&interval=1day&outputsize=10&apikey={1}",
                    StockData.class,
                    company,
                    apiKey);
            if (data != null && data.getValues() != null) {
                var list = data.getValues().stream().map(DailyStockData::getClose).toList();
                var doc = Document.builder()
                        .id(company)
                        .text(mapper.writeValueAsString(new Stock(company, list)))
                        .build();
                store.add(List.of(doc));
                LOG.info("Document added: {}", company);
            }
        }
    }
    
    @GetMapping("/docs")
    List<Document> query() {
        SearchRequest searchRequest = SearchRequest.builder()
                .query("Find the most growth trends")
                .topK(2)
                .build();
        List<Document> docs = store.similaritySearch(searchRequest);
        return docs;
    }
    
}

Java

Once we start our application and call the POST /stocks/load-data endpoint, we should see 6 records loaded into the target store. You can verify the content of the database in the Pinocone index browser.

Then we can interact directly with a vector store by calling the GET /stocks/docs endpoint.

curl http://localhost:8080/stocks/docs

ShellSession

Implement RAG with Spring AI

Use QuestionAnswerAdvisor

Previously we loaded data into a target vector store and performed a simple search to find the most growth trend. Our main goal in this section is to incorporate relevant data into an AI model prompt. We can implement RAG with Spring AI in two ways with different advisors. Let’s begin with QuestionAnswerAdvisor. To perform RAG we must provide an instance of QuestionAnswerAdvisor to the ChatClient bean. The QuestionAnswerAdvisor constructor takes the VectorStore instance as an input argument.

@RequestMapping("/v1/most-growth-trend")
String getBestTrend() {
   PromptTemplate pt = new PromptTemplate("""
            {query}.
            Which {target} is the most % growth?
            The 0 element in the prices table is the latest price, while the last element is the oldest price.
            """);

   Prompt p = pt.create(
            Map.of("query", "Find the most growth trends",
                   "target", "share")
   );

   return this.chatClient.prompt(p)
            .advisors(new QuestionAnswerAdvisor(store))
            .call()
            .content();
}

Java

Then, we can call the endpoint GET /stocks/v1/most-growth-trend to see the AI model response. By the way, the result is not accurate.

Let’s work a little bit on our previous code. We will publish a new version of the AI prompt under the GET /stocks/v1-1/most-growth-trend endpoint. The changed lines have been highlighted. We build the SearchRequest objects that return the top 3 records with the 0.7 similarity threshold. The newly created SearchRequest object must be passed as an argument in the QuestionAnswerAdvisor constructor.

@RequestMapping("/v1-1/most-growth-trend")
String getBestTrendV11() {
   PromptTemplate pt = new PromptTemplate("""
            Which share is the most % growth?
            The 0 element in the prices table is the latest price, while the last element is the oldest price.
            Return a full name of company instead of a market shortcut. 
            """);

   SearchRequest searchRequest = SearchRequest.builder()
            .query("""
            Find the most growth trends.
            The 0 element in the prices table is the latest price, while the last element is the oldest price.
            """)
            .topK(3)
            .similarityThreshold(0.7)
            .build();

   return this.chatClient.prompt(pt.create())
            .advisors(new QuestionAnswerAdvisor(store, searchRequest))
            .call()
            .content();
}

Java

Now, the results are more accurate. The model also returns the full name of companies instead of a market shortcut.

Use RetrievalAugmentationAdvisor

Instead of the QuestionAnswerAdvisor class, we can also use the experimental RetrievalAugmentationAdvisor. It provides an out-of-the-box implementation for the most common RAG flows, based on a modular architecture. There are several built-in modules we can use with RetrievalAugmentationAdvisor. We will include the RewriteQueryTransformer module that uses LLM to rewrite a user query to provide better results when querying a target vector database. It requires the query and target placeholders to be present in the prompt template. Thanks to that transformer we can retrieve the optimal set of records for a percentage growth calculation.

@RestController
@RequestMapping("/stocks")
public class StockController {

    private final ObjectMapper mapper = new ObjectMapper();
    private final static Logger LOG = LoggerFactory.getLogger(StockController.class);
    private final ChatClient chatClient;
    private final RewriteQueryTransformer.Builder rqtBuilder;
    private final RestTemplate restTemplate;
    private final VectorStore store;

    @Value("${STOCK_API_KEY}")
    private String apiKey;

    public StockController(ChatClient.Builder chatClientBuilder,
                           VectorStore store,
                           RestTemplate restTemplate) {
        this.chatClient = chatClientBuilder
                .defaultAdvisors(new SimpleLoggerAdvisor())
                .build();
        this.rqtBuilder = RewriteQueryTransformer.builder()
                .chatClientBuilder(chatClientBuilder);
        this.store = store;
        this.restTemplate = restTemplate;
    }
    
    // other methods ...
    
    @RequestMapping("/v2/most-growth-trend")
    String getBestTrendV2() {
        PromptTemplate pt = new PromptTemplate("""
                {query}.
                Which {target} is the most % growth?
                The 0 element in the prices table is the latest price, while the last element is the oldest price.
                """);

        Prompt p = pt.create(Map.of("query", "Find the most growth trends", "target", "share"));

        Advisor retrievalAugmentationAdvisor = RetrievalAugmentationAdvisor.builder()
                .documentRetriever(VectorStoreDocumentRetriever.builder()
                        .similarityThreshold(0.7)
                        .topK(3)
                        .vectorStore(store)
                        .build())
                .queryTransformers(rqtBuilder.promptTemplate(pt).build())
                .build();

        return this.chatClient.prompt(p)
                .advisors(retrievalAugmentationAdvisor)
                .call()
                .content();
    }

}

Java

Once again, we can verify the AI model response by calling the GET /stocks/v2/most-growth-trend endpoint. The response is similar to those generated by the GET /stocks/v1-1/most-growth-trend endpoint.

Run the Application

Only to remind you. Before running the application, we should provide OpenAI and TwelveData API tokens.

$ export OPEN_AI_TOKEN=<YOUR_OPEN_AI_TOKEN>
$ export STOCK_API_KEY=<YOUR_STOCK_API_KEY>
$ mvn spring-boot:run

ShellSession

Final Thoughts

In this article you learned how to use an important AI technique called Retrieval Augmented Generation (RAG) with Spring AI. Spring AI simplifies RAG by providing built-in support for vector stores and easy data incorporation into the chat model through the Advisor API. However, since RetrievalAugmentationAdvisor is an experimental feature we cannot rule out some changes in future releases.

The post Using RAG and Vector Store with Spring AI appeared first on Piotr's TechBlog.

Getting Started with Spring AI Function Calling

piotr.minkowski — Thu, 30 Jan 2025 16:40:30 +0000

This article will show you how to use Spring AI support for Java function calling with the OpenAI chat model. The Spring AI function calling feature lets us connect the LLM capabilities with external APIs or systems. OpenAI’s models are trained to know when to call a function. We will work on implementing a Java function that takes the call arguments from the AI model and sends the result back. Our main goal is to connect to the third-party APIs to provide these results. Then the AI model uses the provided results to complete the conversation.

This article is the second part of a series describing some of the AI project’s most notable features. Before reading on, I recommend checking out my introduction to Spring AI, which is available here. The first part describes such features as prompts, structured output, chat memory, and built-in advisors. Additionally, it demonstrates the capability to switch between the most popular AI chat model API providers.

Source Code

If you would like to try it by yourself, you may always take a look at my source code. To do that, you must clone my sample GitHub repository. Then you should only follow my instructions.

Problem

Whenever I create a new article or example related to AI, I like to define the problem I’m trying to solve. The problem we will solve in this exercise is visible in the following prompt template. I’m asking the AI model about the value of my stock wallet. However, the model doesn’t know how many shares I have, and can’t get the latest stock prices. Since the OpenAI model is trained on a static dataset it does not have direct access to the online services or APIs.

So, in this case, we should provide private data with our wallet structure and “connect” our model with a public API that returns live stock market data. Let’s see how we tackle this challenge with Spring AI function calling.

Create Spring Functions

WalletService Supplier

We will begin with a source code. Then we will visualize the whole process on the diagram. Spring AI supports different ways of registering a function to call. You can read more about it in the Spring AI docs here. We will choose the way based on plain Java functions defined as beans in the Spring application context. This approach allows us to use interfaces from the java.util.function package such as Function, Supplier, or Consumer. Our first function takes no input, so it implements the Supplier interface. It just returns a list of shares that we currently have. It obtains such information from the database through the Spring Data WalletRepository bean.

public class WalletService implements Supplier<WalletResponse> {

    private WalletRepository walletRepository;

    public WalletService(WalletRepository walletRepository) {
        this.walletRepository = walletRepository;
    }

    @Override
    public WalletResponse get() {
        return new WalletResponse((List<Share>) walletRepository.findAll());
    }
}

Java

Information about the number of owned shares is stored in the share table. Each row contains a company name and the quantity of that company shares.

@Entity
public class Share {

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;
    private String company;
    private int quantity;
    
    // ... GETTERS/SETTERS
}

Java

insert into share(id, company, quantity) values (1, 'AAPL', 100);
insert into share(id, company, quantity) values (2, 'AMZN', 300);
insert into share(id, company, quantity) values (3, 'META', 300);
insert into share(id, company, quantity) values (4, 'MSFT', 400);
insert into share(id, company, quantity) values (5, 'NVDA', 200);

SQL

StockService Function

Our second function takes an input argument and returns an output. Therefore, it implements the Function interface. It must interact with live stock market API to get the current price of a given company share. We use the api.twelvedata.com service to access stock exchange quotes. The function returns a current price wrapped by the StockResponse object.

{}", stockRequest.company(), latestData.getClose()); return new StockResponse(Float.parseFloat(latestData.getClose())); } }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

public class StockService implements Function<StockRequest, StockResponse> {

    private static final Logger LOG = LoggerFactory.getLogger(StockService.class);

    @Autowired
    RestTemplate restTemplate;
    @Value("${STOCK_API_KEY}")
    String apiKey;

    @Override
    public StockResponse apply(StockRequest stockRequest) {
        StockData data = restTemplate.getForObject("https://api.twelvedata.com/time_series?symbol={0}&interval=1min&outputsize=1&apikey={1}",
                StockData.class,
                stockRequest.company(),
                apiKey);
        DailyStockData latestData = data.getValues().get(0);
        LOG.info("Get stock prices: {} -> {}", stockRequest.company(), latestData.getClose());
        return new StockResponse(Float.parseFloat(latestData.getClose()));
    }
}

Java

Here are the Java records for request and response objects.

public record StockRequest(String company) { }

public record StockResponse(Float price) { }

Java

To summarize, the first function accesses the database to get owned shares quantity, while the second function communicates public API to get the current price of a company share.

Spring AI Function Calling Flow

Architecture

Here’s the diagram that visualizes the flow of our application. The Spring AI Prompt object must contain references to our function beans. This allows the OpenAI model to recognize when a function should be called. However, the model does not call the function directly but only generates JSON used to call the function on the application side. Each function must provide a name, description, and signature (as JSON schema) to let the model know what arguments it expects. We have two functions. The StockService function returns a list of owned company shares, while the second function takes a single company name as the argument. This is where the magic happens. The chat model should call the StockService function for each object in the list returned by the WalletService function. The final response combines results received from both our functions.

Implementation

To implement the flow visualized above we must register our functions as Spring beans. The method name determines the name of the bean in the Spring context. Each bean declaration should also contain a description, which helps the model to understand when to call the function. The WalletResponse function is registered under the numberOfShares name, while the StockService function under the latestStockPrices name. The WalletService doesn’t take any input arguments, but injects the WalletRepository bean to interact with the database.

numberOfShares(WalletRepository walletRepository) { return new WalletService(walletRepository); } @Bean @Description("Latest stock prices") public Function latestStockPrices() { return new StockService(); }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

@Bean
@Description("Number of shares for each company in my portfolio")
public Supplier<WalletResponse> numberOfShares(WalletRepository walletRepository) {
    return new WalletService(walletRepository);
}

@Bean
@Description("Latest stock prices")
public Function<StockRequest, StockResponse> latestStockPrices() {
    return new StockService();
}

Java

Finally, let’s take a look at the REST controller implementation. It exposes the GET /wallet endpoint that communicates with the OpenAI chat model. When creating a prompt we should register both our functions using the OpenAiChatOptions class and its function method. The reference contains only the function @Bean name.

@RestController
@RequestMapping("/wallet")
public class WalletController {

    private final ChatClient chatClient;

    public WalletController(ChatClient.Builder chatClientBuilder) {
        this.chatClient = chatClientBuilder
                .defaultAdvisors(new SimpleLoggerAdvisor())
                .build();
    }

    @GetMapping
    String calculateWalletValue() {
        PromptTemplate pt = new PromptTemplate("""
        What’s the current value in dollars of my wallet based on the latest stock daily prices ?
        """);

        return this.chatClient.prompt(pt.create(
                OpenAiChatOptions.builder()
                        .function("numberOfShares")
                        .function("latestStockPrices")
                        .build()))
                .call()
                .content();
    }
}

Java

Run and Test the Spring AI Application

Before running the app, we must export OpenAI and Twelvedata API keys as environment variables.

export STOCK_API_KEY=
export OPEN_AI_TOKEN=

ShellSession

We must create an account on the Twelvedata platform to obtain its API key. The Twelvedata platform provides API to get the latest stock prices.

Of course, we must have an API key on the OpenAI platform. Once you create an account there you should go to that page. Then choose the name for your token and copy it after creation.

Then, we run our Spring AI app using the following Maven command:

mvn spring-boot:run

ShellSession

After running the app, we can call the /wallet endpoint to calculate our stock portfolio.

curl http://localhost:8080/wallet

ShellSession

Here’s the response returned by OpenAI for the provided test data and the current stock market prices.

Then, let’s switch to the application logs. We can see that the StockService function was called five times – once for every company in the wallet. After we added the SimpleLoggerAdvisor advisor to the and set the property logging.level.org.springframework.ai to DEBUG, we can observe detailed logs with requests and responses from the OpenAI chat model.

Final Thoughts

In this article, we analyzed the Spring AI integration with function support in AI models. OpenAI’s function calling is a powerful feature that enhances how AI models interact with external tools, APIs, and structured data. It makes AI more interactive and practical for real-world applications. Spring AI provides a flexible way to register and invoke such functions. However, it still requires attention from developers, who need to define clear function schemas and handle edge cases.

The post Getting Started with Spring AI Function Calling appeared first on Piotr's TechBlog.