mistral-ai Archives - Piotr's TechBlog

Getting Started with Quarkus LangChain4j and Chat Model

piotr.minkowski — Wed, 18 Jun 2025 16:36:08 +0000

This article will teach you how to use the Quarkus LangChain4j project to build applications based on different chat models. The Quarkus AI Chat Model offers a portable and straightforward interface, enabling seamless interaction with these models. Our sample Quarkus application will switch between three popular chat models provided by OpenAI, Mistral AI, and Ollama. This article is the first in a series explaining AI concepts with Quarkus LangChain4j. Look for more on my blog in this area soon. The idea of this tutorial is very similar to the series on Spring AI. Therefore, you will be able to easily compare the two approaches, as the sample application will do the same thing as an analogous Spring Boot application.

If you like Quarkus, then you can find quite a few articles about it on my blog. Just go to the Quarkus category and find the topic you are interested in.

SourceCode

Feel free to use my source code if you’d like to try it out yourself. To do that, you must clone my sample GitHub repository. Then you should only follow my instructions.

Motivation

Whenever I create a new article or example related to AI, I like to define the problem I’m trying to solve. The problem this example solves is very trivial. I publish numerous small demo apps to explain complex technology concepts. These apps typically require data to display a demo output. Usually, I add demo data by myself or use a library like Datafaker to do it for me. This time, we can leverage the AI Chat Models API for that. Let’s begin!

The Quarkus-related topic I’m describing today, I also explained earlier for Spring Boot. For a comparison of the features offered by both frameworks for simple interaction with the AI chat model, you can read this article on Spring AI.

Dependencies

The sample application uses the current latest version of the Quarkus framework.


  
    
      io.quarkus.platform
      quarkus-bom
      ${quarkus.platform.version}
      pom
      import

XML

You can easily switch between multiple AI model implementations by activating a dedicated Maven profile. By default, the open-ai profile is active. It includes the quarkus-langchain4j-openai module in the Maven dependencies. You can also activate the mistral-ai and ollama profile. In that case, the quarkus-langchain4j-mistral-ai or quarkus-langchain4j-ollama module will be included instead of the LangChain4j OpenAI extension.


  
    open-ai
    
      true
    
    
      
        io.quarkiverse.langchain4j
        quarkus-langchain4j-openai
        ${quarkus-langchain4j.version}
      
    
  
  
    mistral-ai
    
      
        io.quarkiverse.langchain4j
        quarkus-langchain4j-mistral-ai
        ${quarkus-langchain4j.version}
      
    
  
  
    ollama
    
      
        io.quarkiverse.langchain4j
        quarkus-langchain4j-ollama
        ${quarkus-langchain4j.version}

XML

The sample Quarkus application is simple. It exposes some REST endpoints and communicates with a selected AI model to return an AI-generated response via each endpoint. So, you need to include only core Quarkus modules like quarkus-rest-jackson or quarkus-arc. To implement JUnit tests with REST API, it also includes the quarkus-junit5 and rest-assured modules in the test scope.


  
  
    io.quarkus
    quarkus-rest-jackson
  
  
    io.quarkus
    quarkus-arc
  

  
  
    io.quarkus
    quarkus-junit5
    test
  
  
    io.rest-assured
    rest-assured
    test

XML

Quarkus LangChain4j Chat Models Integration

Quarkus provides an innovative approach to interacting with AI chat models. First, you need to annotate your interface by defining AI-oriented methods with the @RegisterAiService annotation. Then you must add a proper description and input prompt inside the @SystemMessage and @UserMessage annotations. Here is the sample PersonAiService interaction, which defines two methods. The generatePersonList method aims to ask the AI model to generate a list of 10 unique persons in a form consistent with the input object structure. The getPersonById method must read the previously generated list from chat memory and return a person’s data with a specified id field.

@RegisterAiService
@ApplicationScoped
public interface PersonAiService {

    @SystemMessage("""
        You are a helpful assistant that generates realistic person data.
        Always respond with valid JSON format.
        """)
    @UserMessage("""
        Generate exactly 10 unique persons

        Requirements:
        - Each person must have a unique integer ID (like 1, 2, 3, etc.)
        - Use realistic first and last names per each nationality
        - Ages should be between 18 and 80
        - Return ONLY the JSON array, no additional text
        """)
    PersonResponse generatePersonList(@MemoryId int userId);

    @SystemMessage("""
        You are a helpful assistant that can recall generated person data from chat memory.
        """)
    @UserMessage("""
        In the previously generated list of persons for user {userId}, find and return the person with id {id}.
        
        Return ONLY the JSON object, no additional text.
        """)
    Person getPersonById(@MemoryId int userId, int id);

}

Java

There are a few more things to add regarding the code snippet above. The beans created by @RegisterAiService are @RequestScoped by default. The Quarkus LangChain4j documentation states that this is possible, allowing objects to be deleted from the chat memory. In the case seen above, the list of people is generated per user ID, which acts as the key by which we search the chat memory. To guarantee that the getPersonById method finds a list of persons generated per @MemoryId the PersonAiService interface must be annotated with @ApplicationScoped. The InMemoryChatMemoryStore implementation is enabled by default, so you don’t need to declare any additional beans to use it.

Quarkus LangChain4j can automatically map the LLM’s JSON response to the output POJO. However, until now, it has not been possible to map it directly to the output collection. Therefore, you must wrap the output list with the additional class, as shown below.

public class PersonResponse {

    private List<Person> persons;

    public List<Person> getPersons() {
        return persons;
    }

    public void setPersons(List<Person> persons) {
        this.persons = persons;
    }
}

Java

Here’s the Person class:

public class Person {

    private Integer id;
    private String firstName;
    private String lastName;
    private int age;
    private String nationality;
    private Gender gender;
    
    // GETTERS and SETTERS

}

Java

Finally, the last part of our implementation is REST endpoints. Here’s the REST controller that injects and uses PersonAiService to interact with the AI chat model. It exposes two endpoints: GET /api/{userId}/persons and GET /api/{userId}/persons/{id}. You can generate several lists of persons by specifying the userId path parameter.

@Path("/api")
@Produces(MediaType.APPLICATION_JSON)
@Consumes(MediaType.APPLICATION_JSON)
public class PersonController {

    private static final Logger LOG = Logger.getLogger(PersonController.class);

    PersonAiService personAiService;

    public PersonController(PersonAiService personAiService) {
        this.personAiService = personAiService;
    }

    @GET
    @Path("/{userId}/persons")
    public PersonResponse generatePersons(@PathParam("userId") int userId) {
        return personAiService.generatePersonList(userId);
    }

    @GET
    @Path("/{userId}/persons/{id}")
    public Person getPersonById(@PathParam("userId") int userId, @PathParam("id") int id) {
        return personAiService.getPersonById(userId, id);
    }

}

Java

Use Different AI Models with Quarkus LangChain4j

Configuration Properties

Here is a configuration defined within the application.properties file. Before proceeding, you must generate the OpenAI and Mistral AI API tokens and export them as environment variables. Additionally, you can enable logging of requests and responses in AI model communication. It is also worth increasing the default timeout for a single request from 10 seconds to a higher value, such as 20 seconds.

quarkus.langchain4j.chat-model.provider = ${AI_MODEL_PROVIDER:openai}
quarkus.langchain4j.log-requests = true
quarkus.langchain4j.log-responses = true

# OpenAI Configuration
quarkus.langchain4j.openai.api-key = ${OPEN_AI_TOKEN}
quarkus.langchain4j.openai.timeout = 20s

# Mistral AI Configuration
quarkus.langchain4j.mistralai.api-key = ${MISTRAL_AI_TOKEN}
quarkus.langchain4j.mistralai.timeout = 20s

# Ollama Configuration
quarkus.langchain4j.ollama.base-url = ${OLLAMA_BASE_URL:http://localhost:11434}

Plaintext

To run a sample Quarkus application and connect it with OpenAI, you must set the OPEN_AI_TOKEN environment variable. Since the open-ai Maven profile is activated by default, you don’t need to set anything else while running an app.

$ export OPEN_AI_TOKEN=<your_openai_token>
$ mvn quarkus:dev

ShellSession

Then, you can call the GET /api/{userId}/persons endpoint with different userId path variable values. Here are sample API requests and responses.

After that, you can call the GET /api/{userId}/persons/{id} endpoint to return a specified person found in the chat memory.

Switch Between AI Models

Then, you can repeat the same exercise with the Mistral AI model. You must set the AI_MODEL_PROVIDER to mistral, export its API token as the MISTRAL_AI_TOKEN environment variable, and enable the mistral-ai profile while running the app.

$ export AI_MODEL_PROVIDER=mistralai
$ export MISTRAL_AI_TOKEN=<your_mistralai_token>
$ mvn quarkus:dev -Pmistral-ai

ShellSession

The app should start successfully.

Once it happens, you can repeat the same sequence of requests as before for OpenAI.

$ curl http://localhost:8080/api/1/persons
$ curl http://localhost:8080/api/2/persons
$ curl http://localhost:8080/api/1/persons/1
$ curl http://localhost:8080/api/2/persons/1

ShellSession

You can check the request sent to the AI model in the application logs.

Here’s a log showing an AI chat model response:

Finally, you can run a test with ollama. By default, the LangChain4j extension for Ollama uses the llama3.2 model. You can change it by setting the quarkus.langchain4j.ollama.chat-model.model-id property in the application.properties file. Assuming that you use the llama3.3 model, here’s your configuration:

quarkus.langchain4j.ollama.base-url = ${OLLAMA_BASE_URL:http://localhost:11434}
quarkus.langchain4j.ollama.chat-model.model-id = llama3.3
quarkus.langchain4j.ollama.timeout = 60s

Plaintext

Before proceeding, you must run the llama3.3 model on your laptop. Of course, you can choose another, smaller model, because llama3.3 is 42 GB.

ollama run llama3.3

ShellSession

It can take a lot of time. However, a model is finally ready to use.

Once a model is running, you can set the AI_MODEL_PROVIDER environment variable to ollama and activate the ollama profile for the app:

$ export AI_MODEL_PROVIDER=ollama
$ mvn quarkus:dev -Pollama

ShellSession

This time, our application is connected to the llama3.3 model started with ollama:

With the Quarkus LangChain4j Ollama extension, you can take advantage of dev services support. It means that you don’t need to install and run Ollama on your laptop or run a model with ollama CLI. Quarkus will run Ollama as a Docker container and automatically run a selected AI model on it. In that case, you don’t need to set the quarkus.langchain4j.ollama.base-url property. Before switching to that option, let’s use a smaller AI model by setting the quarkus.langchain4j.ollama.chat-model.model-id = mistral property. Then start the app in the same way as before.

Final Thoughts

I must admit that the Quarkus LangChain4j extension is enjoyable to use. With a few simple annotations, you can configure your application to talk to the AI model of your choice correctly. In this article, I presented a straightforward example of integrating Quarkus with an AI chat model. However, we quickly reviewed features such as prompts, structured output, and chat memory. You can expect more articles in the Quarkus series with AI soon.

The post Getting Started with Quarkus LangChain4j and Chat Model appeared first on Piotr's TechBlog.

Spring AI with Multimodality and Images

piotr.minkowski — Tue, 04 Mar 2025 08:56:24 +0000

This article will teach you how to create a Spring Boot application that handles images and text using the Spring AI multimodality feature. Multimodality is the ability to understand and process information from different sources simultaneously. It covers text, images, audio, and other data formats. We will perform simple experiments with multimodality and images. This is the fourth part of my series of articles about Spring Boot and AI. It is worth reading the following posts before proceeding with the current one:

https://piotrminkowski.com/2025/01/28/getting-started-with-spring-ai-and-chat-model: The first tutorial introduces the Spring AI project and its support for building applications based on chat models like OpenAI or Mistral AI.
https://piotrminkowski.com/2025/01/30/getting-started-with-spring-ai-function-calling: The second tutorial shows Spring AI support for Java function calling with the OpenAI chat model.
https://piotrminkowski.com/2025/02/24/using-rag-and-vector-store-with-spring-ai: The third tutorial shows Spring AI support for RAG (Retrieval Augmented Generation) and vector store.

Source Code

Feel free to use my source code if you’d like to try it out yourself. To do that, you must clone my sample GitHub repository. Then you should only follow my instructions.

Motivation for Multimodality with Spring AI

The multimodal large language model (LLM) capabilities allow it to process and generate text alongside other modalities, including images, audio, and video. This feature covers a use case when we want LLM to detect something specific inside an image or describe its content. Let’s assume we have a list of input images. We want to find the image in that list that matches our description. For example, this description can ask a model to find the image that contains a specified item. The Spring AI Message API provides all the necessary elements to support multimodal LLMs. Here’s a diagram that illustrates our scenario.

Use Multimodality with Spring AI

We don’t need to include any specific library other than the Spring AI starter for a particular AI model. The default option is spring-ai-openai-spring-boot-starter. Our application uses images stored in the src/main/resources/images directory. Spring AI multimodality support requires the image to be passed inside the Media object. We load all the pictures from the classpath inside the constructor.

Recognize Items in the Image

The GET /images/find/{object} tries to find the image that contains the item determined by the object path variable. AI model must return a position on the image in the input list. To achieve that, we create an UserMessage object that contains a user query and a list of the Media objects. Once the model returns the position, the endpoint reads the image from the list and returns its content in the image/png format.

images; private List dynamicImages = new ArrayList<>(); public ImageController(ChatClient.Builder chatClientBuilder) { this.chatClient = chatClientBuilder .defaultAdvisors(new SimpleLoggerAdvisor()) .build(); this.images = List.of( Media.builder().id("fruits").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits.png")).build(), Media.builder().id("fruits-2").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits-2.png")).build(), Media.builder().id("fruits-3").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits-3.png")).build(), Media.builder().id("fruits-4").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits-4.png")).build(), Media.builder().id("fruits-5").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits-5.png")).build(), Media.builder().id("animals").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals.png")).build(), Media.builder().id("animals-2").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals-2.png")).build(), Media.builder().id("animals-3").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals-3.png")).build(), Media.builder().id("animals-4").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals-4.png")).build(), Media.builder().id("animals-5").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals-5.png")).build() ); } @GetMapping(value = "/find/{object}", produces = MediaType.IMAGE_PNG_VALUE) @ResponseBody byte[] analyze(@PathVariable String object) { String msg = """ Which picture contains %s. Return only a single picture. Return only the number that indicates its position in the media list. """.formatted(object); LOG.info(msg); UserMessage um = new UserMessage(msg, images); String content = this.chatClient.prompt(new Prompt(um)) .call() .content(); assert content != null; return images.get(Integer.parseInt(content)-1).getDataAsByteArray(); } }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

@RestController
@RequestMapping("/images")
public class ImageController {

    private final static Logger LOG = LoggerFactory
        .getLogger(ImageController.class);

    private final ChatClient chatClient;
    private List<Media> images;
    private List<Media> dynamicImages = new ArrayList<>();

    public ImageController(ChatClient.Builder chatClientBuilder) {
        this.chatClient = chatClientBuilder
                .defaultAdvisors(new SimpleLoggerAdvisor())
                .build();
        this.images = List.of(
                Media.builder().id("fruits").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits.png")).build(),
                Media.builder().id("fruits-2").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits-2.png")).build(),
                Media.builder().id("fruits-3").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits-3.png")).build(),
                Media.builder().id("fruits-4").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits-4.png")).build(),
                Media.builder().id("fruits-5").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits-5.png")).build(),
                Media.builder().id("animals").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals.png")).build(),
                Media.builder().id("animals-2").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals-2.png")).build(),
                Media.builder().id("animals-3").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals-3.png")).build(),
                Media.builder().id("animals-4").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals-4.png")).build(),
                Media.builder().id("animals-5").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals-5.png")).build()
        );
    }

    @GetMapping(value = "/find/{object}", produces = MediaType.IMAGE_PNG_VALUE)
    @ResponseBody byte[] analyze(@PathVariable String object) {
        String msg = """
        Which picture contains %s.
        Return only a single picture.
        Return only the number that indicates its position in the media list.
        """.formatted(object);
        LOG.info(msg);

        UserMessage um = new UserMessage(msg, images);

        String content = this.chatClient.prompt(new Prompt(um))
                .call()
                .content();

        assert content != null;
        return images.get(Integer.parseInt(content)-1).getDataAsByteArray();
    }

}

Java

Let’s make a test call. We will look for the picture containing a banana. Here’s the AI model response after calling the http://localhost:8080/images/find/banana. You can try to make other test calls and find an image with e.g. an orange or a tomato.

Describe Image Contents

On the other hand, we can ask the AI model to generate a short description of all images included as the Media content. The GET /images/describe endpoint merges two lists of images.

@GetMapping("/describe")
String[] describe() {
   UserMessage um = new UserMessage("Explain what do you see on each image.",
            List.copyOf(Stream.concat(images.stream(), dynamicImages.stream()).toList()));
      return this.chatClient.prompt(new Prompt(um))
              .call()
              .entity(String[].class);
}

Java

Once we call the http://localhost:8080/images/describe URL we will receive a compact description of all input images. The two highlighted descriptions have been generated for images from the dynamicImages List. These images were generated by the AI image model. We will discuss this in the next section.

Generate Images with AI Model

To generate an image using AI API we must inject the ImageModel bean. It provides a single call method that allows us to communicate with AI Models dedicated to image generation. This method takes the ImagePrompt object as an argument. Typically, we use the ImagePrompt constructor that takes instructions for image generation and options that customize the height, width, and number of images. We will generate a single (N=1) image with 1024 pixels in height and width. The AI model returns the image URL (responseFormat). Once the image is generated, we create an UrlResource object, create the Media object, and put it into the dynamicImages List. The GET /images/generate/{object} endpoint returns a byte array representation of the image object.

images; private List dynamicImages = new ArrayList<>(); public ImageController(ChatClient.Builder chatClientBuilder, ImageModel imageModel) { this.chatClient = chatClientBuilder .defaultAdvisors(new SimpleLoggerAdvisor()) .build(); this.imageModel = imageModel; // other initializations } @GetMapping(value = "/generate/{object}", produces = MediaType.IMAGE_PNG_VALUE) byte[] generate(@PathVariable String object) throws IOException { ImageResponse ir = imageModel.call(new ImagePrompt("Generate an image with " + object, ImageOptionsBuilder.builder() .height(1024) .width(1024) .N(1) .responseFormat("url") .build())); UrlResource url = new UrlResource(ir.getResult().getOutput().getUrl()); LOG.info("Generated URL: {}", ir.getResult().getOutput().getUrl()); dynamicImages.add(Media.builder() .id(UUID.randomUUID().toString()) .mimeType(MimeTypeUtils.IMAGE_PNG) .data(url) .build()); return url.getContentAsByteArray(); } }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

@RestController
@RequestMapping("/images")
public class ImageController {

    private final ChatClient chatClient;
    private final ImageModel imageModel;
    private List<Media> images;
    private List<Media> dynamicImages = new ArrayList<>();
    
    public ImageController(ChatClient.Builder chatClientBuilder,
                           ImageModel imageModel) {
        this.chatClient = chatClientBuilder
                .defaultAdvisors(new SimpleLoggerAdvisor())
                .build();
        this.imageModel = imageModel;
        // other initializations
    }
    
    @GetMapping(value = "/generate/{object}", produces = MediaType.IMAGE_PNG_VALUE)
    byte[] generate(@PathVariable String object) throws IOException {
        ImageResponse ir = imageModel.call(new ImagePrompt("Generate an image with " + object, ImageOptionsBuilder.builder()
                .height(1024)
                .width(1024)
                .N(1)
                .responseFormat("url")
                .build()));
        UrlResource url = new UrlResource(ir.getResult().getOutput().getUrl());
        LOG.info("Generated URL: {}", ir.getResult().getOutput().getUrl());
        dynamicImages.add(Media.builder()
                .id(UUID.randomUUID().toString())
                .mimeType(MimeTypeUtils.IMAGE_PNG)
                .data(url)
                .build());
        return url.getContentAsByteArray();
    }
    
}

Java

Do you remember the description of that image returned by the GET /images/describe endpoint? Here’s our image with strawberry generated by the AI model after calling the http://localhost:8080/images/generate/strawberry URL.

Here’s a similar test for the banana input parameter.

Use Vector Store with Spring AI Multimodality

Let’s consider how we can leverage vector store in our scenario. We cannot insert image representation directly to a vector store since most popular vendors like OpenAI or Mistral AI do not provide image embedding models. We could integrate directly with a model like clip-vit-base-patch32 to generate image embeddings, but this article won’t cover such a scenario. Instead, a vector store may contain an image description and its location (or name). The GET /images/load endpoint provides a method for loading image descriptions into a vector store. It uses Spring AI multimodality support to generate a compact description of each image in the input list and then puts it into the store.

    @GetMapping("/load")
    void load() throws JsonProcessingException {
        String msg = """
        Explain what do you see on the image.
        Generate a compact description that explains only what is visible.
        """;
        for (Media image : images) {
            UserMessage um = new UserMessage(msg, image);
            String content = this.chatClient.prompt(new Prompt(um))
                    .call()
                    .content();

            var doc = Document.builder()
                    .id(image.getId())
                    .text(mapper.writeValueAsString(new ImageDescription(image.getId(), content)))
                    .build();
            store.add(List.of(doc));
            LOG.info("Document added: {}", image.getId());
        }
    }

Java

Finally, we can implement another endpoint that generates a new image and asks the AI model to generate an image description. Then, it performs a similarity search in a vector store to find the most similar image based on its text description.

generateAndMatch(@PathVariable String object) throws IOException { ImageResponse ir = imageModel.call(new ImagePrompt("Generate an image with " + object, ImageOptionsBuilder.builder() .height(1024) .width(1024) .N(1) .responseFormat("url") .build())); UrlResource url = new UrlResource(ir.getResult().getOutput().getUrl()); LOG.info("URL: {}", ir.getResult().getOutput().getUrl()); String msg = """ Explain what do you see on the image. Generate a compact description that explains only what is visible. """; UserMessage um = new UserMessage(msg, new Media(MimeTypeUtils.IMAGE_PNG, url)); String content = this.chatClient.prompt(new Prompt(um)) .call() .content(); SearchRequest searchRequest = SearchRequest.builder() .query("Find the most similar description to this: " + content) .topK(2) .build(); return store.similaritySearch(searchRequest); }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

    @GetMapping("/generate-and-match/{object}")
    List<Document> generateAndMatch(@PathVariable String object) throws IOException {
        ImageResponse ir = imageModel.call(new ImagePrompt("Generate an image with " + object, ImageOptionsBuilder.builder()
                .height(1024)
                .width(1024)
                .N(1)
                .responseFormat("url")
                .build()));
        UrlResource url = new UrlResource(ir.getResult().getOutput().getUrl());
        LOG.info("URL: {}", ir.getResult().getOutput().getUrl());

        String msg = """
        Explain what do you see on the image.
        Generate a compact description that explains only what is visible.
        """;

        UserMessage um = new UserMessage(msg, new Media(MimeTypeUtils.IMAGE_PNG, url));
        String content = this.chatClient.prompt(new Prompt(um))
                .call()
                .content();

        SearchRequest searchRequest = SearchRequest.builder()
                .query("Find the most similar description to this: " + content)
                .topK(2)
                .build();

        return store.similaritySearch(searchRequest);
    }

Java

Let’s test the GET /images/generate-and-match/{object} endpoint using the pineapple parameter. It returns the description of the fruits.png image from the classpath.

By the way, here’s the fruits.png image located in the /src/main/resources/images directory.

Final Thoughts

Spring AI provides multimodality and image generation support. All the features presented in this article work fine with OpenAI. It supports both the image model and multimodality. To read more about the support offered by other models, refer to the Spring AI chat and image model docs.

This article shows how we can use Spring AI and AI models to interact with images in various ways.

The post Spring AI with Multimodality and Images appeared first on Piotr's TechBlog.

Getting Started with Spring AI and Chat Model

piotr.minkowski — Tue, 28 Jan 2025 10:02:24 +0000

This article will teach you how to use the Spring AI project to build applications based on different chat models. The Spring AI Chat Model is a simple and portable interface that allows us to interact with these models. Our sample Spring Boot application will switch between three popular chat models provided by OpenAI, Mistral AI, and Ollama. This article is the first in a series explaining AI concepts with Spring Boot. Look for more on my blog in this area soon.

If you are interested in Spring Boot, read my article about tips, tricks, and techniques for this framework here.

Source Code

If you would like to try it by yourself, you may always take a look at my source code. To do that, you must clone my sample GitHub repository. Then you should only follow my instructions.

Problem

Whenever I create a new article or example related to AI, I like to define the problem I’m trying to solve. The problem this example solves is very trivial. I publish a lot of small demo apps to explain technology concepts. These apps usually need data to show a demo output. Usually, I add demo data by myself or use a library like Datafaker to do it for me. This time, we can leverage AI Chat Models API for that. Let’s begin!

Dependencies

The Spring AI project is still under active development. Currently, we are waiting for the 1.0 GA release. Until then, we will switch to the milestone releases of the project. The current milestone is 1.0.0-M5. So let’s add the Spring Milestones repository to our Maven pom.xml file.

    
        
            central
            Central
            https://repo1.maven.org/maven2/
        
        
            spring-milestones
            Spring Milestones
            https://repo.spring.io/milestone
            
                false

XML

Then we should include the Maven BOM with a specified version of the Spring AI project.

    
        21
        1.0.0-M5
    
    
    
        
            
                org.springframework.ai
                spring-ai-bom
                ${spring-ai.version}
                pom
                import

XML

Since our sample application exposes some REST endpoints, we should include the Spring Boot Web Starter. We can include the Spring Boot Test Starter to create some JUnit tests. The Spring AI modules are included in the Maven profiles section. There are three different profiles for each chat model provider. By default, our application uses Open AI, and thus it activates the open-ai profile, which includes the spring-ai-openai-spring-boot-starter library. We should activate the mistral-ai profile to switch to Mistral AI. The third option is the ollama-ai profile including the spring-ai-ollama-spring-boot-starter dependency. Here’s a full list of dependencies. That’ll make it a breeze to switch between different chat model AI providers — we’ll only need to set the profile parameter in the Maven running command.

    
        
            org.springframework.boot
            spring-boot-starter-web
        
        
            org.springframework.boot
            spring-boot-starter-test
            test
        
    

    
        
            open-ai
            
                true
            
            
                
                    org.springframework.ai
                    spring-ai-openai-spring-boot-starter
                
            
        
        
            mistral-ai
            
                
                    org.springframework.ai
                    spring-ai-mistral-ai-spring-boot-starter
                
            
        
        
            ollama-ai
            
                
                    org.springframework.ai
                    spring-ai-ollama-spring-boot-starter

XML

Connect to AI Chat Model Providers

Configure OpenAI

Before we proceed with a source code, we should prepare chat model AI tools. Let’s begin with OpenAI. We must have an account on the OpenAI Platform portal. After signing in we should access the API Keys page to generate an API token. Once we set its name, we can click the “Create secret key” button. Don’t forget to copy the key after creation.

The value of the generated token should be saved as an environment variable. Our sample Spring Boot application read its value from the OPEN_AI_TOKEN variable.

export OPEN_AI_TOKEN=

ShellSession

Configure Mistral AI

Then, we should repeat a very similar action for Mistral AI. We must have an account on the Mistral AI Platform portal. After signing in we should access the API Keys page to generate an API token. Both the name and expiration date fields are optional. Once we generate a token by clicking the “Create key” button, we should copy it.

The value of the generated token should be saved as an environment variable. Our sample Spring Boot application read its value for Mistral AI from the MISTRAL_AI_TOKEN variable.

export MISTRAL_AI_TOKEN=

ShellSession

Run and Configure Ollama

Opposite to OpenAI or Mistral AI, Ollama is built to allow to run large language models (LLMs) directly on our workstations. This means we don’t have any connection to the remote API to access it. First, we must download the Ollama binary dedicated to our OS from the following page. After installation, we can interact with it using the ollama CLI. First, we should choose the model to run. The full list of available models can be found here. By default, Spring AI expects the mistral model for the Ollama. Let’s choose llama3.2.

ollama run llama3.2

ShellSession

After running Ollama locally we can interact with it using the CLI terminal.

Configure Spring Boot Properties

Ollama exposes port over localhost and does not require an API token. Fortunately, all necessary URLs for our APIs come with the Spring AI auto-configuration. After choosing the llama3.2 model, we should provide the change in Spring Boot application properties respectively. We can also set the gpt-4o-mini model for OpenAI to decrease API costs.

spring.ai.openai.api-key = ${OPEN_AI_TOKEN}
spring.ai.openai.chat.options.model = gpt-4o-mini
spring.ai.mistralai.api-key = ${MISTRAL_AI_TOKEN}
spring.ai.ollama.chat.options.model = llama3.2

Plaintext

Spring AI Chat Model API

Prompting and Structured Output

Here is our model class. It contains the id field and several other fields that best describe each person.

public class Person {

    private Integer id;
    private String firstName;
    private String lastName;
    private int age;
    private Gender gender;
    private String nationality;
    
    //... GETTERS/SETTERS
}

public enum Gender {
    MALE, FEMALE;
}

Java

The @RestController class injects auto-configured ChatClient.Builder to create an instance of ChatClient. PersonController implements a method for returning a list of persons from the GET /persons endpoint. The main goal is to generate a list of 10 objects with the fields defined in the Person class. The id field should be auto-incremented. The PromptTemplate object defines a message, that will be sent to the chat model AI API. It doesn’t have to specify the exact fields that should be returned. This part is handled automatically by the Spring AI library after we invoke the entity() method on the ChatClient instance. The ParameterizedTypeReference object inside the entity method tells Spring AI to generate a list of objects.

@RestController
@RequestMapping("/persons")
public class PersonController {

    private final ChatClient chatClient;

    public PersonController(ChatClient.Builder chatClientBuilder) {
        this.chatClient = chatClientBuilder.build();
    }

    @GetMapping
    List<Person> findAll() {
        PromptTemplate pt = new PromptTemplate("""
                Return a current list of 10 persons if exists or generate a new list with random values.
                Each object should contain an auto-incremented id field.
                Do not include any explanations or additional text.
                """);

        return this.chatClient.prompt(pt.create())
                .call()
                .entity(new ParameterizedTypeReference<>() {});
    }

}

Java

Assuming you exported the OpenAI token to the OPEN_AI_TOKEN environment variable, you can run the application using the following command:

mvn spring-boot:run

ShellSession

Then, let’s call the http://localhost:8080/persons endpoint. It returns a list of 10 people with different nationalities. It

Now, we can change the PromptTemplate content and add the word “famous” before persons. Just for fun.

The results are not surprising at all – “Elon Musk” enters the list However, the list will be slightly different the second time you call the same endpoint. According to our prompt, a chat client should “return a current list of 10 persons”. So, I expected to get the same list as before. In this case, the problem is that the chat client doesn’t remember a previous conversation.

Advisors and Chat Memory

Let’s try to change it. First, we should define the implementation of the ChatMemory interface. InMemoryChatMemory is good enough for our tests.

@SpringBootApplication
public class SpringAIShowcase {

    public static void main(String[] args) {
        SpringApplication.run(SpringAIShowcase.class, args);
    }

    @Bean
    InMemoryChatMemory chatMemory() {
        return new InMemoryChatMemory();
    }
}

Java

To enable conversation history for a chat client we should define an advisor. The Spring AI Advisors API lets us intercept, modify, and enhance AI-driven interactions handled by Spring applications. Spring AI offers API to create custom advisors, but we can also leverage several built-in advisors. It can be e.g. PromptChatMemoryAdvisor that enables chat memory and adds it to the prompt’s system text or SimpleLoggerAdvisor which enables request/response logging. Let’s take a look at the latest implementation of the PersonController class. I highlighted the added lines of code. Besides advisors, it contains a new GET /persons/{id} endpoint implementation. This endpoint takes a previously returned list of persons and seeks the object with a specified id. The PromptTemplate object specifies the id parameter filled with the value read from the context path.

findAll() { PromptTemplate pt = new PromptTemplate(""" Return a current list of 10 persons if exists or generate a new list with random values. Each object should contain an auto-incremented id field. Do not include any explanations or additional text. """); return this.chatClient.prompt(pt.create()) .call() .entity(new ParameterizedTypeReference<>() {}); } @GetMapping("/{id}") Person findById(@PathVariable String id) { PromptTemplate pt = new PromptTemplate(""" Find and return the object with id {id} in a current list of persons. """); Prompt p = pt.create(Map.of("id", id)); return this.chatClient.prompt(p) .call() .entity(Person.class); } }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

@RestController
@RequestMapping("/persons")
public class PersonController {

    private final ChatClient chatClient;

    public PersonController(ChatClient.Builder chatClientBuilder, 
                            ChatMemory chatMemory) {
        this.chatClient = chatClientBuilder
                .defaultAdvisors(
                        new PromptChatMemoryAdvisor(chatMemory),
                        new SimpleLoggerAdvisor())
                .build();
    }

    @GetMapping
    List<Person> findAll() {
        PromptTemplate pt = new PromptTemplate("""
                Return a current list of 10 persons if exists or generate a new list with random values.
                Each object should contain an auto-incremented id field.
                Do not include any explanations or additional text.
                """);

        return this.chatClient.prompt(pt.create())
                .call()
                .entity(new ParameterizedTypeReference<>() {});
    }

    @GetMapping("/{id}")
    Person findById(@PathVariable String id) {
        PromptTemplate pt = new PromptTemplate("""
                Find and return the object with id {id} in a current list of persons.
                """);
        Prompt p = pt.create(Map.of("id", id));
        return this.chatClient.prompt(p)
                .call()
                .entity(Person.class);
    }
}

Java

Now, let’s make a final test. After the application restarts, we can call the endpoint that generates a list of persons. Then, we will call the GET /persons/{id} endpoint to display only a single person by ID. Spring application reads the value from the list of persons stored in the chat memory. Finally, we can repeat the call to the GET /persons endpoint to verify if it returns the same list.

Different Chat AI Models

Assuming you exported the Mistral AI token to the MISTRAL_AI_TOKEN environment variable, you can run the application using the following command. It activates the mistral-ai Maven profile and includes the starter with the Mistral AI support.

mvn spring-boot:run -Pmistral-ai

ShellSession

It returns responses similar to OpenAI’s, but some small differences exist. It always returns 0 in the age field and a 3-letter shortcut as a country name.

Let’s tweak our template to get Mistrai AI to generate an accurate age number. Here’s the fixed prompt template:

Now, it looks quite better. Even so, the names don’t match up with the countries they’re from, so there’s room for improvement.

The last test is for Ollama. Let’s run our application once again. This time we should activate the ollama-ai Maven profile.

mvn spring-boot:run -Pollama-ai

ShellSession

Then, we can repeat the same requests to check out the responses from Ollama AI. You can check out the responses by yourself.

$ curl http://localhost:8080/persons
$ curl http://localhost:8080/persons/2

ShellSession

Final Thoughts

This example doesn’t do anything unusual but only shows some basic features offered by Spring AI Chat Models API. We quickly reviewed features like prompts, structured output, chat memory, and built-in advisors. We also switched between some popular AI Chat Models API providers. You can expect more articles in this area soon. If you want to continue with the next part of the AI series on my blog, go here.

The post Getting Started with Spring AI and Chat Model appeared first on Piotr's TechBlog.