ollama Archives - Piotr's TechBlog

Create Apps with Claude Code on Ollama

piotr.minkowski — Tue, 17 Feb 2026 16:25:18 +0000

This article explains how to run Claude Code on Ollama and use local or cloud models served by Ollama to create Java apps. Read this article if you are experimenting with AI code generation and using paid APIs for this purpose. Relatively recently, Ollama has made a built-in integration with developer tools such as Codex and Claude Code available. This is a really useful feature. Using the example of integration with Claude Coda and several different models running both locally and in the cloud, you will see how it works.

You can find other articles about AI and Java on my blog. For example, if you are interested in how to use Ollama to serve models for Spring AI applications, you can read the following article.

Source Code

Feel free to use my source code if you’d like to try it out yourself. To do that, you must clone my sample GitHub repository. Then you should only follow my instructions. This repository contains several branches, each with an application generated from the same prompt using different models. Currently, the branch with the fewest comments in the code review has been merged into master. This is the version of the code generated using the glm-5 model. However, this may change in the future, and the master branch may be modified. Therefore, it is best to simply refer to the individual branches or pull requests shown below.

Below is the current list of branches. The dev branch contains the initial version of the repository with the CLAUDE.md file, which specifies the basic requirements for the generated code.

$ git branch
    dev
    glm-5
    gpt-oss
  * master
    minimax
    qwen3-coder

$ git branch
    dev
    glm-5
    gpt-oss
  * master
    minimax
    qwen3-coder

ShellSession

Here are instructions for AI from the CLAUDE.md file. They include a description of the technologies I plan to use in my application and a few practices I intend to apply. For example, I don’t want to use Lombok, a popular Java library that automates the generation of code parts such as getters, setters, and constructors. It seems that in the age of AI, this approach doesn’t make sense, but for some reason, AI models really like this library Also, each time I make a code change, I want the LLM model to increment the version number and update the README.md file, etc.

# Project Instructions

- Always use the latest versions of dependencies.
- Always write Java code as the Spring Boot application.
- Always use Maven for dependency management.
- Always create test cases for the generated code both positive and negative.
- Always generate the CircleCI pipeline in the .circleci directory to verify the code.
- Minimize the amount of code generated.
- The Maven artifact name must be the same as the parent directory name.
- Use semantic versioning for the Maven project. Each time you generate a new version, bump the PATCH section of the version number.
- Use `pl.piomin.services` as the group ID for the Maven project and base Java package.
- Do not use the Lombok library.
- Generate the Docker Compose file to run all components used by the application.
- Update README.md each time you generate a new version.

# Project Instructions

- Always use the latest versions of dependencies.
- Always write Java code as the Spring Boot application.
- Always use Maven for dependency management.
- Always create test cases for the generated code both positive and negative.
- Always generate the CircleCI pipeline in the .circleci directory to verify the code.
- Minimize the amount of code generated.
- The Maven artifact name must be the same as the parent directory name.
- Use semantic versioning for the Maven project. Each time you generate a new version, bump the PATCH section of the version number.
- Use `pl.piomin.services` as the group ID for the Maven project and base Java package.
- Do not use the Lombok library.
- Generate the Docker Compose file to run all components used by the application.
- Update README.md each time you generate a new version.

Markdown

Run Claude on Ollama

First, install Ollama on your computer. You can download the installer for your OS here. If you have used Ollama before, please update to the latest version.

$ ollama --version
  ollama version is 0.16.1

$ ollama --version
  ollama version is 0.16.1

ShellSession

Next, install Claude Code.

curl -fsSL https://claude.ai/install.sh | bash

curl -fsSL https://claude.ai/install.sh | bash

ShellSession

Before you start, it is worth increasing the maximum context window value allowed by Ollama. By default, it is set to 4k, and on the Ollama website itself, you will find a recommendation of 64k for Claude Code. I set the maximum value to 256k for testing different models.

For example, the gpt-oss model supports a 128k context window size.

Let’s pull and run the gpt-oss model with Ollama:

ollama run gpt-oss

ollama run gpt-oss

ShellSession

After downloading and launching, you can verify the model parameters with the ollama ps command. If you have 100% GPU and a context window size of ~131k, that’s exactly what I meant.

Ensure you are in the root repository directory, then run Claude Code with the command ollama launch claude. Next, choose the gpt-oss model visible in the list under “More”.

That’s it! Finally, we can start playing with AI.

Generate a Java App with Claude Code

My application will be very simple. I just need something to quickly test the solution. Of course, all guidelines defined in the CLAUDE.md file should be followed. So, here is my prompt. Nothing more, nothing less

Generate an application that exposes REST API and connects to a PostgreSQL database.
The application should have a Person entity with id, and typical fields related to each person.
All REST endpoints should be protected with JWT and OAuth2.
The codebase should use Skaffold to deploy on Kubernetes.

Generate an application that exposes REST API and connects to a PostgreSQL database.
The application should have a Person entity with id, and typical fields related to each person.
All REST endpoints should be protected with JWT and OAuth2.
The codebase should use Skaffold to deploy on Kubernetes.

Plaintext

After a few minutes, I have the entire code generated. Below is a summary from the AI of what has been done. If you want to check it out for yourself, take a look at this branch in my repository.

For the sake of formality, let’s take a look at the generated code. There is nothing spectacular here, because it is just a regular Spring Boot application that exposes a few REST endpoints for CRUD operations. However, it doen’t look bad. Here’s the Spring Boot @Service implementation responsible for using PersonRepository to interact with database.

@Service
public class PersonService {
    private final PersonRepository repository;

    public PersonService(PersonRepository repository) {
        this.repository = repository;
    }

    public List<Person> findAll() {
        return repository.findAll();
    }

    public Optional<Person> findById(Long id) {
        return repository.findById(id);
    }

    @Transactional
    public Person create(Person person) {
        return repository.save(person);
    }

    @Transactional
    public Optional<Person> update(Long id, Person person) {
        return repository.findById(id).map(existing -> {
            existing.setFirstName(person.getFirstName());
            existing.setLastName(person.getLastName());
            existing.setEmail(person.getEmail());
            existing.setAge(person.getAge());
            return repository.save(existing);
        });
    }

    @Transactional
    public void delete(Long id) {
        repository.deleteById(id);
    }
}

@Service
public class PersonService {
    private final PersonRepository repository;

    public PersonService(PersonRepository repository) {
        this.repository = repository;
    }

    public List<Person> findAll() {
        return repository.findAll();
    }

    public Optional<Person> findById(Long id) {
        return repository.findById(id);
    }

    @Transactional
    public Person create(Person person) {
        return repository.save(person);
    }

    @Transactional
    public Optional<Person> update(Long id, Person person) {
        return repository.findById(id).map(existing -> {
            existing.setFirstName(person.getFirstName());
            existing.setLastName(person.getLastName());
            existing.setEmail(person.getEmail());
            existing.setAge(person.getAge());
            return repository.save(existing);
        });
    }

    @Transactional
    public void delete(Long id) {
        repository.deleteById(id);
    }
}

Java

Here’s the generated @RestController witn REST endpoints implementation:

@RestController
@RequestMapping("/api/people")
public class PersonController {
    private final PersonService service;

    public PersonController(PersonService service) {
        this.service = service;
    }

    @GetMapping
    public List<Person> getAll() {
        return service.findAll();
    }

    @GetMapping("/{id}")
    public ResponseEntity<Person> getById(@PathVariable Long id) {
        Optional<Person> person = service.findById(id);
        return person.map(ResponseEntity::ok).orElseGet(() -> ResponseEntity.notFound().build());
    }

    @PostMapping
    public ResponseEntity<Person> create(@RequestBody Person person) {
        Person saved = service.create(person);
        return ResponseEntity.status(201).body(saved);
    }

    @PutMapping("/{id}")
    public ResponseEntity<Person> update(@PathVariable Long id, @RequestBody Person person) {
        Optional<Person> updated = service.update(id, person);
        return updated.map(ResponseEntity::ok).orElseGet(() -> ResponseEntity.notFound().build());
    }

    @DeleteMapping("/{id}")
    public ResponseEntity<Void> delete(@PathVariable Long id) {
        service.delete(id);
        return ResponseEntity.noContent().build();
    }
}

@RestController
@RequestMapping("/api/people")
public class PersonController {
    private final PersonService service;

    public PersonController(PersonService service) {
        this.service = service;
    }

    @GetMapping
    public List<Person> getAll() {
        return service.findAll();
    }

    @GetMapping("/{id}")
    public ResponseEntity<Person> getById(@PathVariable Long id) {
        Optional<Person> person = service.findById(id);
        return person.map(ResponseEntity::ok).orElseGet(() -> ResponseEntity.notFound().build());
    }

    @PostMapping
    public ResponseEntity<Person> create(@RequestBody Person person) {
        Person saved = service.create(person);
        return ResponseEntity.status(201).body(saved);
    }

    @PutMapping("/{id}")
    public ResponseEntity<Person> update(@PathVariable Long id, @RequestBody Person person) {
        Optional<Person> updated = service.update(id, person);
        return updated.map(ResponseEntity::ok).orElseGet(() -> ResponseEntity.notFound().build());
    }

    @DeleteMapping("/{id}")
    public ResponseEntity<Void> delete(@PathVariable Long id) {
        service.delete(id);
        return ResponseEntity.noContent().build();
    }
}

Java

Below is a summary in a pull request with the generated code.

Using Ollama Cloud Models

Recently, Ollama has made it possible to run models not only locally, but also in the cloud. By default, all models tagged with cloud are run this way. Cloud models are automatically offloaded to Ollama’s cloud service while offering the same capabilities as local models. This is the most useful for larger models that wouldn’t fit on a personal computer. You can for example try to experiment with the qwen3-coder model locally. Unfortunately, it didn’t look very good on my laptop.

Then, I can run a same or event a larger model in cloud and automatically connect Claude Code with that model using the following command:

ollama launch claude --model qwen3-coder:480b-cloud

ollama launch claude --model qwen3-coder:480b-cloud

Java

Now you can repeat exactly the same exercise as before or take a look at my branch containing the code generated using this model.

You can also try some other cloud models like minimax-m2.5 or glm-5.

Conclusion

If you’re developing locally and don’t want to burn money on APIs, use Claude Code with Ollama, and e.g., the gpt-oss or glm-5 models. It’s a pretty powerful and free option. If you have a powerful personal computer, the locally launched model should be able to generate the code efficiently. Otherwise, you can use the option of launching the model in the cloud offered by Ollama free of charge up to a certain usage limit (it is difficult to say exactly what that limit is). The gpt-oss model worked really well on my laptop (MacBook Pro M3), and it took about 7-8 minutes to generate the application. You can also look for a model that suits you better.

The post Create Apps with Claude Code on Ollama appeared first on Piotr's TechBlog.

Getting Started with Quarkus LangChain4j and Chat Model

piotr.minkowski — Wed, 18 Jun 2025 16:36:08 +0000

This article will teach you how to use the Quarkus LangChain4j project to build applications based on different chat models. The Quarkus AI Chat Model offers a portable and straightforward interface, enabling seamless interaction with these models. Our sample Quarkus application will switch between three popular chat models provided by OpenAI, Mistral AI, and Ollama. This article is the first in a series explaining AI concepts with Quarkus LangChain4j. Look for more on my blog in this area soon. The idea of this tutorial is very similar to the series on Spring AI. Therefore, you will be able to easily compare the two approaches, as the sample application will do the same thing as an analogous Spring Boot application.

If you like Quarkus, then you can find quite a few articles about it on my blog. Just go to the Quarkus category and find the topic you are interested in.

SourceCode

Feel free to use my source code if you’d like to try it out yourself. To do that, you must clone my sample GitHub repository. Then you should only follow my instructions.

Motivation

Whenever I create a new article or example related to AI, I like to define the problem I’m trying to solve. The problem this example solves is very trivial. I publish numerous small demo apps to explain complex technology concepts. These apps typically require data to display a demo output. Usually, I add demo data by myself or use a library like Datafaker to do it for me. This time, we can leverage the AI Chat Models API for that. Let’s begin!

The Quarkus-related topic I’m describing today, I also explained earlier for Spring Boot. For a comparison of the features offered by both frameworks for simple interaction with the AI chat model, you can read this article on Spring AI.

Dependencies

The sample application uses the current latest version of the Quarkus framework.


  
    
      io.quarkus.platform
      quarkus-bom
      ${quarkus.platform.version}
      pom
      import

XML

You can easily switch between multiple AI model implementations by activating a dedicated Maven profile. By default, the open-ai profile is active. It includes the quarkus-langchain4j-openai module in the Maven dependencies. You can also activate the mistral-ai and ollama profile. In that case, the quarkus-langchain4j-mistral-ai or quarkus-langchain4j-ollama module will be included instead of the LangChain4j OpenAI extension.


  
    open-ai
    
      true
    
    
      
        io.quarkiverse.langchain4j
        quarkus-langchain4j-openai
        ${quarkus-langchain4j.version}
      
    
  
  
    mistral-ai
    
      
        io.quarkiverse.langchain4j
        quarkus-langchain4j-mistral-ai
        ${quarkus-langchain4j.version}
      
    
  
  
    ollama
    
      
        io.quarkiverse.langchain4j
        quarkus-langchain4j-ollama
        ${quarkus-langchain4j.version}

XML

The sample Quarkus application is simple. It exposes some REST endpoints and communicates with a selected AI model to return an AI-generated response via each endpoint. So, you need to include only core Quarkus modules like quarkus-rest-jackson or quarkus-arc. To implement JUnit tests with REST API, it also includes the quarkus-junit5 and rest-assured modules in the test scope.


  
  
    io.quarkus
    quarkus-rest-jackson
  
  
    io.quarkus
    quarkus-arc
  

  
  
    io.quarkus
    quarkus-junit5
    test
  
  
    io.rest-assured
    rest-assured
    test

XML

Quarkus LangChain4j Chat Models Integration

Quarkus provides an innovative approach to interacting with AI chat models. First, you need to annotate your interface by defining AI-oriented methods with the @RegisterAiService annotation. Then you must add a proper description and input prompt inside the @SystemMessage and @UserMessage annotations. Here is the sample PersonAiService interaction, which defines two methods. The generatePersonList method aims to ask the AI model to generate a list of 10 unique persons in a form consistent with the input object structure. The getPersonById method must read the previously generated list from chat memory and return a person’s data with a specified id field.

@RegisterAiService
@ApplicationScoped
public interface PersonAiService {

    @SystemMessage("""
        You are a helpful assistant that generates realistic person data.
        Always respond with valid JSON format.
        """)
    @UserMessage("""
        Generate exactly 10 unique persons

        Requirements:
        - Each person must have a unique integer ID (like 1, 2, 3, etc.)
        - Use realistic first and last names per each nationality
        - Ages should be between 18 and 80
        - Return ONLY the JSON array, no additional text
        """)
    PersonResponse generatePersonList(@MemoryId int userId);

    @SystemMessage("""
        You are a helpful assistant that can recall generated person data from chat memory.
        """)
    @UserMessage("""
        In the previously generated list of persons for user {userId}, find and return the person with id {id}.
        
        Return ONLY the JSON object, no additional text.
        """)
    Person getPersonById(@MemoryId int userId, int id);

}

Java

There are a few more things to add regarding the code snippet above. The beans created by @RegisterAiService are @RequestScoped by default. The Quarkus LangChain4j documentation states that this is possible, allowing objects to be deleted from the chat memory. In the case seen above, the list of people is generated per user ID, which acts as the key by which we search the chat memory. To guarantee that the getPersonById method finds a list of persons generated per @MemoryId the PersonAiService interface must be annotated with @ApplicationScoped. The InMemoryChatMemoryStore implementation is enabled by default, so you don’t need to declare any additional beans to use it.

Quarkus LangChain4j can automatically map the LLM’s JSON response to the output POJO. However, until now, it has not been possible to map it directly to the output collection. Therefore, you must wrap the output list with the additional class, as shown below.

public class PersonResponse {

    private List<Person> persons;

    public List<Person> getPersons() {
        return persons;
    }

    public void setPersons(List<Person> persons) {
        this.persons = persons;
    }
}

Java

Here’s the Person class:

public class Person {

    private Integer id;
    private String firstName;
    private String lastName;
    private int age;
    private String nationality;
    private Gender gender;
    
    // GETTERS and SETTERS

}

Java

Finally, the last part of our implementation is REST endpoints. Here’s the REST controller that injects and uses PersonAiService to interact with the AI chat model. It exposes two endpoints: GET /api/{userId}/persons and GET /api/{userId}/persons/{id}. You can generate several lists of persons by specifying the userId path parameter.

@Path("/api")
@Produces(MediaType.APPLICATION_JSON)
@Consumes(MediaType.APPLICATION_JSON)
public class PersonController {

    private static final Logger LOG = Logger.getLogger(PersonController.class);

    PersonAiService personAiService;

    public PersonController(PersonAiService personAiService) {
        this.personAiService = personAiService;
    }

    @GET
    @Path("/{userId}/persons")
    public PersonResponse generatePersons(@PathParam("userId") int userId) {
        return personAiService.generatePersonList(userId);
    }

    @GET
    @Path("/{userId}/persons/{id}")
    public Person getPersonById(@PathParam("userId") int userId, @PathParam("id") int id) {
        return personAiService.getPersonById(userId, id);
    }

}

Java

Use Different AI Models with Quarkus LangChain4j

Configuration Properties

Here is a configuration defined within the application.properties file. Before proceeding, you must generate the OpenAI and Mistral AI API tokens and export them as environment variables. Additionally, you can enable logging of requests and responses in AI model communication. It is also worth increasing the default timeout for a single request from 10 seconds to a higher value, such as 20 seconds.

quarkus.langchain4j.chat-model.provider = ${AI_MODEL_PROVIDER:openai}
quarkus.langchain4j.log-requests = true
quarkus.langchain4j.log-responses = true

# OpenAI Configuration
quarkus.langchain4j.openai.api-key = ${OPEN_AI_TOKEN}
quarkus.langchain4j.openai.timeout = 20s

# Mistral AI Configuration
quarkus.langchain4j.mistralai.api-key = ${MISTRAL_AI_TOKEN}
quarkus.langchain4j.mistralai.timeout = 20s

# Ollama Configuration
quarkus.langchain4j.ollama.base-url = ${OLLAMA_BASE_URL:http://localhost:11434}

Plaintext

To run a sample Quarkus application and connect it with OpenAI, you must set the OPEN_AI_TOKEN environment variable. Since the open-ai Maven profile is activated by default, you don’t need to set anything else while running an app.

$ export OPEN_AI_TOKEN=<your_openai_token>
$ mvn quarkus:dev

ShellSession

Then, you can call the GET /api/{userId}/persons endpoint with different userId path variable values. Here are sample API requests and responses.

After that, you can call the GET /api/{userId}/persons/{id} endpoint to return a specified person found in the chat memory.

Switch Between AI Models

Then, you can repeat the same exercise with the Mistral AI model. You must set the AI_MODEL_PROVIDER to mistral, export its API token as the MISTRAL_AI_TOKEN environment variable, and enable the mistral-ai profile while running the app.

$ export AI_MODEL_PROVIDER=mistralai
$ export MISTRAL_AI_TOKEN=<your_mistralai_token>
$ mvn quarkus:dev -Pmistral-ai

ShellSession

The app should start successfully.

Once it happens, you can repeat the same sequence of requests as before for OpenAI.

$ curl http://localhost:8080/api/1/persons
$ curl http://localhost:8080/api/2/persons
$ curl http://localhost:8080/api/1/persons/1
$ curl http://localhost:8080/api/2/persons/1

ShellSession

You can check the request sent to the AI model in the application logs.

Here’s a log showing an AI chat model response:

Finally, you can run a test with ollama. By default, the LangChain4j extension for Ollama uses the llama3.2 model. You can change it by setting the quarkus.langchain4j.ollama.chat-model.model-id property in the application.properties file. Assuming that you use the llama3.3 model, here’s your configuration:

quarkus.langchain4j.ollama.base-url = ${OLLAMA_BASE_URL:http://localhost:11434}
quarkus.langchain4j.ollama.chat-model.model-id = llama3.3
quarkus.langchain4j.ollama.timeout = 60s

Plaintext

Before proceeding, you must run the llama3.3 model on your laptop. Of course, you can choose another, smaller model, because llama3.3 is 42 GB.

ollama run llama3.3

ShellSession

It can take a lot of time. However, a model is finally ready to use.

Once a model is running, you can set the AI_MODEL_PROVIDER environment variable to ollama and activate the ollama profile for the app:

$ export AI_MODEL_PROVIDER=ollama
$ mvn quarkus:dev -Pollama

ShellSession

This time, our application is connected to the llama3.3 model started with ollama:

With the Quarkus LangChain4j Ollama extension, you can take advantage of dev services support. It means that you don’t need to install and run Ollama on your laptop or run a model with ollama CLI. Quarkus will run Ollama as a Docker container and automatically run a selected AI model on it. In that case, you don’t need to set the quarkus.langchain4j.ollama.base-url property. Before switching to that option, let’s use a smaller AI model by setting the quarkus.langchain4j.ollama.chat-model.model-id = mistral property. Then start the app in the same way as before.

Final Thoughts

I must admit that the Quarkus LangChain4j extension is enjoyable to use. With a few simple annotations, you can configure your application to talk to the AI model of your choice correctly. In this article, I presented a straightforward example of integrating Quarkus with an AI chat model. However, we quickly reviewed features such as prompts, structured output, and chat memory. You can expect more articles in the Quarkus series with AI soon.

The post Getting Started with Quarkus LangChain4j and Chat Model appeared first on Piotr's TechBlog.

Using Ollama with Spring AI

piotr.minkowski — Mon, 10 Mar 2025 09:46:35 +0000

This article will teach you how to create a Spring Boot application that implements several AI scenarios using Spring AI and the Ollama tool. Ollama is an open-source tool that aims to run open LLMs on our local machine. It acts like a bridge between LLM and a workstation, providing an API layer on top of them for other applications or services. With Ollama we can run almost every model we want only by pulling it from a huge library.

This is the fifth part of my series of articles about Spring Boot and AI. I mentioned Ollama in the first part of the series to show how to switch between different AI models with Spring AI. However, it was only a brief introduction. Today, we try to run all AI use cases described in the previous tutorials with the Ollama tool. Those tutorials integrated mostly with OpenAI. In this article, we will test them against different AI models.

https://piotrminkowski.com/2025/01/28/getting-started-with-spring-ai-and-chat-model: The first tutorial introduces the Spring AI project and its support for building applications based on chat models like OpenAI or Mistral AI.
https://piotrminkowski.com/2025/01/30/getting-started-with-spring-ai-function-calling: The second tutorial shows Spring AI support for Java function calling with the OpenAI chat model.
https://piotrminkowski.com/2025/02/24/using-rag-and-vector-store-with-spring-ai: The third tutorial shows Spring AI support for RAG (Retrieval Augmented Generation) and vector store.
https://piotrminkowski.com/2025/03/04/spring-ai-with-multimodality-and-images: The fourth tutorial shows Spring AI support for a multimodality feature and image generation

Fortunately, our application can easily switch between different AI tools or models. To achieve this, we must activate the right Maven profile.

Source Code

Feel free to use my source code if you’d like to try it out yourself. To do that, you must clone my sample GitHub repository. Then you should only follow my instructions.

Prepare a Local Environment for Ollama

A few options exist for accessing Ollama on the local machine with Spring AI. I downloaded Ollama from the following link and installed it on my laptop. Alternatively, we can run it e.g. with Docker Compose or Testcontainers.

Once we install Ollama on our workstation we can run the AI model from its library with the ollama run command. The full list of available models can be found here. At the beginning, we will choose the Llava model. It is one of the most popular models which supports both a vision encoder and language understanding.

ollama run llava

ShellSession

Ollama must pull the model manifest and image. Here’s the ollama run command output. Once we see that, we can interact with the model.

The sample application source code already defines the ollama-ai Maven profile with the spring-ai-ollama-spring-boot-starter Spring Boot starter.


  ollama-ai
  
    
      org.springframework.ai
      spring-ai-ollama-spring-boot-starter

XML

The profile is disabled by default. We might enable it during development as shown below (for IntelliJ IDEA). However, the application doesn’t use any vendor-specific components but only generic Spring AI classes and interfaces.

We must activate the ollama-ai profile when running the same application. Assuming we are in the project root directory, we need to run the following Maven command:

mvn spring-boot:run -Pollama-ai

ShellSession

Portability across AI Models

We should avoid using specific model library components to make our application portable between different models. For example, when registering functions in the chat model client we should use FunctionCallingOptions instead of model-specific components like OpenAIChatOptions or OllamaOptions.

@GetMapping
String calculateWalletValue() {
   PromptTemplate pt = new PromptTemplate("""
   What’s the current value in dollars of my wallet based on the latest stock daily prices ?
   """);

   return this.chatClient.prompt(pt.create(
        FunctionCallingOptions.builder()
                    .function("numberOfShares")
                    .function("latestStockPrices")
                    .build()))
            .call()
            .content();
}

Java

Not all models support all the AI capabilities used in our sample application. For models like Ollama or Mistral AI, Spring AI doesn’t provide image generation implementation since those tools don’t support it right now. Therefore we should inject the ImageModel optionally, in case it is not provided by the model-specific library.

imageModel, VectorStore store) { this.chatClient = chatClientBuilder .defaultAdvisors(new SimpleLoggerAdvisor()) .build(); imageModel.ifPresent(model -> this.imageModel = model); // other initializations } }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

@RestController
@RequestMapping("/images")
public class ImageController {

    private final static Logger LOG = LoggerFactory.getLogger(ImageController.class);
    private final ObjectMapper mapper = new ObjectMapper();

    private final ChatClient chatClient;
    private ImageModel imageModel;

    public ImageController(ChatClient.Builder chatClientBuilder,
                           Optional<ImageModel> imageModel,
                           VectorStore store) {
        this.chatClient = chatClientBuilder
                .defaultAdvisors(new SimpleLoggerAdvisor())
                .build();
        imageModel.ifPresent(model -> this.imageModel = model);
        
        // other initializations 
    }
}

Java

Then, if a method requires the ImageModel bean, we can throw an exception informing it is not by the AI model (1). On the other hand, Spring AI does not provide a dedicated interface for multimodality, which enables AI models to process information from multiple sources. We can use the UserMessage class and the Media class to combine e.g. text with image(s) in the user prompt. The GET /images/describe/{image} endpoint lists items detected in the source image from the classpath (2).

describeImage(@PathVariable String image) { Media media = Media.builder() .id(image) .mimeType(MimeTypeUtils.IMAGE_PNG) .data(new ClassPathResource("images/" + image + ".png")) .build(); UserMessage um = new UserMessage(""" List all items you see on the image and define their category. Return items inside the JSON array in RFC8259 compliant JSON format. """, media); return this.chatClient.prompt(new Prompt(um)) .call() .entity(new ParameterizedTypeReference<>() {}); }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

@GetMapping(value = "/generate/{object}", produces = MediaType.IMAGE_PNG_VALUE)
byte[] generate(@PathVariable String object) throws IOException, NotSupportedException {
   if (imageModel == null)
      throw new NotSupportedException("Image model is not supported by the AI model"); // (1)
   ImageResponse ir = imageModel.call(new ImagePrompt("Generate an image with " + object, ImageOptionsBuilder.builder()
           .height(1024)
           .width(1024)
           .N(1)
           .responseFormat("url")
           .build()));
   String url = ir.getResult().getOutput().getUrl();
   UrlResource resource = new UrlResource(url);
   LOG.info("Generated URL: {}", url);
   dynamicImages.add(Media.builder()
           .id(UUID.randomUUID().toString())
           .mimeType(MimeTypeUtils.IMAGE_PNG)
           .data(url)
           .build());
   return resource.getContentAsByteArray();
}
    
@GetMapping("/describe/{image}") // (2)
List<Item> describeImage(@PathVariable String image) {
   Media media = Media.builder()
           .id(image)
           .mimeType(MimeTypeUtils.IMAGE_PNG)
           .data(new ClassPathResource("images/" + image + ".png"))
           .build();
   UserMessage um = new UserMessage("""
   List all items you see on the image and define their category. 
   Return items inside the JSON array in RFC8259 compliant JSON format.
   """, media);
   return this.chatClient.prompt(new Prompt(um))
           .call()
           .entity(new ParameterizedTypeReference<>() {});
}

Java

Let’s try to avoid similar declarations described in Spring AI. Although they are perfectly correct, they will cause problems when switching between different Spring Boot starters for different AI vendors.

ChatResponse response = chatModel.call(
    new Prompt(
        "Generate the names of 5 famous pirates.",
        OllamaOptions.builder()
            .model(OllamaModel.LLAMA3_1)
            .temperature(0.4)
            .build()
    ));

Java

In this case, we can set the global property in the application.properties file which sets the default model used in the scenario with Ollama.

spring.ai.ollama.chat.options.model = llava

Java

Testing Multiple Models with Spring AI and Ollama

By default, Ollama doesn’t require any API token to establish communication with AI models. The Ollama Spring Boot starter provides auto-configuration that connects the chat client to the Ollama API server running on the localhost:11434 address. So, before running our sample application we must export tokens used to authorize against stock market API and a vector store.

export STOCK_API_KEY=<YOUR_STOCK_API_KEY>
export PINECONE_TOKEN=<YOUR_PINECONE_TOKEN>

Java

Llava on Ollama

Let’s begin with the Llava model. We can call the first endpoint that asks the model to generate a list of persons (GET /persons) and then search for the person with a particular in the list stored in the chat memory (GET /persons/{id}).

Then we can the endpoint that displays all the items visible on the particular image from the classpath (GET /images/describe/{image}).

By the way, here is the analyzed image stored in the src/main/resources/images/fruits-3.png file.

The endpoint for describing all the input images from the classpath doesn’t work fine. I tried to tweak it by adding the RFC8259 JSON format sentence or changing a query. However, the AI model always returned a description of a single instead of a whole Media list. The OpenAI model could print descriptions for all images in the String[] format.

@GetMapping("/describe")
String[] describe() {
   UserMessage um = new UserMessage("""
            Explain what do you see on each image from the input list.
            Return data in RFC8259 compliant JSON format.
            """, List.copyOf(Stream.concat(images.stream(), dynamicImages.stream()).toList()));
   return this.chatClient.prompt(new Prompt(um))
            .call()
            .entity(String[].class);
}

Java

Here’s the response. Of course, we can train a model to receive better results or try to prepare a better prompt.

After calling the GET /wallet endpoint exposed by the WalletController, I received the [400] Bad Request - {"error":"registry.ollama.ai/library/llava:latest does not support tools"} response. It seems Llava doesn’t support the Function/Tool calling feature. We will also always receive the NotSupportedExcpetion for GET /images/generate/{object} endpoint, since the Spring AI Ollama library doesn’t provide ImageModel bean. You can perform other tests e.g. for RAG and vector store features implemented in the StockController @RestController.

Granite on Ollama

Let’s switch to another interesting model – Granite. Particularly we will test the granite3.2-vision model dedicated to automated content extraction from tables, charts, infographics, plots, and diagrams. First, we set the current model name in the Ollama Spring AI configuration properties.

spring.ai.ollama.chat.options.model = granite3.2-vision

Plaintext

Let’s stop the Llava model and then run granite3.2-vision on Ollama:

ollama run granite3.2-vision

Java

After the application restarts, we can perform some test calls. The endpoint for describing a single image returns a more detailed response than the Llava model. The response for the query with multiple images still looks the same as before.

The Granite Vision model supports a “function calling” feature, but it couldn’t call functions properly using my prompt. Please refer to my article for more details about the Spring AI function calling with OpenAI.

Deepseek on Ollama

The last model we will run within this exercise is Deepseek. DeepSeek-R1 achieves performance comparable to OpenAI-o1 on reasoning tasks. First, we must set the current model name in the Ollama Spring AI configuration properties.

spring.ai.ollama.chat.options.model = deepseek-r1

Plaintext

Then let’s stop the Granite model and then run deepseek-r1 on Ollama:

ollama run deepseek-r1

ShellSession

We need to restart the app:

mvn spring-boot:run -Pollama-ai

ShellSession

As usual, we can call the first endpoint that asks the model to generate a list of persons (GET /persons) and then search for the person with a particular in the list stored in the chat memory (GET /persons/{id}). The response was pretty large, but not in the required JSON format. Here’s the fragment of the response:

The deepseek-r1 model doesn’t support a tool/function calling feature. Also, it didn’t analyze my input image properly and it didn’t return a JSON response according to the Spring AI structured output feature.

Final Thoughts

This article shows how to easily switch between multiple AI models with Spring AI and Ollama. We tested several AI use cases implemented in the sample Spring Boot application across models such as Llava, Granite, or Deepseek. The app provides several endpoints for showing such features as multimodality, chat memory, RAG, vector store, or a function calling. It aims not to compare the AI models, but to give a simple recipe for integration with different AI models and allow playing with them using Spring AI.

The post Using Ollama with Spring AI appeared first on Piotr's TechBlog.

Spring AI with Multimodality and Images

piotr.minkowski — Tue, 04 Mar 2025 08:56:24 +0000

This article will teach you how to create a Spring Boot application that handles images and text using the Spring AI multimodality feature. Multimodality is the ability to understand and process information from different sources simultaneously. It covers text, images, audio, and other data formats. We will perform simple experiments with multimodality and images. This is the fourth part of my series of articles about Spring Boot and AI. It is worth reading the following posts before proceeding with the current one:

https://piotrminkowski.com/2025/01/28/getting-started-with-spring-ai-and-chat-model: The first tutorial introduces the Spring AI project and its support for building applications based on chat models like OpenAI or Mistral AI.
https://piotrminkowski.com/2025/01/30/getting-started-with-spring-ai-function-calling: The second tutorial shows Spring AI support for Java function calling with the OpenAI chat model.
https://piotrminkowski.com/2025/02/24/using-rag-and-vector-store-with-spring-ai: The third tutorial shows Spring AI support for RAG (Retrieval Augmented Generation) and vector store.

Source Code

Feel free to use my source code if you’d like to try it out yourself. To do that, you must clone my sample GitHub repository. Then you should only follow my instructions.

Motivation for Multimodality with Spring AI

The multimodal large language model (LLM) capabilities allow it to process and generate text alongside other modalities, including images, audio, and video. This feature covers a use case when we want LLM to detect something specific inside an image or describe its content. Let’s assume we have a list of input images. We want to find the image in that list that matches our description. For example, this description can ask a model to find the image that contains a specified item. The Spring AI Message API provides all the necessary elements to support multimodal LLMs. Here’s a diagram that illustrates our scenario.

Use Multimodality with Spring AI

We don’t need to include any specific library other than the Spring AI starter for a particular AI model. The default option is spring-ai-openai-spring-boot-starter. Our application uses images stored in the src/main/resources/images directory. Spring AI multimodality support requires the image to be passed inside the Media object. We load all the pictures from the classpath inside the constructor.

Recognize Items in the Image

The GET /images/find/{object} tries to find the image that contains the item determined by the object path variable. AI model must return a position on the image in the input list. To achieve that, we create an UserMessage object that contains a user query and a list of the Media objects. Once the model returns the position, the endpoint reads the image from the list and returns its content in the image/png format.

images; private List dynamicImages = new ArrayList<>(); public ImageController(ChatClient.Builder chatClientBuilder) { this.chatClient = chatClientBuilder .defaultAdvisors(new SimpleLoggerAdvisor()) .build(); this.images = List.of( Media.builder().id("fruits").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits.png")).build(), Media.builder().id("fruits-2").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits-2.png")).build(), Media.builder().id("fruits-3").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits-3.png")).build(), Media.builder().id("fruits-4").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits-4.png")).build(), Media.builder().id("fruits-5").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits-5.png")).build(), Media.builder().id("animals").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals.png")).build(), Media.builder().id("animals-2").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals-2.png")).build(), Media.builder().id("animals-3").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals-3.png")).build(), Media.builder().id("animals-4").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals-4.png")).build(), Media.builder().id("animals-5").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals-5.png")).build() ); } @GetMapping(value = "/find/{object}", produces = MediaType.IMAGE_PNG_VALUE) @ResponseBody byte[] analyze(@PathVariable String object) { String msg = """ Which picture contains %s. Return only a single picture. Return only the number that indicates its position in the media list. """.formatted(object); LOG.info(msg); UserMessage um = new UserMessage(msg, images); String content = this.chatClient.prompt(new Prompt(um)) .call() .content(); assert content != null; return images.get(Integer.parseInt(content)-1).getDataAsByteArray(); } }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

@RestController
@RequestMapping("/images")
public class ImageController {

    private final static Logger LOG = LoggerFactory
        .getLogger(ImageController.class);

    private final ChatClient chatClient;
    private List<Media> images;
    private List<Media> dynamicImages = new ArrayList<>();

    public ImageController(ChatClient.Builder chatClientBuilder) {
        this.chatClient = chatClientBuilder
                .defaultAdvisors(new SimpleLoggerAdvisor())
                .build();
        this.images = List.of(
                Media.builder().id("fruits").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits.png")).build(),
                Media.builder().id("fruits-2").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits-2.png")).build(),
                Media.builder().id("fruits-3").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits-3.png")).build(),
                Media.builder().id("fruits-4").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits-4.png")).build(),
                Media.builder().id("fruits-5").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/fruits-5.png")).build(),
                Media.builder().id("animals").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals.png")).build(),
                Media.builder().id("animals-2").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals-2.png")).build(),
                Media.builder().id("animals-3").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals-3.png")).build(),
                Media.builder().id("animals-4").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals-4.png")).build(),
                Media.builder().id("animals-5").mimeType(MimeTypeUtils.IMAGE_PNG).data(new ClassPathResource("images/animals-5.png")).build()
        );
    }

    @GetMapping(value = "/find/{object}", produces = MediaType.IMAGE_PNG_VALUE)
    @ResponseBody byte[] analyze(@PathVariable String object) {
        String msg = """
        Which picture contains %s.
        Return only a single picture.
        Return only the number that indicates its position in the media list.
        """.formatted(object);
        LOG.info(msg);

        UserMessage um = new UserMessage(msg, images);

        String content = this.chatClient.prompt(new Prompt(um))
                .call()
                .content();

        assert content != null;
        return images.get(Integer.parseInt(content)-1).getDataAsByteArray();
    }

}

Java

Let’s make a test call. We will look for the picture containing a banana. Here’s the AI model response after calling the http://localhost:8080/images/find/banana. You can try to make other test calls and find an image with e.g. an orange or a tomato.

Describe Image Contents

On the other hand, we can ask the AI model to generate a short description of all images included as the Media content. The GET /images/describe endpoint merges two lists of images.

@GetMapping("/describe")
String[] describe() {
   UserMessage um = new UserMessage("Explain what do you see on each image.",
            List.copyOf(Stream.concat(images.stream(), dynamicImages.stream()).toList()));
      return this.chatClient.prompt(new Prompt(um))
              .call()
              .entity(String[].class);
}

Java

Once we call the http://localhost:8080/images/describe URL we will receive a compact description of all input images. The two highlighted descriptions have been generated for images from the dynamicImages List. These images were generated by the AI image model. We will discuss this in the next section.

Generate Images with AI Model

To generate an image using AI API we must inject the ImageModel bean. It provides a single call method that allows us to communicate with AI Models dedicated to image generation. This method takes the ImagePrompt object as an argument. Typically, we use the ImagePrompt constructor that takes instructions for image generation and options that customize the height, width, and number of images. We will generate a single (N=1) image with 1024 pixels in height and width. The AI model returns the image URL (responseFormat). Once the image is generated, we create an UrlResource object, create the Media object, and put it into the dynamicImages List. The GET /images/generate/{object} endpoint returns a byte array representation of the image object.

images; private List dynamicImages = new ArrayList<>(); public ImageController(ChatClient.Builder chatClientBuilder, ImageModel imageModel) { this.chatClient = chatClientBuilder .defaultAdvisors(new SimpleLoggerAdvisor()) .build(); this.imageModel = imageModel; // other initializations } @GetMapping(value = "/generate/{object}", produces = MediaType.IMAGE_PNG_VALUE) byte[] generate(@PathVariable String object) throws IOException { ImageResponse ir = imageModel.call(new ImagePrompt("Generate an image with " + object, ImageOptionsBuilder.builder() .height(1024) .width(1024) .N(1) .responseFormat("url") .build())); UrlResource url = new UrlResource(ir.getResult().getOutput().getUrl()); LOG.info("Generated URL: {}", ir.getResult().getOutput().getUrl()); dynamicImages.add(Media.builder() .id(UUID.randomUUID().toString()) .mimeType(MimeTypeUtils.IMAGE_PNG) .data(url) .build()); return url.getContentAsByteArray(); } }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

@RestController
@RequestMapping("/images")
public class ImageController {

    private final ChatClient chatClient;
    private final ImageModel imageModel;
    private List<Media> images;
    private List<Media> dynamicImages = new ArrayList<>();
    
    public ImageController(ChatClient.Builder chatClientBuilder,
                           ImageModel imageModel) {
        this.chatClient = chatClientBuilder
                .defaultAdvisors(new SimpleLoggerAdvisor())
                .build();
        this.imageModel = imageModel;
        // other initializations
    }
    
    @GetMapping(value = "/generate/{object}", produces = MediaType.IMAGE_PNG_VALUE)
    byte[] generate(@PathVariable String object) throws IOException {
        ImageResponse ir = imageModel.call(new ImagePrompt("Generate an image with " + object, ImageOptionsBuilder.builder()
                .height(1024)
                .width(1024)
                .N(1)
                .responseFormat("url")
                .build()));
        UrlResource url = new UrlResource(ir.getResult().getOutput().getUrl());
        LOG.info("Generated URL: {}", ir.getResult().getOutput().getUrl());
        dynamicImages.add(Media.builder()
                .id(UUID.randomUUID().toString())
                .mimeType(MimeTypeUtils.IMAGE_PNG)
                .data(url)
                .build());
        return url.getContentAsByteArray();
    }
    
}

Java

Do you remember the description of that image returned by the GET /images/describe endpoint? Here’s our image with strawberry generated by the AI model after calling the http://localhost:8080/images/generate/strawberry URL.

Here’s a similar test for the banana input parameter.

Use Vector Store with Spring AI Multimodality

Let’s consider how we can leverage vector store in our scenario. We cannot insert image representation directly to a vector store since most popular vendors like OpenAI or Mistral AI do not provide image embedding models. We could integrate directly with a model like clip-vit-base-patch32 to generate image embeddings, but this article won’t cover such a scenario. Instead, a vector store may contain an image description and its location (or name). The GET /images/load endpoint provides a method for loading image descriptions into a vector store. It uses Spring AI multimodality support to generate a compact description of each image in the input list and then puts it into the store.

    @GetMapping("/load")
    void load() throws JsonProcessingException {
        String msg = """
        Explain what do you see on the image.
        Generate a compact description that explains only what is visible.
        """;
        for (Media image : images) {
            UserMessage um = new UserMessage(msg, image);
            String content = this.chatClient.prompt(new Prompt(um))
                    .call()
                    .content();

            var doc = Document.builder()
                    .id(image.getId())
                    .text(mapper.writeValueAsString(new ImageDescription(image.getId(), content)))
                    .build();
            store.add(List.of(doc));
            LOG.info("Document added: {}", image.getId());
        }
    }

Java

Finally, we can implement another endpoint that generates a new image and asks the AI model to generate an image description. Then, it performs a similarity search in a vector store to find the most similar image based on its text description.

generateAndMatch(@PathVariable String object) throws IOException { ImageResponse ir = imageModel.call(new ImagePrompt("Generate an image with " + object, ImageOptionsBuilder.builder() .height(1024) .width(1024) .N(1) .responseFormat("url") .build())); UrlResource url = new UrlResource(ir.getResult().getOutput().getUrl()); LOG.info("URL: {}", ir.getResult().getOutput().getUrl()); String msg = """ Explain what do you see on the image. Generate a compact description that explains only what is visible. """; UserMessage um = new UserMessage(msg, new Media(MimeTypeUtils.IMAGE_PNG, url)); String content = this.chatClient.prompt(new Prompt(um)) .call() .content(); SearchRequest searchRequest = SearchRequest.builder() .query("Find the most similar description to this: " + content) .topK(2) .build(); return store.similaritySearch(searchRequest); }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

    @GetMapping("/generate-and-match/{object}")
    List<Document> generateAndMatch(@PathVariable String object) throws IOException {
        ImageResponse ir = imageModel.call(new ImagePrompt("Generate an image with " + object, ImageOptionsBuilder.builder()
                .height(1024)
                .width(1024)
                .N(1)
                .responseFormat("url")
                .build()));
        UrlResource url = new UrlResource(ir.getResult().getOutput().getUrl());
        LOG.info("URL: {}", ir.getResult().getOutput().getUrl());

        String msg = """
        Explain what do you see on the image.
        Generate a compact description that explains only what is visible.
        """;

        UserMessage um = new UserMessage(msg, new Media(MimeTypeUtils.IMAGE_PNG, url));
        String content = this.chatClient.prompt(new Prompt(um))
                .call()
                .content();

        SearchRequest searchRequest = SearchRequest.builder()
                .query("Find the most similar description to this: " + content)
                .topK(2)
                .build();

        return store.similaritySearch(searchRequest);
    }

Java

Let’s test the GET /images/generate-and-match/{object} endpoint using the pineapple parameter. It returns the description of the fruits.png image from the classpath.

By the way, here’s the fruits.png image located in the /src/main/resources/images directory.

Final Thoughts

Spring AI provides multimodality and image generation support. All the features presented in this article work fine with OpenAI. It supports both the image model and multimodality. To read more about the support offered by other models, refer to the Spring AI chat and image model docs.

This article shows how we can use Spring AI and AI models to interact with images in various ways.

The post Spring AI with Multimodality and Images appeared first on Piotr's TechBlog.

Getting Started with Spring AI and Chat Model

piotr.minkowski — Tue, 28 Jan 2025 10:02:24 +0000

This article will teach you how to use the Spring AI project to build applications based on different chat models. The Spring AI Chat Model is a simple and portable interface that allows us to interact with these models. Our sample Spring Boot application will switch between three popular chat models provided by OpenAI, Mistral AI, and Ollama. This article is the first in a series explaining AI concepts with Spring Boot. Look for more on my blog in this area soon.

If you are interested in Spring Boot, read my article about tips, tricks, and techniques for this framework here.

Source Code

If you would like to try it by yourself, you may always take a look at my source code. To do that, you must clone my sample GitHub repository. Then you should only follow my instructions.

Problem

Whenever I create a new article or example related to AI, I like to define the problem I’m trying to solve. The problem this example solves is very trivial. I publish a lot of small demo apps to explain technology concepts. These apps usually need data to show a demo output. Usually, I add demo data by myself or use a library like Datafaker to do it for me. This time, we can leverage AI Chat Models API for that. Let’s begin!

Dependencies

The Spring AI project is still under active development. Currently, we are waiting for the 1.0 GA release. Until then, we will switch to the milestone releases of the project. The current milestone is 1.0.0-M5. So let’s add the Spring Milestones repository to our Maven pom.xml file.

    
        
            central
            Central
            https://repo1.maven.org/maven2/
        
        
            spring-milestones
            Spring Milestones
            https://repo.spring.io/milestone
            
                false

XML

Then we should include the Maven BOM with a specified version of the Spring AI project.

    
        21
        1.0.0-M5
    
    
    
        
            
                org.springframework.ai
                spring-ai-bom
                ${spring-ai.version}
                pom
                import

XML

Since our sample application exposes some REST endpoints, we should include the Spring Boot Web Starter. We can include the Spring Boot Test Starter to create some JUnit tests. The Spring AI modules are included in the Maven profiles section. There are three different profiles for each chat model provider. By default, our application uses Open AI, and thus it activates the open-ai profile, which includes the spring-ai-openai-spring-boot-starter library. We should activate the mistral-ai profile to switch to Mistral AI. The third option is the ollama-ai profile including the spring-ai-ollama-spring-boot-starter dependency. Here’s a full list of dependencies. That’ll make it a breeze to switch between different chat model AI providers — we’ll only need to set the profile parameter in the Maven running command.

    
        
            org.springframework.boot
            spring-boot-starter-web
        
        
            org.springframework.boot
            spring-boot-starter-test
            test
        
    

    
        
            open-ai
            
                true
            
            
                
                    org.springframework.ai
                    spring-ai-openai-spring-boot-starter
                
            
        
        
            mistral-ai
            
                
                    org.springframework.ai
                    spring-ai-mistral-ai-spring-boot-starter
                
            
        
        
            ollama-ai
            
                
                    org.springframework.ai
                    spring-ai-ollama-spring-boot-starter

XML

Connect to AI Chat Model Providers

Configure OpenAI

Before we proceed with a source code, we should prepare chat model AI tools. Let’s begin with OpenAI. We must have an account on the OpenAI Platform portal. After signing in we should access the API Keys page to generate an API token. Once we set its name, we can click the “Create secret key” button. Don’t forget to copy the key after creation.

The value of the generated token should be saved as an environment variable. Our sample Spring Boot application read its value from the OPEN_AI_TOKEN variable.

export OPEN_AI_TOKEN=

ShellSession

Configure Mistral AI

Then, we should repeat a very similar action for Mistral AI. We must have an account on the Mistral AI Platform portal. After signing in we should access the API Keys page to generate an API token. Both the name and expiration date fields are optional. Once we generate a token by clicking the “Create key” button, we should copy it.

The value of the generated token should be saved as an environment variable. Our sample Spring Boot application read its value for Mistral AI from the MISTRAL_AI_TOKEN variable.

export MISTRAL_AI_TOKEN=

ShellSession

Run and Configure Ollama

Opposite to OpenAI or Mistral AI, Ollama is built to allow to run large language models (LLMs) directly on our workstations. This means we don’t have any connection to the remote API to access it. First, we must download the Ollama binary dedicated to our OS from the following page. After installation, we can interact with it using the ollama CLI. First, we should choose the model to run. The full list of available models can be found here. By default, Spring AI expects the mistral model for the Ollama. Let’s choose llama3.2.

ollama run llama3.2

ShellSession

After running Ollama locally we can interact with it using the CLI terminal.

Configure Spring Boot Properties

Ollama exposes port over localhost and does not require an API token. Fortunately, all necessary URLs for our APIs come with the Spring AI auto-configuration. After choosing the llama3.2 model, we should provide the change in Spring Boot application properties respectively. We can also set the gpt-4o-mini model for OpenAI to decrease API costs.

spring.ai.openai.api-key = ${OPEN_AI_TOKEN}
spring.ai.openai.chat.options.model = gpt-4o-mini
spring.ai.mistralai.api-key = ${MISTRAL_AI_TOKEN}
spring.ai.ollama.chat.options.model = llama3.2

Plaintext

Spring AI Chat Model API

Prompting and Structured Output

Here is our model class. It contains the id field and several other fields that best describe each person.

public class Person {

    private Integer id;
    private String firstName;
    private String lastName;
    private int age;
    private Gender gender;
    private String nationality;
    
    //... GETTERS/SETTERS
}

public enum Gender {
    MALE, FEMALE;
}

Java

The @RestController class injects auto-configured ChatClient.Builder to create an instance of ChatClient. PersonController implements a method for returning a list of persons from the GET /persons endpoint. The main goal is to generate a list of 10 objects with the fields defined in the Person class. The id field should be auto-incremented. The PromptTemplate object defines a message, that will be sent to the chat model AI API. It doesn’t have to specify the exact fields that should be returned. This part is handled automatically by the Spring AI library after we invoke the entity() method on the ChatClient instance. The ParameterizedTypeReference object inside the entity method tells Spring AI to generate a list of objects.

@RestController
@RequestMapping("/persons")
public class PersonController {

    private final ChatClient chatClient;

    public PersonController(ChatClient.Builder chatClientBuilder) {
        this.chatClient = chatClientBuilder.build();
    }

    @GetMapping
    List<Person> findAll() {
        PromptTemplate pt = new PromptTemplate("""
                Return a current list of 10 persons if exists or generate a new list with random values.
                Each object should contain an auto-incremented id field.
                Do not include any explanations or additional text.
                """);

        return this.chatClient.prompt(pt.create())
                .call()
                .entity(new ParameterizedTypeReference<>() {});
    }

}

Java

Assuming you exported the OpenAI token to the OPEN_AI_TOKEN environment variable, you can run the application using the following command:

mvn spring-boot:run

ShellSession

Then, let’s call the http://localhost:8080/persons endpoint. It returns a list of 10 people with different nationalities. It

Now, we can change the PromptTemplate content and add the word “famous” before persons. Just for fun.

The results are not surprising at all – “Elon Musk” enters the list However, the list will be slightly different the second time you call the same endpoint. According to our prompt, a chat client should “return a current list of 10 persons”. So, I expected to get the same list as before. In this case, the problem is that the chat client doesn’t remember a previous conversation.

Advisors and Chat Memory

Let’s try to change it. First, we should define the implementation of the ChatMemory interface. InMemoryChatMemory is good enough for our tests.

@SpringBootApplication
public class SpringAIShowcase {

    public static void main(String[] args) {
        SpringApplication.run(SpringAIShowcase.class, args);
    }

    @Bean
    InMemoryChatMemory chatMemory() {
        return new InMemoryChatMemory();
    }
}

Java

To enable conversation history for a chat client we should define an advisor. The Spring AI Advisors API lets us intercept, modify, and enhance AI-driven interactions handled by Spring applications. Spring AI offers API to create custom advisors, but we can also leverage several built-in advisors. It can be e.g. PromptChatMemoryAdvisor that enables chat memory and adds it to the prompt’s system text or SimpleLoggerAdvisor which enables request/response logging. Let’s take a look at the latest implementation of the PersonController class. I highlighted the added lines of code. Besides advisors, it contains a new GET /persons/{id} endpoint implementation. This endpoint takes a previously returned list of persons and seeks the object with a specified id. The PromptTemplate object specifies the id parameter filled with the value read from the context path.

findAll() { PromptTemplate pt = new PromptTemplate(""" Return a current list of 10 persons if exists or generate a new list with random values. Each object should contain an auto-incremented id field. Do not include any explanations or additional text. """); return this.chatClient.prompt(pt.create()) .call() .entity(new ParameterizedTypeReference<>() {}); } @GetMapping("/{id}") Person findById(@PathVariable String id) { PromptTemplate pt = new PromptTemplate(""" Find and return the object with id {id} in a current list of persons. """); Prompt p = pt.create(Map.of("id", id)); return this.chatClient.prompt(p) .call() .entity(Person.class); } }" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button">

@RestController
@RequestMapping("/persons")
public class PersonController {

    private final ChatClient chatClient;

    public PersonController(ChatClient.Builder chatClientBuilder, 
                            ChatMemory chatMemory) {
        this.chatClient = chatClientBuilder
                .defaultAdvisors(
                        new PromptChatMemoryAdvisor(chatMemory),
                        new SimpleLoggerAdvisor())
                .build();
    }

    @GetMapping
    List<Person> findAll() {
        PromptTemplate pt = new PromptTemplate("""
                Return a current list of 10 persons if exists or generate a new list with random values.
                Each object should contain an auto-incremented id field.
                Do not include any explanations or additional text.
                """);

        return this.chatClient.prompt(pt.create())
                .call()
                .entity(new ParameterizedTypeReference<>() {});
    }

    @GetMapping("/{id}")
    Person findById(@PathVariable String id) {
        PromptTemplate pt = new PromptTemplate("""
                Find and return the object with id {id} in a current list of persons.
                """);
        Prompt p = pt.create(Map.of("id", id));
        return this.chatClient.prompt(p)
                .call()
                .entity(Person.class);
    }
}

Java

Now, let’s make a final test. After the application restarts, we can call the endpoint that generates a list of persons. Then, we will call the GET /persons/{id} endpoint to display only a single person by ID. Spring application reads the value from the list of persons stored in the chat memory. Finally, we can repeat the call to the GET /persons endpoint to verify if it returns the same list.

Different Chat AI Models

Assuming you exported the Mistral AI token to the MISTRAL_AI_TOKEN environment variable, you can run the application using the following command. It activates the mistral-ai Maven profile and includes the starter with the Mistral AI support.

mvn spring-boot:run -Pmistral-ai

ShellSession

It returns responses similar to OpenAI’s, but some small differences exist. It always returns 0 in the age field and a 3-letter shortcut as a country name.

Let’s tweak our template to get Mistrai AI to generate an accurate age number. Here’s the fixed prompt template:

Now, it looks quite better. Even so, the names don’t match up with the countries they’re from, so there’s room for improvement.

The last test is for Ollama. Let’s run our application once again. This time we should activate the ollama-ai Maven profile.

mvn spring-boot:run -Pollama-ai

ShellSession

Then, we can repeat the same requests to check out the responses from Ollama AI. You can check out the responses by yourself.

$ curl http://localhost:8080/persons
$ curl http://localhost:8080/persons/2

ShellSession

Final Thoughts

This example doesn’t do anything unusual but only shows some basic features offered by Spring AI Chat Models API. We quickly reviewed features like prompts, structured output, chat memory, and built-in advisors. We also switched between some popular AI Chat Models API providers. You can expect more articles in this area soon. If you want to continue with the next part of the AI series on my blog, go here.

The post Getting Started with Spring AI and Chat Model appeared first on Piotr's TechBlog.