Skip to content

Commit

Permalink
Add example for Ollama observability
Browse files Browse the repository at this point in the history
  • Loading branch information
ThomasVitale committed Aug 16, 2024
1 parent 37778c9 commit 9832521
Show file tree
Hide file tree
Showing 14 changed files with 427 additions and 3 deletions.
85 changes: 85 additions & 0 deletions 10-observability/observability-models-ollama/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
# LLM Observability: Ollama

LLM Observability for Ollama.

## Running the application

The application relies on Ollama for providing LLMs. The application also relies on Testcontainers to provision automatically
a Grafana LGTM observability stack.

### Ollama as a native application

First, make sure you have [Ollama](https://ollama.ai) installed on your laptop.
Then, use Ollama to run the _mistral_ and _nomic-embed-text_ models. Those are the ones we'll use in this example.

```shell
ollama run mistral
ollama run nomic-embed-text
```

Finally, run the Spring Boot application.

```shell
./gradlew bootTestRun
```

## Observability Platform

Grafana is listening to port 3000. Check your container runtime to find the port to which is exposed to your localhost
and access Grafana from http://localhost:<port>. The credentials are `admin`/`admin`.

The application is automatically configured to export metrics and traces to the Grafana LGTM stack via OpenTelemetry.
In Grafana, you can query the traces from the "Explore" page, selecting the "Tempo" data source. You can also visualize metrics in "Explore > Metrics".

## Calling the application

You can now call the application to perform generative AI operations.
This example uses [httpie](https://httpie.io) to send HTTP requests.

### Chat

```shell
http :8080/chat
```

Try passing your custom prompt and check the result.

```shell
http :8080/chat message=="What is the capital of Italy?"
```

The next request is configured with a custom temperature value to obtain a more creative, yet less precise answer.

```shell
http :8080/chat/generic-options message=="Why is a raven like a writing desk? Give a short answer."
```

The next request is configured with Ollama-specific customizations.

```shell
http :8080/chat/ollama-options message=="What can you see beyond what you can see? Give a short answer."
```

Finally, try a request which uses function calling.

```shell
http :8080/chat/functions authorName=="Philip Pullman"
```

### Embedding

```shell
http :8080/embed
```

Try passing your custom prompt and check the result.

```shell
http :8080/embed message=="The capital of Italy is Rome"
```

The next request is configured with Ollama-specific customizations.

```shell
http :8080/embed/ollama-options message=="The capital of Italy is Rome"
```
45 changes: 45 additions & 0 deletions 10-observability/observability-models-ollama/build.gradle
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
plugins {
id 'java'
id 'org.springframework.boot'
id 'io.spring.dependency-management'
}

group = 'com.thomasvitale'
version = '0.0.1-SNAPSHOT'

java {
toolchain {
languageVersion = JavaLanguageVersion.of(22)
}
}

repositories {
mavenLocal()
mavenCentral()
maven { url 'https://repo.spring.io/milestone' }
maven { url 'https://repo.spring.io/snapshot' }
}

dependencies {
implementation platform("org.springframework.ai:spring-ai-bom:${springAiVersion}")

implementation 'org.springframework.boot:spring-boot-starter-actuator'
implementation 'org.springframework.boot:spring-boot-starter-web'
implementation 'org.springframework.ai:spring-ai-ollama-spring-boot-starter'

implementation 'io.micrometer:micrometer-tracing-bridge-otel'
implementation 'io.opentelemetry:opentelemetry-exporter-otlp'
implementation 'io.micrometer:micrometer-registry-otlp'
implementation 'net.ttddyy.observation:datasource-micrometer-spring-boot:1.0.5'

testAndDevelopmentOnly 'org.springframework.boot:spring-boot-devtools'

testImplementation 'org.springframework.boot:spring-boot-starter-test'
testImplementation 'org.springframework.boot:spring-boot-testcontainers'
testImplementation 'org.testcontainers:junit-jupiter'
testRuntimeOnly 'org.junit.platform:junit-platform-launcher'
}

tasks.named('test') {
useJUnitPlatform()
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
package com.thomasvitale.ai.spring;

import org.springframework.stereotype.Service;

import java.util.List;
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;

@Service
public class BookService {

private static final Map<Integer,Book> books = new ConcurrentHashMap<>();

static {
books.put(1, new Book("His Dark Materials", "Philip Pullman"));
books.put(2, new Book("Narnia", "C.S. Lewis"));
books.put(3, new Book("The Hobbit", "J.R.R. Tolkien"));
books.put(4, new Book("The Lord of The Rings", "J.R.R. Tolkien"));
books.put(5, new Book("The Silmarillion", "J.R.R. Tolkien"));
}

List<Book> getBooksByAuthor(Author author) {
return books.values().stream()
.filter(book -> author.name().equals(book.author()))
.toList();
}

Book getBestsellerByAuthor(Author author) {
return switch (author.name()) {
case "J.R.R. Tolkien" -> books.get(4);
case "C.S. Lewis" -> books.get(2);
case "Philip Pullman" -> books.get(1);
default -> null;
};
}

public record Book(String title, String author) {}
public record Author(String name) {}

}
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
package com.thomasvitale.ai.spring;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.ai.chat.model.ChatModel;
import org.springframework.ai.chat.prompt.ChatOptionsBuilder;
import org.springframework.ai.chat.prompt.Prompt;
import org.springframework.ai.ollama.api.OllamaOptions;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;

import java.util.List;
import java.util.Set;

@RestController
class ChatController {

private final Logger logger = LoggerFactory.getLogger(ChatController.class);

private final ChatModel chatModel;

ChatController(ChatModel chatModel) {
this.chatModel = chatModel;
}

@GetMapping("/chat")
String chat(@RequestParam(defaultValue = "What did Gandalf say to the Balrog?") String message) {
logger.info(message);
return chatModel.call(message);
}

@GetMapping("/chat/generic-options")
String chatWithGenericOptions(@RequestParam(defaultValue = "What did Gandalf say to the Balrog?") String message) {
return chatModel.call(new Prompt(message, ChatOptionsBuilder.builder()
.withTemperature(1.3f)
.build()))
.getResult().getOutput().getContent();
}

@GetMapping("/chat/ollama-options")
String chatWithOllamaOptions(@RequestParam(defaultValue = "What did Gandalf say to the Balrog?") String message) {
return chatModel.call(new Prompt(message, OllamaOptions.builder()
.withFrequencyPenalty(1.3f)
.withNumPredict(1500)
.withPresencePenalty(1.0f)
.withStop(List.of("this-is-the-end", "addio"))
.withTemperature(0.7f)
.withTopK(1)
.withTopP(0f)
.build()))
.getResult().getOutput().getContent();
}

@GetMapping("/chat/functions")
String chatWithFunctions(@RequestParam(defaultValue = "Philip Pullman") String author) {
return chatModel.call(new Prompt("What books written by %s are available to read and what is their bestseller?".formatted(author),
OllamaOptions.builder()
.withTemperature(0.3f)
.withFunctions(Set.of("booksByAuthor", "bestsellerBookByAuthor"))
.build()))
.getResult().getOutput().getContent();
}

}
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
package com.thomasvitale.ai.spring;

import org.springframework.ai.embedding.EmbeddingModel;
import org.springframework.ai.embedding.EmbeddingRequest;
import org.springframework.ai.ollama.api.OllamaOptions;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;

import java.util.List;

@RestController
class EmbeddingController {

private final EmbeddingModel embeddingModel;

EmbeddingController(EmbeddingModel embeddingModel) {
this.embeddingModel = embeddingModel;
}

@GetMapping("/embed")
String embed(@RequestParam(defaultValue = "And Gandalf yelled: 'You shall not pass!'") String message) {
var embeddings = embeddingModel.embed(message);
return "Size of the embedding vector: " + embeddings.length;
}

@GetMapping("/embed/ollama-options")
String embedWithOllamaOptions(@RequestParam(defaultValue = "And Gandalf yelled: 'You shall not pass!'") String message) {
var embeddings = embeddingModel.call(new EmbeddingRequest(List.of(message), OllamaOptions.builder()
.build()))
.getResult().getOutput();
return "Size of the embedding vector: " + embeddings.length;
}

}
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
package com.thomasvitale.ai.spring;

import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.context.annotation.Description;

import java.util.List;
import java.util.function.Function;

@Configuration(proxyBeanMethods = false)
public class Functions {

@Bean
@Description("Get the list of available books written by the given author")
public Function<BookService.Author, List<BookService.Book>> booksByAuthor(BookService bookService) {
return bookService::getBooksByAuthor;
}

@Bean
@Description("Get the bestseller book written by the given author")
public Function<BookService.Author, BookService.Book> bestsellerBookByAuthor(BookService bookService) {
return bookService::getBestsellerByAuthor;
}

}
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
package com.thomasvitale.ai.spring;

import org.springframework.boot.web.client.ClientHttpRequestFactories;
import org.springframework.boot.web.client.ClientHttpRequestFactorySettings;
import org.springframework.boot.web.client.RestClientCustomizer;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.http.client.BufferingClientHttpRequestFactory;

import java.time.Duration;

@Configuration(proxyBeanMethods = false)
public class HttpClientConfig {

@Bean
RestClientCustomizer restClientCustomizer() {
return restClientBuilder -> {
restClientBuilder
.requestFactory(new BufferingClientHttpRequestFactory(
ClientHttpRequestFactories.get(ClientHttpRequestFactorySettings.DEFAULTS
.withConnectTimeout(Duration.ofSeconds(60))
.withReadTimeout(Duration.ofSeconds(60))
)));
};
}

}
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
package com.thomasvitale.ai.spring;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;

@SpringBootApplication
public class ObservabilityModelsOllamaApplication {

public static void main(String[] args) {
SpringApplication.run(ObservabilityModelsOllamaApplication.class, args);
}

}
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
spring:
application:
name: observability-models-ollama
ai:
chat:
observations:
include-completion: true
include-prompt: true
image:
observations:
include-prompt: true
ollama:
chat:
options:
model: mistral
temperature: 0.7
embedding:
options:
model: nomic-embed-text

management:
endpoints:
web:
exposure:
include: "*"
metrics:
tags:
service.name: ${spring.application.name}
tracing:
sampling:
probability: 1.0
otlp:
tracing:
endpoint: http://localhost:4318/v1/traces
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
package com.thomasvitale.ai.spring;

import org.junit.jupiter.api.Test;
import org.springframework.boot.test.context.SpringBootTest;

@SpringBootTest
class ObservabilityModelsOllamaApplicationTests {

@Test
void contextLoads() {
}

}
Loading

0 comments on commit 9832521

Please sign in to comment.