LLM clients
LLM clients are designed for direct interaction with LLM providers.
Each client implements the LLMClient interface, which provides methods for executing prompts and streaming responses.
You can use an LLM client when you work with a single LLM provider and don't need advanced lifecycle management. If you need to manage multiple LLM providers, use a prompt executor.
The table below shows the available LLM clients and their capabilities.
The * symbol indicates additional notes available in the Notes column.
| LLM provider | LLMClient | Tool calling |
Streaming | Multiple choices |
Embeddings | Moderation | Model listing |
Notes |
|---|---|---|---|---|---|---|---|---|
| OpenAI | OpenAILLMClient | ✓ | ✓ | ✓ | ✓ | ✓* | ✓ | Supports moderation via the OpenAI Moderation API. |
| Anthropic | AnthropicLLMClient | ✓ | ✓ | - | - | - | - | - |
| GoogleLLMClient | ✓ | ✓ | ✓ | ✓ | - | ✓ | - | |
| DeepSeek | DeepSeekLLMClient | ✓ | ✓ | ✓ | - | - | ✓ | OpenAI-compatible chat client. |
| OpenRouter | OpenRouterLLMClient | ✓ | ✓ | ✓ | - | - | ✓ | OpenAI-compatible router client. |
| Amazon Bedrock | BedrockLLMClient | ✓ | ✓ | - | ✓ | ✓* | - | JVM-only AWS SDK client that supports multiple model families. Moderation requires Guardrails configuration. |
| Mistral | MistralAILLMClient | ✓ | ✓ | ✓ | ✓ | ✓* | ✓ | OpenAI-compatible client that supports moderation via the Mistral v1/moderations endpoint. |
| Alibaba | DashScopeLLMClient | ✓ | ✓ | ✓ | - | - | ✓ | OpenAI-compatible client that exposes provider-specific parameters (enableSearch, parallelToolCalls, enableThinking). |
| Ollama | OllamaClient | ✓ | ✓ | - | ✓ | ✓ | - | Local server client with model management APIs. |
Running a prompt
To run a prompt using an LLM client, perform the following:
- Create an LLM client that handles the connection between your application and LLM providers.
- Call the
execute()method with the prompt and LLM as arguments.
Here is an example that uses OpenAILLMClient to run prompts:
fun main() = runBlocking {
// Create an OpenAI client
val token = System.getenv("OPENAI_API_KEY")
val client = OpenAILLMClient(token)
// Create a prompt
val prompt = prompt("prompt_name", LLMParams()) {
// Add a system message to set the context
system("You are a helpful assistant.")
// Add a user message
user("Tell me about Kotlin")
// You can also add assistant messages for few-shot examples
assistant("Kotlin is a modern programming language...")
// Add another user message
user("What are its key features?")
}
// Run the prompt
val response = client.execute(prompt, OpenAIModels.Chat.GPT4o)
// Print the response
println(response)
}
Streaming responses
Note
Available for all LLM clients.
When you need to process responses as they are generated,
you can use the executeStreaming() method to stream the model output:
// Set up the OpenAI client with your API key
val token = System.getenv("OPENAI_API_KEY")
val client = OpenAILLMClient(token)
val response = client.executeStreaming(
prompt = prompt("stream_demo") { user("Stream this response in short chunks.") },
model = OpenAIModels.Chat.GPT4_1
)
response.collect { event ->
when (event) {
is StreamFrame.Append -> println(event.text)
is StreamFrame.ToolCall -> println("\nTool call: ${event.name}")
is StreamFrame.End -> println("\n[done] Reason: ${event.finishReason}")
}
}
Multiple choices
Note
Available for all LLM clients except GoogleLLMClient, BedrockLLMClient, and OllamaClient
You can request multiple alternative responses from the model in a single call by using the executeMultipleChoices() method.
It requires additionally specifying the numberOfChoices LLM parameter in the prompt
being executed.
fun main() = runBlocking {
val apiKey = System.getenv("OPENAI_API_KEY")
val client = OpenAILLMClient(apiKey)
val choices = client.executeMultipleChoices(
prompt = prompt("n_best", params = LLMParams(numberOfChoices = 3)) {
system("You are a creative assistant.")
user("Give me three different opening lines for a story.")
},
model = OpenAIModels.Chat.GPT4o
)
choices.forEachIndexed { i, choice ->
val text = choice.joinToString(" ") { it.content }
println("Line #${i + 1}: $text")
}
}
Tip
You can also request multiple choices by adding the numberOfChoices LLM parameter into the prompt.
Listing available models
Note
Available for all LLM clients except GoogleLLMClient, BedrockLLMClient, and OllamaClient.
To get a list of available model IDs supported by the LLM client, use the models() method:
fun main() = runBlocking {
val apiKey = System.getenv("OPENAI_API_KEY")
val client = OpenAILLMClient(apiKey)
val ids: List<String> = client.models()
ids.forEach { println(it) }
}
Embeddings
Note
Available for OpenAILLMClient, GoogleLLMClient, BedrockLLMClient, MistralAILLMClient, and OllamaClient.
You convert text into embedding vectors using the embed() method.
Choose an embedding model and pass your text to this method:
fun main() = runBlocking {
val apiKey = System.getenv("OPENAI_API_KEY")
val client = OpenAILLMClient(apiKey)
val embedding = client.embed(
text = "This is a sample text for embedding",
model = OpenAIModels.Embeddings.TextEmbedding3Large
)
println("Embedding size: ${embedding.size}")
}
Moderation
Note
Available for the following LLM clients: OpenAILLMClient, BedrockLLMClient, MistralAILLMClient, OllamaClient.
You can use the moderate() method with a moderation model to check whether a prompt contains inappropriate content:
fun main() = runBlocking {
val apiKey = System.getenv("OPENAI_API_KEY")
val client = OpenAILLMClient(apiKey)
val result = client.moderate(
prompt = prompt("moderation") {
user("This is a test message that may contain offensive content.")
},
model = OpenAIModels.Moderation.Omni
)
println(result)
}
Integration with prompt executors
Prompt executors wrap LLM clients and provide additional functionality, such as routing, fallbacks, and unified usage across providers. They are recommended for production use, as they offer flexibility when working with multiple providers.