Prompt executors
Prompt executors provide a higher-level abstraction that lets you manage the lifecycle of one or multiple LLM clients. You can work with multiple LLM providers through a unified interface, abstracting from provider-specific details, with dynamic switching between them and fallbacks.
Executor types
Koog provides three main types of prompt executors that implement the PromptExecutor interface:
| Type | Class |
Description |
|---|---|---|
| Single-provider | SingleLLMPromptExecutor |
Wraps a single LLM client for one provider. Use this executor if your agent only requires switching between models within a single LLM provider. |
| Multi-provider | MultiLLMPromptExecutor |
Wraps multiple LLM clients and routes calls based on the LLM provider. It can optionally use a configured fallback provider and LLM when the requested client is unavailable. Use this executor if your agent needs to switch between LLMs from different providers. |
| Routing | RoutingLLMPromptExecutor |
Distributes requests to a given LLM model across multiple client instances using routing strategies. Use this executor to avoid rate limits, improve throughput, and implement failover strategies with load balancing. |
Creating a single-provider executor
To create a prompt executor for a specific LLM provider, perform the following:
- Configure an LLM client for a specific provider with the corresponding API key.
- Create a prompt executor using
MultiLLMPromptExecutor.
Here is an example:
Creating a multi-provider executor
To create a prompt executor that works with multiple LLM providers, do the following:
- Configure clients for the required LLM providers with the corresponding API keys.
- Pass the configured clients to the
MultiLLMPromptExecutorclass constructor to create a prompt executor with multiple LLM providers.
Creating a routing executor
Experimental API
Routing capabilities are experimental and may change in future releases.
To use them, opt in with @OptIn(ExperimentalRoutingApi::class).
To create a prompt executor that distributes requests across multiple LLM client instances using routing strategies, do the following:
- Configure multiple client instances (they can be for the same or different LLM providers) with the corresponding API keys.
- Create a router using a routing strategy, such as
RoundRobinRouter. - Pass the router to the
RoutingLLMPromptExecutorclass constructor.
This is useful for avoiding rate limits, improving throughput, and implementing failover strategies.
// Create multiple client instances
val openAI1 = OpenAILLMClient(apiKey = "openai-key-1")
val openAI2 = OpenAILLMClient(apiKey = "openai-key-2")
val anthropic = AnthropicLLMClient(apiKey = "anthropic-key")
// Create router with round-robin strategy
val router = RoundRobinRouter(openAI1, openAI2, anthropic)
// Create routing executor
val routingExecutor = RoutingLLMPromptExecutor(router)
// Create multiple client instances
OpenAILLMClient openAI1 = new OpenAILLMClient("openai-key-1");
OpenAILLMClient openAI2 = new OpenAILLMClient("openai-key-2");
AnthropicLLMClient anthropic = new AnthropicLLMClient("anthropic-key");
// Create router with round-robin strategy
RoundRobinRouter router = new RoundRobinRouter(openAI1, openAI2, anthropic);
// Create routing executor
RoutingLLMPromptExecutor routingExecutor = new RoutingLLMPromptExecutor(router);
When you execute prompts with this executor, requests to OpenAI models will alternate between openAI1 and openAI2 using the round-robin strategy.
Requests to Anthropic models always go to the single anthropic client, as round-robin maintains an independent counter per provider.
You can also implement custom routing strategies by creating a class that implements the LLMClientRouter interface.
Pre-defined prompt executors
For faster setup, Koog provides ready-to-use executor implementations for common providers in both Kotlin and Java.
The following table includes the pre-defined single-provider executors
that return SingleLLMPromptExecutor configured with a specific LLM client.
| LLM provider | Prompt executor | Description |
|---|---|---|
| OpenAI | simpleOpenAIExecutor | Wraps OpenAILLMClient that runs prompts with OpenAI models. |
| OpenAI | simpleAzureOpenAIExecutor | Wraps OpenAILLMClient configured for using Azure OpenAI Service. |
| Anthropic | simpleAnthropicExecutor | Wraps AnthropicLLMClient that runs prompts with Anthropic models. |
| simpleGoogleAIExecutor | Wraps GoogleLLMClient that runs prompts with Google models. |
|
| OpenRouter | simpleOpenRouterExecutor | Wraps OpenRouterLLMClient that runs prompts with OpenRouter. |
| Amazon Bedrock | simpleBedrockExecutor | Wraps BedrockLLMClient that runs prompts with AWS Bedrock. |
| Amazon Bedrock | simpleBedrockExecutorWithBearerToken | Wraps BedrockLLMClient and uses the provided Bedrock API key to send requests. |
| Mistral | simpleMistralAIExecutor | Wraps MistralAILLMClient that runs prompts with Mistral models. |
| Ollama | simpleOllamaAIExecutor | Wraps OllamaClient that runs prompts with Ollama. |
Here is an example of creating a pre-defined executor:
Running a prompt
To run a prompt using a prompt executor, do the following:
- Create a prompt executor.
- Run the prompt with the specific LLM using the
execute()method.
Here is an example:
// Create an OpenAI executor
PromptExecutor promptExecutor = simpleOpenAIExecutor("OPENAI_API_KEY");
// Create a prompt
Prompt prompt = Prompt.builder("demo")
.user("Summarize this.")
.build();
// Run the prompt
List<Message.Response> response = promptExecutor.execute(prompt, OpenAIModels.Chat.GPT4o);
This will run the prompt with the GPT4o model and return the response.
Note
The prompt executors provide methods to run prompts using various capabilities, such as streaming, multiple choice generation, and content moderation. Since prompt executors wrap LLM clients, each executor supports the capabilities of the corresponding client. For details, refer to LLM clients.
Switching between providers
When you work with multiple LLM providers using MultiLLMPromptExecutor, you can switch between them.
The process is as follows:
- Create an LLM client instance for each provider you want to use.
- Create a
MultiLLMPromptExecutorthat maps LLM providers to LLM clients. - Run a prompt with a model from the corresponding client passed as an argument to the
execute()method. The prompt executor will use the corresponding client based on the model provider to run the prompt.
Here is an example of switching between providers:
// Create LLM clients for OpenAI, Anthropic, and Google providers
val openAIClient = OpenAILLMClient("OPENAI_API_KEY")
val anthropicClient = AnthropicLLMClient("ANTHROPIC_API_KEY")
val googleClient = GoogleLLMClient("GOOGLE_API_KEY")
// Create a MultiLLMPromptExecutor that maps LLM providers to LLM clients
val executor = MultiLLMPromptExecutor(
LLMProvider.OpenAI to openAIClient,
LLMProvider.Anthropic to anthropicClient,
LLMProvider.Google to googleClient
)
// Create a prompt
val p = prompt("demo") { user("Summarize this.") }
// Run the prompt with an OpenAI model; the prompt executor automatically switches to the OpenAI client
val openAIResult = executor.execute(p, OpenAIModels.Chat.GPT4o)
// Run the prompt with an Anthropic model; the prompt executor automatically switches to the Anthropic client
val anthropicResult = executor.execute(p, AnthropicModels.Sonnet_4_5)
// Create LLM clients for OpenAI, Anthropic, and Google providers
OpenAILLMClient openAIClient = new OpenAILLMClient("OPENAI_API_KEY");
AnthropicLLMClient anthropicClient = new AnthropicLLMClient("ANTHROPIC_API_KEY");
GoogleLLMClient googleClient = new GoogleLLMClient("GOOGLE_API_KEY");
// Create a MultiLLMPromptExecutor that maps LLM providers to LLM clients
MultiLLMPromptExecutor promptExecutor = new MultiLLMPromptExecutor(
Map.of(
LLMProvider.OpenAI, openAIClient,
LLMProvider.Anthropic, anthropicClient,
LLMProvider.Google, googleClient
)
);
// Create a prompt
Prompt prompt = Prompt.builder("demo")
.user("Summarize this.")
.build();
// Run the prompt with an OpenAI model; the prompt executor automatically switches to the OpenAI client
List<Message.Response> openAIResult = promptExecutor.execute(prompt, OpenAIModels.Chat.GPT4o);
// Run the prompt with an Anthropic model; the prompt executor automatically switches to the Anthropic client
List<Message.Response> anthropicResult = promptExecutor.execute(prompt, AnthropicModels.Sonnet_4_5);
You can optionally configure a fallback LLM provider and model to use when the requested client is unavailable. For details, refer to Configuring fallbacks.
Configuring fallbacks
Multi-provider and routing prompt executors can be configured to use a fallback LLM provider and model when the requested LLM client is unavailable.
To configure the fallback mechanism, pass fallback settings when creating a MultiLLMPromptExecutor or RoutingLLMPromptExecutor:
val openAIClient = OpenAILLMClient(System.getenv("OPENAI_API_KEY"))
val ollamaClient = OllamaClient()
val multiExecutor = MultiLLMPromptExecutor(
LLMProvider.OpenAI to openAIClient,
LLMProvider.Ollama to ollamaClient,
fallback = MultiLLMPromptExecutor.FallbackPromptExecutorSettings(
fallbackProvider = LLMProvider.Ollama,
fallbackModel = OllamaModels.Meta.LLAMA_3_2
)
)
OpenAILLMClient openAIClient = new OpenAILLMClient(System.getenv("OPENAI_API_KEY"));
OllamaClient ollamaClient = new OllamaClient();
MultiLLMPromptExecutor multiExecutor = new MultiLLMPromptExecutor(
Map.of(
LLMProvider.OpenAI, openAIClient,
LLMProvider.Ollama, ollamaClient
),
new MultiLLMPromptExecutor.FallbackPromptExecutorSettings(
LLMProvider.Ollama,
OllamaModels.Meta.LLAMA_3_2
)
);
If you pass a model from an LLM provider that is not included in the MultiLLMPromptExecutor,
the prompt executor will use the fallback model:
Note
Fallbacks are available for the execute() and executeMultipleChoices() methods only.