LLM parameters

This page provides details about LLM parameters in the Koog agentic framework. LLM parameters let you control and customize the behavior of language models.

Overview

LLM parameters are configuration options that let you fine-tune how language models generate responses. These parameters control aspects like response randomness, length, format, and tool usage. By adjusting the parameters, you optimize model behavior for different use cases, from creative content generation to deterministic structured outputs.

In Koog, the LLMParams class incorporates LLM parameters and provides a consistent interface for configuring language model behavior. You can use LLM parameters in the following ways:

When creating a prompt:

val prompt = prompt(
    id = "dev-assistant",
    params = LLMParams(
        temperature = 0.7,
        maxTokens = 500
    )
) {
    // Add a system message to set the context
    system("You are a helpful assistant.")

    // Add a user message
    user("Tell me about Kotlin")
}

For more information about prompt creation, see Prompts.

When creating a subgraph:

val processQuery by subgraphWithTask<String, String>(
    tools = listOf(searchTool, calculatorTool, weatherTool),
    llmModel = OpenAIModels.Chat.GPT4o,
    llmParams = LLMParams(
        temperature = 0.7,
        maxTokens = 500
    ),
    runMode = ToolCalls.SEQUENTIAL,
    assistantResponseRepeatMax = 3,
) { userQuery ->
    """
    You are a helpful assistant that can answer questions about various topics.
    Please help with the following query:
    $userQuery
    """
}

For more information about existing subgraph types in Koog, see Predefined subgraphs. To learn how to create and implement your own subgraphs, see Custom subgraphs.

When updating a prompt in an LLM write session:

llm.writeSession {
    changeLLMParams(
        LLMParams(
            temperature = 0.7,
            maxTokens = 500
        )
    )
}

For more information about sessions, see LLM sessions and manual history management.

LLM parameter reference

The following table provides a reference of LLM parameters included in the LLMParams class and supported by all LLM providers that are available in Koog out of the box. For a list of parameters that are specific to some providers, see Provider-specific parameters.

Parameter	Type	Description
`temperature`	Double	Controls randomness in the output. Higher values, such as 0.7–1.0, produce more diverse and creative responses, while lower values produce more deterministic and focused responses.
`maxTokens`	Integer	Maximum number of tokens to generate in the response. Useful for controlling response length.
`numberOfChoices`	Integer	Number of alternative responses to generate. Must be greater than 0.
`speculation`	String	A speculative configuration string that influences model behavior, designed to enhance result speed and accuracy. Supported only by certain models, but may greatly improve speed and accuracy.
`schema`	Schema	Defines the structure for the model's response format, enabling structured outputs like JSON. For more information, see Schema.
`toolChoice`	ToolChoice	Controls tool calling behavior of the language model. For more information, see Tool choice.
`user`	String	Identifier for the user making the request, which can be used for tracking purposes.
`additionalProperties`	Map<String, JsonElement>	Additional properties that can be used to store custom parameters specific to certain model providers.

For a list of default values for each parameter, see the corresponding LLM provider documentation:

Schema

The Schema interface defines the structure for the model's response format. Koog supports JSON schemas, as described in the sections below.

JSON schemas

JSON schemas let you request structured JSON data from language models. Koog supports the following two types of JSON schemas:

1) Basic JSON Schema (LLMParams.Schema.JSON.Basic): Used for basic JSON processing capabilities. This format primarily focuses on nested data definitions without advanced JSON Schema functionalities.

// Create parameters with a basic JSON schema
val jsonParams = LLMParams(
    temperature = 0.2,
    schema = LLMParams.Schema.JSON.Basic(
        name = "PersonInfo",
        schema = JsonObject(mapOf(
            "type" to JsonPrimitive("object"),
            "properties" to JsonObject(
                mapOf(
                    "name" to JsonObject(mapOf("type" to JsonPrimitive("string"))),
                    "age" to JsonObject(mapOf("type" to JsonPrimitive("number"))),
                    "skills" to JsonObject(
                        mapOf(
                            "type" to JsonPrimitive("array"),
                            "items" to JsonObject(mapOf("type" to JsonPrimitive("string")))
                        )
                    )
                )
            ),
            "additionalProperties" to JsonPrimitive(false),
            "required" to JsonArray(listOf(JsonPrimitive("name"), JsonPrimitive("age"), JsonPrimitive("skills")))
        ))
    )
)

2) Standard JSON Schema (LLMParams.Schema.JSON.Standard): Represents a standard JSON schema according to json-schema.org. This format is a proper subset of the official JSON Schema specification. Note that the flavor across different LLM providers might vary, since not all of them support full JSON schemas.

// Create parameters with a standard JSON schema
val standardJsonParams = LLMParams(
    temperature = 0.2,
    schema = LLMParams.Schema.JSON.Standard(
        name = "ProductCatalog",
        schema = JsonObject(mapOf(
            "type" to JsonPrimitive("object"),
            "properties" to JsonObject(mapOf(
                "products" to JsonObject(mapOf(
                    "type" to JsonPrimitive("array"),
                    "items" to JsonObject(mapOf(
                        "type" to JsonPrimitive("object"),
                        "properties" to JsonObject(mapOf(
                            "id" to JsonObject(mapOf("type" to JsonPrimitive("string"))),
                            "name" to JsonObject(mapOf("type" to JsonPrimitive("string"))),
                            "price" to JsonObject(mapOf("type" to JsonPrimitive("number"))),
                            "description" to JsonObject(mapOf("type" to JsonPrimitive("string")))
                        )),
                        "additionalProperties" to JsonPrimitive(false),
                        "required" to JsonArray(listOf(JsonPrimitive("id"), JsonPrimitive("name"), JsonPrimitive("price"), JsonPrimitive("description")))
                    ))
                ))
            )),
            "additionalProperties" to JsonPrimitive(false),
            "required" to JsonArray(listOf(JsonPrimitive("products")))
        ))
    )
)

Tool choice

The ToolChoice class controls how the language model uses tools. It provides the following options:

LLMParams.ToolChoice.Named: the language model calls the specified tool. Takes the name string argument that represents the name of the tool to call.
LLMParams.ToolChoice.All: the language model calls all tools.
LLMParams.ToolChoice.None: the language model does not call tools and only generates text.
LLMParams.ToolChoice.Auto: the language model automatically decides whether to call tools and which tool to call.
LLMParams.ToolChoice.Required: the language model calls at least one tool.

Here is an example of using the LLMParams.ToolChoice.Named class to call a specific tool:

val specificToolParams = LLMParams(
    toolChoice = LLMParams.ToolChoice.Named(name = "calculator")
)

Provider-specific parameters

Koog supports provider-specific parameters for some LLM providers. These parameters extend the base LLMParams class and add provider-specific functionality. The following classes include parameters that are specific per provider:

OpenAIChatParams: Parameters specific to the OpenAI Chat Completions API.
OpenAIResponsesParams: Parameters specific to the OpenAI Responses API.
GoogleParams: Parameters specific to Google models.
AnthropicParams: Parameters specific to Anthropic models.
MistralAIParams: Parameters specific to Mistral models.
DeepSeekParams: Parameters specific to DeepSeek models.
OpenRouterParams: Parameters specific to OpenRouter models.
DashscopeParams: Parameters specific to Alibaba models.

Here is the complete reference of provider-specific parameters in Koog:

OpenAI ChatOpenAI ResponsesGoogleAnthropicMistralDeepSeekOpenRouterAlibaba (DashScope)

Parameter	Type	Description
`audio`	OpenAIAudioConfig	Audio output configuration when using audio-capable models. For more information, see the API documentation for OpenAIAudioConfig.
`frequencyPenalty`	Double	Penalizes frequent tokens to reduce repetition. Higher `frequencyPenalty` values result in larger variations of phrasing and reduced repetition. Takes a value in the range of -2.0 to 2.0.
`logprobs`	Boolean	If `true`, includes log-probabilities for output tokens.
`parallelToolCalls`	Boolean	If `true`, multiple tool calls can run in parallel. Particularly applicable to custom nodes or LLM interactions outside of agent strategies.
`presencePenalty`	Double	Prevents the model from reusing tokens that have already been included in the output. Higher values encourage the introduction of new tokens and topics. Takes a value in the range of -2.0 to 2.0.
`promptCacheKey`	String	Stable cache key for prompt caching. OpenAI uses it to cache responses for similar requests.
`reasoningEffort`	ReasoningEffort	Specifies the level of reasoning effort that the model will use. For more information and available values, see the API documentation for ReasoningEffort.
`safetyIdentifier`	String	A stable and unique user identifier that may be used to detect users who violate OpenAI policies.
`serviceTier`	ServiceTier	OpenAI processing tier selection that lets you prioritize performance over cost or vice versa. For more information, see the API documentation for ServiceTier.
`stop`	List<String>	Strings that signal to the model that it should stop generating content when it encounters any of them. For example, to make the model stop generating content when it produces two newlines, specify the stop sequence as `stop = listOf("/n/n")`.
`store`	Boolean	If `true`, the provider may store outputs for later retrieval.
`topLogprobs`	Integer	Number of top most likely tokens per position. Takes a value in the range of 0–20. Requires the `logprobs` parameter to be set to `true`.
`topP`	Double	Also referred to as nucleus sampling. Creates a subset of next tokens by adding tokens with the highest probability values to the subset until the sum of their probabilities reaches the specified `topP` value. Takes a value greater than 0.0 and lower than or equal to 1.0.
`webSearchOptions`	OpenAIWebSearchOptions	Configure web search tool usage (if supported). For more information, see the API documentation for OpenAIWebSearchOptions.

Parameter	Type	Description
`background`	Boolean	Run the response in the background.
`include`	List<OpenAIInclude>	Additional data to include in the model's response, such as sources of web search tool call or search results of a file search tool call. For detailed reference information, see OpenAIInclude in the Koog API reference. To learn more about the `include` parameter, see OpenAI's documentation.
`logprobs`	Boolean	If `true`, includes log-probabilities for output tokens.
`maxToolCalls`	Integer	Maximum total number of built-in tool calls allowed in this response. Takes a value equal to or greater than `0`.
`parallelToolCalls`	Boolean	If `true`, multiple tool calls can run in parallel. Particularly applicable to custom nodes or LLM interactions outside of agent strategies.
`promptCacheKey`	String	Stable cache key for prompt caching. OpenAI uses it to cache responses for similar requests.
`reasoning`	ReasoningConfig	Reasoning configuration for reasoning-capable models. For more information, see the API documentation for ReasoningConfig.
`safetyIdentifier`	String	A stable and unique user identifier that may be used to detect users who violate OpenAI policies.
`serviceTier`	ServiceTier	OpenAI processing tier selection that lets you prioritize performance over cost or vice versa. For more information, see the API documentation for ServiceTier.
`store`	Boolean	If `true`, the provider may store outputs for later retrieval.
`topLogprobs`	Integer	Number of top most likely tokens per position. Takes a value in the range of 0–20. Requires the `logprobs` parameter to be set to `true`.
`topP`	Double	Also referred to as nucleus sampling. Creates a subset of next tokens by adding tokens with the highest probability values to the subset until the sum of their probabilities reaches the specified `topP` value. Takes a value greater than 0.0 and lower than or equal to 1.0.
`truncation`	Truncation	Truncation strategy when nearing the context window. For more information, see the API documentation for Truncation.

Parameter	Type	Description
`thinkingConfig`	GoogleThinkingConfig	Controls whether the model should expose its chain-of-thought and how many tokens it may spend on it. For more information, see the API reference for GoogleThinkingConfig.
`topK`	Integer	Number of top tokens to consider when generating the output. Takes a value greater than or equal to 0 (provider-specific minimums may apply).
`topP`	Double	Also referred to as nucleus sampling. Creates a subset of next tokens by adding tokens with the highest probability values to the subset until the sum of their probabilities reaches the specified `topP` value. Takes a value greater than 0.0 and lower than or equal to 1.0.

Parameter	Type	Description
`container`	String	Container identifier for reuse across requests. Containers are used by Anthropic's code execution tool to provide a secure and containerized code execution environment. By providing the container identifier from a previous response, you can reuse containers across multiple requests, which preserves created files between requests. For more information, see Containers in Anthropic's documentation.
`mcpServers`	List<AnthropicMCPServerURLDefinition>	Definitions of MCP servers to be used in the request. Supports at most 20 servers. For more information, see the API reference for AnthropicMCPServerURLDefinition.
`serviceTier`	ServiceTier	OpenAI processing tier selection that lets you prioritize performance over cost or vice versa. For more information, see the API documentation for ServiceTier.
`stopSequences`	List<String>	Custom text sequences that cause the model to stop generating content. If matched, the value of `stop_reason` in the response is `stop_sequence`.
`thinking`	AnthropicThinking	Configuration for activating Claude's extended thinking. When activated, responses also include thinking content blocks. For more information, see the API reference for AnthropicThinking.
`topK`	Integer	Number of top tokens to consider when generating the output. Takes a value greater than or equal to 0 (provider-specific minimums may apply).
`topP`	Double	Also referred to as nucleus sampling. Creates a subset of next tokens by adding tokens with the highest probability values to the subset until the sum of their probabilities reaches the specified `topP` value. Takes a value greater than 0.0 and lower than or equal to 1.0.

Parameter	Type	Description
`frequencyPenalty`	Double	Penalizes frequent tokens to reduce repetition. Higher `frequencyPenalty` values result in larger variations of phrasing and reduced repetition. Takes a value in the range of -2.0 to 2.0.
`parallelToolCalls`	Boolean	If `true`, multiple tool calls can run in parallel. Particularly applicable to custom nodes or LLM interactions outside of agent strategies.
`presencePenalty`	Double	Prevents the model from reusing tokens that have already been included in the output. Higher values encourage the introduction of new tokens and topics. Takes a value in the range of -2.0 to 2.0.
`promptMode`	String	Lets you toggle between the reasoning mode and no system prompt. When set to `reasoning`, the default system prompt for reasoning models is used. For more information, see Mistral's Reasoning documentation.
`randomSeed`	Integer	The seed to use for random sampling. If set, different calls with the same parameters and the same seed value will generate deterministic results.
`safePrompt`	Boolean	Specifies whether to inject a safety prompt before all conversations. The safety prompt is used to enforce guardrails and protect against harmful content. For more information, see Mistral's Moderation & Guardarailing documentation.
`stop`	List<String>	Strings that signal to the model that it should stop generating content when it encounters any of them. For example, to make the model stop generating content when it produces two newlines, specify the stop sequence as `stop = listOf("/n/n")`.
`topP`	Double	Also referred to as nucleus sampling. Creates a subset of next tokens by adding tokens with the highest probability values to the subset until the sum of their probabilities reaches the specified `topP` value. Takes a value greater than 0.0 and lower than or equal to 1.0.

Parameter	Type	Description
`frequencyPenalty`	Double	Penalizes frequent tokens to reduce repetition. Higher `frequencyPenalty` values result in larger variations of phrasing and reduced repetition. Takes a value in the range of -2.0 to 2.0.
`logprobs`	Boolean	If `true`, includes log-probabilities for output tokens.
`presencePenalty`	Double	Prevents the model from reusing tokens that have already been included in the output. Higher values encourage the introduction of new tokens and topics. Takes a value in the range of -2.0 to 2.0.
`stop`	List<String>	Strings that signal to the model that it should stop generating content when it encounters any of them. For example, to make the model stop generating content when it produces two newlines, specify the stop sequence as `stop = listOf("/n/n")`.
`topLogprobs`	Integer	Number of top most likely tokens per position. Takes a value in the range of 0–20. Requires the `logprobs` parameter to be set to `true`.
`topP`	Double	Also referred to as nucleus sampling. Creates a subset of next tokens by adding tokens with the highest probability values to the subset until the sum of their probabilities reaches the specified `topP` value. Takes a value greater than 0.0 and lower than or equal to 1.0.

Parameter	Type	Description
`frequencyPenalty`	Double	Penalizes frequent tokens to reduce repetition. Higher `frequencyPenalty` values result in larger variations of phrasing and reduced repetition. Takes a value in the range of -2.0 to 2.0.
`logprobs`	Boolean	If `true`, includes log-probabilities for output tokens.
`minP`	Double	Filters out tokens whose relative probability to the most likely token is below the defined `minP` value. Takes a value in the range of 0.0–0.1.
`models`	List<String>	List of allowed models for the request.
`presencePenalty`	Double	Prevents the model from reusing tokens that have already been included in the output. Higher values encourage the introduction of new tokens and topics. Takes a value in the range of -2.0 to 2.0.
`provider`	ProviderPreferences	Includes a range of parameters that let you explicitly control how OpenRouter chooses which LLM provider to use. For more information, see the API documentation on ProviderPreferences.
`repetitionPenalty`	Double	Penalizes token repetition. Next-token probabilities for tokens that already appeared in the output are divided by the value of `repetitionPenalty`, which makes them less likely to appear again if `repetitionPenalty > 1`. Takes a value greater than 0.0 and lower than or equal to 2.0.
`route`	String	Request routing strategy to use.
`stop`	List<String>	Strings that signal to the model that it should stop generating content when it encounters any of them. For example, to make the model stop generating content when it produces two newlines, specify the stop sequence as `stop = listOf("/n/n")`.
`topA`	Double	Dynamically adjusts the sampling window based on model confidence. If the model is confident (there are dominant high-probability next tokens), it keeps the sampling window limited to a few top tokens. If the confidence is low (there are many tokens with similar probabilities), keeps more tokens in the sampling window. Takes a value in the range of 0.0–0.1 (inclusive). Higher value means greater dynamic adaptation.
`topK`	Integer	Number of top tokens to consider when generating the output. Takes a value greater than or equal to 0 (provider-specific minimums may apply).
`topLogprobs`	Integer	Number of top most likely tokens per position. Takes a value in the range of 0–20. Requires the `logprobs` parameter to be set to `true`.
`topP`	Double	Also referred to as nucleus sampling. Creates a subset of next tokens by adding tokens with the highest probability values to the subset until the sum of their probabilities reaches the specified `topP` value. Takes a value greater than 0.0 and lower than or equal to 1.0.
`transforms`	List<String>	List of context transforms. Defines how context is transformed when it exceeds the model's token limit. The default transformation is `middle-out` which truncates from the middle of the prompt. Use empty list for no transformations. For more information, see Message Transforms in OpenRouter documentation.

Parameter	Type	Description
`enableSearch`	Boolean	Specifies whether to enable web search functionality. For more information, see Alibaba's Web search documentation.
`enableThinking`	Boolean	Specifies whether to enable thinking mode when using a hybrid thinking model. For more information, see Alibaba's documentation on Deep thinking.
`frequencyPenalty`	Double	Penalizes frequent tokens to reduce repetition. Higher `frequencyPenalty` values result in larger variations of phrasing and reduced repetition. Takes a value in the range of -2.0 to 2.0.
`logprobs`	Boolean	If `true`, includes log-probabilities for output tokens.
`parallelToolCalls`	Boolean	If `true`, multiple tool calls can run in parallel. Particularly applicable to custom nodes or LLM interactions outside of agent strategies.
`presencePenalty`	Double	Prevents the model from reusing tokens that have already been included in the output. Higher values encourage the introduction of new tokens and topics. Takes a value in the range of -2.0 to 2.0.
`stop`	List<String>	Strings that signal to the model that it should stop generating content when it encounters any of them. For example, to make the model stop generating content when it produces two newlines, specify the stop sequence as `stop = listOf("/n/n")`.
`topLogprobs`	Integer	Number of top most likely tokens per position. Takes a value in the range of 0–20. Requires the `logprobs` parameter to be set to `true`.
`topP`	Double	Also referred to as nucleus sampling. Creates a subset of next tokens by adding tokens with the highest probability values to the subset until the sum of their probabilities reaches the specified `topP` value. Takes a value greater than 0.0 and lower than or equal to 1.0.

The following example shows defined OpenRouter LLM parameters using the provider-specific OpenRouterParams class:

val openRouterParams = OpenRouterParams(
    temperature = 0.7,
    maxTokens = 500,
    frequencyPenalty = 0.5,
    presencePenalty = 0.5,
    topP = 0.9,
    topK = 40,
    repetitionPenalty = 1.1,
    models = listOf("anthropic/claude-3-opus", "anthropic/claude-3-sonnet"),
    transforms = listOf("middle-out")
)

Usage examples

Basic usage

// A basic set of parameters with limited length 
val basicParams = LLMParams(
    temperature = 0.7,
    maxTokens = 150,
    toolChoice = LLMParams.ToolChoice.Auto
)

Reasoning control

You implement reasoning control through provider-specific parameters that control model reasoning. When using the OpenAI Chat API and models that support reasoning, use the reasoningEffort parameter to control how many reasoning tokens the model generates before providing a response:

val openAIReasoningEffortParams = OpenAIChatParams(
    reasoningEffort = ReasoningEffort.MEDIUM
)

In addition, when using the OpenAI Responses API in a stateless mode, you keep an encrypted history of reasoning items and send it to the model in every conversation turn. The encryption is done on the OpenAI side, and you need to request encrypted reasoning tokens by setting the include parameter in your requests to reasoning.encrypted_content. You can then pass the encrypted reasoning tokens back to the model in the next conversation turns.

val openAIStatelessReasoningParams = OpenAIResponsesParams(
    include = listOf(OpenAIInclude.REASONING_ENCRYPTED_CONTENT)
)

Custom parameters

To add custom parameters that may be provider specific and not supported in Koog out of the box, use the additionalProperties property as shown in the example below.

// Add custom parameters for specific model providers
val customParams = LLMParams(
    additionalProperties = additionalPropertiesOf(
        "top_p" to 0.95,
        "frequency_penalty" to 0.5,
        "presence_penalty" to 0.5
    )
)

Setting and overriding parameters

The code sample below shows how you can define a set of LLM parameters that you may want to use primarily, then create another set by partially overriding values from the original set and adding new values to it. This lets you define parameters that are common to most requests but also add more specific parameter combinations without having to repeat the common parameters.

// Define default parameters
val defaultParams = LLMParams(
    temperature = 0.7,
    maxTokens = 150,
    toolChoice = LLMParams.ToolChoice.Auto
)

// Create parameters with some overrides, using defaults for the rest
val overrideParams = LLMParams(
    temperature = 0.2,
    numberOfChoices = 3
).default(defaultParams)

The values in the resulting overrideParams set are equivalent to the following:

val overrideParams = LLMParams(
    temperature = 0.2,
    maxTokens = 150,
    toolChoice = LLMParams.ToolChoice.Auto,
    numberOfChoices = 3
)