Skip to content

OpenTelemetry support

This page provides details about the support for OpenTelemetry with the Koog agentic framework for tracing and monitoring your AI agents.

Overview

OpenTelemetry is an observability framework that provides tools for generating, collecting, and exporting telemetry data (traces) from your applications. The Koog OpenTelemetry feature allows you to instrument your AI agents to collect telemetry data, which can help you:

  • Monitor agent performance and behavior
  • Debug issues in complex agent workflows
  • Visualize the execution flow of your agents
  • Track LLM calls and tool usage
  • Analyze agent behavior patterns

Key OpenTelemetry concepts

  • Spans: spans represent individual units of work or operations within a distributed trace. They indicate the beginning and end of a specific activity in an application, such as an agent execution, a function call, an LLM call, or a tool call.
  • Attributes: attributes provide metadata about a telemetry-related item such as a span. Attributes are represented as key-value pairs.
  • Events: events are specific points in time during the lifetime of a span (span-related events) that represent something potentially noteworthy that happened.
  • Exporters: exporters are components responsible for sending the collected telemetry data to various backends or destinations.
  • Collectors: collectors receive, process, and export telemetry data. They act as intermediaries between your applications and your observability backend.
  • Samplers: samplers determine whether a trace should be recorded based on the sampling strategy. They are used to manage the volume of telemetry data.
  • Resources: resources represent entities that produce telemetry data. They are identified by resource attributes, which are key-value pairs that provide information about the resource.

The OpenTelemetry feature in Koog automatically creates spans for various agent events, including:

  • Agent execution start and end
  • Node execution
  • LLM calls
  • Tool calls

Installation

To use OpenTelemetry with Koog, add the OpenTelemetry feature to your agent:

val agent = AIAgent(
    executor = simpleOpenAIExecutor(apiKey),
    llmModel = OpenAIModels.Chat.GPT4o,
    systemPrompt = "You are a helpful assistant.",
    installFeatures = {
        install(OpenTelemetry) {
            // Configuration options go here
        }
    }
)

Configuration

Basic configuration

Here is the full list of available properties that you set when configuring the OpenTelemetry feature in an agent:

Name Data type Default value Description
serviceName String ai.koog The name of the service being instrumented.
serviceVersion String Current Koog library version The version of the service being instrumented.
isVerbose Boolean false Whether to enable verbose logging for debugging OpenTelemetry configuration.
sdk OpenTelemetrySdk The OpenTelemetry SDK instance to use for telemetry collection.
tracer Tracer The OpenTelemetry tracer instance used for creating spans.

Note

The sdk and tracer properties are public properties that you can access, but you can only set them using the public methods listed below.

The OpenTelemetryConfig class also includes methods that represent actions related to different configuration items. Here is an example of installing the OpenTelemetry feature with a basic set of configuration items:

install(OpenTelemetry) {
    // Set your service configuration
    setServiceInfo("my-agent-service", "1.0.0")

    // Add the Logging exporter
    addSpanExporter(LoggingSpanExporter.create())
}

For a reference of available methods, see the sections below.

setServiceInfo

Sets the service information including name and version. Takes the following arguments:

Name Data type Required Default value Description
serviceName String Yes The name of the service being instrumented.
serviceVersion String Yes The version of the service being instrumented.

addSpanExporter

Adds a span exporter to send telemetry data to external systems. Takes the following argument:

Name Data type Required Default value Description
exporter SpanExporter Yes The SpanExporter instance to be added to the list of custom span exporters.

addSpanProcessor

Adds a span processor to process spans before they are exported. Takes the following argument:

Name Data type Required Default value Description
processor SpanProcessor Yes The span processor that includes the custom logic to process telemetry data before export.

addResourceAttributes

Adds resource attributes to provide additional context about the service. Takes the following argument:

Name Data type Required Default value Description
attributes Map<AttributeKey<T>, T> Yes The key-value pairs that provide additional details about the service.

setSampler

Sets the sampling strategy to control which spans are collected. Takes the following argument:

Name Data type Required Default value Description
sampler Sampler Yes The sampler instance to set for the OpenTelemetry configuration.

setVerbose

Enables or disables verbose logging for debugging OpenTelemetry configuration. Takes the following argument:

Name Data type Required Default value Description
verbose Boolean Yes false If true, the application collects more detailed telemetry data.

Advanced configuration

For more advanced configuration, you can also customize the following configuration options:

  • Sampler: configure the sampling strategy to adjust the frequency and amount of collected data.
  • Resource attributes: add more information about the process that is producing telemetry data.
install(OpenTelemetry) {
    // Set your service configuration
    setServiceInfo("my-agent-service", "1.0.0")

    // Add the Logging exporter
    addSpanExporter(LoggingSpanExporter.create())

    // Set the sampler 
    setSampler(Sampler.traceIdRatioBased(0.5)) 

    // Add resource attributes
    addResourceAttributes(mapOf(
        AttributeKey.stringKey("custom.attribute") to "custom-value")
    )
}

Sampler

To define a sampler, use a corresponding method of the Sampler class (io.opentelemetry.sdk.trace.samplers.Sampler) from the opentelemetry-java SDK that represents the sampling strategy you want to use.

The default sampling strategy is as follows:

  • Sampler.alwaysOn(): The default sampling strategy where every span (trace) is sampled.

For more information about available samplers and sampling strategies, see the OpenTelemetry Sampler documentation.

Resource attributes

Resource attributes represent additional information about a process producing telemetry data. Koog includes a set of resource attributes that are set by default:

  • service.name
  • service.version
  • service.instance.time
  • os.type
  • os.version
  • os.arch

The default value of the service.name attribute is ai.koog, while the default service.version value is the currently used Koog library version.

In addition to default resource attributes, you can also add custom attributes. To add a custom attribute to an OpenTelemetry configuration in Koog, use the addResourceAttributes() method in an OpenTelemetry configuration that takes a key and a value as its arguments.

addResourceAttributes(mapOf(
    AttributeKey.stringKey("custom.attribute") to "custom-value")
)

Span types and attributes

The OpenTelemetry feature automatically creates different types of spans to track various operations in your agent:

  • CreateAgentSpan: created when you run an agent, closed when the agent is closed or the process is terminated.
  • InvokeAgentSpan: the invocation of an agent.
  • NodeExecuteSpan: the execution of a node in the agent's strategy. This is a custom, Koog-specific span.
  • InferenceSpan: an LLM call.
  • ExecuteToolSpan: a tool call.

Spans are organized in a nested, hierarchical structure. Here is an example of a span structure:

CreateAgentSpan
    InvokeAgentSpan
        NodeExecuteSpan
            InferenceSpan
        NodeExecuteSpan
            ExecuteToolSpan
        NodeExecuteSpan
            InferenceSpan    

Span attributes

Span attributes provide metadata related to a span. Each span has its set of attributes, while some spans can also repeat attributes.

Koog supports a list of predefined attributes that follow OpenTelemetry's Semantic conventions for generative AI events. For example, the conventions define an attribute named gen_ai.conversation.id, which is usually a required attribute for a span. In Koog, the value of this attribute is the unique identifier for an agent run, that is automatically set when you call the agent.run() method.

In addition, Koog also includes custom, Koog-specific attributes. You can recognize most of these attributes by the koog. prefix. Here are the available custom attributes:

  • koog.agent.strategy.name: the name of the agent strategy. A strategy is a Koog-related entity that describes the purpose of the agent. Used in the InvokeAgentSpan span.
  • koog.node.name: the name of the node being run. Used in the NodeExecuteSpan span.

Events

A span can also have an event attached to the span. Events describe a specific point in time when something relevant happened. For example, when an LLM call started or finished. Events also have attributes and additionally include event body fields.

The following event types are supported in line with OpenTelemetry's Semantic conventions for generative AI events:

  • SystemMessageEvent: the system instructions passed to the model.
  • UserMessageEvent: the user message passed to the model.
  • AssistantMessageEvent: the assistant message passed to the model.
  • ToolMessageEvent: the response from a tool or function call passed to the model.
  • ChoiceEvent: the response message from a model.

Note

The optentelemetry-java SDK does not support the event body fields parameter when adding an event. Therefore, in the OpenTelemetry support in Koog, event body fields are a separate attribute whose key is body and value type is string. The string includes the content or payload for the event body field, which is usually a JSON-like object. For examples of event body fields, see the OpenTelemetry documentation. For the state of support for event body fields in opentelemetry-java, see the related GitHub issue.

Exporters

Exporters send collected telemetry data to an OpenTelemetry Collector or other types of destinations or backend implementations. To add an exporter, use the addSpanExporter() method when installing the OpenTelemetry feature. The method takes the following argument:

Name Data type Required Default Description
exporter SpanExporter Yes The SpanExporter instance to be added to the list of custom span exporters.

The sections below provide information about some of the most commonly used exporters from the opentelemetry-java SDK.

Logging exporter

A logging exporter that outputs trace information to the console. LoggingSpanExporter (io.opentelemetry.exporter.logging.LoggingSpanExporter) is a part of the opentelemetry-java SDK.

This type of export is useful for development and debugging purposes.

install(OpenTelemetry) {
    // Add the logging exporter
    addSpanExporter(LoggingSpanExporter.create())
    // Add more exporters as needed
}

OpenTelemetry HTTP exporter

OpenTelemetry HTTP exporter (OtlpHttpSpanExporter) is a part of the opentelemetry-java SDK (io.opentelemetry.exporter.otlp.http.trace.OtlpHttpSpanExporter) and sends span data to a backend through HTTP.

install(OpenTelemetry) {
   // Add OpenTelemetry HTTP exporter 
   addSpanExporter(
      OtlpHttpSpanExporter.builder()
         // Set the maximum time to wait for the collector to process an exported batch of spans 
         .setTimeout(30, TimeUnit.SECONDS)
         // Set the OpenTelemetry endpoint to connect to
         .setEndpoint("http://localhost:3000/api/public/otel/v1/traces")
         // Add the authorization header
         .addHeader("Authorization", "Basic $AUTH_STRING")
         .build()
   )
}

OpenTelemetry gRPC exporter

OpenTelemetry gRPC exporter (OtlpGrpcSpanExporter) is a part of the opentelemetry-java SDK (io.opentelemetry.exporter.otlp.trace.OtlpGrpcSpanExporter). It exports telemetry data to a backend through gRPC and lets you define the host and port of the backend, collector, or endpoint that receives the data. The default port is 4317.

install(OpenTelemetry) {
   // Add OpenTelemetry gRPC exporter 
   addSpanExporter(
      OtlpGrpcSpanExporter.builder()
          // Set the host and the port
         .setEndpoint("http://localhost:4317")
         .build()
   )
}

Integration with Jaeger

Jaeger is a popular distributed tracing system that works with OpenTelemetry. The opentelemetry directory within examples in the Koog repository includes an example of using OpenTelemetry with Jaeger and Koog agents.

Prerequisites

To test OpenTelemetry with Koog and Jaeger, start the Jaeger OpenTelemetry all-in-one process using the provided docker-compose.yaml file, by running the following command:

docker compose up -d

The provided Docker Compose YAML file includes the following content:

# docker-compose.yaml
services:
  jaeger-all-in-one:
    image: jaegertracing/all-in-one:1.39
    container_name: jaeger-all-in-one
    environment:
      - COLLECTOR_OTLP_ENABLED=true
    ports:
      - "4317:4317"
      - "16686:16686"

To access the Jaeger UI and view your traces, open http://localhost:16686.

Example

To export telemetry data for use in Jaeger, the example uses LoggingSpanExporter (io.opentelemetry.exporter.logging.LoggingSpanExporter) and OtlpGrpcSpanExporter (io.opentelemetry.exporter.otlp.trace.OtlpGrpcSpanExporter) from the opentelemetry-java SDK.

Here is the full code sample:

import ai.koog.agents.core.agent.AIAgent
import ai.koog.agents.example.ApiKeyService
import ai.koog.agents.features.opentelemetry.feature.OpenTelemetry
import ai.koog.prompt.executor.clients.openai.OpenAIModels
import ai.koog.prompt.executor.llms.all.simpleOpenAIExecutor
import io.opentelemetry.exporter.logging.LoggingSpanExporter
import io.opentelemetry.exporter.otlp.trace.OtlpGrpcSpanExporter
import kotlinx.coroutines.runBlocking


fun main() = runBlocking {

   val agent = AIAgent(
      executor = simpleOpenAIExecutor(ApiKeyService.openAIApiKey),
      llmModel = OpenAIModels.Reasoning.GPT4oMini,
      systemPrompt = "You are a code assistant. Provide concise code examples."
   ) {
      install(OpenTelemetry) {
         // Add a console logger for local debugging
         addSpanExporter(LoggingSpanExporter.create())

         // Send traces to OpenTelemetry collector
         addSpanExporter(
            OtlpGrpcSpanExporter.builder()
               .setEndpoint("http://localhost:4317")
               .build()
         )
      }
   }

    agent.use { agent ->
        println("Running the agent with OpenTelemetry tracing...")

        val result = agent.run("Tell me a joke about programming")

        println("Agent run completed with result: '$result'." +
                "\nCheck Jaeger UI at http://localhost:16686 to view traces")
    }
}

Troubleshooting

Common issues

  1. No traces appearing in Jaeger or Langfuse

    • Ensure the service is running and the OpenTelemetry port (4317) is accessible.
    • Check that the OpenTelemetry exporter is configured with the correct endpoint.
    • Make sure to wait a few seconds after agent execution for traces to be exported.
  2. Missing spans or incomplete traces

    • Verify that the agent execution completes successfully.
    • Ensure that you're not closing the application too quickly after agent execution.
    • Add a delay after agent execution to allow time for spans to be exported.
  3. Excessive number of spans

    • Consider using a different sampling strategy by configuring the sampler property.
    • For example, use Sampler.traceIdRatioBased(0.1) to sample only 10% of traces.