Long-term memory
Beta
This feature is part of a beta module (1.0.0-beta). The API may change in future releases.
See module versioning for details.
The LongTermMemory feature adds persistent memory to Koog AI agents via two independent group of settings:
- Retrieval — augments LLM prompts with relevant context from a memory storage (Retrieval-Augmented Generation or RAG)
- Ingestion — persists conversation messages into a memory storage for future retrieval
Quick Start
val myStorage = InMemoryRecordStorage() // or your vector DB adapter
val agent = AIAgent(
promptExecutor = executor,
strategy = singleRunStrategy(),
agentConfig = agentConfig,
toolRegistry = ToolRegistry.EMPTY
) {
install(LongTermMemory) {
retrieval {
storage = myStorage
searchStrategy = SimilaritySearchStrategy(topK = 5)
}
}
}
agent.run("What did we discuss yesterday?")
InMemoryRecordStorage myStorage = new InMemoryRecordStorage();
AIAgent agent = AIAgent.builder()
.promptExecutor(executor)
.llmModel(OpenAIModels.Chat.GPT4o)
.systemPrompt("You are a helpful assistant.")
.install(LongTermMemory.Feature, config -> {
config.retrieval(
new LongTermMemory.RetrievalSettingsBuilder()
.withStorage(myStorage)
.withSearchStrategy(
SearchStrategy.builder().similarity().withTopK(5).build()
)
.build()
);
})
.build();
Object result = agent.run("What did we discuss yesterday?");
Retrieval Only (RAG)
Use retrieval without ingestion when you have a pre-populated knowledge base:
Prompt Augmenters
| Augmenter | Behavior |
|---|---|
SystemPromptAugmenter() |
Inserts context as a system message at the start of the prompt (no-op if there is no system message) |
UserPromptAugmenter() |
Appends the retrieved context as an extra text part at the end of the last user message (no-op if there is no user message) |
PromptAugmenter { prompt, context -> ... } |
Custom augmentation via lambda |
Search Query Providers
By default, the retrieval flow uses the last user message as the search query. You can customize this by providing a SearchQueryProvider:
| Provider | Behavior |
|---|---|
LastUserMessageQueryProvider() |
Uses the content of the last user message (default) |
SearchQueryProvider { prompt -> ... } |
Custom query derivation via lambda |
var retrievalSettings = new LongTermMemory.RetrievalSettingsBuilder()
.withStorage(myStorage)
.withSearchQueryProvider(prompt -> {
var userMessages = prompt.getMessages().stream()
.filter(m -> m.getRole() == Message.Role.User)
.toList();
if (userMessages.isEmpty()) return null;
return userMessages.get(userMessages.size() - 1).getContent();
})
.build();
Search Strategies
| Strategy | Behavior |
|---|---|
SimilaritySearchStrategy() |
Vector similarity semantic search — default |
query -> new SimilaritySearchRequest(query, 20, 0, 0.0, null) |
Custom search via lambda |
Ingestion Only
Use ingestion without retrieval to build up a memory storage over time:
Ingestion runs once when the agent run completes: the final accumulated session prompt/history is passed to the configured documentExtractor as a single batch.
Disabling Automatic Behavior
By default, retrieval and ingestion run automatically (retrieval runs before each LLM call; ingestion runs once when the agent completes). You can disable automatic behavior while still having access to the configured storage and strategies from within strategy nodes:
This gives you three clean modes:
- Full automatic (default): Install the feature, configure storage — retrieval and ingestion work automatically.
- Manual only: Set
enableAutomaticRetrieval = false/enableAutomaticIngestion = falseand use storage and strategies in your graph strategy nodes. - Hybrid: Combine automatic ingestion with manual retrieval (or vice versa).
Accessing Long-Term Memory from Strategy Nodes
Use withLongTermMemory { } inside a strategy node to directly search or add records:
val myNode by node<String, Unit> {
withLongTermMemory {
// Manually add records
val record = MemoryRecord(content = "important fact")
ingestionStorage?.add(listOf(record), namespace = "my-namespace")
// Manually search
val request = SimilaritySearchRequest(queryText = input, limit = 5)
val results = retrievalStorage?.search(request, namespace = "my-namespace")
}
}
Use longTermMemory() to get the feature instance directly:
val myNode by node<String, Unit> {
val memory = longTermMemory()
val storage = memory.ingestionStorage
}
Custom Document Extractor
Implement DocumentExtractor to control how messages are transformed before storage:
val summarizingExtractor = DocumentExtractor { messages ->
messages
.filter { it.role == Message.Role.Assistant }
.map { MemoryRecord(content = summarize(it.content)) }
}
install(LongTermMemory) {
ingestion {
storage = myStorage
documentExtractor = summarizingExtractor
}
}
Implementing Custom Storage
Implement SearchStorage and/or WriteStorage to connect to your vector database:
class MyVectorDbStorage : SearchStorage<TextDocument, SearchRequest>, WriteStorage<TextDocument> {
override suspend fun search(
request: SearchRequest, namespace: String?
): List<SearchResult<TextDocument>> {
// Query your vector DB
}
override suspend fun add(
records: List<TextDocument>, namespace: String?
): List<String> {
// Upsert into your vector DB and return the IDs of added records
}
}
For testing, use the built-in InMemoryRecordStorage which keeps records in memory. It supports both KeywordSearchRequest (implemented as case-insensitive substring matching) and SimilaritySearchRequest (implemented as a Jaccard coefficient over case-insensitive word sets); no vector embeddings are used.