Skip to main content

Configuration

Two entry points share the same tier configuration: createMemoryCapability builds the capability surface, and system() builds that capability plus the auto-capture and lifecycle pipeline. Whichever you pick, the tier configs below are identical. You won't need every knob on day one — start with the defaults and tighten things as you learn what your agent forgets.

import { system } from "@flow-state-dev/memory";

const mem = system({
model: "openai/gpt-5.4-mini",
working: { capacity: 7, decay: { strategy: "power-law", rate: 0.5 } },
episodic: { scope: "user", significanceThreshold: 0.6 },
semantic: { consolidation: { episodicThreshold: 5 } },
digest: { maxTokens: 400, topN: { facts: 30, episodes: 10 } },
});

Tier dependencies are validated at construction by both entry points: semantic requires episodic, digest requires semantic. Working-only is allowed. If you wire something inconsistent, you'll hear about it when you build, not at runtime.

Choosing an entry point

NeedReach for
Read side: context block, recall tool, typed helperscreateMemoryCapability
Read side plus auto-capture, consolidation, prune, hygienesystem()

Both accept the same tier configs below. system() builds createMemoryCapability internally and exposes it as mem.capability, so the read surface is identical — system() just adds the pipeline that writes new observations back into the tiers.

createMemoryCapability options

createMemoryCapability(options) returns the composed capability with the resource maps you register at the flow level. Install it on a generator with uses: [mem] and spread its resources into the flow:

import { defineFlow, generator } from "@flow-state-dev/core";
import { createMemoryCapability } from "@flow-state-dev/memory";

const mem = createMemoryCapability({
model: "openai/gpt-5.4-mini",
working: { capacity: 7 },
episodic: true,
semantic: true,
});

generator({ uses: [mem] });

defineFlow({
kind: "reader",
resources: { ...mem.sessionResources, ...mem.userResources },
actions: { /* ... */ },
});
FieldTypeDescription
modelstring | string[]Model id (or fallback chain) for the recall tool's filter call. Required.
workingWorkingMemorySystemConfig | trueWorking tier config. Required; true for defaults.
episodicEpisodicMemoryConfig | trueEpisodic tier. Omit to disable.
semanticSemanticMemoryConfig | trueSemantic tier. Omit to disable. Requires episodic.
digestDigestSystemConfig | trueDigest tier. Omit to disable. Requires semantic.
toolMemoryToolConfigRecall-tool strategy and defaults.
hygieneHygieneConfig | true | falseOnly the confidenceDecay slice applies here — it drives recall ranking. Janitor scheduling belongs to system().

The result is a DefinedCapability with sessionResources (always workingMemory + memorySystem), userResources (the configured user-scoped tiers), tiers (the per-tier capabilities), and recallToolBlock attached. For type-safe resource registration use sessionResources / userResources — the resource references travel with the capability, so the same defineResource() reference is used everywhere.

system() options

system() accepts every field above plus the capture-pipeline knobs below, and returns the full MemorySystem — the capability (mem.capability), the capture pipeline (mem.capture, mem.captureFromItems), consolidation, prune, and the janitor.

FieldTypeDescription
consolidationModelstring | string[]Model override for the consolidation generator. Defaults to model.
pruneModelstring | string[]Model override for the prune generator. Defaults to model.
source(input, ctx) => stringCustom source function — overrides reading from ctx.session.items.
maxAssistantCharsnumberMax chars of the assistant response captured per turn. Default 500.
name / inputSchemaOptional naming and input schema for the capture pipeline.

Tier configuration

These configs apply to both entry points.

working

Session-scoped recent observations with a salience-decay model. capacity controls how many entries stick around before older ones get evicted. The decay strategy controls how salience falls off as new turns arrive, so you can tune how aggressively the agent "forgets" what happened a few turns ago.

FieldTypeDefaultDescription
capacitynumber7Max entries retained before eviction (Miller's number)
maxPinnedSlotsnumber2How many entries can be pinned against eviction
decay.strategy"power-law" | "exponential" | "none""power-law"How salience falls off with elapsed turns
decay.ratenumber0.5Tunes the decay curve

episodic

User-scoped past sessions stored as encoded Episode records. Pass true for defaults, or an object when you want to override individual fields. The threshold is the dial worth thinking about: too low and you encode noise, too high and important moments slip past.

FieldTypeDefaultDescription
scope"user" | "org""user"Persistence scope for episodes
significanceThresholdnumber0.6Minimum importance for an item to be encoded as an episode
maxEpisodesnumber200Cap on retained episodes

semantic

User-scoped consolidated facts. Periodically, the system runs an LLM consolidation pass over recent episodes to extract durable facts the agent should keep.

FieldTypeDefaultDescription
scope"user" | "org"inherited from episodic, else "user"Persistence scope for facts
consolidation.episodicThresholdnumber5Run consolidation after N new episodic entries
consolidation.onEvictionbooleantrueAlso consolidate when persistent items are evicted from working memory
consolidation.minIntervalnumberframework defaultDon't consolidate more than once per N turns
pruneThresholdnumber20Prune when fact count reaches this; 0 disables

Consolidation runs an LLM call, so budget for the latency. If you don't want that on the hot path of a user turn, drive mem.consolidate from a scheduled action instead of the capture pipeline.

digest

User-scoped rolling summary that gets regenerated periodically. The digest is the cheapest thing to surface in the prompt: a static blob the agent reads, not a search target. If you want one always-on memory surface and nothing else, this is the one to keep.

FieldTypeDefaultDescription
maxTokensnumber400Hard cap on the regenerated digest
topN.factsnumber30Top-N semantic facts (by reinforcement count) fed to regeneration
topN.episodesnumber10Top-N recent-and-significant episodes fed to regeneration

hygiene

Time-based maintenance for the semantic and episodic stores. On by default. Decays the confidence of stable facts as time-since-reinforcement grows, and applies durability-based TTLs to episodic episodes. See Hygiene for the full picture and how to tune it.

FieldTypeDefaultDescription
hygieneHygieneConfig | true | falsetruePass false to revert to pre-hygiene behavior (no decay, unbounded growth)

Capability presets

mem.capability exposes presets for each contribution, so you can dial in exactly what gets injected into a given block:

generator({
// Default: digest + working context + recall tool
uses: [mem.capability],
});

generator({
// No tool — context-only
uses: [mem.capability.presets({ recall: false })],
});

generator({
// No context, no tool — capability still installs resources
uses: [
mem.capability.presets({ digest: false, working: false, recall: false }),
],
});

Default-on presets: digest, working, recall. Off by default: episodic and semantic context entries. The recall tool covers them already; turn the context entries on when you also want them auto-injected each turn.

Per-tier capabilities

Sometimes you want a single tier without the full unified system. A pre-prompt step that only cares about working memory, for example. Each tier ships as a standalone capability for that case:

import { workingMemoryCapability } from "@flow-state-dev/memory";

generator({
uses: [workingMemoryCapability],
});

The same applies to episodicMemoryCapability, semanticMemoryCapability, and digestMemoryCapability. Mix them when you need a non-default combination and don't want to route through system().

Standalone working memory

For the "I just want a working memory buffer with no observer" case, skip the unified capture and use workingMemoryCapture directly. It's a parallel pipeline with its own observer schema, and it runs independently of the system's unified observer.

import { workingMemoryCapture, workingMemoryResource } from "@flow-state-dev/memory";

const capture = workingMemoryCapture({ model: "openai/gpt-5.4-mini" });

See the overview for the unified path and recall-tool for agent-invocable retrieval.