Memory

The memory domain (@thought-fabric/core/memory) provides working memory: a bounded, salience-scored store that tracks what stays in cognitive focus during a conversation. It lives in session scope. Entries decay over time based on a configurable strategy. As new information arrives, low-salience entries get evicted. Pinned entries survive eviction, up to a limit.

Quick Start

The fastest way to add working memory is workingMemoryCapture. It's a sequencer that extracts memories from text using an LLM, persists them, then advances the decay clock:

import { workingMemoryCapture } from '@thought-fabric/core/memory'
import { sequencer } from '@flow-state-dev/core'

const memoryCapture = workingMemoryCapture({ model: 'gpt-5-mini' })

const pipeline = sequencer({ name: 'chat', inputSchema: chatInput })
  .work((input) => input.message, memoryCapture)
  .then(chatGenerator)

Capture runs on the user's message — that's where new facts, preferences, and goals live. Use a connector function with .work() to extract the message string from your pipeline's input. The capture block runs in the background while the rest of the pipeline continues, so it doesn't add latency.

The capture block declares its own session resource. The framework installs it automatically when the flow runs. No manual resource setup needed.

Working Memory Model

Capacity: Default 7 entries (Miller's number). Configurable.
Pinned slots: Default 2. Pinned entries survive eviction; unpinned low-salience entries are evicted first.
Decay: Salience = importance × decay(elapsed). Default strategy is power-law (ACT-R style): (1 + elapsed)^(-rate).
Eviction: When at capacity, the lowest-salience unpinned entry is removed before adding a new one.

Blocks

workingMemoryCapture

Bundled sequencer: observe → remember → tick. One block for the common case. Input: a string — typically the user's message, since that's where new facts live. Use a connector function with .work() to extract the message from your pipeline input.

import { workingMemoryCapture } from '@thought-fabric/core/memory'

workingMemoryCapture({
  model: 'gpt-5-mini',
  capacity: 7,
  maxPinnedSlots: 2,
  maxExtractPerTurn: 3,
  decay: { strategy: 'power-law', rate: 0.5 },
})

workingMemoryObserve

Generator that uses an LLM to extract structured observations from input text. Output: { observations: [{ content, importance, pinned?, replaces? }] }. Does not persist anything.

import { workingMemoryObserve } from '@thought-fabric/core/memory'

workingMemoryObserve({
  model: 'gpt-5-mini',
  maxExtractPerTurn: 5,
})

workingMemoryRemember

Handler that persists observations into the resource. Handles replaces (evict old entry before adding new one). Errors on individual observations are caught and skipped; partial success is preferred.

workingMemoryTick

Handler that advances the turn counter and recomputes salience for all entries. Use with .tap() since it's a side-effect with no meaningful output.

workingMemorySnapshot

Handler that returns current state: { entries: WorkingMemoryEntry[], currentTurn: number }. Entries are sorted by salience.

workingMemoryAdd

Handler that adds an entry directly without LLM extraction. Input: { content, importance, pinned?, id?, metadata? }.

Composable Pipeline

For more control, wire the blocks yourself:

import {
  workingMemoryObserve,
  workingMemoryRemember,
  workingMemoryTick,
} from '@thought-fabric/core/memory'

const pipeline = sequencer({ name: 'chat', inputSchema: chatInput })
  .work(
    (input) => input.message,
    sequencer({ name: 'memory', inputSchema: z.string() })
      .then(workingMemoryObserve({ model: 'gpt-5-mini', maxExtractPerTurn: 5 }))
      .then(workingMemoryRemember())
      .tap(workingMemoryTick())
  )
  .then(chatGenerator)

Injecting Memory into Prompts

Use workingMemoryContextFormatter in a generator's context array:

import { generator } from '@flow-state-dev/core'
import {
  workingMemoryResources,
  workingMemoryContextFormatter,
} from '@thought-fabric/core/memory'

const chat = generator({
  name: 'chat',
  model: 'gpt-5',
  inputSchema: z.string(),
  sessionResources: workingMemoryResources,
  context: [workingMemoryContextFormatter],
  user: (input) => input,
})

This formats entries as a bullet list ordered by salience. Salience scores are omitted from the formatted output; they're for eviction, not confidence. Ordering already communicates priority.

Helpers

For direct resource manipulation outside blocks, use verb-first helpers:

Helper	Purpose
`addWorkingMemory(ref, entry, config?)`	Add entry with auto-eviction at capacity
`evictWorkingMemory(ref, id)`	Remove by ID (overrides pin)
`pinWorkingMemory(ref, id, config?)`	Pin to protect from eviction
`unpinWorkingMemory(ref, id)`	Remove pin
`refreshWorkingMemory(ref, id, config?)`	Reset access time (access boost)
`advanceWorkingMemory(ref, config?)`	Advance turn, recompute salience
`workingMemoryItems(ref)`	Entries sorted by salience
`formatWorkingMemoryEntries(ref)`	Bullet list for LLM context

import {
  addWorkingMemory,
  workingMemoryItems,
  pinWorkingMemory,
} from '@thought-fabric/core/memory'

const ref = ctx.session.resources.get('workingMemory')

await addWorkingMemory(ref, {
  content: 'User wants to build a REST API',
  importance: 0.8,
  pinned: true,
})

const sorted = workingMemoryItems(ref)
await pinWorkingMemory(ref, 'entry-id')

Decay Strategies

Strategy	Formula	Use case
`power-law` (default)	`(1 + elapsed)^(-rate)`	ACT-R style; fast initial drop, long tail
`exponential`	`exp(-rate × elapsed)`	Steeper, more aggressive decay
`none`	1	No decay; salience = importance forever. Good for testing.

Resource and Schemas

workingMemoryResource — Session-scoped resource definition.
workingMemoryResources — Pre-keyed { workingMemory: workingMemoryResource } for sessionResources.
workingMemoryEntrySchema, workingMemoryStateSchema — Zod schemas for type validation.

Pure Math Functions

computeDecay(elapsed, strategy, rate) and computeSalience(entry, currentTurn, decay) are exported for custom logic or testing.

Naming Convention

workingMemory[Verb] — Block or item (e.g. workingMemoryCapture, workingMemoryObserve).
[verb]WorkingMemory — Helper (e.g. addWorkingMemory, evictWorkingMemory).

Quick Start​

Working Memory Model​

Blocks​

workingMemoryCapture​

workingMemoryObserve​

workingMemoryRemember​

workingMemoryTick​

workingMemorySnapshot​

workingMemoryAdd​

Composable Pipeline​

Injecting Memory into Prompts​

Helpers​

Decay Strategies​

Resource and Schemas​

Pure Math Functions​

Naming Convention​

Further Reading​