Model Groups

How to use semantic model labels with automatic fallback across providers.

The Problem

Every production AI app needs model fallback. API keys expire, providers go down, rate limits hit. Hardcoding a single model ID means a single point of failure.

Model groups solve this. Instead of model: "gpt-5.4", you write model: "preset/fast" or pass an array of models. The framework resolves to the best available model at execution time, retries on failure, and falls back to the next provider automatically.

Quick Start

import { createModelResolver } from "@flow-state-dev/core/models";
import { generator } from "@flow-state-dev/core";

const resolver = createModelResolver({
  presets: {
    fast: { models: ["anthropic/claude-sonnet-4-6", "openai/gpt-5.4-mini", "google/gemini-3-flash"] },
  },
});

const chat = generator({
  name: "chat",
  model: "preset/fast",
  prompt: "You are a helpful assistant.",
});

"preset/fast" resolves to the first available model in the preset's list. No changes to your generator code — it's a drop-in replacement for any model reference.

Generators also support array fallback directly:

const chat = generator({
  name: "chat",
  model: ["openai/gpt-5.4", "anthropic/claude-sonnet-4-6"],
  prompt: "You are a helpful assistant.",
});

Default Presets

Three built-in presets ship with the framework:

Preset	Models (preference order)	Defaults
`fast`	anthropic/claude-sonnet-4-6, openai/gpt-5.4-mini, google/gemini-3-flash	`maxTokens: 1024`
`thinking`	anthropic/claude-opus-4-6, openai/gpt-5.4, google/gemini-3.1-pro-preview	Anthropic extended thinking enabled
`balanced`	anthropic/claude-sonnet-4-6, openai/gpt-5.4, google/gemini-3-flash	None

The first available model in each list is used. "Available" means the app has an API key for that provider (direct key or gateway).

Provider Detection

The model resolver auto-detects which providers are available by checking environment variables:

Provider	Environment Variable
Anthropic	`ANTHROPIC_API_KEY`
OpenAI	`OPENAI_API_KEY`
Google	`GOOGLE_GENERATIVE_AI_API_KEY`

If only ANTHROPIC_API_KEY is set and you use "preset/fast", it resolves to anthropic/claude-sonnet-4-6. If that key later fails, it skips to openai/gpt-5.4-mini — which won't be available either, so it moves to google/gemini-3-flash. If nothing works, you get a clear error listing what was tried.

Explicit Keys

Override auto-detection with explicit keys:

const resolver = createModelResolver({
  keys: {
    anthropic: process.env.MY_ANTHROPIC_KEY,
    openai: process.env.MY_OPENAI_KEY,
  },
});

Gateways

Gateways are availability multipliers. A single gateway key makes all providers available without needing individual API keys.

Vercel AI Gateway

Zero-config on Vercel deployments. If AI_GATEWAY_API_KEY is set (or auto-provided via Vercel OIDC), all providers are available. Use the vercel/ prefix in model strings to route through the gateway:

"vercel/openai/gpt-5.4"    — OpenAI via Vercel gateway
"vercel/anthropic/claude-sonnet-4-6"  — Anthropic via gateway

The gateway is auto-detected from AI_GATEWAY_API_KEY even without explicit config. Just deploy to Vercel and it works.

OpenRouter

Uses OPENROUTER_API_KEY.

Priority

Direct API keys take priority over gateways. If you have ANTHROPIC_API_KEY set and a Vercel gateway configured, Anthropic models use the direct key (lower latency, no intermediary). Other providers route through the gateway.

Custom Presets

Override defaults or add new presets:

import { createModelResolver } from "@flow-state-dev/core/models";

const resolver = createModelResolver({
  presets: {
    // Override built-in
    fast: {
      models: ["openai/gpt-5.4-nano", "google/gemini-3.1-flash-lite-preview"],
      defaults: { maxTokens: 512 },
    },
    // Add new
    coding: {
      models: ["anthropic/claude-opus-4-6", "openai/gpt-5.4"],
      defaults: { maxTokens: 8192 },
    },
  },
});

const coder = generator({
  name: "coder",
  model: "preset/coding",
});

Preset Defaults

Preset defaults set baseline generation config. Caller config always wins:

const resolver = createModelResolver({
  presets: {
    thinking: {
      models: ["anthropic/claude-opus-4-6", "openai/gpt-5.4"],
      defaults: {
        maxTokens: 4096,
        providerOptions: {
          anthropic: { thinking: { budgetTokens: 10000 } },
        },
      },
    },
  },
});

Provider-specific options are filtered at runtime. If thinking resolves to an OpenAI model, the anthropic provider options are stripped — they won't leak to the wrong provider.

Provider Preference

Presets encode a capability tier — "how capable, how fast." They do not encode a brand choice. If you want to say "I prefer Anthropic across the board," that is an orthogonal axis called provider preference.

Two axes, combined at resolution time:

Axis	Answers	Set by
Preset / tier	"How capable?"	Preset author; `model: "preset/x"`
Provider preference	"Which brand, when multiple are available?"	User, flow, block, or skill

The resolver walks the preset's list, but reorders it first: models from preferred providers come first (in the order you give), the rest come after in their original order. Availability filtering and retry/fallback run on the reordered list.

With `createFSDProvider`

Per-call preference:

import { createFSDProvider, defaultGroups } from "@flow-state-dev/core/models";

const provider = createFSDProvider({ groups: defaultGroups });

// Default: preset's natural order
provider("balanced");

// Prefer Anthropic; fall back to the rest of the preset if no Anthropic model is available
provider("balanced", { prefer: "anthropic" });

// Ordered preference
provider("balanced", { prefer: ["anthropic", "google"] });

Provider-level default (applies to every call unless overridden):

const provider = createFSDProvider({
  groups: defaultGroups,
  providerPreference: "anthropic",
});

provider("balanced");                             // uses anthropic models first
provider("balanced", { prefer: "openai" });       // call-site wins
provider("balanced", { prefer: [] });             // explicit "no preference"

With `createModelResolver`

Set the default on the resolver; every "preset/x" string reorders accordingly:

const resolver = createModelResolver({
  providerPreference: "anthropic",
});

Reordering example

Preset preset/large = [openai/gpt-5.4, anthropic/opus, google/gemini-3, anthropic/sonnet].

`prefer`	Order used
`undefined`	openai/gpt-5.4, anthropic/opus, google/gemini-3, anthropic/sonnet
`"anthropic"`	anthropic/opus, anthropic/sonnet, openai/gpt-5.4, google/gemini-3
`["anthropic","google"]`	anthropic/opus, anthropic/sonnet, google/gemini-3, openai/gpt-5.4

Relative order within a provider bucket is preserved — opus stays before sonnet.

Strict mode

By default, preference is a soft preference — if no preferred model is available, the rest of the preset is still tried. Opt in to strict mode for compliance-style use cases ("only ever Anthropic"):

provider("balanced", { prefer: "anthropic", strict: true });

Strict mode throws when no model from the preferred providers is available (either because the preset has none, or because the key/gateway for those providers is missing). The error message names the preset and the preferred providers.

Precedence

Highest wins:

Call-site { prefer } on provider(...)
createFSDProvider({ providerPreference }) (or createModelResolver({ providerPreference }))
Nothing — preset's natural order (today's behavior; fully backward-compatible)

Call-site prefer is an override, not a merge. An empty array prefer: [] explicitly means "no preference" — it does not re-inherit the provider-level default.

Dynamic preference (per-input or per-user)

Use the existing model: (input, ctx) => ... callback form, not a new mechanism:

const chat = generator({
  name: "chat",
  model: (input, ctx) =>
    provider("balanced", { prefer: ctx.user.state.preferredProvider }),
});

The framework has one dynamism mechanism. Call sites that need per-request preference read user state (or input, or any other source) and pass the result explicitly.

Introspection

provider.explain(groupName, options?) returns the ordered candidate list with availability status, plus the model the resolver would choose. Useful for debugging and for building UI selectors.

provider.explain("balanced", { prefer: "anthropic" });
// {
//   preset: "balanced",
//   prefer: ["anthropic"],
//   candidates: [
//     { modelId: "anthropic/claude-sonnet-4-6", providerName: "anthropic",
//       available: true, source: "key" },
//     { modelId: "openai/gpt-5.4", providerName: "openai",
//       available: true, source: "gateway", gateway: "vercel" },
//     { modelId: "google/gemini-3-flash", providerName: "google",
//       available: false, reason: "no-key-no-gateway" },
//   ],
//   willUse: "anthropic/claude-sonnet-4-6",
// }

Retry and Fallback

The fallback behavior is configurable:

const resolver = createModelResolver({
  retryPolicy: {
    maxAttemptsPerModel: 3,  // default: 2
    baseDelayMs: 500,        // default: 1000
    maxDelayMs: 15000,       // default: 10000
  },
});

When a model call fails:

If the error is retryable (429, 500, 502, 503, network errors), retry the same model with exponential backoff
After maxAttemptsPerModel retries, move to the next model in the list
Non-retryable errors (auth failures, bad requests) skip directly to the next model
If all models are exhausted, throw with a summary of every error

Streaming

Streaming uses a simpler fallback: if a stream fails before yielding its first chunk, the next model is tried. Mid-stream failures propagate to the caller — there's no way to transparently resume a stream from a different model.

Model String Format

Model strings use slash format:

Format	Example	Description
`provider/model`	`"openai/gpt-5.4"`	Direct provider
`gateway/provider/model`	`"vercel/openai/gpt-5.4"`	Via gateway
`preset/name`	`"preset/fast"`	Built-in preset

Introspection

Check what's available at runtime:

resolver.presets();              // ["fast", "thinking", "balanced"]
resolver.available("fast");      // ["anthropic/claude-sonnet-4-6", "openai/gpt-5.4-mini"]

available() returns only the models in a preset that have a working provider configured.

Dynamic Model Selection

Use a function for model to pick presets based on input:

const adaptive = generator({
  name: "adaptive",
  model: (input, ctx) => {
    return input.needsReasoning
      ? "preset/thinking-small"
      : "preset/small";
  },
});

Relationship to Model Resolver

createModelResolver handles both model resolution and presets in a unified API:

Model strings like "openai/gpt-5.4" are resolved to concrete AI SDK model instances
Presets like "preset/fast" resolve through the preset's model list with built-in fallback
Array fallback like ["openai/gpt-5.4", "anthropic/claude-sonnet-4-6"] tries models in order

Zero-config usage auto-detects providers from environment variables:

const resolver = createModelResolver();

The Problem​

Quick Start​

Default Presets​

Provider Detection​

Explicit Keys​

Gateways​

Vercel AI Gateway​

OpenRouter​

Priority​

Custom Presets​

Preset Defaults​

Provider Preference​

With createFSDProvider​

With createModelResolver​

Reordering example​

Strict mode​

Precedence​

Dynamic preference (per-input or per-user)​

Introspection​

Retry and Fallback​

Streaming​

Model String Format​

Introspection​

Dynamic Model Selection​

Relationship to Model Resolver​

The Problem

Quick Start

Default Presets

Provider Detection

Explicit Keys

Gateways

Vercel AI Gateway

OpenRouter

Priority

Custom Presets

Preset Defaults

Provider Preference

With `createFSDProvider`

With `createModelResolver`

Reordering example

Strict mode

Precedence

Dynamic preference (per-input or per-user)

Introspection

Retry and Fallback

Streaming

Model String Format

Introspection

Dynamic Model Selection

Relationship to Model Resolver