Models
The shared model catalog and managed provider boundaries.
Alumia exposes a shared app model picker with 38 visible models across 14 providers. The full internal catalog tracks 43 model entries across 14 providers. Each agent is configured with one supported model, and you can change that model when the workflow calls for a different balance of quality, speed, and cost.
Managed models
With managed models, Alumia handles provider credentials, request routing, retries, streaming, and metering behind the product surface. Public copy should describe this as managed model reliability, not expose internal route details.
Current visible catalog examples include GPT-5.4, Claude Sonnet 4.6, MiniMax M3, Qwen3.7 Max, MiMo V2.5 Pro, GPT OSS 120B, and NVIDIA Llama 3.3 Nemotron Super 49B. Internal route metadata also retains hidden, disabled, or route-only entries such as GPT-5.5 with its 1,050,000-token context window, MiniMax M2.7 for saved-agent compatibility, and NVIDIA Nemotron 3 Super 120B A12B for safe migration and fallback. The visible catalog is validated before deploy so provider-specific model regressions are caught before users hit chat, apps, or agent execution.
Runtime settings (per agent)
On the agent detail page, optional runtime controls persist in the agent's settings JSON and apply on every chat run for that agent:
- Fast mode — when the selected model has a mapped faster sibling (for example GPT-5.4 → GPT-5.4-mini), chat resolves to the faster model while the picker still shows your chosen model. Models without a mapping ignore fast mode.
- Reasoning effort — for OpenAI reasoning-capable models, choose
low,medium, orhigh. Other providers ignore this setting.
These settings are ordinary agent saves and do not require step-up unless you are also changing privileged tool or peer restrictions.
Agents switching their own model
Agents have a set_agent_model tool and a list_models companion, so an agent can move itself to a different model mid-session — for a harder reasoning step, a cheaper draft, or a different provider. Agents only do this when you explicitly ask for a different model, provider, speed/quality tradeoff, or reasoning depth; they never switch silently as an optimization. The change persists on the agent like any other model save.
Managed provider boundaries
BYOK model-token calls cost 0 Alumia credits, including on Personal, while managed model calls draw fractional credits from the org's Billing buckets. The user-facing model picker stays provider-agnostic: users choose logical models, while Alumia keeps managed-provider route details internal.