The control plane
for local intelligence.
Engram remembers. Persona speaks. Cortex knows. Pyre brings them to life.
A multi-agent AI runtime that runs on your hardware. Orchestrates any model from any provider — local or cloud, switched per conversation. Composes Engram for personal memory, Persona for personality, and Cortex for company memory — each one a standalone MCP server you can also use in Claude Desktop, Cursor, Cline, or any MCP-compatible client.
The local cognitive stack.
Free in Core.
Ollama and LM Studio run models. Pyre runs an orchestration system on top — four engineered pieces that turn 16 GB of consumer VRAM into something that feels like a 200K-context cloud model. Co-designed across Engram, Persona, and the Pyre engine, so the components compound instead of fighting each other.
@onenomad/pyre-context-budgetContext Budget Engine
One tested module owns the watermark math: clamp(floor(ctx × 0.65), 2k, 24k). Slot-aware so the runtime decides what to compact instead of dropping the conversation. Renderer and engine read from the same source — no drift between client and server.
@onenomad/pyre-tool-vaultTool Output Vault
Pyre exposes ~70 tools — that's 70 ways to blow your context. The Vault catches tool returns, stores raw output on disk, and hands the agent a structured summary plus a get_tool_output(id) escape hatch. 40–60% token reduction on agentic sessions.
@onenomad/pyre-compaction-sidecarCompaction Sidecar
A small model on a parallel slot summarizes scrollback without touching your main inference loop. OpenAI-compatible — works with llama-server, Ollama, vLLM, OpenRouter. Drop-in upgrade for the Vault summarizer; falls through gracefully on outage.
@onenomad/persona-mcpPersona minimal-context mode
Soul + persona at three sizes — minimal (~400 tok), standard (~1-2k), full (~3-16k). Pick the budget; keep the personality. Drops VOICE & STYLE on minimal so a 14B model on a 16 GB GPU still feels like itself at 30K visible context.
Four primitives.
One runtime.
Same runtime from a single user's laptop to a company's on-prem deployment — local models or any cloud provider, your data, your hardware. Engram and Persona are standalone MCP packages on npm — composed by Pyre, usable from any MCP client.
Pyre
The control plane for local intelligence.
Runs Qwen3-14B on a 16 GB consumer GPU with up to 200 K effective context and 8-hour agent sessions at zero API spend. Multi-agent runtime that adapts to your hardware, switches between local and any cloud provider per conversation, and remembers everything via Engram. Pro keeps your agents, personal memory, and personality continuous across every device. Enterprise adds on-prem deployment, SSO, audit, and managed company-memory ingestion via Cortex.
Engram
MCP serverLong-term personal memory for AI agents. 92% R@10 on LoCoMo — beats Mem0, Zep, Letta, and ChatGPT memory. Pluggable storage backend: the same codebase serves a local user via LanceDB or a multi-tenant cloud deployment via Postgres + pgvector.
npm install @onenomad/engram-mcpPersona
MCP serverEvolving personality for AI agents. Three-part soul system keeps a coherent voice that grows with you instead of resetting every session.
- Soul — User territory — PERSONALITY.md, STYLE.md, SKILL.md files you edit directly.
- Role — Overlays for context — developer, designer, pm, writer, researcher.
- Journal — Persona's territory — evolution proposals land here before you apply them.
npm install @onenomad/persona-mcpCortex
MCP serverYour company’s knowledge as one searchable surface — Confluence, Jira, Linear, Notion, Obsidian, Bitbucket, GitHub, Slack — with permission-aware retrieval and on-prem deployment. Stateless on the answer side: retrieval and ranking happen in Cortex, LLM calls happen in the connected runtime. No duplicated LLM cost in the multi-tenant tier — enterprises bring their own model.
One memory. One personality. Every client.
Engram and Persona are MCP servers. Any MCP-compatible client connects to the same memory and the same personality — Pyre, Claude Desktop, Cursor, Cline, Windsurf, Continue, ChatGPT connectors. Switch tools without re-introducing yourself.
One memory follows you across every MCP client.
One personality follows you across every MCP client.
Most memory products lock you into their app. Engram and Persona are npm packages with open MCP servers — your data is yours, and so is the choice of client.
Free locally.
Pay for continuity.
Open-core funnel: free local runtime → paid cloud features → enterprise contracts. Same product, three deployment shapes.
- Full local runtime — every model, every provider
- Single-machine personal memory (Engram) + personality (Persona)
- Multi-agent orchestration up to your hardware limit
- Desktop · CLI · Web server · Chrome companion
- Local cognitive stack — 200K effective context on 16 GB GPU
- Everything in Core
- Your Persona, personal memory, and project context follow you across every device — E2E encrypted
- Always-on background agents — long sessions keep running when your laptop sleeps
- Nightly E2E-encrypted backups — restore your AI on any new machine in minutes
- Full memory portability — export your Persona and Engram any time. Your data, your call.
- Curated plugin & SKILL.md catalog
- Everything in Pro
- On-prem deployment — Terraform module, Helm chart, air-gap installer
- SSO / SAML — Okta, Azure AD, Google Workspace
- Audit logging, RBAC, configurable retention
- Cortex Enterprise — managed company-memory ingestion
- 99.9% SLA + implementation services
Pyre is not...
Helps to know what Pyre isn't before deciding if it's for you.
- × A modelWe don't train. Bring whichever you trust.
- × A chat wrapperNot competing with the ChatGPT / Claude / Gemini surface.
- × A cloud inference serviceSelling cloud inference contradicts the local-first brand.
- × A workflow builderNot n8n / Zapier / Make. Different problem.
vs the alternatives
Pyre overlaps with several categories without belonging to any of them. Here's what we do that they don't.
Multi-agent runtime + personal memory + persona on top. The local cognitive stack (CBE · Vault · Sidecar · Persona-min) makes Pyre Core measurably better at long-running agentic work — not just a prettier model launcher.
Local-first by default. Multi-provider — switch model per conversation. Persona persists across sessions; Engram remembers. Your data stays on your machine unless you choose otherwise.
Different surface. Pyre orchestrates broader work — research, docs, agents, glue between tools — not just coding. Use Pyre alongside your IDE, not instead of it.
Cortex returns answers. Pyre takes action. Open-core, on-prem option, MCP-native, multi-source — not locked to one vendor's collaboration suite.
We don't train models.
We orchestrate them.
Most AI tools today are chat wrappers around a single hosted model, or vertical apps that lock you into one stack. Pyre's bet: own the layer between your data and the model. That layer is durable when models change, durable when providers change, and accumulates value as your persona and personal memory deepen.
Local-first by default
Pyre runs on your machine — CPU, Mac Metal, CUDA, ROCm, Vulkan all auto-detected. Your data never leaves unless you choose a cloud provider per conversation. No vendor lock-in, no telemetry tax.
Open-core, not open-bait
Apache 2.0, open source from day one. Cloud and enterprise tiers fund the work. The Core tier is genuinely good, not a crippled demo. Cortex is the one exception — source-available under a separate commercial license.
Own the cognitive layer
Models are commoditizing. The durable value is personal memory, personality, and knowledge — the layer between your data and any model. Pyre owns that layer so you can swap models without losing yourself.
Built by someone who lives in the workflow.
Matt Stvartak. Senior JavaScript engineer. Solo founder of OneNomad.
The whole stack — Pyre, Engram, Persona, Cortex — was shipped in under 30 days using Claude Code as an engineering pair. Pyre is built by someone who lives in the exact AI-augmented workflow Pyre is designed for.
Pyre is in public beta. We're raising a small pre-seed. Investor inquiries: hello@onenomad.dev.
One command. Any machine.
Pyre auto-detects your hardware on first launch — CPU, Mac Metal, CUDA, ROCm, or Vulkan — and picks a model that fits. No config files, no ceremony.
# macOS / Linux
curl -fsSL https://getpyre.dev/install.sh | sh
# Or via npm
npm install -g @onenomad/pyre
# Run
pyre start