Pyre 0.1.x · Public BetaMCP-nativeApache 2.0 · open-coreAuto-detects your hardwareLatestPluggable storage in Engram · v0.2 desktop shipped

The control plane
for local intelligence.

Engram remembers. Persona speaks. Cortex knows. Pyre brings them to life.

A multi-agent AI runtime that runs on your hardware. Orchestrates any model from any provider — local or cloud, switched per conversation. Composes Engram for personal memory, Persona for personality, and Cortex for company memory — each one a standalone MCP server you can also use in Claude Desktop, Cursor, Cline, or any MCP-compatible client.

Qwen3-14Bon a9070 XT30K visible/60–80K effectivecontext ·8-hour agent sessions·$0 API spend.
$0
API spend
Local models, BYOK cloud
200K
Effective context
On a 16 GB consumer GPU
8 hr
Agent sessions
Without context blowup
// The moat

The local cognitive stack.
Free in Core.

Ollama and LM Studio run models. Pyre runs an orchestration system on top — four engineered pieces that turn 16 GB of consumer VRAM into something that feels like a 200K-context cloud model. Co-designed across Engram, Persona, and the Pyre engine, so the components compound instead of fighting each other.

01@onenomad/pyre-context-budget

Context Budget Engine

One tested module owns the watermark math: clamp(floor(ctx × 0.65), 2k, 24k). Slot-aware so the runtime decides what to compact instead of dropping the conversation. Renderer and engine read from the same source — no drift between client and server.

02@onenomad/pyre-tool-vault

Tool Output Vault

Pyre exposes ~70 tools — that's 70 ways to blow your context. The Vault catches tool returns, stores raw output on disk, and hands the agent a structured summary plus a get_tool_output(id) escape hatch. 40–60% token reduction on agentic sessions.

03@onenomad/pyre-compaction-sidecar

Compaction Sidecar

A small model on a parallel slot summarizes scrollback without touching your main inference loop. OpenAI-compatible — works with llama-server, Ollama, vLLM, OpenRouter. Drop-in upgrade for the Vault summarizer; falls through gracefully on outage.

04@onenomad/persona-mcp

Persona minimal-context mode

Soul + persona at three sizes — minimal (~400 tok), standard (~1-2k), full (~3-16k). Pick the budget; keep the personality. Drops VOICE & STYLE on minimal so a 14B model on a 16 GB GPU still feels like itself at 30K visible context.

// Receipts
92%
LoCoMo R@10
Engram, today — beats Mem0 / Zep / Letta
40–60%
Token reduction
Tool Output Vault on agentic runs
8 hr
Agent run, no blowup
Side-by-side video vs vanilla auto-truncation
Soon
LongBench-v2 · HELMET · RULER
In the eval queue · receipts coming
// The Stack

Four primitives.
One runtime.

Same runtime from a single user's laptop to a company's on-prem deployment — local models or any cloud provider, your data, your hardware. Engram and Persona are standalone MCP packages on npm — composed by Pyre, usable from any MCP client.

Flagship · Beta

Pyre

The control plane for local intelligence.

Runs Qwen3-14B on a 16 GB consumer GPU with up to 200 K effective context and 8-hour agent sessions at zero API spend. Multi-agent runtime that adapts to your hardware, switches between local and any cloud provider per conversation, and remembers everything via Engram. Pro keeps your agents, personal memory, and personality continuous across every device. Enterprise adds on-prem deployment, SSO, audit, and managed company-memory ingestion via Cortex.

Personal · FreePro · $20/moEnterprise · LicensedMCP-nativeMac · Win · Linux
Local
llama.cpp · MLX · Ollama
Cloud
OpenAI · Anthropic · OpenRouter
Personal memory
Engram
Identity
Persona
Apps
Desktop · CLI · Web · Chrome
License
Apache 2.0 · open-core

Engram

MCP server
remembers

Long-term personal memory for AI agents. 92% R@10 on LoCoMo — beats Mem0, Zep, Letta, and ChatGPT memory. Pluggable storage backend: the same codebase serves a local user via LanceDB or a multi-tenant cloud deployment via Postgres + pgvector.

$npm install @onenomad/engram-mcp
License · Apache 2.0·GitHub·npm

Persona

MCP server
speaks

Evolving personality for AI agents. Three-part soul system keeps a coherent voice that grows with you instead of resetting every session.

  • Soul User territory — PERSONALITY.md, STYLE.md, SKILL.md files you edit directly.
  • Role Overlays for context — developer, designer, pm, writer, researcher.
  • Journal Persona's territory — evolution proposals land here before you apply them.
$npm install @onenomad/persona-mcp
License · Apache 2.0·GitHub·npm

Cortex

MCP server
knows

Your company’s knowledge as one searchable surface — Confluence, Jira, Linear, Notion, Obsidian, Bitbucket, GitHub, Slack — with permission-aware retrieval and on-prem deployment. Stateless on the answer side: retrieval and ranking happen in Cortex, LLM calls happen in the connected runtime. No duplicated LLM cost in the multi-tenant tier — enterprises bring their own model.

License · Source-available · commercial·onenomad.dev/cortex
// Cross-client

One memory. One personality. Every client.

Engram and Persona are MCP servers. Any MCP-compatible client connects to the same memory and the same personality — Pyre, Claude Desktop, Cursor, Cline, Windsurf, Continue, ChatGPT connectors. Switch tools without re-introducing yourself.

MCP clientPyre
MCP clientClaude Desktop
MCP clientCursor
MCP clientCline
MCP clientWindsurf
MCP clientContinue
MCP clientChatGPT connectors
Cognitive layer
Engram

One memory follows you across every MCP client.

Persona

One personality follows you across every MCP client.

Most memory products lock you into their app. Engram and Persona are npm packages with open MCP servers — your data is yours, and so is the choice of client.

// Tiers

Free locally.
Pay for continuity.

Open-core funnel: free local runtime → paid cloud features → enterprise contracts. Same product, three deployment shapes.

Pyre · Core
Personal · Free forever
$0Apache 2.0 · open-core
Install
  • Full local runtime — every model, every provider
  • Single-machine personal memory (Engram) + personality (Persona)
  • Multi-agent orchestration up to your hardware limit
  • Desktop · CLI · Web server · Chrome companion
  • Local cognitive stack — 200K effective context on 16 GB GPU
Most popular
Pyre · Pro
Cloud · Compounding personal AI
$20per month · or $200/yr
Start free trial
  • Everything in Core
  • Your Persona, personal memory, and project context follow you across every device — E2E encrypted
  • Always-on background agents — long sessions keep running when your laptop sleeps
  • Nightly E2E-encrypted backups — restore your AI on any new machine in minutes
  • Full memory portability — export your Persona and Engram any time. Your data, your call.
  • Curated plugin & SKILL.md catalog
Pyre · Enterprise
Licensed · On-prem
CustomAnnual contracts
Talk to us
  • Everything in Pro
  • On-prem deployment — Terraform module, Helm chart, air-gap installer
  • SSO / SAML — Okta, Azure AD, Google Workspace
  • Audit logging, RBAC, configurable retention
  • Cortex Enterprise — managed company-memory ingestion
  • 99.9% SLA + implementation services
// Not this

Pyre is not...

Helps to know what Pyre isn't before deciding if it's for you.

  • × A modelWe don't train. Bring whichever you trust.
  • × A chat wrapperNot competing with the ChatGPT / Claude / Gemini surface.
  • × A cloud inference serviceSelling cloud inference contradicts the local-first brand.
  • × A workflow builderNot n8n / Zapier / Make. Different problem.
// Why Pyre

vs the alternatives

Pyre overlaps with several categories without belonging to any of them. Here's what we do that they don't.

vs Ollama · LM StudioRun models locally

Multi-agent runtime + personal memory + persona on top. The local cognitive stack (CBE · Vault · Sidecar · Persona-min) makes Pyre Core measurably better at long-running agentic work — not just a prettier model launcher.

vs ChatGPT · Claude · GeminiHosted chat with one model

Local-first by default. Multi-provider — switch model per conversation. Persona persists across sessions; Engram remembers. Your data stays on your machine unless you choose otherwise.

vs Cursor · Continue · ClineIDE-embedded AI

Different surface. Pyre orchestrates broader work — research, docs, agents, glue between tools — not just coding. Use Pyre alongside your IDE, not instead of it.

vs Glean · Microsoft 365 CopilotWorkplace search + AI answers

Cortex returns answers. Pyre takes action. Open-core, on-prem option, MCP-native, multi-source — not locked to one vendor's collaboration suite.

// Philosophy

We don't train models.
We orchestrate them.

Most AI tools today are chat wrappers around a single hosted model, or vertical apps that lock you into one stack. Pyre's bet: own the layer between your data and the model. That layer is durable when models change, durable when providers change, and accumulates value as your persona and personal memory deepen.

01

Local-first by default

Pyre runs on your machine — CPU, Mac Metal, CUDA, ROCm, Vulkan all auto-detected. Your data never leaves unless you choose a cloud provider per conversation. No vendor lock-in, no telemetry tax.

02

Open-core, not open-bait

Apache 2.0, open source from day one. Cloud and enterprise tiers fund the work. The Core tier is genuinely good, not a crippled demo. Cortex is the one exception — source-available under a separate commercial license.

03

Own the cognitive layer

Models are commoditizing. The durable value is personal memory, personality, and knowledge — the layer between your data and any model. Pyre owns that layer so you can swap models without losing yourself.

Adapters
OpenAI · Anthropic · OpenRouter · Ollama · llama.cpp · MLX · LM Studio
Apps
Desktop · CLI · Web · Chrome
Hardware
CPU · Metal · CUDA · ROCm · Vulkan
License
Apache 2.0 · open source
// About

Built by someone who lives in the workflow.

Matt Stvartak. Senior JavaScript engineer. Solo founder of OneNomad.

The whole stack — Pyre, Engram, Persona, Cortex — was shipped in under 30 days using Claude Code as an engineering pair. Pyre is built by someone who lives in the exact AI-augmented workflow Pyre is designed for.

Investors

Pyre is in public beta. We're raising a small pre-seed. Investor inquiries: hello@onenomad.dev.

// Get Pyre

One command. Any machine.

Pyre auto-detects your hardware on first launch — CPU, Mac Metal, CUDA, ROCm, or Vulkan — and picks a model that fits. No config files, no ceremony.

terminal
# macOS / Linux
curl -fsSL https://getpyre.dev/install.sh | sh

# Or via npm
npm install -g @onenomad/pyre

# Run
pyre start
// Native installersAll releases →
Windows
Pyre Desktop
Installer.exe
Linux
Pyre Desktop
AppImagex86_64