continuity · substrate · open source

The continuity layer for everything you do with AI.

One open source server. Every tool you use, every device you work from. No cloud rental, no vendor lock-in, and the continuity is yours.

where it started

I was copying conversations between Claude and ChatGPT, generating handoff docs, re-explaining the same decisions over and over. That's when it hit me: I was the continuity layer between every AI tool, had to be it for them, and worst of all, I'm lossy too.

Brandon Lehmann, creator of Neural Ram

the job you didn't sign up for

You're already doing this by hand.

You've re-explained the same project to a blank chat more times than you can count. You keep a doc of decisions so the next tool can catch up, and it's stale by the time you've switched. The sharpest thing you worked out last week is buried in a conversation you'll never scroll back to. The carrying, the re-explaining, the copy-paste from one window to the next: that's a job, and right now it's yours.

Nobody's good at it. You lose the thread, you lose the nuance, you lose the best version of the idea, and you never notice the moment it slips. Whether you're wiring up agents or just living across a dozen tabs, nram keeps the thread.

a layer, not an app

Your agent reads. nram remembers.

Memory today lives inside one tool: one app, one agent, one vendor. nram is the layer underneath them instead, not another memory app bolted onto one of them. Researching on a laptop, coding on a desktop, drafting on a tablet, picking it back up on your phone, switching between Claude, ChatGPT, Grok, Mistral, Perplexity, Cursor, and your own scripts: none of that should reset the work every time you change rooms.

Your agent already reads the PDF, watches the video, runs the test, scrapes the page. nram's job is to keep what mattered. Across every tool. Across every conversation. On infrastructure that belongs to you.

one substrate, many jobs

One server, not four separate tools.

A single server covers work that today is split across four separate products: conversational memory, document and corpus recall, standing rules, and agent state. One substrate does all of it, so there's nothing to stitch together and nothing to keep in sync.

Conversational continuity
Memory that survives across sessions, tools, and vendors, reachable over MCP.
Document and corpus recall
Semantic search and an entity-deduped knowledge graph over your stored corpus. A substrate, not a chat UI.
Procedural rules
Verbatim standing rules and conventions an assistant loads at session start. Returned byte-for-byte, never paraphrased.
Agent memory
Persistent memory for coding, research, and custom agents, with consolidation and a knowledge graph on top.
like sleeping on it

The best thinking happens offline.

You've felt it. The fix that arrives in the shower, the connection that surfaces on a walk, the problem that's somehow simpler after a night's sleep. Your mind keeps working when you step away from it, sorting what counts from what doesn't, settling what was left unsettled.

nram does the same. What matters carries forward, across tools, across devices, across weeks. While nram sits idle, it dreams: folding in what's new, resolving contradictions instead of stacking them, letting the stale fade. You come back to memory that's been refined while you were gone, not just stored.

what "self-hosted" actually means

A server, not a script.

"Self-hosted" in this space usually means a Python library you embed, a localhost shim with no auth, or an open-source wrapper sitting on rented infrastructure. None of them survive the moment "self-hosted" was supposed to matter. nram is a real server.

  • not a library
    A single MIT binary you run as a server. SQLite with a pure-Go vector index by default, Postgres with pgvector or Qdrant at scale.
  • not a localhost shim
    OAuth 2.0 with PKCE and dynamic client registration. WebAuthn passkeys. Per-org OIDC SSO. Your laptop, desktop, and phone see the same brain.
  • not single-user
    Organizations, projects, hierarchical namespaces. RBAC across five roles. Your server is shared, your memories stay yours.
  • not stdio-only
    MCP over Streamable HTTP. Plus REST, SSE with reconnect, signed webhooks, Prometheus at /metrics. Every tool you use shares one server.
  • not locked to one vendor
    Runs on OpenAI, Anthropic, Gemini, Ollama, OpenRouter, or any OpenAI-compatible endpoint. Swap providers without moving your memory.
  • not a black box
    A Web Console for organizations, projects, providers, the knowledge graph, the dreaming cycle, and usage analytics. See exactly what your memory is doing.
under the hood

More than a database with vector search.

nram does the part of memory that's actually hard.

Hybrid recall
  • Vector + lexical (FTS5 on SQLite, tsvector on Postgres), fused with reciprocal rank fusion
  • MMR rerank kills near-duplicate clusters
  • Six ranking terms: similarity, recency, importance, frequency, graph relevance, confidence
  • Query augmentation paraphrases each memory into retrieval queries, so recall matches the way you ask
  • Every weight tunable per project
Knowledge graph
  • Built from entities and relationships the enrichment pipeline extracts
  • An ingestion judge decides add / update / delete / none against near-duplicates before extraction runs
  • Multi-hop traversal, operator-tunable depth
  • Updates supersede rather than overwrite, with lineage kept
  • Visualized in the Web Console
Dreaming
  • Nine phases, only when nram's idle and something's actually changed
  • Entity dedup, embedding and augmentation backfill
  • Paraphrase fold, transitive inference, contradiction detection
  • Consolidation with novelty audit, pruning, weight recalc
  • Contradictions get resolved, not stacked; confidence decays and recall reinforces, so memory stays current
Memory tiers
  • Procedural tier stores standing rules verbatim, never embedded or rewritten
  • Persona (about_me) tier holds identity and preferences, surfaced on every recall
  • Global tier carries world-knowledge across every project
  • Project memories run the full enrichment and dreaming pipeline
open source. first, last, forever.

Open source. No asterisks.

nram is free and open software, and always will be. You can run it, read it, change it, and build on it, for anything you want. No one, us included, can ever take that back. No bait and switch. No "enterprise edition" that hides the features you actually need behind a contract. The substrate stays open. The economics stay aligned with the people using it.

Because it runs on your infrastructure, nothing happens to your memory you can't see. Every memory keeps its source and lineage, nothing gets quietly overwritten, and each dreaming cycle writes an audit log. You can always trace why nram knows what it knows.

Run it.

Go 1.26+, Node 18+, a few minutes. SQLite by default, no signup, no cloud.

$ git clone https://github.com/nram-ai/nram && cd nram && make build && ./nram
about the name

Working memory is what your brain holds while you're working. It moves between tools without you thinking about it. Neural Ram is that, for your AI.