Post · May 22, 2026

Six stores, one verb test: how my agent system remembers

Most agent systems have one memory abstraction. Mine has six writable stores plus one view layer, separated by what verb you do with the data. Compiled-truth plus timeline patterns borrowed from GBrain, MemPalace, MemGPT, PARA, and Wikidata. With file paths.

Friday, 2026-04-10, late afternoon. I’m twelve hundred words into a planning doc on how to import Garry Tan’s GBrain compiled-truth pattern into my own agent substrate, and I notice the doc is wrong. I had split the memory architecture into four layers by length of content. Short facts go to MemPalace, longer prose goes to a new entities/ directory, durable rules go to auto-memory, raw transcripts stay in session JSONLs. Read out loud, the split sounds clean. Stress-tested for ten minutes, the split falls apart. “Jane works at Acme” is one short sentence that could plausibly live in any of three stores. Length is the worst possible axis to split on. Drift is guaranteed within weeks.

I deleted the section and rewrote it around the verb instead. What do you intend to do with the fact? Recall it mid-conversation? Read it before a conversation? Derive a pattern across many conversations? Preserve someone’s exact words? Each verb is a different store. The architecture stopped collapsing under questioning the moment I named the verbs.

Six weeks later that document is the canonical filing rule for my agent system. Every Claude Code session in my repo reads it. Every entity page, every MemPalace drawer, every original captured by the scribe pipeline got routed by it. This post is about the six stores, the one view layer over them, the rule that keeps them straight, and the open problems I have not yet solved.

One framing note before the machinery. What follows is how I think about agent memory in my own personal system, Vajra, which I have been running and tuning for myself for months. It is deliberately over-built. Six writable stores is more than most people need, and I would not tell anyone to start there. Neutron, the open-source harness I extracted from this system, does not ship all six. It productizes the part that actually carries the weight for a self-hosted agent: durable, persistent memory, powered by GBrain, Garry Tan’s entity-first knowledge format (one file per entity, a rewritable compiled head, an append-only timeline below a horizontal rule). GBrain is the load-bearing pattern under everything below, and it is the one piece I would tell you to copy first. Read the six-store architecture as one person’s maxed-out version of an idea whose core ships in the box.

Part 1: Six stores, one view layer

Most agent systems I have read about have one memory abstraction. A vector DB, or the session transcript, or some compaction-style window-sliding heuristic glued on top. The framing is usually “memory” singular.

My system has six distinct writable stores and one view layer over them. Each store has its own write trigger, its own read pattern, its own file path on disk, and its own typical content shape. The view layer indexes all of them for hybrid keyword-plus-vector search. Here is the map.

┌────────────────────────────────────────────────────────────────────┐
│                          QMD (view layer)                          │
│        BM25 + vector search across every writable store below      │
└────────────────────────────────────────────────────────────────────┘
           ▲              ▲             ▲             ▲
           │              │             │             │
┌──────────┴────┐ ┌───────┴────┐ ┌──────┴────┐ ┌──────┴────────┐
│ session JSONL │ │ auto-mem   │ │ MemPalace │ │ entities/     │
│ ~/.claude/    │ │ MEMORY.md  │ │ MCP, KG,  │ │ wiki layer    │
│ projects/*    │ │ + per-fact │ │ rooms +   │ │ compiled head │
│ .jsonl        │ │ files      │ │ drawers   │ │ + timeline    │
│               │ │            │ │           │ │               │
│ recent ctx    │ │ durable    │ │ recall-   │ │ read-before-  │
│ for the LLM   │ │ user facts │ │ during    │ │ conversation  │
└───────────────┘ └────────────┘ └───────────┘ └───────┬───────┘
                                                       │ originals/
                                                       │ verbatim
                                                       ▼
                                              ┌────────────────┐
                                              │ PARA folders   │
                                              │ Projects/Areas/│
                                              │ Resources/     │
                                              │ Archive/       │
                                              │ STATUS.md per  │
                                              │ workstream     │
                                              └────────────────┘

The legend, store by store.

1. Session JSONLs (Claude Code native)

Path: ~/.claude/projects/-Users-ryan-vajra/<session-id>.jsonl. Every Claude Code session writes a JSONL transcript here, one event per line, including every user message, every tool call, every tool result, every assistant message. The runtime treats this file as the conversation. When a session resumes by ID, this is what gets loaded.

Write trigger: the runtime appends an event for every turn. Read pattern: the LLM reads the recent N tokens as its working memory. This is the only store the LLM accesses without an explicit tool call. Everything else requires a deliberate query.

Content shape: maximally rich, maximally recent. Lots of noise. Compaction trims it when the window fills. This is the “what just happened” store, not the “what I want to remember” store.

2. Auto-memory (per-project durable facts)

Path: ~/.claude/projects/-Users-ryan-vajra/memory/. Each fact is a markdown file with YAML frontmatter declaring its name, description, and type (user, feedback, project, reference). An index file at MEMORY.md is auto-loaded into every session’s system prompt. Individual fact files are loaded on demand when the index suggests relevance.

Write trigger: I (or an agent observing me) decide a behavioral rule, preference, or durable fact is worth carrying across sessions. The agent writes a new file, appends one line to MEMORY.md, exits. Read pattern: every new session reads MEMORY.md in its system prompt. The agent grep-fetches specific files when context demands.

One real entry, lifted verbatim from memory/feedback_no_validating_openings.md:

---
name: No validating openings
description: Don't open replies with praise or self-deprecating validation
type: feedback
originSessionId: bf16dbc2-39bb-434a-b210-30410c17a346
---
Never open replies with phrases like "Your pushback is valid",
"Your reframing is sharper than mine", "Both corrections good",
"Fair call", "Good point", "You're right", "Love this", "Great
question". Also avoid the self-deprecating form ("scrap mine", "I
was wrong, here's the better version"). Both shapes are
boot-licking.

**Why:** Ryan explicitly called this out 2026-04-21 during Neutron
session - "You've really started to become an ass-kisser,
boot-licker. You're starting every reply with this." [...]

**How to apply:**
- Before sending a reply, check the first sentence. If it's about
  who's right rather than what's true, cut it.
- Start with the actual analysis or substantive point. Warmth
  through precision, not hype.
[...]

That file is 17 lines. Every session I open reads the one-line pointer to it in MEMORY.md. The session adapts even when I do not mention the rule. Forty-eight similar files live in the directory today, ranging from “always use SerpAPI for travel” to “no time estimates in specs.”

Auto-memory is the store with the strictest content rule. It is for durable facts only. Anything volatile (current phase, current branch, what’s running) is forbidden here, because the file rots silently and no process refreshes it. I learned that one the hard way in April when I asserted to Ryan that a cutover had not happened, citing a memory file that had been written before the cutover and never updated. Reality had moved; the file had not. The fix was a meta-rule, also in auto-memory: never cite memory for state questions. Always re-read live sources.

3. MemPalace (structured KG, recall-during-conversation)

MemPalace is an MCP server. It exposes a structured knowledge graph organized into wings, rooms, and drawers, plus tunnels between rooms in different wings, plus a diary primitive for per-tenant journals. The schema is graph-shaped, not vector-shaped. Queries traverse typed edges. Results are verbatim drawer text with similarity scores when semantic search runs.

My current graph, real numbers from mempalace_graph_stats at the moment of writing:

{
  "total_rooms": 14,
  "tunnel_rooms": 6,
  "rooms_per_wing": {
    "vajra": 8, "biohacking": 5, "general": 5,
    "ryan": 5, "technical": 5, "alchemy": 4,
    "lumina": 4, "neptune": 4, "robobuddha": 4,
    "book": 3
  },
  "top_tunnels": [
    { "room": "technical",     "count": 5036 },
    { "room": "problems",      "count": 1383 },
    { "room": "architecture",  "count": 998  },
    { "room": "planning",      "count": 787  },
    { "room": "decisions",     "count": 107  }
  ]
}

Thirteen wings, fourteen base rooms, the most active tunnel (“technical”) spans nine wings and holds 5036 drawers. Total drawers across the graph: low five figures. The graph grew organically over about 18 months of writes.

Write trigger: an agent or I decide a fact is worth recalling mid-conversation. The agent calls mempalace_kg_add with the entity, the typed edge, and the value. Or it adds a free-form drawer with mempalace_add_drawer when the structure has not crystallized yet. Read pattern: mempalace_search for semantic lookup, mempalace_kg_query for graph traversal, mempalace_find_tunnels when the agent suspects two wings share content.

MemPalace’s job is fast recall during an active conversation. It is optimized for the agent thinking “do I already know this?” mid-reply and pulling back something terse and citable.

4. entities/ (wiki layer, read-before-conversation)

Path: ~/vajra/entities/. A wiki directory on disk. One markdown file per person, company, concept, meeting, idea. The convention is the GBrain pattern: compiled truth above a horizontal rule, append-only timeline below. Frontmatter declares the slug, aliases, confidence, tier, last-verified date.

Current size on disk: 62 people, 131 companies, plus concepts, meetings, ideas, and originals. About 13,000 lines total. Every file is hand-curated or scribe-extracted with a provenance footnote.

Real example, head-and-tail excerpt from entities/people/ryan-junee.md (the principal-user page):

---
slug: ryan-junee
type: person
aliases: ["Ryan", "Ryan Alexander Junee", ...]
confidence: high
tier: 1
last_verified: 2026-04-10
---

# Ryan Alexander Junee

> Principal of this system. Engineer-founder-operator running a
> portfolio of DTC businesses (Tabs, Amascence, Pristine) and a
> paused acquisition target (Neptune) [...]

## What Ryan Believes (compiled)
- "Hope is not a strategy" - applied to Tabs cashflow 2026-03-31.
  ^[src:observed,conf:high,by:ryan,2026-03-31]
- All phenomena are empty of inherent existence AND choices matter
  with karmic consequences. Engaging fully is how mind knows itself.
  ^[src:self-described,conf:high,by:ryan,2026-04-10]

---

## Timeline
- **2026-04-10** | Phase 2 seed - self-page created from USER.md +
  SOUL.md. <!-- agent:forge ts:2026-04-10T22:40Z -->
- **2026-04-13** | telegram - biohacking protocol detail: BPC-157
  dose is 10 units; morning reminders should auto-pull sleep time
  + HRV from Oura. <!-- agent:scribe ts:2026-04-13T05:41:54Z -->
- **2026-04-13** | telegram - ceremony gear design principle stated:
  live expression + improvisation over studio production; Lyra-8
  sound is the benchmark. <!-- agent:scribe ts:2026-04-13T06:52:53Z -->
[...]

Read everything above the horizontal rule and you have the compiled state of play. Read the timeline and you have the audit trail. The compiled head is rewritable. The timeline is append-only. Every claim has a provenance footnote (^[src:observed,conf:high,by:ryan,2026-03-31]) and every line below the rule has an HTML-comment marker tagging the agent that wrote it and the trigger.

Write trigger: a human or an agent decides an entity needs a curated wiki page. Scribe writes the first version automatically from a Telegram mention (Tier 3 stub plus one timeline entry). Promotions to Tier 1 or 2 happen by hand. Read pattern: agents grep this directory before a conversation about that entity (“who am I about to meet?”). It is the briefing layer.

The structural rule that keeps it honest: only scribe writes below the rule, only humans (or human-approved rewrites) write above the rule. That separation is what makes “compiled truth” a useful abstraction. If anything could rewrite the head at any time, the head would just be the most recent dump from whichever agent ran last. Discipline lives in the writer-matrix.

5. entities/originals/ (verbatim Ryan-as-he-said-it)

Path: ~/vajra/entities/originals/. Frozen excerpts of my exact words. No paraphrase, no cleanup, no summarization. Captured when scribe (see Part 3 of this post) decides a Telegram message contains a frame, a thesis, a rant, or a phrase worth preserving as canonical Ryan-language.

387 files in this directory at last count. They tend to look like this (lifted from 2026-04-13-my-current-thinking-for-v1-is-lyra-8-plu.md):

---
slug: 2026-04-13-my-current-thinking-for-v1-is-lyra-8-plu
type: original
confidence: high
tier: 1
source: telegram
source_date: 2026-04-13
---

# Ceremony Gear v1 + Phase 2 Architecture

> Ryan's first explicit gear-phase plan: Lyra-8 + Blackbox as v1
> core; Phase 2 adds controller (Erae 2), spatial/terrain layer
> (Soma Terra or Solar 42N), and vocal looping (Boss RC505 II).

## Verbatim

<!-- agent:ryan ts:2026-04-13T23:45:21Z -->
<!-- EXACT-PHRASE RULE: do not edit the quoted block below. -->

> my current thinking for v1 is lyra-8 plus blackbox as you
> suggest. Phase 2 adds the erae 2 to control the blackbox,
> potentially a soma terra once I understand more of what it
> does (or solar 42N), and potentially the boss rc505 II for
> vocal looping and layering everything

## Context
- **Where it came from:** Telegram, ceremony gear planning thread
- **Why it matters:** First explicit phased plan - v1 is locked
  (Lyra-8 + Blackbox); Phase 2 is exploratory [...]

---

## Timeline
- **2026-04-13** | First captured - source: telegram tg:1237 [...]
- **2026-04-14** | telegram - Updated: "Tier 2 is the right
  approach I think. Wingie sounds like a maybe for phase 1 [...]"

The HTML comment EXACT-PHRASE RULE is a hard constraint: no agent edits text below it. The verbatim block is immutable. Everything else (the title, the context section, the timeline) is rewriteable. The framework is captured as I said it, with the meta-shape captured around it.

This store exists because my own framing of an idea is irreplaceable. An agent rewriting “what I mean” loses the load-bearing rhythm of how I think about it. Originals preserve the actual artifact.

6. PARA folders (long-form workstream state)

Path: ~/vajra/Projects/<slug>/, ~/vajra/Resources/, ~/vajra/Archive/. PARA is Tiago Forte’s organization scheme (Projects, Areas, Resources, Archive) from Building a Second Brain. In my system, each active workstream gets a Projects/<slug>/ folder containing a STATUS.md, planning docs, research outputs, drafts, and the long-form artifacts that don’t fit anywhere else.

Write trigger: an agent or I update workstream state during work. Read pattern: agents read STATUS.md at the start of any task touching that project. Read patterns are bounded by project slug; the agent doesn’t grep all of PARA, it reads the one STATUS file for the project at hand.

The hard rule: STATUS.md is the single source of truth for project state. Auto-memory does not store project state. MemPalace does not store project state. entities/ does not store project state. STATUS.md does. When something contradicts STATUS.md, STATUS.md wins and the contradicting store gets fixed.

The view layer: QMD

QMD is a local search engine that indexes markdown collections with hybrid BM25 plus vector search. Three query types: lex (BM25 keywords), vec (semantic), hyde (hypothetical-document expansion). My config at ~/.config/qmd/index.yml:

collections:
  memory-dir-main:
    path: /Users/ryan/vajra/Memory/sessions
    pattern: "**/*.md"
  vajra:
    path: /Users/ryan/vajra
    pattern: "**/*.md"
  entities:
    path: /Users/ryan/vajra/entities
    pattern: "**/*.md"

QMD is a view, not a writable store. Nothing originates here. Every entry in QMD is a projection of an underlying markdown file in one of the writable stores. The embedding model is embeddinggemma-300M-Q8_0, a quantized 300M-parameter local model. Search latency is sub-second for the corpus size I have.

A real query, run from a Claude Code session, looking for the canonical “verb test” doc:

searches: [
  { "type": "lex", "query": "\"verb test\" entities" },
  { "type": "vec", "query": "how is memory split between MemPalace and entities" }
]

results: [
  {
    "file": "vajra/docs/plans/2026-04-10-008-feat-gbrain-borrow-list-plan.md",
    "score": 0.93,
    "snippet": "entities/ is a new layer that sits between auto-memory
                (short durable rules) and QMD (fulltext search over
                everything)..."
  },
  {
    "file": "vajra/docs/solutions/2026-04-11-entities-pipeline-compound.md",
    "score": 0.56,
    "snippet": "category: architecture-decisions, tags: [entities,
                scribe, mempalace-rearch, gbrain-pattern]..."
  }
]

Top hit at score 0.93 was the canonical plan. The lex sub-query carried the exact phrase; the vec sub-query carried the meaning. Hybrid is the default because either alone fails on roughly a third of real queries.

Part 2: The verb test

Six stores is a lot of stores. The risk of a multi-store memory architecture is that nobody knows where to write what, the routing rule decays into folklore, drift sets in, and within a quarter the system is back to “memory” singular plus three rotting forks.

The discipline that prevents drift is a single decision rule, named and enforced. Mine is the verb test. Before you decide where a fact goes, decide how you expect to use it. The verb tells you the store.

I want to remember X.

What verb describes how I'll use X?

├─ Recall X during a conversation
│  (mid-reply fact lookup, citation, decision pointer)
│           │
│           └──▶ MemPalace
│               (mempalace_kg_add / mempalace_add_drawer)
│
├─ Read X before a conversation
│  (brief myself on a person, company, concept I'm about to engage)
│           │
│           └──▶ entities/{people,companies,concepts}/<slug>.md
│
├─ Derive a pattern across many observations of X
│  (Ryan tends to fold under pushback, our customers ask for Y, ...)
│           │
│           └──▶ MemPalace drawer ━━tunneled━━▶ entities/<slug>.md
│               (raw observations in MP, synthesis in entities)
│
├─ Preserve X verbatim
│  (Ryan's exact framing, a thesis, a rant, a phrase to immortalize)
│           │
│           └──▶ entities/originals/<date>-<slug>.md
│
├─ Durable user-fact, applies to future agent behavior
│  (preferences, feedback rules, reference pointers)
│           │
│           └──▶ auto-memory file + MEMORY.md pointer
│
└─ Rich recent context (active turn)
            │
            └──▶ session JSONL (the LLM writes this automatically)

That tree is now codified in ~/vajra/entities/RESOLVER.md, the master filing decision document. Every Claude Code session reads it (or its summary line in CLAUDE.md’s Design Principle #2) on boot. Every scribe spawn reads it. Every research agent reads it. When in doubt about where to write a fact, the agent runs the verb test and stops at the first match.

A concrete walkthrough. Suppose I tell my agent in chat: “Don’t book me on red-eyes. I won’t sleep on them and the next day is destroyed.”

Verb test:

Will an agent need to recall this mid-conversation? Sometimes (when booking flights). But the more important question is whether it should shape default behavior. It should.
Is this a durable user-fact that applies to future agent behavior? Yes. It is a preference rule.
Route: auto-memory. File: feedback_no_red_eye_flights.md. Add one line to MEMORY.md. Done.

Next: I tell my agent “Dave at our broker is the wrong person to ask about residency. He’s a CPA, not a tax attorney. Loop in Lily for residency questions.”

Verb test:

Verbatim worth preserving? Not really; the sentence is a routing fact, not a framing.
Read-before-conversation about Dave? Yes. Next time an agent prepares to email Dave, it should know his role boundaries.
Recall-during-conversation about residency? Yes. When an agent is mid-conversation about residency strategy, it should remember the routing rule.
Both verbs apply. Per the resolver, pick the primary use and cross-link the other. Primary use is read-before-conversation (the routing rule is most useful when an agent is briefing itself on Dave). Route: entities/people/dave.md Open Threads + a timeline entry. Cross-link: a MemPalace drawer in tax/decisions tunneled to the entity page.

The verb test is not algorithmic. It is judgmental. But it is judgmental over a fixed dimension (intended-use verb) instead of over a malleable one (content length, content type, agent preference). The same fact routed by length next month would land in a different store. Routed by verb, it lands in the same store consistently.

That stability matters because most memory drift is not a single big bad decision. It is the third reasonable-looking write to the wrong store, six weeks after the architecture got laid down, when the rule has decayed enough that nobody re-derives it from first principles. The verb test resists this because the question “what verb describes how I’ll use this?” is the same question every time.

Part 3: How writes actually happen (the scribe pipeline)

Routing rules are useless if writes don’t happen automatically. A six-store memory architecture that depends on me manually deciding where each fact goes is a six-store memory architecture that has zero facts after three weeks.

The automatic-write pipeline is a sub-agent called scribe. It fires on every Telegram message from me that crosses an 80-character threshold, on inbound emails marked as important by my email-classification worker, on new or edited calendar events, and on dropped audio files in entities/meetings/inbox/. The four trigger sources cover ~90% of where my actual thinking gets externalized.

The fire-and-forget contract is strict. The gateway’s receive thread (where Telegram webhooks land) cannot block on scribe work. Every write to a log file, every spawn, every assertion happens in a detached child after setImmediate defers to the next tick. From gateway/index.ts:3422-3463:

export function maybeSpawnScribe(
  trigger: ScribeTrigger,
  payload: string,
  opts: ScribeSpawnOptions = {},
): boolean {
  if (!SCRIBE_ENABLED) return false
  if (!payload || !payload.trim()) return false

  if (trigger === 'telegram') {
    if (opts.isSystem) return false
    if (opts.isBot) return false
    if (opts.senderId && !ALLOWED_USERS.includes(opts.senderId)) return false
    if (payload.startsWith('SYSTEM:')) return false
    if (payload.startsWith('/')) return false // commands
    if (payload.trim().length < 80) return false
  }

  if (!opts.skipBudget) {
    const acq = scribeTryAcquire(scribeBudgetState, trigger, opts.now)
    if (!acq.ok) {
      console.log(`SCRIBE: rejected trigger=${trigger} reason=${acq.reason}`)
      return false
    }
  }

  // Get the actual spawn work off the receive thread. setImmediate
  // defers until the current microtask queue drains, then the
  // detached child carries the rest. The maybeSpawnScribe call
  // itself is ~2 microseconds.
  setImmediate(() => {
    try {
      doScribeSpawn(trigger, payload, opts)
    } catch (e) {
      console.error(`SCRIBE: spawn failed: ${e}`)
      scribeRelease(scribeBudgetState, false)
    }
  })
  return true
}

The pipeline:

sequenceDiagram
    participant TG as Telegram
    participant GW as Gateway (Bun.serve)
    participant SB as scribe-budget
    participant CH as detached child
    participant WC as whisper-cli
    participant HA as Haiku 4.5 (extract)
    participant FS as entities/ + .raw/

    TG->>GW: webhook (text or voice message)
    alt voice
        GW->>WC: transcribeAudioAsync(audio)
        WC-->>GW: text
    end
    GW->>SB: tryAcquire(trigger=telegram)
    SB-->>GW: ok / rate-limit reason
    GW->>GW: setImmediate (defer off receive thread)
    GW->>CH: spawn-agent.sh scribe <slug> detached
    CH->>HA: claude -p --model haiku-4-5 + scribe.md
    HA->>HA: classify: entity-mention / verbatim / pattern
    HA->>FS: append timeline entry to entities/<type>/<slug>.md
    Note over HA,FS: AND/OR write entities/originals/<date>.md
    Note over HA,FS: AND/OR add MemPalace drawer for derived pattern
    CH-->>GW: completed (writes spawn row to running-agents.jsonl)
    GW->>SB: release(success)

Voice messages get transcribed by whisper-cli first (the same binary at /opt/homebrew/bin/whisper-cli with the large-v3-turbo model). Text messages skip transcription. The extracted text becomes scribe’s payload. Scribe spawns claude -p with the Haiku 4.5 model on a dedicated API key (deliberately isolated from my interactive Claude Code subscription quota, so scribe runaway can never throttle my own sessions). The Haiku spawn reads prompts/scribe.md, classifies the input into one or more of {entity-mention, verbatim-original, derived-pattern, nothing-worth-keeping}, and writes the appropriate files.

A real ceremony-gear message lands in entities/originals/2026-04-13-my-current-thinking-for-v1-is-lyra-8-plu.md as a verbatim original, with the exact-phrase block preserved, plus a one-line timeline entry on entities/people/ryan-junee.md citing the ceremony-gear principle. Two files written from one Telegram message. The agent that decided to write them was Haiku 4.5 running for under three seconds in a detached child while the gateway’s receive thread had already moved on to the next webhook.

Budget discipline matters here. Scribe is capped at 500 spawns per day. Inflight is capped at 4 concurrent. A watchdog reaps any scribe child that runs longer than 300 seconds. The reason is unglamorous: an unbounded extraction pipeline burns API spend without bound, and one bad prompt or one stuck child can spiral fast. The cap is the safety net; the verb test is what makes the extracted writes useful.

Part 4: Inspirations and prior art

Nothing about this architecture is novel in its parts. The combination is mine; the parts were taken from other people who already figured them out.

GBrain (Garry Tan) is the closest direct ancestor. GBrain is a personal knowledge system organized as one file per entity, with a rewritable compiled head and an append-only timeline below a horizontal rule. The frontmatter discipline (slug, aliases, confidence, tier) is GBrain’s. The writer-matrix rule (compiled head is human-editable, timeline is agent-appendable) is GBrain’s. The provenance footnote style (^[src:observed,conf:high,by:ryan,2026-03-31]) is GBrain’s. My entities/ directory is a direct port of the GBrain pattern with no improvements to the file format. The plan document at ~/vajra/docs/plans/2026-04-10-008-feat-gbrain-borrow-list-plan.md is literally titled “Steal GBrain’s Entity-First Wiki Into Vajra.”

MemGPT (Charles Packer et al., UC Berkeley) is the tiered-memory precedent. The MemGPT paper showed how to structure an agent’s memory with a small in-context window plus larger out-of-context archival storage, with explicit transition primitives between layers. The specific rule I borrow most directly: when rewriting a compiled-truth head, never feed the LLM the existing head and ask it to revise. The LLM will hew to the existing head’s phrasing and miss real changes from the timeline. Instead, re-derive the head from the timeline plus the prior head as two separate inputs, then diff. The dreaming pipeline in my system (the corrective re-derivation cron) follows that rule.

PARA (Tiago Forte) gives me the long-form workstream organization. Projects, Areas, Resources, Archive. My Obsidian vault is structured this way. Projects gets a folder per active workstream, each with its own STATUS.md. When a project goes dormant, the folder moves to Archive. PARA’s contribution is the discipline of asking “is this a project (bounded outcome) or an area (ongoing concern) or a resource (reference) or archive?” before filing. Without that question, I had one giant junk folder.

Wikidata is the source-rank-plus-confidence model. Every claim above the horizontal rule in an entity page has a source label (observed, self-described, inferred) and a confidence grade (low, medium, high). When two agents disagree on a fact, the higher source-rank wins. When sources tie, the more recent claim wins. The discipline is borrowed wholesale.

CQRS / event sourcing is the mental model behind the compiled-head-plus-timeline pattern. The timeline is the event log. The compiled head is the projection. Reads serve the projection; writes append to the log; periodically the projection gets re-derived from the log. If you have written CRUD-with-an-audit-trail you know the shape. Fowler’s writeup is the canonical reference.

MemPalace is an MCP server built by my collaborator for this system. It exposes the structured-KG layer as MCP tools (kg_add, kg_query, search, traverse, find_tunnels, diary_read, diary_write). The schema (wings, rooms, drawers, tunnels) is graph-shaped, with similarity scoring at the drawer level for semantic recall. It is the only piece of the substrate I did not directly choose off the shelf, and it earns its keep by making “recall mid-conversation” actually fast.

QMD is a local vector-plus-BM25 search engine that indexes markdown collections. The embedding model is embeddinggemma-300M-Q8_0 (Google’s quantized small embedder). Hybrid lex-plus-vec retrieval with optional HyDE expansion. Standard hybrid-search architecture, executed locally.

Obsidian is the markdown vault container that holds everything above. I write in Obsidian, my agents read and edit the same files, the vault is a git repo, every change is committed. The wikilink syntax ([[entities/people/garry-tan]]) gives me human-readable cross-references that work in both Obsidian’s UI and as plain markdown for agents.

Each of these gets credit because each of them solves a problem I would have solved badly on my own. The novelty of my system, if there is any, is the verb-test routing rule and the fire-and-forget scribe pipeline tying the writes together. Everything else is borrowed and credited.

Part 5: Open problems

I have been running this architecture for about six weeks in its current form. Long enough to find the edges. Short enough that several of them are not yet patched.

Compiled-head drift. The MemGPT rule (re-derive from timeline plus prior head as two separate inputs, then diff) prevents the worst kind of drift. It does not eliminate it. The dreaming cron, which is the corrective re-derivation pass, runs every six hours. It has been broken three separate times since I built it: a CLI flag deprecation, a prompt that hit a token budget on entity pages with long timelines, a regression in the scribe-vs-dreaming write contention check. Each time the bug was silent (the cron fires, the prompt errors, the agent posts a result that says “no changes,” nothing alerts). I have a passive-health-check rule that says any cron with zero-write outcomes for more than seven days posts an alert; that rule was the trip that caught two of the three regressions. The third I caught by hand. The corrective pass is not yet bulletproof.

Confidence drift. Every claim in an entity page carries a confidence grade (low / medium / high). Nothing currently re-evaluates confidence after capture. A claim graded “high confidence” in February stays “high” forever, even if no agent has re-verified it in three months and the world has moved. A stale high-confidence claim is worse than a stale low-confidence claim because it reads as authoritative. The corrective pass needs to age confidence down on claims without recent re-verification. It does not, yet.

Tunneling between MemPalace and entities/. The verb test routes derived-patterns to “a MemPalace drawer tunneled to an entities/ page.” The mechanical implementation of that tunnel is partial. MemPalace exposes find_tunnels and stores edges between rooms, but the second half (auto-updating the entities page when the MemPalace drawer changes) is manual today. I add a tunnel and a timeline entry on the entities page. If the drawer grows new observations later, the entities page does not auto-reflect them until a human or the dreaming pass walks the tunnel. Manual maintenance has held for six weeks; it won’t hold at scale.

Originals deduplication. 387 originals on disk. The scribe extraction heuristic occasionally captures the same framing twice, three days apart, when I say roughly the same thing. There is no automated dedup pass. The weekly brain-lint script does some grep-based clustering, but it is fuzzy-string-based, not semantic. I expect to lift the duplicate detection to a vec query at some point. It is on the backlog.

QMD index lag. QMD is a periodic indexer, not a real-time one. Writes show up after the next qmd update cycle (currently every 30 minutes). For an interactive agent that just wrote something to entities/ and immediately wants to query it via QMD, the freshly-written content is invisible for up to half an hour. The agent works around this by reading the file directly when it knows the path, but for “search across all entities for X” queries the index lag is real.

Multi-tenant memory. Everything above is single-user. I am the only writer. The whole architecture assumes one MemPalace database, one auto-memory directory, one entities/ tree. Productizing this for multiple users (which is the next step, see Part 6) requires per-tenant isolation at every layer, plus cross-tenant access control for shared workspaces. I have a design for this; I have not built it. The risk is real: a multi-tenant memory architecture done wrong leaks one user’s entities/ into another user’s recall path. The privacy quarantine has to be load-bearing, not a soft convention.

The wrong-session-resumed shape. I wrote about this one in the prior post on failure modes. It is a memory-shape failure as much as it is a routing failure: the routing record (a cache) said “session X is active” while the disk (the truth) said “session Y had the most recent turn.” On respawn the gateway trusted the cache. The agent surfaced no memory of work it had actually done, because it had resumed the wrong transcript. The patch is to reconcile the cache against disk on every health probe, but the deeper lesson is that any cache over a memory store is itself a memory store, with its own staleness profile, and treating it as authoritative is a category error.

Part 6: What’s next for Neutron

Neutron is the open-source, self-hosted harness I extracted from this substrate, with the source at github.com/rjunee/neutron (Apache 2.0). Its memory layer is built on GBrain: the entity-first, compiled-head-plus-timeline format from store #4 above is the productized core, not all six of my personal stores. The single-user, self-hosted case is the default. Most of what follows is how that core grows up for teams who invite each other into shared projects: the same architecture, with the edges I’m still sanding down.

The shipping path:

Per-tenant SessionDB. Every tenant gets its own SQLite database, its own GBrain-backed entities/ tree, its own Projects/Resources/Archive folder. The cross-tenant API tags every response with origin_tenant, and a tenant’s scribe pipeline must refuse to persist content marked with a foreign origin (or persist under a quarantined namespace). This closes the “Ryan’s solo-project agent gets tricked into pasting his Amascence data into his personal MemPalace” attack surface.

The user-tenant vs workspace-tenant flavor. Individual users get a user-tenant (one user, has a Telegram bot, has a personal MemPalace). Companies get a workspace-tenant (many members, no bot, member-attributed writes, internal-only API). Both flavors use the same memory architecture; the access control differs.

PARA convention as canonical project shape. Every tenant’s vault is PARA-organized by default. Projects/<slug>/STATUS.md is the canonical workstream-state file. The convention is enforced by the project-creation Core, not by tribal knowledge.

Originals capture as a Neutron Core. The Telegram-voice-to-originals pipeline (scribe plus whisper-cli plus Haiku extraction) becomes a first-class Core that any Neutron tenant can enable. The capture surface generalizes beyond Telegram (web app, mobile app, email, calendar) but the extraction shape stays.

Continuous dreaming pass. The corrective re-derivation cron lifts the MemGPT discipline more rigorously than my hacked-up six-hour cron does. The deliverable: every entity page that has new timeline entries since the last successful re-derive gets queued for re-derivation, with the prior head and the timeline-delta as separate prompt inputs, never co-mingled. The output is a compiled-head proposal plus a diff. Above a confidence threshold the diff auto-applies; below threshold it routes to me for review.

In-app doc interface deprecating Obsidian. Long-term, Neutron tenants edit their vault in an in-app markdown UI that talks to the same on-disk files. Obsidian stays usable for power users (the files are plain markdown), but the default tenant does not need to install anything. PR #158 in the Neutron repo shipped the first read surface (file tree plus markdown viewer). The editor lands next.

The forward path is in ~/repos/neutron/docs/engineering-plan.md. The substrate that ships these changes is Trident, which I wrote about in the prior post. Trident takes spec’d, bounded memory-layer work and ships it autonomously; I write the brief, Argus reviews, the gateway merges. Several of the memory-architecture sprints in the corpus already shipped this way.

I'm productizing this substrate as Neutron for operators who want a real agent system without rebuilding it themselves. Separately I take on a small number of consulting engagements per quarter for teams shipping into production. Services →

← All posts