How to Give Your AI Coding Agent Persistent Memory with MCP

Every AI coding session starts from zero. You open Claude Code, and the agent has no idea what you built yesterday, what rules your project follows, what you deployed last week, or what your users are complaining about. You spend the first five minutes re-explaining context that the agent knew perfectly well an hour ago. This is the cold start problem, and it's costing developers an hour every week.

This isn't a limitation you have to accept. The Model Context Protocol (MCP) lets you connect external services to your coding tools — and one of the most useful things you can connect is a persistent memory layer.

This guide walks through what that looks like in practice, how it works, and how to set it up.

What MCP Is (30-Second Version)

MCP is an open protocol that lets AI coding tools connect to external services. Your coding tool (Claude Code, Cursor, Windsurf) is the MCP client. External services are MCP servers. The client calls tools exposed by the server — reading data, writing data, triggering actions.

Think of it as plugins for your AI agent. Except instead of adding features to the editor, you're adding capabilities to the AI itself.

When you connect a memory service via MCP, your agent gains new abilities: it can store knowledge, retrieve it later, and build an accumulating understanding of your project that survives across sessions.

What "Persistent Memory" Actually Means

There's a difference between memory and a flat file.

CLAUDE.md is a flat file. It's loaded into every session, it contains static text, and the agent reads it passively. It's useful for small, stable sets of instructions — but as we explored in Why CLAUDE.md Breaks at Scale, it degrades past a few hundred lines.

Persistent memory is active. The agent writes to it during development — storing decisions as they're made, warnings as they're discovered, patterns as they're established. Future sessions query it selectively, loading only what's relevant to the current task. The knowledge base grows over time as the project evolves.

The distinction matters because a flat file stops scaling after a few hundred lines. Persistent memory scales to thousands of entries because the agent only loads what it needs.

The Three Things Your Agent Needs to Remember

Not everything is worth storing. Effective persistent memory focuses on three categories:

1. Rules and Constraints

The non-negotiable stuff. "All CSS must be in external files." "Never use offset pagination." "Always validate input before database queries." These are the guardrails that prevent the agent from making mistakes. Without them, every session is a fresh opportunity to reintroduce bugs you've already fixed.

Rules are high-priority and universally relevant — they should be loaded at the start of every session regardless of what task you're working on.

2. Decisions and Rationale

Why things are the way they are. "We chose cursor-based pagination because offset breaks on large datasets with concurrent inserts." "We use Redis for sessions, not database sessions, because of the multi-server setup." "We rejected GraphQL in favour of REST because our clients are all server-to-server."

Without stored decisions, the agent will periodically suggest approaches you've already considered and rejected. Storing the rationale prevents re-litigation and keeps development moving forward.

3. Warnings and Gotchas

The landmines. "CI4 4.7 changed the rawData default — this breaks encryption of existing data." "The Stripe webhook signature requires the raw request body, not the parsed JSON." "This endpoint returns 403 for revoked API keys, not 401."

These are the things that cost you hours when the agent doesn't know about them. Every production bug, every framework quirk, every integration gotcha — stored once, known forever.

How Structured Memory Works

The best memory systems don't just store text — they store typed, tagged, queryable entries. Each piece of knowledge has metadata that makes it findable:

Type tells you what kind of knowledge it is. A rule is different from a decision, which is different from a warning. The agent handles them differently — rules are constraints to follow, decisions are context to respect, warnings are dangers to avoid.

Tags make entries searchable. An entry tagged css, frontend, styling is found when the agent is working on CSS. An entry tagged stripe, payments, webhooks is found when the agent is working on the payment integration.

Priority determines loading order. High-priority entries are loaded every session. Normal-priority entries are loaded when relevant. Low-priority entries are loaded on demand.

Scope groups entries by area. All entries scoped to auth are loaded together when working on authentication. All entries scoped to deployment are loaded when deploying.

This structure means the agent can make targeted queries:

"Give me all high-priority rules" — loads the universal guardrails
"Give me everything tagged css" — loads CSS-specific knowledge for a frontend task
"Give me all warnings scoped to database" — loads database gotchas before a migration

Instead of reading a 500-line file and hoping attention lands on the right parts, the agent loads 30 lines of precisely relevant context.

Setting Up Persistent Memory with Minolith

Minolith is a hosted memory service designed specifically for AI coding agents. The Context service provides structured, typed entries with tag-based filtering — and it's free.

Here's how to set it up from zero.

Step 1: Create an Account and Project

Go to app.minolith.io/register. Create an account, create a project, and copy your API key from the project settings. The key starts with mlth_.

No credit card required. The 14-day trial includes 500 credits. Context operations don't consume credits, but you still need an active account after the trial ends.

Step 2: Connect Your Coding Tool

Claude Code:

claude mcp add --transport http minolith https://mcp.minolith.io \
  --header "Authorization: Bearer mlth_your_api_key_here"

Cursor, Windsurf, or other MCP clients:

Add to your MCP configuration file:

{
  "mcpServers": {
    "minolith": {
      "url": "https://mcp.minolith.io",
      "transport": "http",
      "headers": {
        "Authorization": "Bearer mlth_your_api_key_here"
      }
    }
  }
}

That's it. No npm install. No local server. No Docker. Minolith is a hosted API — you connect to it over HTTPS.

Step 3: Store Your First Entry

Start a coding session and tell your agent to store a project rule:

Store a context rule: "All CSS must be in external stylesheets. 
Never use inline styles or style blocks in templates."
Tag it with css, styling, frontend. Set priority to high.

The agent calls the store_context MCP tool:

{
  "type": "rule",
  "title": "All CSS must be in external stylesheets",
  "body": "Never use inline style attributes. Never create <style> blocks in templates. All styles go in the shared stylesheet.",
  "tags": ["css", "styling", "frontend"],
  "priority": "high"
}

That entry now exists in your project's knowledge base. Every future session can query it.

Step 4: Query Context at Session Start

Next session, tell your agent to load project context:

Load all high-priority context entries for this project.

The agent calls get_context with priority: "high" and gets back your rule — plus any other high-priority entries you've stored. The agent now knows the CSS rule before it writes a single line of code.

For task-specific context:

Load all context entries tagged "auth" — I'm working on the login flow today.

The agent loads only auth-related rules, decisions, warnings, and patterns. Focused, relevant, no noise.

What to Store (And When)

The best time to store context is the moment you learn something. Not after the session. Not during a documentation sprint. Right now, while the detail is fresh.

During development:

You make an architectural decision → store it as a decision with the rationale
You hit a framework gotcha → store it as a warning immediately
You establish a code pattern → store it as a pattern so it's followed consistently
You find a bug → store it as a bug with reproduction steps
You deploy → log it as an event (immutable, becomes part of the project timeline)
You discover a dependency constraint → store it as a dependency
You apply a temporary workaround → store it as a workaround with removal criteria

The agent does the storing. You don't need to write entries manually. Tell the agent what you learned and it creates the structured entry with the right type, tags, and priority. The MCP tool handles the rest.

Over time, your project accumulates a comprehensive knowledge base — not because anyone sat down and documented everything, but because knowledge was captured in the moment it was discovered.

The Session Start Workflow

After a few weeks of storing context, your session start workflow looks like this:

Discover what's available. The agent calls list_tags and list_scopes to see what knowledge exists in the project. This informs the queries that follow.
Load critical rules. The agent calls get_context with priority: "high" to load non-negotiable rules and warnings. This set stays small and focused — maybe 10-20 entries.
Check recent events. The agent calls get_recent_events to see what happened since the last session — deployments, migrations, incidents. This gives situational awareness without the agent having to ask "what did we do last time?"
Load task-specific context. If you're working on CSS, the agent loads CSS-tagged entries. If you're working on the API, it loads API-scoped entries. Only what's relevant.

This takes about 2 seconds. The agent makes 3-4 MCP calls and starts the session with full project awareness. Compare that to spending 5 minutes re-explaining your project, or hoping a 500-line CLAUDE.md is read correctly.

Beyond Memory: What Else Persistent Services Enable

Once your agent has a persistent connection to a service like Minolith, memory is just the beginning. As we covered in What Cursor, Claude Code, and Windsurf Still Can't Do, coding tools have six major blind spots. The same MCP connection gives the agent access to services that address most of them:

Changelogs — the agent creates changelog entries as it ships features. "Added CSV import." "Fixed timezone handling in exports." These publish to a hosted page your users can see. No manual changelog management.

Feedback — an embeddable widget on your app collects user bug reports and feature requests. Your agent queries the feedback inbox via MCP: "Are there any open bug reports about the export feature?" The agent can factor user needs into its work without you relaying anything.

Runbooks — stored multi-step procedures the agent follows. A deployment runbook with 10 steps, branching logic, and human approval gates. If the session ends mid-deployment, the next session picks up from exactly the right step.

Agent definitions — define who your agent is. An orchestrator that coordinates subagents. A code reviewer that only reads files. A documentation updater that follows your voice guidelines. Each with pre-loaded context specific to their role.

These aren't separate tools to configure. They're services on the same platform, authenticated with the same API key, accessible through the same MCP connection. The agent that stores context is the same agent that publishes changelogs, reads feedback, and follows runbooks.

Common Questions

Does this replace CLAUDE.md?

No. CLAUDE.md is still useful for a small set of critical, universal instructions — project setup commands, the tech stack summary, the one-paragraph project description. Think of CLAUDE.md as the quick-reference card and structured context as the full knowledge base.

What if I switch coding tools?

MCP works with Claude Code, Cursor, Windsurf, and any MCP-compatible client. Your context is stored on the server, not in your editor. Switch tools and your project knowledge comes with you.

What if two agents store contradictory context?

The entries are independent — there's no automatic conflict detection. But because entries have types, tags, and timestamps, you can query for all rules on a topic and see if they conflict. The dashboard also lets you review and edit entries directly.

How much does it cost?

Context operations cost zero credits — storing, querying, updating, and deleting don't consume from your allowance. You still need an active subscription ($5/month base which includes 500 credits). Minolith's paid services (Changelog, Feedback, Runbooks, Agents) start at $0.01 per action.

Is my data private?

Each project is isolated. Data created with one API key is only accessible to other keys in the same project. There's no cross-project or cross-account access. Minolith doesn't use your data to train AI models.

Getting Started

Sign up at app.minolith.io/register — 14-day free trial, no card required
Connect via MCP in one command
Store your first rule, decision, or warning
Query at the start of your next session and see the difference

The full API documentation is at docs.minolith.io. The Context API reference is at docs.minolith.io/api/context.

Your agent doesn't have to start from zero. Give it a memory.

Built by Minolith — micro-services for AI coding agents.