Agentic Memory on MongoDB

The Agent Lifecycle

Every intelligent agent operates in a continuous loop. MongoDB sits at the center — the persistent memory that makes each cycle smarter than the last.

👁️ Perceive

🔍 Retrieve

🧠 Reason

⚡ Act

💾 Store

Memory

Phase 1 of 5

Perceive

The agent receives input — a user message, an API event, a sensor reading, or a scheduled trigger. This raw input is the catalyst for the entire cognitive cycle.

MongoDB Role

Change Streams capture real-time events. Incoming messages are immediately persisted to the working_memory collection.

Eight Layers of Memory

Just like the human mind, an intelligent agent needs multiple types of memory working in concert. MongoDB Atlas unifies them all in a single cluster.

⚡ Working Memory

Short-term conversation context — the agent's scratchpad.

TTL Index Change Streams

📖 Episodic Memory

Past interactions and experiences — the agent's autobiography.

Timestamped Docs Atlas Search

🌐 Semantic Memory

Knowledge and facts — the agent's world understanding.

Vector Search Voyage AI

🔧 Procedural Memory

Learned skills and workflows — the agent knows how.

Document Model Aggregation

🏛️ Long-term Memory

Persistent preferences and profiles that grow over time.

Flexible Schema Upserts

🏷️ Taxonomic Memory

Structured categories and domain terminology.

$graphLookup Hierarchical Docs

🪪 Entity Memory

People, products, and organizations the agent knows.

Document Model Rich Queries

🤝 Shared Memory

Cross-agent sharing — one learns, the team benefits.

RBAC Multi-tenant

Deep Dive

Each memory type maps to specific MongoDB features. Here's exactly how.

⚡

Working Memory

Working memory is the agent's scratchpad — it holds the current conversation context, intermediate reasoning steps, and any temporary data needed for the task at hand. Like human short-term memory, it's fast, focused, and ephemeral.

▸

TTL Indexes automatically expire sessions after inactivity, preventing unbounded growth.

▸

Change Streams let the agent react to new messages in real-time without polling.

▸

LangGraph Checkpointer uses MongoDB to persist graph state for multi-step agent workflows.

SESSION → BUFFER → TTL EXPIRE

                                working_memory.js
                                
                            
// Auto-expire sessions after 1 hour of inactivity
db.working_memory.createIndex(
  { "created_at": 1 },
  { expireAfterSeconds: 3600 }
)

// Store current conversation context
db.working_memory.insertOne({
  session_id: "sess_abc",
  agent_id:  "support-v2",
  messages: [
    { role: "user",  content: "My order hasn't arrived" },
    { role: "agent", content: "Let me check that..." }
  ],
  tool_state: { pending: "order_lookup" },
  created_at: new Date()
})

📖

Episodic Memory

Episodic memory records what happened — past conversations, resolved tickets, successful outcomes, and failures. It lets the agent learn from experience: "Last time this user asked about billing, here's what worked."

▸

Rich Documents capture full interaction context: messages, decisions, outcomes, and metadata.

▸

Atlas Search enables fuzzy, full-text search across past episodes by content, tags, or outcome.

▸

Compound Indexes on user + timestamp enable instant lookups for recent history.

INTERACTION → LOG → RECALL

                                episodic_memory.js
                                
                            
// Recall similar past episodes for context
db.episodes.find({
  user_id: "user_123",
  outcome: "resolved",
  tags: { $in: ["billing", "refund"] }
}).sort({ timestamp: -1 }).limit(5)

// Store a completed episode
db.episodes.insertOne({
  user_id:   "user_123",
  agent_id:  "support-v2",
  summary:   "Resolved billing dispute for $49.99",
  outcome:   "resolved",
  tags:      ["billing", "refund", "positive"],
  messages:  [...],
  decisions: ["escalated_to_supervisor"],
  timestamp: new Date()
})

🌐

Semantic Memory

Semantic memory is the agent's knowledge base — facts, concepts, and domain expertise stored as vector embeddings alongside structured metadata. This is the backbone of RAG (Retrieval-Augmented Generation).

▸

Voyage AI Embeddings — Atlas-native embedding models from Voyage AI generate high-quality vectors directly inside the platform. No external embedding API calls needed.

▸

Voyage AI Reranking — after initial vector retrieval, Atlas reranks results using Voyage AI's cross-encoder models for significantly higher recall and precision.

▸

Atlas Vector Search finds semantically similar documents using ANN (Approximate Nearest Neighbor).

▸

Hybrid Search combines vector similarity with keyword and metadata filters in one query.

▸

Co-located Data — embeddings live alongside the source documents. No sync pipeline needed.

Voyage AI EMBED → INDEX → $vectorSearch → RERANK

                                semantic_memory.js
                                
                            
// Retrieve knowledge via vector similarity
db.knowledge.aggregate([
  { $vectorSearch: {
      index:         "vector_index",
      path:          "embedding",
      queryVector:   queryEmbedding,
      numCandidates: 100,
      limit:         5,
      filter: {
        domain: "billing",
        active: true
      }
  }},
  { $project: {
      content: 1,
      source:  1,
      score:   { $meta: "vectorSearchScore" }
  }}
])

🔧

Procedural Memory

Procedural memory stores how the agent does things — tool definitions, learned workflows, successful action sequences, and decision templates. It's the difference between knowing facts and knowing skills.

▸

Flexible Documents store complex tool schemas, multi-step workflows, and versioned procedures.

▸

Versioning allows agents to evolve their skills — keeping proven procedures while testing new ones.

▸

Success Tracking — each procedure tracks its success rate so the agent picks the best approach.

SKILL → VERSION → EXECUTE

                                procedural_memory.js
                                
                            
// Retrieve the best-performing refund procedure
db.procedures.findOne({
  skill:        "order_refund",
  version:      { $gte: 2 },
  success_rate: { $gte: 0.95 }
})

// A procedure document — the agent's playbook
{
  skill:    "order_refund",
  version:  3,
  steps: [
    { action: "lookup_order",   tool: "orders_api" },
    { action: "check_policy",   tool: "policy_engine" },
    { action: "process_refund", tool: "payments_api" },
    { action: "send_confirm",   tool: "email_service" }
  ],
  success_rate: 0.97,
  avg_duration_ms: 4200
}

🏛️

Long-term Memory

Long-term memory persists user preferences, behavioral patterns, and accumulated insights across sessions. It enables true personalization — the agent remembers that you prefer concise answers, always want tracking links, and tend to ask about billing on Mondays.

▸

Flexible Schema lets profiles evolve organically — new preferences are added without migrations.

▸

Atomic Updates with $set, $push, $inc modify specific fields without overwriting the full document.

▸

Capped Arrays via $slice keep history bounded while retaining the most recent entries.

LEARN → STORE → PERSONALIZE

                                longterm_memory.js
                                
                            
// Evolve user profile with every interaction
db.user_profiles.updateOne(
  { user_id: "user_123" },
  {
    $set: {
      "preferences.tone": "concise",
      "preferences.include_links": true,
      "last_active": new Date()
    },
    $push: {
      "interaction_history": {
        $each:  [{ topic: "billing", ts: new Date() }],
        $slice: -100
      }
    },
    $inc: { "metrics.total_interactions": 1 }
  },
  { upsert: true }
)

// Read personalization at session start
const profile = db.user_profiles.findOne(
  { user_id: "user_123" },
  { preferences: 1, interaction_history: { $slice: -5 } }
)

🏷️

Taxonomic Memory

Taxonomic memory organizes domain terminology into structured hierarchies and categories. When the agent encounters "WAN issue," it knows this belongs under Networking → Wide Area Network → Troubleshooting. It's the difference between understanding words and understanding how concepts relate.

▸

$graphLookup traverses hierarchical category trees in a single query — no recursive application code needed.

▸

Self-referencing Documents store parent-child relationships natively with a parent_id field.

▸

Post-session Consolidation — back-end processing analyzes working memory and automatically discovers new terms to categorize.

TERM → CLASSIFY → HIERARCHY

                                taxonomic_memory.js
                                
                            
// Taxonomy as self-referencing documents
db.taxonomy.insertMany([
  { _id: "networking", label: "Networking", parent: null },
  { _id: "wan", label: "Wide Area Network", parent: "networking" },
  { _id: "wan_troubleshoot", label: "WAN Troubleshooting",
    parent: "wan",
    terms: ["latency", "packet loss", "jitter"] }
])

// Traverse the full hierarchy with $graphLookup
db.taxonomy.aggregate([
  { $match: { _id: "wan_troubleshoot" } },
  { $graphLookup: {
      from: "taxonomy",
      startWith: "$parent",
      connectFromField: "parent",
      connectToField: "_id",
      as: "ancestors"
  }}
])

🪪

Entity Memory

Entity memory stores structured knowledge about the specific people, products, organizations, and concepts the agent interacts with. Unlike semantic memory (embeddings), entity memory is richly structured — enabling precise attribute lookups and relationship traversal.

▸

Document Model naturally represents entities with nested attributes, arrays of relationships, and arbitrary metadata.

▸

Rich Queries on structured fields — filter entities by type, relationship, attribute, or any combination.

▸

CrewAI Compatible — maps directly to CrewAI's EntityMemory interface for multi-agent entity tracking.

EXTRACT → STORE → RELATE

                                entity_memory.js
                                
                            
// Store structured entity knowledge
db.entities.updateOne(
  { entity_id: "acme_corp" },
  { $set: {
      type: "organization",
      name: "Acme Corporation",
      attributes: {
        tier: "enterprise",
        industry: "manufacturing",
        since: 2019
      },
      relationships: [
        { type: "contact", ref: "jane_doe" },
        { type: "product", ref: "widget_pro" }
      ]
  }},
  { upsert: true }
)

// Query entities by relationship
db.entities.find({
  "relationships.ref": "jane_doe",
  type: "organization"
})

🤝

Shared Memory

In multi-agent systems, what one agent learns should benefit the whole team. Shared memory enables cross-agent knowledge sharing — when Engineer A's agent solves a WAN issue, Engineer B's agent can leverage that knowledge a week later. MongoDB's RBAC controls exactly who reads and writes what.

▸

RBAC — Role-Based Access Control governs which agents or teams can read/write to shared collections.

▸

Multi-tenant Design — a team_id field scopes shared memory to teams, departments, or organizations.

▸

Change Streams notify peer agents when shared knowledge is updated — enabling real-time collaboration.

LEARN → SHARE → COLLABORATE

                                shared_memory.js
                                
                            
// Agent A learns and shares with the team
db.shared_knowledge.insertOne({
  team_id:     "noc_team",
  contributed_by: "agent_A",
  topic:       "WAN troubleshooting",
  insight:     "Rebooting the edge router resolves 80% of WAN latency spikes",
  confidence:  0.92,
  source_episode: ObjectId("..."),
  created_at:  new Date()
})

// Agent B queries shared team knowledge
db.shared_knowledge.find({
  team_id:    "noc_team",
  topic:      { $regex: /wan/i },
  confidence: { $gte: 0.8 }
}).sort({ created_at: -1 })

How It All Connects

A single MongoDB Atlas cluster serves as the unified memory backbone. One connection, one API, every type of memory.

👤

User / Event

Messages, API calls, triggers, sensor data

input

🤖

Agent Framework

LangChain / LangGraph / CrewAI / MCP

Perceive

Retrieve

Reason

Act

MCP / Driver

MongoDB Atlas

Click a layer to explore

⚡ Working Memory

📖 Episodic Memory

🌐 Semantic Memory

🔧 Procedural Memory

🏛️ Long-term Memory

🏷️ Taxonomic Memory

🪪 Entity Memory

🤝 Shared Memory

⚡

Working Memory

Short-term conversation context

MongoDB Features Used

How It Works

Collection

db.working_memory

Key Operation

🍃 One cluster · One connection string · Eight memory types · Zero sync overhead

Memory in Action

Watch a support agent handle a real conversation. Click each step to see which memory layers are read and written — and the exact MongoDB operations behind each one.

1

User sends message

"My last order was damaged, can you help?"

2

Agent recalls history

Pulls past conversations & similar resolved cases

3

Agent retrieves knowledge

Searches refund policy docs via vector similarity

4

Agent selects best procedure

Loads the highest-performing "damaged_item" workflow

5

Agent reads preferences

Checks user prefers concise responses with links

6

Agent responds & stores

Sends reply, logs episode, updates profile

⚡ Working

📖 Episodic

🌐 Semantic

🔧 Procedural

🏛️ Long-term

🏷️ Taxonomic

🪪 Entity

🤝 Shared

🤖 Agent Thought

"New message received. Storing to working memory and beginning the cognitive cycle."

MongoDB Operation

The Sync Tax

Most teams bolt together separate databases for each memory type. The result? Sync pipelines, consistency bugs, and compounding operational cost.

🔩

Bolted-On Architecture

💬 Redis / Memcached session cache

🔍 Pinecone / Weaviate vector store

🗄️ PostgreSQL user profiles

📊 Elasticsearch episode search

🏷️ Neo4j taxonomy graph

🔐 Custom ACL layer shared access

6 databases · 5+ sync pipelines · N consistency bugs

🍃

Unified on MongoDB Atlas

⚡ working_memory TTL + Change Streams

📖 episodes Atlas Search

🌐 knowledge $vectorSearch

🔧 procedures Document Model

🏛️ user_profiles Flexible Schema

🏷️ taxonomy $graphLookup

🪪 entities Document Model

🤝 shared_knowledge RBAC + Change Streams

1 cluster · 0 sync pipelines · Atomic consistency

Memory Lifecycle

Data doesn't stay in one layer forever. It flows, consolidates, and transforms as the agent learns — all within a single MongoDB cluster.

T + 0s — Message Arrives

⚡ Working Write

User message is persisted to working_memory with a TTL. The agent's scratchpad is initialized.

T + 50ms — Memory Retrieval

📖 Episodic 🌐 Semantic 🔧 Procedural 🏛️ Long-term 🏷️ Taxonomic 🪪 Entity 🤝 Shared Read

Queries fan out across all memory layers: past episodes, knowledge embeddings, skill definitions, user preferences, domain taxonomy, entity records, and shared team knowledge.

T + 200ms — LLM Reasoning

⚡ Working Read

The LLM receives the enriched context from working memory — current conversation plus all retrieved memories — and begins generating a response.

T + 2s — Response & Action

⚡ Working Write

Agent response is appended to the working memory conversation buffer. Tool calls and their results are logged to agent_traces.

T + 2.1s — Episode Capture

📖 Episodic Write

The completed interaction is summarized and stored as an episode document — including outcome, tags, and decisions taken — for future recall.

T + 2.2s — Profile Update

🏛️ Long-term Write

User preferences are atomically updated via $set / $inc. Interaction history is appended with $push + $slice.

T + 2.3s — Entity Extraction

🪪 Entity 🤝 Shared Write

Entities mentioned in the conversation (people, products, order IDs) are extracted and upserted. If the agent is part of a team, insights are written to shared_knowledge via RBAC.

T + 5s — Post-Session Consolidation (async)

🔧 Procedural 🏷️ Taxonomic Write

Back-end processing analyzes the completed session: discovers new step sequences and writes them to procedural memory, identifies new domain terms and categorizes them in the taxonomy. This is how the agent learns.

T + 1 hour — Automatic Cleanup

⚡ Working TTL Expire

The TTL index on working_memory expires the session document. The scratchpad is cleared — but the episode, knowledge updates, and profile persist forever.

The key insight: Working memory is ephemeral (TTL). But every interaction enriches the episodic, procedural, taxonomic, entity, shared, and long-term layers — making the agent permanently smarter with each conversation. Post-session consolidation closes the learning loop.

Plug Into Any Framework

MongoDB integrates natively with the leading agent frameworks — and with MCP, agents can talk to their memory store directly. Here's how to wire it up.

langchain_memory.py

📦

Official Packages

First-party langchain-mongodb, llama-index-vector-stores-mongodb packages

🔄

Checkpointing

LangGraph's MongoDBSaver persists graph state for multi-step agent workflows

🧠

Chat History

MongoDBChatMessageHistory provides drop-in conversational memory

🔌

MCP Server

Official mongodb-mcp-server lets agents query, index, and manage memory via Model Context Protocol

🔌 Model Context Protocol

MongoDB MCP Server

The frameworks above show how to wire memory into your agent code. But with the MongoDB MCP Server, AI agents can query, update, and manage their memory store directly — in natural language, with no driver code required.

🔗

Standardized Protocol

MCP provides a universal interface for LLMs to communicate with any data source, eliminating custom integrations for each tool.

🛡️

Secure by Design

MCP servers run locally or in your infrastructure. Your data never leaves your control — the AI only sees what you allow.

🧩

25+ Database Tools

MongoDB's MCP server exposes tools for querying, aggregation, indexes, vector search, Atlas management, and more.

🤖

MCP Host

Claude / VS Code / Cursor

→

🔌 MCP Server

mongodb-mcp-server

Tools Resources

→

MongoDB Atlas

All 8 Memory Layers

See It In Action

Watch a complete agentic lifecycle using only MCP tools for every step. See how the agent uses MongoDB for memory, the LLM for reasoning, and MCP for all data operations.

🤖

Support Agent v2

Powered by MongoDB MCP Server

Iteration: 1

👁️

Perceive

User → Agent

→

🔍

Retrieve

Agent ↔ MongoDB

→

🧠

Reason

Agent → LLM

→

⚡

Act

Agent → MongoDB

→

💾

Store

Agent → MongoDB

↩

🔄

Loop

Next Turn

🍃 MongoDB

Click a phase or send a message to begin...

0

Reads

0

Writes

0

Tools Used

0ms

Latency

🧠LLM Reasoning

Claude Sonnet 4

🧠

LLM reasoning will appear during the "Reason" phase

🤖 Agent

💭 Current Thought

Waiting for user input...

No messages yet. Try a suggestion below!

👁️

Perceive

Store input to working memory

insert-many

🔍

Retrieve

Query all memory layers

find aggregate

🧠

Reason

LLM processes context

collection-schema

⚡

Act

Execute tools & respond

find update-many

💾

Store

Persist episode & update profile

insert-many update-many

Quick Setup

Get your agents talking to MongoDB in under 2 minutes. Choose your AI client.

Step 1: Add Configuration

Add this to your claude_desktop_config.json file:

claude_desktop_config.json

Step 2: Restart & Connect

1️⃣ Restart your AI client to load the MCP server

2️⃣ Look for the 🔌 MCP icon indicating the server is connected

3️⃣ Start asking questions about your MongoDB data!

💡

Tip: The MCP server runs locally via npx — no global installation required. It downloads automatically on first use.

Why MongoDB Atlas

The alternative is bolting together separate databases for vectors, sessions, profiles, and knowledge. Here's why a unified platform wins.

🔗

Unified Platform

All eight memory types in a single cluster. No sync pipelines between a vector DB, a cache, a graph DB, and a document store. Fewer moving parts, lower TCO, faster time-to-production.

🔍

Vector + Document

Atlas Vector Search with built-in Voyage AI embedding and reranking models. Combine semantic search with structured filters, full-text search, and geospatial queries — all in a single aggregation pipeline, no external APIs.

🛡️

Production Ready

TTL indexes, Change Streams, RBAC, CSFLE, horizontal scaling, multi-region replication. Enterprise-grade infrastructure that AI startups and Fortune 500s already trust.

See framework integration code