Every intelligent agent operates in a continuous loop. MongoDB sits at the center — the persistent memory that makes each cycle smarter than the last.
Memory
The agent receives input — a user message, an API event, a sensor reading, or a scheduled trigger. This raw input is the catalyst for the entire cognitive cycle.
Change Streams capture real-time events. Incoming messages are immediately persisted to the
working_memory collection.
Just like the human mind, an intelligent agent needs multiple types of memory working in concert. MongoDB Atlas unifies them all in a single cluster.
Short-term conversation context — the agent's scratchpad.
Past interactions and experiences — the agent's autobiography.
Knowledge and facts — the agent's world understanding.
Learned skills and workflows — the agent knows how.
Persistent preferences and profiles that grow over time.
Structured categories and domain terminology.
People, products, and organizations the agent knows.
Cross-agent sharing — one learns, the team benefits.
Each memory type maps to specific MongoDB features. Here's exactly how.
Working memory is the agent's scratchpad — it holds the current conversation context, intermediate reasoning steps, and any temporary data needed for the task at hand. Like human short-term memory, it's fast, focused, and ephemeral.
TTL Indexes automatically expire sessions after inactivity, preventing unbounded growth.
Change Streams let the agent react to new messages in real-time without polling.
LangGraph Checkpointer uses MongoDB to persist graph state for multi-step agent workflows.
// Auto-expire sessions after 1 hour of inactivity
db.working_memory.createIndex(
{ "created_at": 1 },
{ expireAfterSeconds: 3600 }
)
// Store current conversation context
db.working_memory.insertOne({
session_id: "sess_abc",
agent_id: "support-v2",
messages: [
{ role: "user", content: "My order hasn't arrived" },
{ role: "agent", content: "Let me check that..." }
],
tool_state: { pending: "order_lookup" },
created_at: new Date()
})Episodic memory records what happened — past conversations, resolved tickets, successful outcomes, and failures. It lets the agent learn from experience: "Last time this user asked about billing, here's what worked."
Rich Documents capture full interaction context: messages, decisions, outcomes, and metadata.
Atlas Search enables fuzzy, full-text search across past episodes by content, tags, or outcome.
Compound Indexes on user + timestamp enable instant lookups for recent history.
// Recall similar past episodes for context
db.episodes.find({
user_id: "user_123",
outcome: "resolved",
tags: { $in: ["billing", "refund"] }
}).sort({ timestamp: -1 }).limit(5)
// Store a completed episode
db.episodes.insertOne({
user_id: "user_123",
agent_id: "support-v2",
summary: "Resolved billing dispute for $49.99",
outcome: "resolved",
tags: ["billing", "refund", "positive"],
messages: [...],
decisions: ["escalated_to_supervisor"],
timestamp: new Date()
})Semantic memory is the agent's knowledge base — facts, concepts, and domain expertise stored as vector embeddings alongside structured metadata. This is the backbone of RAG (Retrieval-Augmented Generation).
Voyage AI Embeddings — Atlas-native embedding models from Voyage AI generate high-quality vectors directly inside the platform. No external embedding API calls needed.
Voyage AI Reranking — after initial vector retrieval, Atlas reranks results using Voyage AI's cross-encoder models for significantly higher recall and precision.
Atlas Vector Search finds semantically similar documents using ANN (Approximate Nearest Neighbor).
Hybrid Search combines vector similarity with keyword and metadata filters in one query.
Co-located Data — embeddings live alongside the source documents. No sync pipeline needed.
// Retrieve knowledge via vector similarity
db.knowledge.aggregate([
{ $vectorSearch: {
index: "vector_index",
path: "embedding",
queryVector: queryEmbedding,
numCandidates: 100,
limit: 5,
filter: {
domain: "billing",
active: true
}
}},
{ $project: {
content: 1,
source: 1,
score: { $meta: "vectorSearchScore" }
}}
])Procedural memory stores how the agent does things — tool definitions, learned workflows, successful action sequences, and decision templates. It's the difference between knowing facts and knowing skills.
Flexible Documents store complex tool schemas, multi-step workflows, and versioned procedures.
Versioning allows agents to evolve their skills — keeping proven procedures while testing new ones.
Success Tracking — each procedure tracks its success rate so the agent picks the best approach.
// Retrieve the best-performing refund procedure
db.procedures.findOne({
skill: "order_refund",
version: { $gte: 2 },
success_rate: { $gte: 0.95 }
})
// A procedure document — the agent's playbook
{
skill: "order_refund",
version: 3,
steps: [
{ action: "lookup_order", tool: "orders_api" },
{ action: "check_policy", tool: "policy_engine" },
{ action: "process_refund", tool: "payments_api" },
{ action: "send_confirm", tool: "email_service" }
],
success_rate: 0.97,
avg_duration_ms: 4200
}Long-term memory persists user preferences, behavioral patterns, and accumulated insights across sessions. It enables true personalization — the agent remembers that you prefer concise answers, always want tracking links, and tend to ask about billing on Mondays.
Flexible Schema lets profiles evolve organically — new preferences are added without migrations.
Atomic Updates with $set, $push, $inc modify specific fields without overwriting the full document.
Capped Arrays via $slice keep history bounded while retaining the most recent entries.
// Evolve user profile with every interaction
db.user_profiles.updateOne(
{ user_id: "user_123" },
{
$set: {
"preferences.tone": "concise",
"preferences.include_links": true,
"last_active": new Date()
},
$push: {
"interaction_history": {
$each: [{ topic: "billing", ts: new Date() }],
$slice: -100
}
},
$inc: { "metrics.total_interactions": 1 }
},
{ upsert: true }
)
// Read personalization at session start
const profile = db.user_profiles.findOne(
{ user_id: "user_123" },
{ preferences: 1, interaction_history: { $slice: -5 } }
)Taxonomic memory organizes domain terminology into structured hierarchies and categories. When the agent encounters "WAN issue," it knows this belongs under Networking → Wide Area Network → Troubleshooting. It's the difference between understanding words and understanding how concepts relate.
$graphLookup traverses hierarchical category trees in a single query — no recursive application code needed.
Self-referencing Documents store parent-child relationships natively with a parent_id field.
Post-session Consolidation — back-end processing analyzes working memory and automatically discovers new terms to categorize.
// Taxonomy as self-referencing documents
db.taxonomy.insertMany([
{ _id: "networking", label: "Networking", parent: null },
{ _id: "wan", label: "Wide Area Network", parent: "networking" },
{ _id: "wan_troubleshoot", label: "WAN Troubleshooting",
parent: "wan",
terms: ["latency", "packet loss", "jitter"] }
])
// Traverse the full hierarchy with $graphLookup
db.taxonomy.aggregate([
{ $match: { _id: "wan_troubleshoot" } },
{ $graphLookup: {
from: "taxonomy",
startWith: "$parent",
connectFromField: "parent",
connectToField: "_id",
as: "ancestors"
}}
])Entity memory stores structured knowledge about the specific people, products, organizations, and concepts the agent interacts with. Unlike semantic memory (embeddings), entity memory is richly structured — enabling precise attribute lookups and relationship traversal.
Document Model naturally represents entities with nested attributes, arrays of relationships, and arbitrary metadata.
Rich Queries on structured fields — filter entities by type, relationship, attribute, or any combination.
CrewAI Compatible — maps directly to CrewAI's EntityMemory interface for multi-agent entity tracking.
// Store structured entity knowledge
db.entities.updateOne(
{ entity_id: "acme_corp" },
{ $set: {
type: "organization",
name: "Acme Corporation",
attributes: {
tier: "enterprise",
industry: "manufacturing",
since: 2019
},
relationships: [
{ type: "contact", ref: "jane_doe" },
{ type: "product", ref: "widget_pro" }
]
}},
{ upsert: true }
)
// Query entities by relationship
db.entities.find({
"relationships.ref": "jane_doe",
type: "organization"
})A single MongoDB Atlas cluster serves as the unified memory backbone. One connection, one API, every type of memory.
Messages, API calls, triggers, sensor data
LangChain / LangGraph / CrewAI / MCP
Click a layer to explore
Short-term conversation context
Watch a support agent handle a real conversation. Click each step to see which memory layers are read and written — and the exact MongoDB operations behind each one.
User sends message
"My last order was damaged, can you help?"
Agent recalls history
Pulls past conversations & similar resolved cases
Agent retrieves knowledge
Searches refund policy docs via vector similarity
Agent selects best procedure
Loads the highest-performing "damaged_item" workflow
Agent reads preferences
Checks user prefers concise responses with links
Agent responds & stores
Sends reply, logs episode, updates profile
"New message received. Storing to working memory and beginning the cognitive cycle."
MongoDB Operation
Most teams bolt together separate databases for each memory type. The result? Sync pipelines, consistency bugs, and compounding operational cost.
6 databases · 5+ sync pipelines · N consistency bugs
1 cluster · 0 sync pipelines · Atomic consistency
Data doesn't stay in one layer forever. It flows, consolidates, and transforms as the agent learns — all within a single MongoDB cluster.
User message is persisted to working_memory with a TTL. The agent's scratchpad is initialized.
Queries fan out across all memory layers: past episodes, knowledge embeddings, skill definitions, user preferences, domain taxonomy, entity records, and shared team knowledge.
The LLM receives the enriched context from working memory — current conversation plus all retrieved memories — and begins generating a response.
Agent response is appended to the working memory conversation buffer. Tool calls and their results are logged to agent_traces.
The completed interaction is summarized and stored as an episode document — including outcome, tags, and decisions taken — for future recall.
User preferences are atomically updated via $set / $inc. Interaction history is appended with $push + $slice.
Entities mentioned in the conversation (people, products, order IDs) are extracted and upserted. If the agent is part of a team, insights are written to shared_knowledge via RBAC.
Back-end processing analyzes the completed session: discovers new step sequences and writes them to procedural memory, identifies new domain terms and categorizes them in the taxonomy. This is how the agent learns.
The TTL index on working_memory expires the session document. The scratchpad is cleared — but the episode, knowledge updates, and profile persist forever.
The key insight: Working memory is ephemeral (TTL). But every interaction enriches the episodic, procedural, taxonomic, entity, shared, and long-term layers — making the agent permanently smarter with each conversation. Post-session consolidation closes the learning loop.
MongoDB integrates natively with the leading agent frameworks — and with MCP, agents can talk to their memory store directly. Here's how to wire it up.
First-party langchain-mongodb, llama-index-vector-stores-mongodb packages
LangGraph's MongoDBSaver persists graph state for multi-step agent workflows
MongoDBChatMessageHistory provides drop-in conversational memory
Official mongodb-mcp-server lets agents query, index, and manage memory via Model Context Protocol
The frameworks above show how to wire memory into your agent code. But with the MongoDB MCP Server, AI agents can query, update, and manage their memory store directly — in natural language, with no driver code required.
MCP provides a universal interface for LLMs to communicate with any data source, eliminating custom integrations for each tool.
MCP servers run locally or in your infrastructure. Your data never leaves your control — the AI only sees what you allow.
MongoDB's MCP server exposes tools for querying, aggregation, indexes, vector search, Atlas management, and more.
Watch a complete agentic lifecycle using only MCP tools for every step. See how the agent uses MongoDB for memory, the LLM for reasoning, and MCP for all data operations.
Get your agents talking to MongoDB in under 2 minutes. Choose your AI client.
Add this to your claude_desktop_config.json file:
The alternative is bolting together separate databases for vectors, sessions, profiles, and knowledge. Here's why a unified platform wins.
All eight memory types in a single cluster. No sync pipelines between a vector DB, a cache, a graph DB, and a document store. Fewer moving parts, lower TCO, faster time-to-production.
Atlas Vector Search with built-in Voyage AI embedding and reranking models. Combine semantic search with structured filters, full-text search, and geospatial queries — all in a single aggregation pipeline, no external APIs.
TTL indexes, Change Streams, RBAC, CSFLE, horizontal scaling, multi-region replication. Enterprise-grade infrastructure that AI startups and Fortune 500s already trust.