MongoDB Atlas Complete Guide Part 4 — Performance Optimization: Indexing, Query Tuning, and Auto-scaling
Atlas performance optimization starts with diagnosis, not guesswork. Use Performance Advisor, Query Profiler, and explain() to find bottlenecks, design Compound/Partial/TTL/Wildcard indexes with the ESR rule, optimize Aggregation Pipelines, and connect Working Set, Connection Pooling, and Reactive/Predictive Auto-scaling into one operational workflow.
Series
- Part 1 | Concepts, architecture, and why Atlas matters
- Part 2 | Cluster Design Strategy
- Part 3 | Security, Networking, and Access Control
- Part 4 ← You are here | Performance Optimization — Indexing, Query Tuning, Auto-scaling
- Part 5 | Atlas in the AI Era — Vector Search, Stream Processing, RAG Pipelines
Table of Contents
- Seeing the Problem — Three Diagnostic Tools
- Index Types and When to Use Each
- The Golden Rule of Indexing — ESR
- Aggregation Pipeline Optimization
- Working Set and RAM Sizing
- Connection Pooling — The Bottleneck Teams Miss
- Auto-scaling Deep Dive — Reactive & Predictive
- Schema Design Determines Performance
- The Performance Optimization Workflow
1. Seeing the Problem — Three Diagnostic Tools
Before changing indexes or tiers, identify exactly where the bottleneck lives. Atlas provides three key tools for this.
1-1. Performance Advisor — The Core of Automated Index Recommendations
Performance Advisor automatically collects queries taking longer than 100 ms and ranks index suggestions by Impact.
Key metrics to understand:
| Metric | Meaning | Warning sign |
|---|---|---|
| Docs Examined | Documents scanned during query execution | More than 10× Docs Returned is dangerous |
| Docs Returned | Documents actually returned | Baseline |
| Keys Examined | Index keys scanned | 0 means no index used (full collection scan) |
| Avg Query Targeting | Docs Examined / Docs Returned ratio | Close to 1 is optimal; above 1000 is severe |
| Impact | Expected performance gain from adding the index | Process High → Medium → Low in order |
// Performance Advisor detection example
// Namespace: shop.orders
// Avg execution time: 1,243ms | Daily executions: 8,742 | Impact: HIGH
// Currently slow queries
db.orders.find({ status: "pending" }).sort({ createdAt: -1 })
db.orders.find({ status: "shipped", userId: "user123" })
// Index suggested by Performance Advisor
// { status: 1, createdAt: -1 } → covers both queries above
2025 update: Performance Advisor is now accessible via natural language through the MongoDB MCP Server, from AI clients such as Claude, Cursor, and GitHub Copilot. Query Profiler also gained shareable URLs for collaborating on slow-query analysis.
Before applying any recommendation, review it against existing indexes for redundancy, the write load on the collection, and the RAM impact of a new index.
1-2. Query Profiler — Dissecting Individual Queries
If Performance Advisor answers "which index should I create," Query Profiler answers "why is this specific query slow."
Critical patterns to look for:
Danger pattern 1: Docs Examined >> Docs Returned
Example: Examined: 2,000,000 / Returned: 5
→ Scanning the entire collection. Missing or ineffective index.
Danger pattern 2: Keys Examined = 0
→ Full Collection Scan (COLLSCAN)
→ Add an index immediately
Danger pattern 3: hasSort: true with no index-backed sort
→ In-memory sort occurring
→ Include the sort field in the index
Healthy pattern:
Docs Examined ≈ Docs Returned
Keys Examined > 0
hasIndexCoverage: true
1-3. explain() — Reading the Execution Plan Directly
Use this when you need the execution plan from within your code.
// Check the execution plan (plan only, no query execution)
const plan = await db.collection("orders")
.find({ status: "pending", userId: "user123" })
.sort({ createdAt: -1 })
.explain("executionStats");
// Key fields to check
console.log(plan.queryPlanner.winningPlan.stage);
// "COLLSCAN" → no index
// "IXSCAN" → index in use
// "FETCH" → index lookup then document fetch
// "PROJECTION_COVERED" → covered index (best performance)
console.log(plan.executionStats.totalDocsExamined); // docs scanned
console.log(plan.executionStats.totalDocsReturned); // docs returned
console.log(plan.executionStats.executionTimeMillis); // execution time (ms)
Execution plan stage reference:
COLLSCAN → Full collection scan (slow, needs an index)
IXSCAN → Index scan (good)
FETCH → Index lookup followed by document fetch (normal)
SORT → In-memory sort (use index-backed sort instead)
SORT_KEY_GENERATOR → Sort key generation (appears alongside SORT)
PROJECTION_COVERED → Covered index (best possible performance)
2. Index Types and When to Use Each
MongoDB supports several index types. Picking the right one for the query shape is half the optimization work. Indexes speed up reads but slow down writes and consume RAM — create only what you need.
2-1. Single Field Index
The foundation. An index on one field.
// Frequent lookups by userId
db.users.createIndex({ userId: 1 }); // ascending
db.users.createIndex({ createdAt: -1 }); // descending (latest-first sort)
// Unique index
db.users.createIndex({ email: 1 }, { unique: true });
2-2. Compound Index — Most Commonly Used
Combines multiple fields into one index. Far more efficient than separate single-field indexes when queries filter or sort on multiple fields simultaneously.
// Heavy query pattern: status filter + date sort
db.orders.createIndex({ status: 1, createdAt: -1 });
// Heavy query pattern: per-user status lookup
db.orders.createIndex({ userId: 1, status: 1, createdAt: -1 });
Prefix rule — the key concept behind Compound Indexes:
Index: { userId: 1, status: 1, createdAt: -1 }
Query patterns this index covers:
{ userId: "u1" } → first 1 field (prefix)
{ userId: "u1", status: "pending" } → first 2 fields (prefix)
{ userId: "u1", status: "pending", createdAt: ... } → all 3 fields
Query patterns this index does NOT cover:
{ status: "pending" } → no userId prefix — index not used!
{ createdAt: ... } → skips middle field
2-3. Partial Index — The Hidden Gem for Index Size Reduction
Indexes only documents that match a condition. When queries concentrate on a subset of the collection, this dramatically reduces index size while improving performance.
// Example: heavy read on orders where status = "pending"
// 1M total documents, only 50K are pending
// → Up to 95% smaller than a full index
db.orders.createIndex(
{ createdAt: -1, userId: 1 },
{
partialFilterExpression: { status: "pending" }
}
);
// Queries must include status: "pending" to use this index
db.orders.find({ status: "pending", userId: "u1" }).sort({ createdAt: -1 });
2-4. Sparse Index — Exclude Null Fields
Excludes documents that don't have the indexed field. Useful for optional fields.
// Index only documents that have a phoneNumber field
db.users.createIndex(
{ phoneNumber: 1 },
{ sparse: true }
);
2-5. TTL Index — Automatic Document Expiry
Automatically deletes documents past a date threshold. Ideal for sessions, logs, and cached data.
// Auto-delete session data after 24 hours
db.sessions.createIndex(
{ lastAccessedAt: 1 },
{ expireAfterSeconds: 86400 }
);
// Auto-delete log data after 30 days
db.logs.createIndex(
{ createdAt: 1 },
{ expireAfterSeconds: 2592000 }
);
2-6. Wildcard Index — Dynamic Field Structures
Useful when documents have varying field shapes (user-defined metadata, dynamic attributes).
// E-commerce: different attribute structures per product
// { attributes: { color: "red", size: "M" } }
// { attributes: { voltage: "220V", wattage: "100W" } }
// attributes sub-structure varies per document
db.products.createIndex({ "attributes.$**": 1 });
db.products.find({ "attributes.color": "red" }); // uses index
db.products.find({ "attributes.voltage": "220V" }); // uses index
Caution: Wildcard Index covers many field combinations, but a Compound Index is far more efficient for known fields. Use it only when you genuinely need to query arbitrary dynamic fields.
2-7. Hidden Index — Pre-validate Before Dropping
Hides an index from the query planner without deleting it, letting you verify impact before committing.
// Hide the index (query planner stops using it)
db.orders.hideIndex({ status: 1, createdAt: -1 });
// Run your queries and observe performance change
// No impact detected → actually drop it
db.orders.dropIndex({ status: 1, createdAt: -1 });
// Problem found → restore immediately
db.orders.unhideIndex({ status: 1, createdAt: -1 });
3. The Golden Rule of Indexing — ESR
When designing a Compound Index, place fields in ESR (Equality → Sort → Range) order for optimal performance.
E (Equality) → Exact match fields: { field: "exact_value" }
S (Sort) → Sort fields: .sort({ field: 1 })
R (Range) → Range filter fields: { field: { $gte: ..., $lte: ... } }
Practical example:
// Query pattern:
// db.orders.find({
// status: "shipped", Equality (E)
// createdAt: { $gte: last7days } Range (R)
// }).sort({ createdAt: -1 }) Sort (S)
// Wrong index order (Range before Sort)
db.orders.createIndex({ status: 1, createdAt: -1, userId: 1 });
// createdAt serves both Range and Sort roles;
// wrong ordering triggers an in-memory sort
// Correct ESR order
db.orders.createIndex({
status: 1, // E: Equality
createdAt: -1, // S: Sort (also the Range field — Sort takes precedence)
userId: 1 // R: additional filter
});
Why ESR works intuitively:
E first → groups equal values together (minimizes unnecessary scanning)
S second → preserves sort order within each group (no in-memory sort needed)
R last → range filter as the final narrowing step
4. Aggregation Pipeline Optimization
Aggregation Pipeline is powerful, but poorly written pipelines easily hit the 100 MB per-stage memory limit or run for tens of seconds.
4-1. Golden Rule: Push $match and $project to the Front
// Bad: group everything, then filter
db.orders.aggregate([
{ $group: { _id: "$userId", total: { $sum: "$amount" } } },
{ $match: { total: { $gte: 100 } } } // filter after grouping all data
]);
// Good: filter first, then group
db.orders.aggregate([
{ $match: { status: "completed", createdAt: { $gte: last30days } } }, // reduce first
{ $group: { _id: "$userId", total: { $sum: "$amount" } } },
{ $match: { total: { $gte: 100 } } }
]);
// Bad: pass all fields through every stage
db.orders.aggregate([
{ $match: { status: "pending" } },
{ $lookup: { from: "users", localField: "userId", ... } },
{ $project: { orderId: 1, userName: 1 } } // project at the end
]);
// Good: project early to keep only needed fields
db.orders.aggregate([
{ $match: { status: "pending" } },
{ $project: { userId: 1, orderId: 1, amount: 1 } }, // project early
{ $lookup: { from: "users", localField: "userId", ... } }
]);
4-2. $lookup Optimization — The Performance Trap in JOINs
$lookup is convenient but easily becomes the biggest performance bottleneck on large datasets.
// Slow $lookup: no filter, full join
db.orders.aggregate([
{ $lookup: {
from: "products",
localField: "productId",
foreignField: "_id",
as: "product"
}}
]);
// Fast $lookup: pipeline form + field projection
db.orders.aggregate([
{ $match: { status: "pending" } }, // reduce first
{ $lookup: {
from: "products",
let: { pid: "$productId" },
pipeline: [
{ $match: { $expr: { $eq: ["$_id", "$$pid"] } } },
{ $project: { name: 1, price: 1 } } // only needed fields
],
as: "product"
}},
{ $unwind: "$product" }
]);
Design signal: If you need $lookup frequently, that may indicate a schema redesign is warranted. Data that is always queried together should generally be embedded in one document — that is the MongoDB-idiomatic pattern.
4-3. $merge for Materialized Views
If the same expensive aggregation runs repeatedly, pre-storing its result in a separate collection produces large performance gains.
// Runs nightly at 3 AM — stores daily revenue aggregates
db.orders.aggregate([
{ $match: { status: "completed" } },
{ $group: {
_id: { $dateToString: { format: "%Y-%m-%d", date: "$createdAt" } },
dailyRevenue: { $sum: "$amount" },
orderCount: { $sum: 1 }
}},
{ $merge: {
into: "daily_revenue_stats",
on: "_id",
whenMatched: "replace",
whenNotMatched: "insert"
}}
]);
// Reporting queries now hit the aggregated collection, not millions of raw docs
db.daily_revenue_stats.find({ _id: { $gte: "2026-01-01" } });
4-4. allowDiskUse — Bypassing the 100 MB Memory Limit
Each aggregation pipeline stage has a default 100 MB RAM limit. Large aggregations can error when they exceed it.
// Allow spilling to disk when stage exceeds 100 MB
db.orders.aggregate(
[
{ $group: { _id: "$category", total: { $sum: "$amount" } } },
{ $sort: { total: -1 } }
],
{ allowDiskUse: true } // disk I/O will slow things down
);
allowDiskUseis a workaround, not a solution. The right fix is to reduce pipeline input volume with an early$match, or to use indexes that shrink the scan set upstream.
5. Working Set and RAM Sizing
Answering "should I upgrade the cluster tier?" requires understanding the Working Set.
5-1. What Is the Working Set?
The Working Set is the sum of data and indexes actively accessed by live queries. MongoDB's storage engine (WiredTiger) performs best when the Working Set fits in RAM. When it exceeds RAM, disk I/O spikes and performance degrades sharply.
WiredTiger Cache (50% of physical RAM)
Indexes (must stay in RAM at all times) 8 GB
Hot Data (frequently accessed documents) 6 GB
Buffer headroom 2 GB
Total: Working Set = 14 GB
Required WiredTiger Cache = 14 GB or more
Required physical RAM = 28 GB or more (Cache = 50% of RAM, so × 2)
Recommended cluster tier: M40 (32 GB RAM)
5-2. Measuring Working Set Size
// Check index sizes
const stats = await db.collection("orders").stats();
console.log("Total document size:", (stats.size / 1024 / 1024).toFixed(2), "MB");
console.log("Index size:", (stats.totalIndexSize / 1024 / 1024).toFixed(2), "MB");
// Full database stats
const dbStats = await db.stats();
console.log("Total data:", (dbStats.dataSize / 1024 / 1024 / 1024).toFixed(2), "GB");
console.log("Total indexes:", (dbStats.indexSize / 1024 / 1024 / 1024).toFixed(2), "GB");
RAM sizing formula:
Minimum required RAM = (Hot Data size + Total index size) × 2
(Because WiredTiger Cache is 50% of physical RAM)
Example:
Indexes: 8 GB
Hot Data (~25% of total data): 6 GB
Working Set: 14 GB
Required RAM: 14 GB × 2 = 28 GB → M40 (32 GB)
Indexes must live in RAM: When indexes exceed RAM, query performance degrades catastrophically. If index size is growing, removing unused indexes or switching to Partial Indexes should come before a tier upgrade.
5-3. Detecting and Removing Unused Indexes
Indexes speed up reads but slow down writes and consume RAM. Remove indexes that no query is using.
// Query index usage statistics ($indexStats)
const indexUsage = await db.collection("orders").aggregate([
{ $indexStats: {} }
]).toArray();
indexUsage.forEach(idx => {
console.log({
indexName: idx.name,
totalOps: idx.accesses.ops,
since: idx.accesses.since
});
});
// Indexes with ops = 0 or very low are removal candidates
// Use Hidden Index first to verify impact before dropping
6. Connection Pooling — The Bottleneck Teams Miss
Many teams focus solely on indexes and queries, overlooking connection pool misconfiguration as a bottleneck. The problem is especially acute in serverless environments.
6-1. Connection Pool Basics
Without pooling, high traffic exhausts the Atlas connection limit quickly. With a pool, connections are reused, minimizing latency and keeping connection count under control.
6-2. Driver Connection Pool Settings
// Node.js (mongoose)
const mongoose = require('mongoose');
mongoose.connect(uri, {
maxPoolSize: 50, // max connections (default: 5 — too low)
minPoolSize: 10, // minimum connections to maintain
serverSelectionTimeoutMS: 5000, // server selection timeout
socketTimeoutMS: 45000, // socket timeout
maxIdleTimeMS: 30000, // max idle connection lifetime
waitQueueTimeoutMS: 5000 // max wait for available connection
});
# Python (pymongo)
from pymongo import MongoClient
client = MongoClient(
uri,
maxPoolSize=50,
minPoolSize=10,
serverSelectionTimeoutMS=5000,
maxIdleTimeMS=30000,
compressors=['snappy', 'zlib'] # network compression too
)
6-3. Serverless Environments — Special Considerations
In serverless environments like AWS Lambda, Google Cloud Run, and Vercel, instances can be created fresh per request, resetting the connection pool each time.
Problem scenario:
500 Lambda instances running concurrently
Each instance maxPoolSize: 50
Potential connection count: 25,000
Atlas M30 maximum connections: 3,000
→ Connection errors everywhere
Recommended serverless pattern:
// Bad: new client inside the handler every time
exports.handler = async (event) => {
const client = new MongoClient(uri); // new connection every invocation
await client.connect();
// ...
};
// Good: create once at module level (reused on warm starts)
let cachedClient = null;
exports.handler = async (event) => {
if (!cachedClient) {
cachedClient = await new MongoClient(uri, {
maxPoolSize: 1, // 1 is correct for serverless
serverSelectionTimeoutMS: 10000,
socketTimeoutMS: 45000,
}).connect();
}
const db = cachedClient.db("mydb");
// ...
};
7. Auto-scaling Deep Dive — Reactive & Predictive
7-1. Two Auto-scaling Mechanisms
Atlas Auto-scaling runs two mechanisms in parallel.
Reactive Auto-scaling
Traffic spike → CPU/RAM threshold breached → scale-up decision → execute
Reactive — responds after the spike arrives
M10/M20: 5× faster reaction speed (2025 update)
M30+: scales up after 10+ minutes above threshold;
scales down after 24+ hours of low utilization
Predictive Auto-scaling
Learn past patterns (ML) → predict traffic spike → scale up proactively
Proactive — acts before the spike arrives
Especially effective for scheduled batch jobs and day-of-week traffic patterns
Scale-down is still Reactive only
Verify the GA status and supported tiers for Predictive Auto-scaling in the MongoDB official documentation. Auto-scaling does not fix bad queries or schema problems — resolve root causes first, then adjust tier only when genuinely needed.
7-2. Auto-scaling Trigger Conditions (M30+)
Scale-Up trigger:
CPU utilization > 90% sustained for 10+ minutes
or
RAM utilization persistently high
→ Automatically upgrades to the next tier (M30 → M40)
→ Detection starts 10 minutes after the last scaling event
Scale-Down trigger:
All of the following must be true:
CPU < 45% for the last 10 minutes AND last 4 hours
No scaling events in the last 24 hours
No provisioning or restarts in the last 24 hours
Next lower tier is at or above the configured Minimum Tier
→ Downgrades one tier (M40 → M30)
7-3. Scaling Range Strategy
// Set Auto-scaling range via Terraform
resource "mongodbatlas_advanced_cluster" "prod" {
// ...
replication_specs {
region_configs {
auto_scaling {
compute_enabled = true
disk_gb_enabled = true
compute_min_instance_size = "M30"
compute_max_instance_size = "M60"
compute_scale_down_enabled = true
}
}
}
}
Range selection guidelines:
Too wide (risky): M10 ~ M200
Risk of unexpected cost spikes during traffic bursts
Balanced (recommended): M30 ~ M60
Covers typical production traffic variation
Cost-conservative: M30 ~ M40
Tight cost control; may not handle extreme spikes
7-4. Independent Shard Scaling (2025 New Feature)
When traffic concentrates on a single shard (hot shard), individual shard tier settings are now possible.
8. Schema Design Determines Performance
Even perfect indexes have a ceiling if the schema itself is wrong.
8-1. Embed vs Reference — When to Use Each
Embedding (everything in one document)
Use when:
Data is always queried together
1:1 or 1:Few relationship
Sub-data does not need to exist independently
Example: order + shipping address (always fetched together)
{
_id: ObjectId("..."),
orderId: "ORD-001",
amount: 50000,
shippingAddress: {
street: "123 Main St",
city: "Seattle",
zip: "98101"
}
}
Referencing (separate collections)
Use when:
1:Many or Many:Many relationship
Sub-data changes frequently and independently
Arrays can grow without bound (avoid Unbounded Arrays!)
Example: user + order list (orders keep accumulating)
users collection
{ _id: ObjectId("u1"), name: "Jane Doe", email: "jane@example.com" }
orders collection (references userId)
{ _id: ObjectId("o1"), userId: ObjectId("u1"), amount: 50000, ... }
{ _id: ObjectId("o2"), userId: ObjectId("u1"), amount: 30000, ... }
8-2. Unbounded Array Anti-pattern — The Most Common Performance Issue
// Anti-pattern: array that grows indefinitely
{
_id: ObjectId("u1"),
userId: "user123",
orders: [ // document grows without limit as orders accumulate
{ orderId: "o1", amount: 50000 },
{ orderId: "o2", amount: 30000 },
// thousands of entries → risks hitting the 16 MB document size limit
// index performance degrades sharply
]
}
// Correct pattern: separate collection
// users: { _id, userId, ... }
// orders: { _id, userId, amount, ... } ← index on userId
8-3. Schema Performance Suggestions
Atlas Performance Advisor detects schema design problems in addition to missing indexes.
Issues it detects:
Array too large (oversized array fields)
Collection join overuse (excessive $lookup dependency)
Redundant indexes (duplicate indexes)
Document too large (very large documents)
High null ratio (fields that are mostly null)
9. The Performance Optimization Workflow
First action by bottleneck type:
CPU high → Check Performance Advisor for queries not using indexes
RAM high → Measure Working Set; upgrade tier or remove unused indexes
IOPS high → Excessive disk reads = Working Set exceeds RAM → trace to RAM shortage
Connections → Check maxPoolSize settings; serverless client caching
Rolling Index Build — create indexes without downtime:
# Default in Atlas UI is Rolling Build
# When creating via CLI:
atlas clusters indexes create \
--clusterName prod-cluster \
--db myDatabase \
--collection orders \
--key '{"status": 1, "createdAt": -1}' \
--name "idx_status_createdAt"
Part 4 Summary
| Optimization Area | Key Point |
|---|---|
| Diagnostic tools | Performance Advisor (auto-suggest) → Query Profiler (deep analysis) → explain() |
| Index selection | Choose type by query shape (Compound, Partial, TTL, Wildcard); validate with Hidden before dropping |
| ESR rule | Compound Index field order: Equality → Sort → Range |
| Aggregation | Push $match/$project forward; minimize $lookup; materialize with $merge |
| Working Set | Indexes + Hot Data must fit in WiredTiger Cache = RAM × 50% |
| Connection Pool | Set maxPoolSize appropriately; cache client at module level in serverless |
| Auto-scaling | Reactive + Predictive in parallel; set sensible Min/Max range; per-shard scaling |
| Schema design | No unbounded arrays; embed data that is always queried together |
References
- MongoDB Atlas Documentation: Performance Advisor
- MongoDB Atlas Documentation: Query Profiler
- MongoDB Manual: Indexes and Compound Indexes
- MongoDB Atlas Documentation: Cluster Auto-scaling
- MongoDB Manual: Aggregation Pipeline Optimization