Thursday, June 4, 2026
All posts
Lv.2 BeginnerMongoDB
28 min readLv.2 Beginner
SeriesMongoDB Atlas Complete Guide · Part 4View series hub

MongoDB Atlas Complete Guide Part 4 — Performance Optimization: Indexing, Query Tuning, and Auto-scaling

MongoDB Atlas Complete Guide Part 4 — Performance Optimization: Indexing, Query Tuning, and Auto-scaling

Atlas performance optimization starts with diagnosis, not guesswork. Use Performance Advisor, Query Profiler, and explain() to find bottlenecks, design Compound/Partial/TTL/Wildcard indexes with the ESR rule, optimize Aggregation Pipelines, and connect Working Set, Connection Pooling, and Reactive/Predictive Auto-scaling into one operational workflow.

Series

  • Part 1 | Concepts, architecture, and why Atlas matters
  • Part 2 | Cluster Design Strategy
  • Part 3 | Security, Networking, and Access Control
  • Part 4 ← You are here | Performance Optimization — Indexing, Query Tuning, Auto-scaling
  • Part 5 | Atlas in the AI Era — Vector Search, Stream Processing, RAG Pipelines

Table of Contents

  1. Seeing the Problem — Three Diagnostic Tools
  2. Index Types and When to Use Each
  3. The Golden Rule of Indexing — ESR
  4. Aggregation Pipeline Optimization
  5. Working Set and RAM Sizing
  6. Connection Pooling — The Bottleneck Teams Miss
  7. Auto-scaling Deep Dive — Reactive & Predictive
  8. Schema Design Determines Performance
  9. The Performance Optimization Workflow

1. Seeing the Problem — Three Diagnostic Tools

Before changing indexes or tiers, identify exactly where the bottleneck lives. Atlas provides three key tools for this.

1-1. Performance Advisor — The Core of Automated Index Recommendations

Performance Advisor automatically collects queries taking longer than 100 ms and ranks index suggestions by Impact.

Key metrics to understand:

MetricMeaningWarning sign
Docs ExaminedDocuments scanned during query executionMore than 10× Docs Returned is dangerous
Docs ReturnedDocuments actually returnedBaseline
Keys ExaminedIndex keys scanned0 means no index used (full collection scan)
Avg Query TargetingDocs Examined / Docs Returned ratioClose to 1 is optimal; above 1000 is severe
ImpactExpected performance gain from adding the indexProcess High → Medium → Low in order
// Performance Advisor detection example
// Namespace: shop.orders
// Avg execution time: 1,243ms | Daily executions: 8,742 | Impact: HIGH

// Currently slow queries
db.orders.find({ status: "pending" }).sort({ createdAt: -1 })
db.orders.find({ status: "shipped", userId: "user123" })

// Index suggested by Performance Advisor
// { status: 1, createdAt: -1 }  → covers both queries above

2025 update: Performance Advisor is now accessible via natural language through the MongoDB MCP Server, from AI clients such as Claude, Cursor, and GitHub Copilot. Query Profiler also gained shareable URLs for collaborating on slow-query analysis.

Before applying any recommendation, review it against existing indexes for redundancy, the write load on the collection, and the RAM impact of a new index.

1-2. Query Profiler — Dissecting Individual Queries

If Performance Advisor answers "which index should I create," Query Profiler answers "why is this specific query slow."

Critical patterns to look for:

Danger pattern 1: Docs Examined >> Docs Returned
  Example: Examined: 2,000,000 / Returned: 5
  → Scanning the entire collection. Missing or ineffective index.

Danger pattern 2: Keys Examined = 0
  → Full Collection Scan (COLLSCAN)
  → Add an index immediately

Danger pattern 3: hasSort: true with no index-backed sort
  → In-memory sort occurring
  → Include the sort field in the index

Healthy pattern:
  Docs Examined ≈ Docs Returned
  Keys Examined > 0
  hasIndexCoverage: true

1-3. explain() — Reading the Execution Plan Directly

Use this when you need the execution plan from within your code.

// Check the execution plan (plan only, no query execution)
const plan = await db.collection("orders")
  .find({ status: "pending", userId: "user123" })
  .sort({ createdAt: -1 })
  .explain("executionStats");

// Key fields to check
console.log(plan.queryPlanner.winningPlan.stage);
// "COLLSCAN"           → no index
// "IXSCAN"             → index in use
// "FETCH"              → index lookup then document fetch
// "PROJECTION_COVERED" → covered index (best performance)

console.log(plan.executionStats.totalDocsExamined);   // docs scanned
console.log(plan.executionStats.totalDocsReturned);   // docs returned
console.log(plan.executionStats.executionTimeMillis); // execution time (ms)

Execution plan stage reference:

COLLSCAN           → Full collection scan (slow, needs an index)
IXSCAN             → Index scan (good)
FETCH              → Index lookup followed by document fetch (normal)
SORT               → In-memory sort (use index-backed sort instead)
SORT_KEY_GENERATOR → Sort key generation (appears alongside SORT)
PROJECTION_COVERED → Covered index (best possible performance)

2. Index Types and When to Use Each

MongoDB supports several index types. Picking the right one for the query shape is half the optimization work. Indexes speed up reads but slow down writes and consume RAM — create only what you need.

2-1. Single Field Index

The foundation. An index on one field.

// Frequent lookups by userId
db.users.createIndex({ userId: 1 });     // ascending
db.users.createIndex({ createdAt: -1 }); // descending (latest-first sort)

// Unique index
db.users.createIndex({ email: 1 }, { unique: true });

2-2. Compound Index — Most Commonly Used

Combines multiple fields into one index. Far more efficient than separate single-field indexes when queries filter or sort on multiple fields simultaneously.

// Heavy query pattern: status filter + date sort
db.orders.createIndex({ status: 1, createdAt: -1 });

// Heavy query pattern: per-user status lookup
db.orders.createIndex({ userId: 1, status: 1, createdAt: -1 });

Prefix rule — the key concept behind Compound Indexes:

Index: { userId: 1, status: 1, createdAt: -1 }

Query patterns this index covers:
  { userId: "u1" }                                    → first 1 field (prefix)
  { userId: "u1", status: "pending" }                 → first 2 fields (prefix)
  { userId: "u1", status: "pending", createdAt: ... } → all 3 fields

Query patterns this index does NOT cover:
  { status: "pending" }   → no userId prefix — index not used!
  { createdAt: ... }      → skips middle field

2-3. Partial Index — The Hidden Gem for Index Size Reduction

Indexes only documents that match a condition. When queries concentrate on a subset of the collection, this dramatically reduces index size while improving performance.

// Example: heavy read on orders where status = "pending"
// 1M total documents, only 50K are pending
// → Up to 95% smaller than a full index

db.orders.createIndex(
  { createdAt: -1, userId: 1 },
  {
    partialFilterExpression: { status: "pending" }
  }
);

// Queries must include status: "pending" to use this index
db.orders.find({ status: "pending", userId: "u1" }).sort({ createdAt: -1 });

2-4. Sparse Index — Exclude Null Fields

Excludes documents that don't have the indexed field. Useful for optional fields.

// Index only documents that have a phoneNumber field
db.users.createIndex(
  { phoneNumber: 1 },
  { sparse: true }
);

2-5. TTL Index — Automatic Document Expiry

Automatically deletes documents past a date threshold. Ideal for sessions, logs, and cached data.

// Auto-delete session data after 24 hours
db.sessions.createIndex(
  { lastAccessedAt: 1 },
  { expireAfterSeconds: 86400 }
);

// Auto-delete log data after 30 days
db.logs.createIndex(
  { createdAt: 1 },
  { expireAfterSeconds: 2592000 }
);

2-6. Wildcard Index — Dynamic Field Structures

Useful when documents have varying field shapes (user-defined metadata, dynamic attributes).

// E-commerce: different attribute structures per product
// { attributes: { color: "red", size: "M" } }
// { attributes: { voltage: "220V", wattage: "100W" } }
// attributes sub-structure varies per document

db.products.createIndex({ "attributes.$**": 1 });

db.products.find({ "attributes.color": "red" });    // uses index
db.products.find({ "attributes.voltage": "220V" }); // uses index

Caution: Wildcard Index covers many field combinations, but a Compound Index is far more efficient for known fields. Use it only when you genuinely need to query arbitrary dynamic fields.

2-7. Hidden Index — Pre-validate Before Dropping

Hides an index from the query planner without deleting it, letting you verify impact before committing.

// Hide the index (query planner stops using it)
db.orders.hideIndex({ status: 1, createdAt: -1 });

// Run your queries and observe performance change
// No impact detected → actually drop it
db.orders.dropIndex({ status: 1, createdAt: -1 });

// Problem found → restore immediately
db.orders.unhideIndex({ status: 1, createdAt: -1 });

3. The Golden Rule of Indexing — ESR

When designing a Compound Index, place fields in ESR (Equality → Sort → Range) order for optimal performance.

E (Equality) → Exact match fields:  { field: "exact_value" }
S (Sort)     → Sort fields:          .sort({ field: 1 })
R (Range)    → Range filter fields:  { field: { $gte: ..., $lte: ... } }

Practical example:

// Query pattern:
// db.orders.find({
//   status: "shipped",               Equality (E)
//   createdAt: { $gte: last7days }   Range (R)
// }).sort({ createdAt: -1 })         Sort (S)

// Wrong index order (Range before Sort)
db.orders.createIndex({ status: 1, createdAt: -1, userId: 1 });
// createdAt serves both Range and Sort roles;
// wrong ordering triggers an in-memory sort

// Correct ESR order
db.orders.createIndex({
  status: 1,      // E: Equality
  createdAt: -1,  // S: Sort (also the Range field — Sort takes precedence)
  userId: 1       // R: additional filter
});

Why ESR works intuitively:

E first  → groups equal values together (minimizes unnecessary scanning)
S second → preserves sort order within each group (no in-memory sort needed)
R last   → range filter as the final narrowing step

4. Aggregation Pipeline Optimization

Aggregation Pipeline is powerful, but poorly written pipelines easily hit the 100 MB per-stage memory limit or run for tens of seconds.

4-1. Golden Rule: Push $match and $project to the Front

// Bad: group everything, then filter
db.orders.aggregate([
  { $group: { _id: "$userId", total: { $sum: "$amount" } } },
  { $match: { total: { $gte: 100 } } }  // filter after grouping all data
]);

// Good: filter first, then group
db.orders.aggregate([
  { $match: { status: "completed", createdAt: { $gte: last30days } } }, // reduce first
  { $group: { _id: "$userId", total: { $sum: "$amount" } } },
  { $match: { total: { $gte: 100 } } }
]);
// Bad: pass all fields through every stage
db.orders.aggregate([
  { $match: { status: "pending" } },
  { $lookup: { from: "users", localField: "userId", ... } },
  { $project: { orderId: 1, userName: 1 } }  // project at the end
]);

// Good: project early to keep only needed fields
db.orders.aggregate([
  { $match: { status: "pending" } },
  { $project: { userId: 1, orderId: 1, amount: 1 } }, // project early
  { $lookup: { from: "users", localField: "userId", ... } }
]);

4-2. $lookup Optimization — The Performance Trap in JOINs

$lookup is convenient but easily becomes the biggest performance bottleneck on large datasets.

// Slow $lookup: no filter, full join
db.orders.aggregate([
  { $lookup: {
    from: "products",
    localField: "productId",
    foreignField: "_id",
    as: "product"
  }}
]);

// Fast $lookup: pipeline form + field projection
db.orders.aggregate([
  { $match: { status: "pending" } }, // reduce first
  { $lookup: {
    from: "products",
    let: { pid: "$productId" },
    pipeline: [
      { $match: { $expr: { $eq: ["$_id", "$$pid"] } } },
      { $project: { name: 1, price: 1 } }  // only needed fields
    ],
    as: "product"
  }},
  { $unwind: "$product" }
]);

Design signal: If you need $lookup frequently, that may indicate a schema redesign is warranted. Data that is always queried together should generally be embedded in one document — that is the MongoDB-idiomatic pattern.

4-3. $merge for Materialized Views

If the same expensive aggregation runs repeatedly, pre-storing its result in a separate collection produces large performance gains.

// Runs nightly at 3 AM — stores daily revenue aggregates
db.orders.aggregate([
  { $match: { status: "completed" } },
  { $group: {
    _id: { $dateToString: { format: "%Y-%m-%d", date: "$createdAt" } },
    dailyRevenue: { $sum: "$amount" },
    orderCount: { $sum: 1 }
  }},
  { $merge: {
    into: "daily_revenue_stats",
    on: "_id",
    whenMatched: "replace",
    whenNotMatched: "insert"
  }}
]);

// Reporting queries now hit the aggregated collection, not millions of raw docs
db.daily_revenue_stats.find({ _id: { $gte: "2026-01-01" } });

4-4. allowDiskUse — Bypassing the 100 MB Memory Limit

Each aggregation pipeline stage has a default 100 MB RAM limit. Large aggregations can error when they exceed it.

// Allow spilling to disk when stage exceeds 100 MB
db.orders.aggregate(
  [
    { $group: { _id: "$category", total: { $sum: "$amount" } } },
    { $sort: { total: -1 } }
  ],
  { allowDiskUse: true }  // disk I/O will slow things down
);

allowDiskUse is a workaround, not a solution. The right fix is to reduce pipeline input volume with an early $match, or to use indexes that shrink the scan set upstream.


5. Working Set and RAM Sizing

Answering "should I upgrade the cluster tier?" requires understanding the Working Set.

5-1. What Is the Working Set?

The Working Set is the sum of data and indexes actively accessed by live queries. MongoDB's storage engine (WiredTiger) performs best when the Working Set fits in RAM. When it exceeds RAM, disk I/O spikes and performance degrades sharply.

WiredTiger Cache (50% of physical RAM)

  Indexes (must stay in RAM at all times)    8 GB
  Hot Data (frequently accessed documents)  6 GB
  Buffer headroom                            2 GB
  Total: Working Set = 14 GB

Required WiredTiger Cache = 14 GB or more
Required physical RAM = 28 GB or more  (Cache = 50% of RAM, so × 2)
Recommended cluster tier: M40 (32 GB RAM)

5-2. Measuring Working Set Size

// Check index sizes
const stats = await db.collection("orders").stats();
console.log("Total document size:", (stats.size / 1024 / 1024).toFixed(2), "MB");
console.log("Index size:", (stats.totalIndexSize / 1024 / 1024).toFixed(2), "MB");

// Full database stats
const dbStats = await db.stats();
console.log("Total data:", (dbStats.dataSize / 1024 / 1024 / 1024).toFixed(2), "GB");
console.log("Total indexes:", (dbStats.indexSize / 1024 / 1024 / 1024).toFixed(2), "GB");

RAM sizing formula:

Minimum required RAM = (Hot Data size + Total index size) × 2
(Because WiredTiger Cache is 50% of physical RAM)

Example:
  Indexes: 8 GB
  Hot Data (~25% of total data): 6 GB
  Working Set: 14 GB
  Required RAM: 14 GB × 2 = 28 GB → M40 (32 GB)

Indexes must live in RAM: When indexes exceed RAM, query performance degrades catastrophically. If index size is growing, removing unused indexes or switching to Partial Indexes should come before a tier upgrade.

5-3. Detecting and Removing Unused Indexes

Indexes speed up reads but slow down writes and consume RAM. Remove indexes that no query is using.

// Query index usage statistics ($indexStats)
const indexUsage = await db.collection("orders").aggregate([
  { $indexStats: {} }
]).toArray();

indexUsage.forEach(idx => {
  console.log({
    indexName: idx.name,
    totalOps: idx.accesses.ops,
    since: idx.accesses.since
  });
});

// Indexes with ops = 0 or very low are removal candidates
// Use Hidden Index first to verify impact before dropping

6. Connection Pooling — The Bottleneck Teams Miss

Many teams focus solely on indexes and queries, overlooking connection pool misconfiguration as a bottleneck. The problem is especially acute in serverless environments.

6-1. Connection Pool Basics

Without pooling, high traffic exhausts the Atlas connection limit quickly. With a pool, connections are reused, minimizing latency and keeping connection count under control.

6-2. Driver Connection Pool Settings

// Node.js (mongoose)
const mongoose = require('mongoose');

mongoose.connect(uri, {
  maxPoolSize: 50,                   // max connections (default: 5 — too low)
  minPoolSize: 10,                   // minimum connections to maintain
  serverSelectionTimeoutMS: 5000,    // server selection timeout
  socketTimeoutMS: 45000,            // socket timeout
  maxIdleTimeMS: 30000,              // max idle connection lifetime
  waitQueueTimeoutMS: 5000           // max wait for available connection
});
# Python (pymongo)
from pymongo import MongoClient

client = MongoClient(
    uri,
    maxPoolSize=50,
    minPoolSize=10,
    serverSelectionTimeoutMS=5000,
    maxIdleTimeMS=30000,
    compressors=['snappy', 'zlib']  # network compression too
)

6-3. Serverless Environments — Special Considerations

In serverless environments like AWS Lambda, Google Cloud Run, and Vercel, instances can be created fresh per request, resetting the connection pool each time.

Problem scenario:
500 Lambda instances running concurrently
Each instance maxPoolSize: 50
Potential connection count: 25,000

Atlas M30 maximum connections: 3,000
→ Connection errors everywhere

Recommended serverless pattern:

// Bad: new client inside the handler every time
exports.handler = async (event) => {
  const client = new MongoClient(uri); // new connection every invocation
  await client.connect();
  // ...
};

// Good: create once at module level (reused on warm starts)
let cachedClient = null;

exports.handler = async (event) => {
  if (!cachedClient) {
    cachedClient = await new MongoClient(uri, {
      maxPoolSize: 1,                  // 1 is correct for serverless
      serverSelectionTimeoutMS: 10000,
      socketTimeoutMS: 45000,
    }).connect();
  }
  const db = cachedClient.db("mydb");
  // ...
};

7. Auto-scaling Deep Dive — Reactive & Predictive

7-1. Two Auto-scaling Mechanisms

Atlas Auto-scaling runs two mechanisms in parallel.

Reactive Auto-scaling
  Traffic spike → CPU/RAM threshold breached → scale-up decision → execute
  Reactive — responds after the spike arrives
  M10/M20: 5× faster reaction speed (2025 update)
  M30+: scales up after 10+ minutes above threshold;
         scales down after 24+ hours of low utilization

Predictive Auto-scaling
  Learn past patterns (ML) → predict traffic spike → scale up proactively
  Proactive — acts before the spike arrives
  Especially effective for scheduled batch jobs and day-of-week traffic patterns
  Scale-down is still Reactive only

Verify the GA status and supported tiers for Predictive Auto-scaling in the MongoDB official documentation. Auto-scaling does not fix bad queries or schema problems — resolve root causes first, then adjust tier only when genuinely needed.

7-2. Auto-scaling Trigger Conditions (M30+)

Scale-Up trigger:

CPU utilization > 90% sustained for 10+ minutes
  or
RAM utilization persistently high
→ Automatically upgrades to the next tier (M30 → M40)
→ Detection starts 10 minutes after the last scaling event

Scale-Down trigger:

All of the following must be true:
  CPU < 45% for the last 10 minutes AND last 4 hours
  No scaling events in the last 24 hours
  No provisioning or restarts in the last 24 hours
  Next lower tier is at or above the configured Minimum Tier
→ Downgrades one tier (M40 → M30)

7-3. Scaling Range Strategy

// Set Auto-scaling range via Terraform
resource "mongodbatlas_advanced_cluster" "prod" {
  // ...
  replication_specs {
    region_configs {
      auto_scaling {
        compute_enabled            = true
        disk_gb_enabled            = true
        compute_min_instance_size  = "M30"
        compute_max_instance_size  = "M60"
        compute_scale_down_enabled = true
      }
    }
  }
}

Range selection guidelines:

Too wide (risky):        M10 ~ M200
  Risk of unexpected cost spikes during traffic bursts

Balanced (recommended):  M30 ~ M60
  Covers typical production traffic variation

Cost-conservative:       M30 ~ M40
  Tight cost control; may not handle extreme spikes

7-4. Independent Shard Scaling (2025 New Feature)

When traffic concentrates on a single shard (hot shard), individual shard tier settings are now possible.


8. Schema Design Determines Performance

Even perfect indexes have a ceiling if the schema itself is wrong.

8-1. Embed vs Reference — When to Use Each

Embedding (everything in one document)
  Use when:
    Data is always queried together
    1:1 or 1:Few relationship
    Sub-data does not need to exist independently

  Example: order + shipping address (always fetched together)
  {
    _id: ObjectId("..."),
    orderId: "ORD-001",
    amount: 50000,
    shippingAddress: {
      street: "123 Main St",
      city: "Seattle",
      zip: "98101"
    }
  }

Referencing (separate collections)
  Use when:
    1:Many or Many:Many relationship
    Sub-data changes frequently and independently
    Arrays can grow without bound (avoid Unbounded Arrays!)

  Example: user + order list (orders keep accumulating)
  users collection
  { _id: ObjectId("u1"), name: "Jane Doe", email: "jane@example.com" }

  orders collection (references userId)
  { _id: ObjectId("o1"), userId: ObjectId("u1"), amount: 50000, ... }
  { _id: ObjectId("o2"), userId: ObjectId("u1"), amount: 30000, ... }

8-2. Unbounded Array Anti-pattern — The Most Common Performance Issue

// Anti-pattern: array that grows indefinitely
{
  _id: ObjectId("u1"),
  userId: "user123",
  orders: [           // document grows without limit as orders accumulate
    { orderId: "o1", amount: 50000 },
    { orderId: "o2", amount: 30000 },
    // thousands of entries → risks hitting the 16 MB document size limit
    // index performance degrades sharply
  ]
}

// Correct pattern: separate collection
// users: { _id, userId, ... }
// orders: { _id, userId, amount, ... }  ← index on userId

8-3. Schema Performance Suggestions

Atlas Performance Advisor detects schema design problems in addition to missing indexes.

Issues it detects:
  Array too large          (oversized array fields)
  Collection join overuse  (excessive $lookup dependency)
  Redundant indexes        (duplicate indexes)
  Document too large       (very large documents)
  High null ratio          (fields that are mostly null)

9. The Performance Optimization Workflow

First action by bottleneck type:

CPU high     → Check Performance Advisor for queries not using indexes
RAM high     → Measure Working Set; upgrade tier or remove unused indexes
IOPS high    → Excessive disk reads = Working Set exceeds RAM → trace to RAM shortage
Connections  → Check maxPoolSize settings; serverless client caching

Rolling Index Build — create indexes without downtime:

# Default in Atlas UI is Rolling Build
# When creating via CLI:
atlas clusters indexes create \
  --clusterName prod-cluster \
  --db myDatabase \
  --collection orders \
  --key '{"status": 1, "createdAt": -1}' \
  --name "idx_status_createdAt"

Part 4 Summary

Optimization AreaKey Point
Diagnostic toolsPerformance Advisor (auto-suggest) → Query Profiler (deep analysis) → explain()
Index selectionChoose type by query shape (Compound, Partial, TTL, Wildcard); validate with Hidden before dropping
ESR ruleCompound Index field order: Equality → Sort → Range
AggregationPush $match/$project forward; minimize $lookup; materialize with $merge
Working SetIndexes + Hot Data must fit in WiredTiger Cache = RAM × 50%
Connection PoolSet maxPoolSize appropriately; cache client at module level in serverless
Auto-scalingReactive + Predictive in parallel; set sensible Min/Max range; per-shard scaling
Schema designNo unbounded arrays; embed data that is always queried together

References

  • MongoDB Atlas Documentation: Performance Advisor
  • MongoDB Atlas Documentation: Query Profiler
  • MongoDB Manual: Indexes and Compound Indexes
  • MongoDB Atlas Documentation: Cluster Auto-scaling
  • MongoDB Manual: Aggregation Pipeline Optimization

Share This Article

Series Navigation

MongoDB Atlas Complete Guide

Current part 4 · 5 published

Explore this topic·Start with featured series

한국어

Follow new posts via RSS

Use RSS to get new posts and series updates directly.

Open RSS Guide