Thursday, June 4, 2026
All posts
Lv.2 BeginnerMongoDB
22 min readLv.2 Beginner
SeriesMongoDB Atlas Complete Guide · Part 2View series hub

MongoDB Atlas Complete Guide Part 2 — Cluster Design Strategy: Flex vs Dedicated, Multi-Region, and Cost Optimization

MongoDB Atlas Complete Guide Part 2 — Cluster Design Strategy: Flex vs Dedicated, Multi-Region, and Cost Optimization

Understand the 2025 Atlas cluster tier restructuring — Flex GA replacing Serverless and Shared — and learn how to pick the right tier for your workload. Covers the M30 production baseline, the 3-region 5-node high-availability pattern, shard key selection principles, and cost optimization patterns including Auto-scaling bounds, Analytics Nodes, and Online Archive.

Series

  • Part 1 | Concepts, architecture, and why Atlas matters
  • Part 2 ← You are here | Cluster Design Strategy
  • Part 3 | Security, Networking, and Access Control
  • Part 4 | Performance Optimization — Indexing, Query Tuning, Auto-scaling
  • Part 5 | Atlas in the AI Era — Vector Search, Stream Processing, RAG Pipelines

Table of Contents

  1. The 2025 Atlas Tier Restructuring — The End of Serverless
  2. The Four Cluster Tiers Explained
  3. Tier Selection Guide — Which One Is Right for You?
  4. Multi-Region and Multi-Cloud Design Strategy
  5. Replica Set vs Sharding — When Does Sharding Make Sense?
  6. Cost Optimization Patterns
  7. Cluster Design Checklist
  8. Practical application notes

1. The 2025 Atlas Tier Restructuring — The End of Serverless

If you have been using MongoDB Atlas for a while, you have probably noticed that the tier lineup has changed significantly. The once-popular Serverless Instance and Shared Tier (M2/M5) were officially deprecated in early 2025, and Atlas Flex Tier now fully replaces both.

This matters because it is not a simple rename. Flex simultaneously addresses two problems: the unpredictable billing of Serverless and the feature limitations of Shared Tier. Accounts still running Serverless or M2/M5 are being automatically migrated to Flex by MongoDB. The legacy resources reached end-of-life on January 22, 2026.


2. The Four Cluster Tiers Explained

M0 — Free Tier

ItemDetails
PriceCompletely free, no credit card required
Storage512 MB
RAM / vCPUShared (minimum spec)
CloudAWS, GCP, or Azure
RestrictionsOne per project, no production use, no Vector Search

M0 is for learning, prototyping, and personal projects. It comes with connection limits, IOPS limits, and no backup support. It cannot be used for production.


Flex Tier — GA February 2025

Flex is the unified replacement for the old Serverless and Shared (M2/M5) tiers. Its core value is the combination of predictable cost and variable scaling.

ItemDetails
Base fee$8/month (includes 100 ops/sec)
Maximum charge$30/month cap (no overages)
Storage5 GB
Peak throughput500 ops/sec (burst)
Supported featuresAtlas Search, Vector Search, Change Streams, Triggers
Production usePossible for light traffic workloads

Flex billing estimates (reference only):

Scenario: 100 ops/sec × 20 days + 250 ops/sec × 5 days + 500 ops/sec × 5 days
→ Estimated charge: ~$13.67

Scenario: 100 ops/sec × 20 days, then deleted
→ Estimated charge: ~$5.28

When Flex is a good fit:

  • MVP / side projects
  • Development / staging environments
  • Early-stage startups with irregular traffic
  • Vector Search feature testing

When Flex is not a good fit:

  • Sustained high-traffic services (Dedicated M10 may be cheaper)
  • Multi-region or high-availability configurations
  • Advanced network security requirements (VPC Peering, Private Endpoint)

Dedicated Tier — M10 through M700

The backbone of production workloads. Dedicated vCPU and RAM are guaranteed, and all Atlas features are available.

TierRAMvCPUStorageMonthly cost (AWS, reference)Primary use
M102 GBShared10 GB+~$57Dev/staging, small apps
M204 GBShared20 GB+~$100Small production
M308 GBDedicated40 GB+~$210General production (recommended baseline)
M4016 GBDedicated80 GB+~$390Mid-size production
M6032 GBDedicated160 GB+~$700High-traffic services
M8064 GBDedicated320 GB+~$1,400Large enterprise
M140+192 GB+Dedicated1 TB+~$3,500+Massive workloads

M10 and M20 use shared vCPUs, making them vulnerable to CPU-intensive workloads. The recommended production baseline is M30, which guarantees dedicated vCPU. With Auto-scaling enabled, Atlas automatically moves between M30, M40, and M60 based on traffic.

Features exclusive to M30 and above:

  • Sharded clusters
  • Multi-region / multi-cloud deployments
  • Private Endpoint / VPC Peering
  • LDAP, X.509 authentication
  • Encryption at Rest (bring-your-own KMS key)
  • Online Archive (automatic cold storage tiering)

Tier Overview at a Glance


3. Tier Selection Guide — Which One Is Right for You?

Recommended configurations for the most common real-world scenarios.

Scenario A: Individual developer / side project

Recommended: M0 (free) → upgrade to Flex when needed
Reason: Zero cost to start, cluster ready in 5 minutes

Scenario B: Early-stage startup (MAU under 10,000)

Recommended: Flex Tier
Reason: $8–$30/month predictable billing, includes Vector Search, absorbs traffic spikes
Upgrade signal: Sustained CPU above 70%, or hitting the 500 ops/sec throughput ceiling → upgrade to M30

Scenario C: General production service

Recommended: M30 + Auto-scaling (M30–M60 range)
Reason: Dedicated vCPU, stable SLA (99.995% for M10+), full feature support
Backup: Enable Continuous Cloud Backup (PITR) — required for production

Scenario D: High-volume data + global service

Recommended: M50+ + multi-region (3-region 5-node pattern)
Reason: Survive full-region failures without downtime, minimize global read latency
Cost: Expect roughly 2–3x the cost of a single-region configuration

Scenario E: Global enterprise / compliance

Recommended: Global Cluster (M30+) or independent per-region clusters
Reason: GDPR / data sovereignty compliance, regional data isolation
Core rule: EU data stays in EU regions, domestic data stays in domestic regions

4. Multi-Region and Multi-Cloud Design Strategy

Foundation: Distributing a Replica Set Across Regions

Every Atlas cluster is a 3-node Replica Set by default. In a single-region deployment, Atlas automatically spreads the three nodes across three Availability Zones, absorbing AZ-level failures.

Single-region covers AZ failures with automatic failover, but a full region outage takes the service down. Multi-region is necessary to survive region-level failures.

Multi-Region Configuration — 3-Region 5-Node Pattern (M10+)

Failure behavior for this configuration:

  • Region A fully fails → Region B automatically promoted to Primary
  • Region B fully fails → Region A + C maintain quorum (3 of 5 nodes)
  • Theoretical availability: 99.999% or better

Terraform example for a multi-region cluster:

resource "mongodbatlas_advanced_cluster" "prod" {
  project_id   = var.project_id
  name         = "prod-cluster"
  cluster_type = "REPLICASET"

  replication_specs {
    region_configs {
      provider_name = "AWS"
      region_name   = "AP_NORTHEAST_2"  # Seoul
      priority      = 7
      electable_specs {
        instance_size = "M30"
        node_count    = 2
      }
    }

    region_configs {
      provider_name = "AWS"
      region_name   = "AP_SOUTHEAST_1"  # Singapore
      priority      = 6
      electable_specs {
        instance_size = "M30"
        node_count    = 2
      }
    }

    region_configs {
      provider_name = "AWS"
      region_name   = "AP_EAST_1"       # Hong Kong
      priority      = 5
      analytics_specs {
        instance_size = "M30"
        node_count    = 1
      }
    }
  }
}

Multi-Cloud Configuration — Breaking Vendor Lock-in

Going one step further, Atlas supports running across AWS, Azure, and GCP simultaneously. Atlas is the only managed MongoDB service that offers this.

Reasons to use multi-cloud:

  • Automatic failover when a specific cloud provider experiences an outage
  • Leverage for cloud vendor negotiations (avoiding lock-in)
  • Using cloud-specific strengths (AI on GCP, main workload on AWS, etc.)
  • Meeting data sovereignty requirements (when only a specific cloud is permitted in a given country)

Drawbacks:

  • Cross-cloud traffic costs (ingress/egress)
  • Encryption key management (KMS) separated per cloud
  • Private Endpoint must be configured independently per cloud
  • Increased operational complexity

Most services can achieve what they need with multi-region (single cloud). Multi-cloud is worth evaluating for enterprises with strict compliance requirements in sectors like finance or healthcare.

Global Cluster — For True Global Services

When you need to offer local writes to users worldwide, consider Global Cluster.

Global Cluster supports up to 9 zones and 70 shards, with each zone independently handling its own reads and writes. Asian users' data goes to the Asia shard, EU users' data to the EU shard — satisfying both GDPR compliance and low-latency access simultaneously. That said, the design complexity is high. This is overkill unless you genuinely serve a global audience.


5. Replica Set vs Sharding — When Does Sharding Make Sense?

When to Introduce Sharding

More data does not automatically mean sharding is the right answer.

SituationRecommended configuration
Data under a few hundred GB, predictable trafficSingle Replica Set with appropriate tier
Data in the multiple-TB range, or write throughput bottleneckSharded cluster (M30+)
Global per-region data isolation requiredGlobal Cluster (sharding-based)
Read traffic only is overloadedReplica Set + Analytics Node

Shard Key Selection — The Most Important Decision

The shard key choice defines how data is distributed across shards. A poor choice leads to "hotspots" — one shard absorbing all the load.

// Bad shard key: monotonically increasing time-based field
// (all writes pile onto the last shard)
{ createdAt: 1 }

// Bad shard key: cardinality too low (e.g., gender)
{ gender: 1 }

// Good shard key: high cardinality + even distribution
{ userId: "hashed" }           // hash-based even distribution
{ region: 1, userId: 1 }       // compound key (region-scoped queries + even distribution)
{ _id: "hashed" }              // universally safe general-purpose choice

Shard keys cannot be changed once set (pre-MongoDB 5.0). Take the time to get this right before committing.


6. Cost Optimization Patterns

Pattern 1: Keep Auto-scaling Bounds Narrow

Auto-scaling is a double-edged sword. Set the ceiling too high and a traffic spike can send you straight to M60.

Recommended: M30 (minimum) – M50 (maximum)
Avoid:       M30 (minimum) – M140 (maximum)  ← risk of an unexpected large bill

Pattern 2: Offload Reporting Queries to Analytics Nodes

Running heavy aggregation or reporting queries against the production Primary leads to performance degradation and unnecessary tier upgrades. Add a dedicated Analytics Node and route analytical reads there instead.

// Route to Analytics Node in driver configuration (Node.js)
const client = new MongoClient(uri, {
  readPreference: 'secondary',
  readPreferenceTags: [{ nodeType: 'ANALYTICS' }]
});

Pattern 3: Automatically Tier Cold Data with Online Archive

Leaving aging log data in the Primary cluster drives up storage costs. Online Archive automatically moves data matching your date rule to S3-compatible cold storage, while queries continue to work as before.

// Archive rule: automatically move data older than 30 days
{
  "dataProcessRegion": { "cloudProvider": "AWS", "region": "AP_SOUTHEAST_2" },
  "criteria": {
    "type": "DATE",
    "dateField": "createdAt",
    "dateFormat": "ISODATE",
    "expireAfterDays": 30
  }
}

Pattern 4: Auto-Pause Non-Production Clusters Outside Business Hours

Running dev and staging clusters around the clock is wasteful. Use the Atlas CLI with a scheduler to automate pause and resume.

# Pause the staging cluster at 11 PM on weekdays
atlas clusters pause staging-cluster --projectId <PROJECT_ID>

# Resume at 8 AM on weekdays
atlas clusters start staging-cluster --projectId <PROJECT_ID>

Pattern 5: Reduce Data Transfer Costs

Data transfer (network egress) is a surprisingly significant portion of Atlas bills.

Cost order, lowest to highest:
1. Within the same region         → free or minimal
2. Same cloud, different regions  → moderate
3. Different clouds               → high
4. Outbound egress to the internet → highest

Reduction tips:
- Deploy application servers in the same region as your Atlas cluster
- Use query Projection to minimize data returned per query
- Enable network compression in the driver
// Enable network compression (Node.js)
const client = new MongoClient(uri, {
  compressors: ['snappy', 'zlib'],
  zlibCompressionLevel: 6
});

7. Cluster Design Checklist

Work through this list before provisioning a cluster.

Basic Design

  • Tier selection: choose the right tier for your workload (Flex / M10 / M30+)
  • Cloud and region: match your application server's cloud and region
  • MongoDB version: use the latest stable release (8.x)
  • Cluster name: include environment in the name (prod-myapp, staging-myapp, etc.)

Availability and Recovery

  • Auto-scaling: always enable on Dedicated clusters
  • Continuous Cloud Backup (PITR): required for production (minute-level recovery)
  • Multi-region: use the 3-region 5-node pattern if targeting 99.999% SLA
  • Maintenance Window: schedule during your lowest-traffic window (e.g., 3–5 AM)

Security (covered in depth in Part 3)

  • IP Access List: apply least-privilege access
  • Database User: per-application dedicated users with minimal permissions
  • Private Endpoint: strongly recommended for M10+ production clusters

Cost Management

  • Billing Alert: set an alert at 150% of expected monthly spend
  • Online Archive: configure automatic archiving for cold data older than 30 days
  • Non-production clusters: schedule automatic pause during nights and weekends

Part 2 Summary

Key PointDetail
Serverless retiredReplaced by Flex Tier in 2025 ($8–$30/month cap)
Production baselineM30 + Auto-scaling is the practical starting point
Multi-region recommendation3-region 5-node pattern (99.999% availability target)
Sharding thresholdMultiple TBs of data, or a write throughput bottleneck
Shard key principlesHigh cardinality + even distribution + no easy rollback
Top cost optimizationsNarrow Auto-scaling bounds + Online Archive + non-production auto-pause

References

Share This Article

Series Navigation

MongoDB Atlas Complete Guide

Current part 2 · 5 published

Explore this topic·Start with featured series

한국어

Follow new posts via RSS

Use RSS to get new posts and series updates directly.

Open RSS Guide