Lv.2 BeginnerMongoDB

2026.05.2522 min readLv.2 Beginner

SeriesMongoDB Atlas Complete Guide · Part 2View series hub

MongoDB Atlas Complete Guide Part 2 — Cluster Design Strategy: Flex vs Dedicated, Multi-Region, and Cost Optimization

Understand the 2025 Atlas cluster tier restructuring — Flex GA replacing Serverless and Shared — and learn how to pick the right tier for your workload. Covers the M30 production baseline, the 3-region 5-node high-availability pattern, shard key selection principles, and cost optimization patterns including Auto-scaling bounds, Analytics Nodes, and Online Archive.

Series

Part 1 | Concepts, architecture, and why Atlas matters

Part 2 ← You are here | Cluster Design Strategy

Part 3 | Security, Networking, and Access Control

Part 4 | Performance Optimization — Indexing, Query Tuning, Auto-scaling

Part 5 | Atlas in the AI Era — Vector Search, Stream Processing, RAG Pipelines

The 2025 Atlas Tier Restructuring — The End of Serverless
The Four Cluster Tiers Explained
Tier Selection Guide — Which One Is Right for You?
Multi-Region and Multi-Cloud Design Strategy
Replica Set vs Sharding — When Does Sharding Make Sense?
Cost Optimization Patterns
Cluster Design Checklist
Practical application notes

1. The 2025 Atlas Tier Restructuring — The End of Serverless

If you have been using MongoDB Atlas for a while, you have probably noticed that the tier lineup has changed significantly. The once-popular Serverless Instance and Shared Tier (M2/M5) were officially deprecated in early 2025, and Atlas Flex Tier now fully replaces both.

This matters because it is not a simple rename. Flex simultaneously addresses two problems: the unpredictable billing of Serverless and the feature limitations of Shared Tier. Accounts still running Serverless or M2/M5 are being automatically migrated to Flex by MongoDB. The legacy resources reached end-of-life on January 22, 2026.

2. The Four Cluster Tiers Explained

M0 — Free Tier

Item	Details
Price	Completely free, no credit card required
Storage	512 MB
RAM / vCPU	Shared (minimum spec)
Cloud	AWS, GCP, or Azure
Restrictions	One per project, no production use, no Vector Search

M0 is for learning, prototyping, and personal projects. It comes with connection limits, IOPS limits, and no backup support. It cannot be used for production.

Flex Tier — GA February 2025

Flex is the unified replacement for the old Serverless and Shared (M2/M5) tiers. Its core value is the combination of predictable cost and variable scaling.

Item	Details
Base fee	$8/month (includes 100 ops/sec)
Maximum charge	$30/month cap (no overages)
Storage	5 GB
Peak throughput	500 ops/sec (burst)
Supported features	Atlas Search, Vector Search, Change Streams, Triggers
Production use	Possible for light traffic workloads

Flex billing estimates (reference only):

Scenario: 100 ops/sec × 20 days + 250 ops/sec × 5 days + 500 ops/sec × 5 days
→ Estimated charge: ~$13.67

Scenario: 100 ops/sec × 20 days, then deleted
→ Estimated charge: ~$5.28

When Flex is a good fit:

MVP / side projects
Development / staging environments
Early-stage startups with irregular traffic
Vector Search feature testing

When Flex is not a good fit:

Sustained high-traffic services (Dedicated M10 may be cheaper)
Multi-region or high-availability configurations
Advanced network security requirements (VPC Peering, Private Endpoint)

Dedicated Tier — M10 through M700

The backbone of production workloads. Dedicated vCPU and RAM are guaranteed, and all Atlas features are available.

Tier	RAM	vCPU	Storage	Monthly cost (AWS, reference)	Primary use
M10	2 GB	Shared	10 GB+	~$57	Dev/staging, small apps
M20	4 GB	Shared	20 GB+	~$100	Small production
M30	8 GB	Dedicated	40 GB+	~$210	General production (recommended baseline)
M40	16 GB	Dedicated	80 GB+	~$390	Mid-size production
M60	32 GB	Dedicated	160 GB+	~$700	High-traffic services
M80	64 GB	Dedicated	320 GB+	~$1,400	Large enterprise
M140+	192 GB+	Dedicated	1 TB+	~$3,500+	Massive workloads

M10 and M20 use shared vCPUs, making them vulnerable to CPU-intensive workloads. The recommended production baseline is M30, which guarantees dedicated vCPU. With Auto-scaling enabled, Atlas automatically moves between M30, M40, and M60 based on traffic.

Features exclusive to M30 and above:

Sharded clusters
Multi-region / multi-cloud deployments
Private Endpoint / VPC Peering
LDAP, X.509 authentication
Encryption at Rest (bring-your-own KMS key)
Online Archive (automatic cold storage tiering)

Tier Overview at a Glance

3. Tier Selection Guide — Which One Is Right for You?

Recommended configurations for the most common real-world scenarios.

Scenario A: Individual developer / side project

Recommended: M0 (free) → upgrade to Flex when needed
Reason: Zero cost to start, cluster ready in 5 minutes

Scenario B: Early-stage startup (MAU under 10,000)

Recommended: Flex Tier
Reason: $8–$30/month predictable billing, includes Vector Search, absorbs traffic spikes
Upgrade signal: Sustained CPU above 70%, or hitting the 500 ops/sec throughput ceiling → upgrade to M30

Scenario C: General production service

Recommended: M30 + Auto-scaling (M30–M60 range)
Reason: Dedicated vCPU, stable SLA (99.995% for M10+), full feature support
Backup: Enable Continuous Cloud Backup (PITR) — required for production

Scenario D: High-volume data + global service

Recommended: M50+ + multi-region (3-region 5-node pattern)
Reason: Survive full-region failures without downtime, minimize global read latency
Cost: Expect roughly 2–3x the cost of a single-region configuration

Scenario E: Global enterprise / compliance

Recommended: Global Cluster (M30+) or independent per-region clusters
Reason: GDPR / data sovereignty compliance, regional data isolation
Core rule: EU data stays in EU regions, domestic data stays in domestic regions

4. Multi-Region and Multi-Cloud Design Strategy

Foundation: Distributing a Replica Set Across Regions

Every Atlas cluster is a 3-node Replica Set by default. In a single-region deployment, Atlas automatically spreads the three nodes across three Availability Zones, absorbing AZ-level failures.

Single-region covers AZ failures with automatic failover, but a full region outage takes the service down. Multi-region is necessary to survive region-level failures.

Multi-Region Configuration — 3-Region 5-Node Pattern (M10+)

Failure behavior for this configuration:

Region A fully fails → Region B automatically promoted to Primary
Region B fully fails → Region A + C maintain quorum (3 of 5 nodes)
Theoretical availability: 99.999% or better

Terraform example for a multi-region cluster:

resource "mongodbatlas_advanced_cluster" "prod" {
  project_id   = var.project_id
  name         = "prod-cluster"
  cluster_type = "REPLICASET"

  replication_specs {
    region_configs {
      provider_name = "AWS"
      region_name   = "AP_NORTHEAST_2"  # Seoul
      priority      = 7
      electable_specs {
        instance_size = "M30"
        node_count    = 2
      }
    }

    region_configs {
      provider_name = "AWS"
      region_name   = "AP_SOUTHEAST_1"  # Singapore
      priority      = 6
      electable_specs {
        instance_size = "M30"
        node_count    = 2
      }
    }

    region_configs {
      provider_name = "AWS"
      region_name   = "AP_EAST_1"       # Hong Kong
      priority      = 5
      analytics_specs {
        instance_size = "M30"
        node_count    = 1
      }
    }
  }
}

Multi-Cloud Configuration — Breaking Vendor Lock-in

Going one step further, Atlas supports running across AWS, Azure, and GCP simultaneously. Atlas is the only managed MongoDB service that offers this.

Reasons to use multi-cloud:

Automatic failover when a specific cloud provider experiences an outage
Leverage for cloud vendor negotiations (avoiding lock-in)
Using cloud-specific strengths (AI on GCP, main workload on AWS, etc.)
Meeting data sovereignty requirements (when only a specific cloud is permitted in a given country)

Drawbacks:

Cross-cloud traffic costs (ingress/egress)
Encryption key management (KMS) separated per cloud
Private Endpoint must be configured independently per cloud
Increased operational complexity

Most services can achieve what they need with multi-region (single cloud). Multi-cloud is worth evaluating for enterprises with strict compliance requirements in sectors like finance or healthcare.

Global Cluster — For True Global Services

When you need to offer local writes to users worldwide, consider Global Cluster.

Global Cluster supports up to 9 zones and 70 shards, with each zone independently handling its own reads and writes. Asian users' data goes to the Asia shard, EU users' data to the EU shard — satisfying both GDPR compliance and low-latency access simultaneously. That said, the design complexity is high. This is overkill unless you genuinely serve a global audience.

5. Replica Set vs Sharding — When Does Sharding Make Sense?

When to Introduce Sharding

More data does not automatically mean sharding is the right answer.

Situation	Recommended configuration
Data under a few hundred GB, predictable traffic	Single Replica Set with appropriate tier
Data in the multiple-TB range, or write throughput bottleneck	Sharded cluster (M30+)
Global per-region data isolation required	Global Cluster (sharding-based)
Read traffic only is overloaded	Replica Set + Analytics Node

Shard Key Selection — The Most Important Decision

The shard key choice defines how data is distributed across shards. A poor choice leads to "hotspots" — one shard absorbing all the load.

// Bad shard key: monotonically increasing time-based field
// (all writes pile onto the last shard)
{ createdAt: 1 }

// Bad shard key: cardinality too low (e.g., gender)
{ gender: 1 }

// Good shard key: high cardinality + even distribution
{ userId: "hashed" }           // hash-based even distribution
{ region: 1, userId: 1 }       // compound key (region-scoped queries + even distribution)
{ _id: "hashed" }              // universally safe general-purpose choice

Shard keys cannot be changed once set (pre-MongoDB 5.0). Take the time to get this right before committing.

6. Cost Optimization Patterns

Pattern 1: Keep Auto-scaling Bounds Narrow

Auto-scaling is a double-edged sword. Set the ceiling too high and a traffic spike can send you straight to M60.

Recommended: M30 (minimum) – M50 (maximum)
Avoid:       M30 (minimum) – M140 (maximum)  ← risk of an unexpected large bill

Pattern 2: Offload Reporting Queries to Analytics Nodes

Running heavy aggregation or reporting queries against the production Primary leads to performance degradation and unnecessary tier upgrades. Add a dedicated Analytics Node and route analytical reads there instead.

// Route to Analytics Node in driver configuration (Node.js)
const client = new MongoClient(uri, {
  readPreference: 'secondary',
  readPreferenceTags: [{ nodeType: 'ANALYTICS' }]
});

Pattern 3: Automatically Tier Cold Data with Online Archive

Leaving aging log data in the Primary cluster drives up storage costs. Online Archive automatically moves data matching your date rule to S3-compatible cold storage, while queries continue to work as before.

// Archive rule: automatically move data older than 30 days
{
  "dataProcessRegion": { "cloudProvider": "AWS", "region": "AP_SOUTHEAST_2" },
  "criteria": {
    "type": "DATE",
    "dateField": "createdAt",
    "dateFormat": "ISODATE",
    "expireAfterDays": 30
  }
}

Pattern 4: Auto-Pause Non-Production Clusters Outside Business Hours

Running dev and staging clusters around the clock is wasteful. Use the Atlas CLI with a scheduler to automate pause and resume.

# Pause the staging cluster at 11 PM on weekdays
atlas clusters pause staging-cluster --projectId <PROJECT_ID>

# Resume at 8 AM on weekdays
atlas clusters start staging-cluster --projectId <PROJECT_ID>

Pattern 5: Reduce Data Transfer Costs

Data transfer (network egress) is a surprisingly significant portion of Atlas bills.

Cost order, lowest to highest:
1. Within the same region         → free or minimal
2. Same cloud, different regions  → moderate
3. Different clouds               → high
4. Outbound egress to the internet → highest

Reduction tips:
- Deploy application servers in the same region as your Atlas cluster
- Use query Projection to minimize data returned per query
- Enable network compression in the driver

// Enable network compression (Node.js)
const client = new MongoClient(uri, {
  compressors: ['snappy', 'zlib'],
  zlibCompressionLevel: 6
});

7. Cluster Design Checklist

Work through this list before provisioning a cluster.

Basic Design

Tier selection: choose the right tier for your workload (Flex / M10 / M30+)
Cloud and region: match your application server's cloud and region
MongoDB version: use the latest stable release (8.x)
Cluster name: include environment in the name (prod-myapp, staging-myapp, etc.)

Availability and Recovery

Auto-scaling: always enable on Dedicated clusters
Continuous Cloud Backup (PITR): required for production (minute-level recovery)
Multi-region: use the 3-region 5-node pattern if targeting 99.999% SLA
Maintenance Window: schedule during your lowest-traffic window (e.g., 3–5 AM)

Security (covered in depth in Part 3)

IP Access List: apply least-privilege access
Database User: per-application dedicated users with minimal permissions
Private Endpoint: strongly recommended for M10+ production clusters

Cost Management

Billing Alert: set an alert at 150% of expected monthly spend
Online Archive: configure automatic archiving for cold data older than 30 days
Non-production clusters: schedule automatic pause during nights and weekends

Part 2 Summary

Key Point	Detail
Serverless retired	Replaced by Flex Tier in 2025 ($8–$30/month cap)
Production baseline	M30 + Auto-scaling is the practical starting point
Multi-region recommendation	3-region 5-node pattern (99.999% availability target)
Sharding threshold	Multiple TBs of data, or a write throughput bottleneck
Shard key principles	High cardinality + even distribution + no easy rollback
Top cost optimizations	Narrow Auto-scaling bounds + Online Archive + non-production auto-pause