Tuesday, April 14, 2026
Volume 1.3
All posts
Lv.1 IntroMongoDB
10 min readLv.1 Intro

MongoDB and DocumentDB in 10 Minutes — A Practical Primer

MongoDB and DocumentDB in 10 Minutes — A Practical Primer

MongoDB and AWS DocumentDB are names you hear often, yet it is still hard to grasp what they are and how they differ in one pass. This guide is for readers who know relational databases but find document-oriented NoSQL unfamiliar: from what a document is, through the roles of MongoDB and DocumentDB, API compatibility limits, selection criteria, collections, BSON, and embed-versus-reference modeling—in about ten minutes of reading. It aims for a grounded mental model without hype, and leaves you knowing what to open next in the official docs or a local sandbox.

If you have used relational databases but MongoDB and DocumentDB still feel fuzzy, this article maps the big picture. Expect about ten minutes from start to finish.

Table of contents

  1. Introduction — what is a “document”?
  2. Relational vs document databases — key differences
  3. What is MongoDB?
  4. What is AWS DocumentDB? (MongoDB-compatible API)
  5. MongoDB vs AWS DocumentDB — which one?
  6. Core concepts — collection, document, BSON
  7. Modeling at a glance — embed vs reference
  8. Minimal query cheat sheet
  9. When a document database is a good fit
  10. Wrap-up — ten-minute summary

1. Introduction — what is a “document”?

Databases such as MySQL and PostgreSQL store data in rows and columns—a table.

Real-world data does not always fit that grid.

Suppose you store user profiles.

  • Some users have one phone number
  • Some have three
  • Some have none

In a relational database you often add another table or fill with NULL.

A document database stores what you actually have.

{
  "name": "Jane Doe",
  "phones": ["+1-555-0100", "+1-555-0199"],
  "address": {
    "city": "Seattle",
    "district": "Capitol Hill"
  }
}

That JSON-like blob is a document.

2. Relational vs document databases — key differences

TopicRelational (e.g. MySQL)Document (e.g. MongoDB)
StructureFixed schema (tables)Flexible schema (JSON/BSON)
RelationshipsJOINsEmbed or reference
ScalingOften discussed as scale-upMany workloads lean scale-out
TransactionsStrong ACIDSupported, but check version and topology1
Good fitStructured data, heavy joinsSemi-structured data, fast change

Note: PostgreSQL and MySQL can also scale horizontally with read replicas, sharding products, and so on. This table captures common talking points; details depend on the product and architecture.

One-line takeaway: when the schema changes often or data is deeply nested, a document model is often easier to work with.

3. What is MongoDB?

MongoDB is a leading open source document database that was released as open source in 2009 (the company behind it, originally 10gen, was founded in 2007).

Highlights

  • Flexible schema: you do not need rigid columns up front—great for early prototypes.
  • BSON: documents are handled in a binary form related to JSON. Encoding helps with types and parsing, but latency and cost still depend mostly on indexes, query patterns, network, and storage.
  • Horizontal scaling (sharding): you can shard as data grows.
  • Replica sets: replication for availability is standard.
  • Query and analytics: aggregation pipelines, geospatial queries, and more (depends on version and deployment). Atlas Search and similar full-text products are MongoDB Atlas–specific.

Where it runs

OptionDescription
Self-managedInstall on your own servers
MongoDB AtlasMongoDB’s managed service on AWS, GCP, or Azure
DockerRun in a container

4. What is AWS DocumentDB? (MongoDB-compatible API)

AWS DocumentDB is Amazon’s fully managed document database service.

It targets a MongoDB-compatible API, so many applications can use MongoDB drivers in a familiar way. It is not a byte-for-byte MongoDB server—validate feature by feature before production (some aggregation stages, indexes, or search features differ or are absent).

Highlights

  • Managed operations: patching, backup, and recovery are operated by AWS.
  • MongoDB-compatible API: easier migration paths, but read the compatibility matrix.
  • Storage and availability: the service is described as maintaining six copies of storage data across three Availability Zones (How It Works — Amazon DocumentDB).
  • Inside a VPC: fits AWS networking and security models.

5. MongoDB vs AWS DocumentDB — which one?

TopicMongoDB (e.g. Atlas)AWS DocumentDB
OperatorMongoDB Inc.Amazon Web Services
FeaturesNative MongoDBCompatible—not every new MongoDB feature arrives at the same time
Multi-cloudMultiple cloud optionsAWS only
PricingProduct-specificEC2-style instance, storage, I/O
OperationsCan be low-touch managedStrong fit if you live in AWS

When to consider which

  • All-in on AWS with RDS, VPC, IAM, etc. → AWS DocumentDB is a candidate.
  • You need latest MongoDB features or Atlas-only tools, or want multi-cloud flexibility → look at MongoDB Atlas or self-managed MongoDB.
  • Side projects and local dev → MongoDB Community Edition or Docker is a simple start.

6. Core concepts — collection, document, BSON

Terminology mapping

RelationalDocument
DatabaseDatabase
TableCollection
RowDocument
ColumnField
Primary key_id (unique per collection; default is usually ObjectId)
JOINEmbed / $lookup, etc.

Example document

{
  "_id": "64a1f2e3b5c6d7e8f9a0b1c2",
  "title": "MongoDB in ten minutes",
  "author": {
    "name": "Jane Doe",
    "email": "jane@example.com"
  },
  "tags": ["database", "nosql", "mongodb"],
  "views": 1024,
  "published": true,
  "createdAt": "2024-07-01T09:00:00Z"
}

Nested objects (author) and arrays (tags) live naturally in one document.

BSON is the on-the-wire and on-disk representation for these documents. Performance is not “fast because BSON”—it is fast or slow because of schema design, indexes, and workload.

Structure at a glance

The diagram below shows a database, collections, documents, and where embed vs reference strategies apply.

Database — collections — documents, embed and reference

7. Modeling at a glance — embed vs reference

Common rules of thumb:

  • Embed: data is read and updated together and fits comfortably in one document without consistency or size problems.
  • Reference: entities change independently, or embedding would duplicate data across many documents—store another collection and link with _id.

Production systems often mix both. Start from read patterns (what each screen needs in one round trip).

8. Minimal query cheat sheet

Connect (Node.js)

const { MongoClient } = require('mongodb');

async function main() {
  const client = new MongoClient('mongodb://localhost:27017');
  try {
    await client.connect();
    const db = client.db('myDatabase');
    const col = db.collection('articles');
    // Run CRUD here.
  } finally {
    await client.close();
  }
}

main().catch(console.error);

CRUD basics

const col = db.collection('articles');

await col.insertOne({ title: 'First post', views: 0 });

const doc = await col.findOne({ title: 'First post' });
const docs = await col.find({ published: true }).toArray();

await col.updateOne(
  { title: 'First post' },
  { $set: { views: 100 } }
);

await col.deleteOne({ title: 'First post' });

Common patterns

await col.find({ views: { $gte: 100 } }).toArray();
await col.find({ tags: { $in: ['nosql'] } }).toArray();

await col.find({}).sort({ createdAt: -1 }).skip(0).limit(10).toArray();

await col.aggregate([
  { $match: { published: true } },
  { $group: { _id: '$author.name', total: { $sum: 1 } } },
  { $sort: { total: -1 } },
]).toArray();

9. When a document database is a good fit

Often a good fit

  • Frequent schema changes — early-stage products
  • Nested structures — product options, profiles, configuration blobs
  • Read-heavy workloads — catalogs and content (with sensible indexes)
  • Event logs and IoT — append-heavy pipelines when the model matches

Think twice

  • Domains where large multi-table joins are always central
  • Strict transactional guarantees for financial cores—verify the database meets the bar
  • Highly regular, fixed schemas where relational modeling stays simpler

10. Wrap-up — ten-minute summary

KeywordOne line
Document DBNoSQL family built around flexible JSON/BSON documents
MongoDBThe flagship open source product and ecosystem
AWS DocumentDBManaged MongoDB-compatible API on AWS—verify features in the docs
Collection / documentThink “table / row” for intuition
Use itWhen flexibility, nesting, or horizontal scaling patterns matter
Skip itWhen joins and strict transactions dominate the story

Next steps

References (official docs)

Footnotes

  1. MongoDB multi-document ACID transactions landed in 4.0 on replica sets and expanded to sharded clusters in 4.2. See Transactions — MongoDB Manual for exact rules.

Share This Article

Recommended Reads

MongoDB ACID — Part 1: "NoSQL with transactions?" — ACID and MongoDB's evolution

The old line that NoSQL has no transactions is no longer accurate for MongoDB. What changed is not a binary switch but a gradual expansion: single-document atomicity was there for a long time, while multi-document and distributed transactional scope arrived in stages — and you need that timeline to read today's manual limits fairly. This Part 1 maps the four ACID letters to operational questions, walks the 4.0 and 4.2 story, and resists the oversimplified CAP slogan in favor of what partition actually means. It also spells out the real costs of adopting transactions — contention, latency, retries — and ends with a short decision flow: model inside one document first, then pay for multi-document work only when invariants demand it. Later parts go deeper on Atomicity, Isolation, and Durability; consider this the grounding episode.

Read

Why PostgreSQL? Part 5 — The ecosystem: pgvector, PostGIS, TimescaleDB

One more PostgreSQL extension—and you can seriously discuss vector search, geospatial queries, time-series analytics, and BM25-style full-text search on the same engine. This series finale walks through pgvector, PostGIS, TimescaleDB, and ParadeDB (pg_search): what public benchmarks and vendor write-ups claim, where “replace the specialist” is conditionally true, and how to read latency/cost numbers when managed services, self-hosting, and tuning assumptions differ. It closes the five-part arc on why PostgreSQL is often the lowest-regret default—reliability, extensibility, ecosystem depth—and when a separate system still earns its place. Use the decision inputs at the end alongside the comparison table: growth rate, staffing, failure tolerance, compliance, and how you define TCO.

Read

Why PostgreSQL? Part 4 — MongoDB & Oracle: real migration stories

If you are already on MongoDB or Oracle, moving to PostgreSQL is not a vague someday project—it hits budget, performance, and operations immediately. This post stays between the two lazy extremes (“trivial” vs “impossible”): what triggered real MongoDB→Postgres and Oracle→Postgres moves, how long they took, what hurt more than expected, and what changed afterward. Figures such as Infisical’s reported database cost reduction after leaving MongoDB, or wide TCO improvement narratives away from Oracle, come from public write‑ups, case studies, and vendor materials—so you should separate license vs labor vs migration project costs and read them as directional, not universal guarantees. The article also stresses that schema redesign and query work often travel with the database change, so “Postgres magic” and “refactoring wins” should not be collapsed into one headline. It closes with a compact decision frame so you can judge whether migration is a sensible next step for your team—not a moral obligation.

Read

Explore this topic·Start with featured series

한국어

Follow new posts via RSS

Until the newsletter opens, RSS is the fastest way to get updates.

Open RSS Guide