MongoDB and DocumentDB in 10 Minutes — A Practical Primer
MongoDB and AWS DocumentDB are names you hear often, yet it is still hard to grasp what they are and how they differ in one pass. This guide is for readers who know relational databases but find document-oriented NoSQL unfamiliar: from what a document is, through the roles of MongoDB and DocumentDB, API compatibility limits, selection criteria, collections, BSON, and embed-versus-reference modeling—in about ten minutes of reading. It aims for a grounded mental model without hype, and leaves you knowing what to open next in the official docs or a local sandbox.
If you have used relational databases but MongoDB and DocumentDB still feel fuzzy, this article maps the big picture. Expect about ten minutes from start to finish.
Table of contents
- Introduction — what is a “document”?
- Relational vs document databases — key differences
- What is MongoDB?
- What is AWS DocumentDB? (MongoDB-compatible API)
- MongoDB vs AWS DocumentDB — which one?
- Core concepts — collection, document, BSON
- Modeling at a glance — embed vs reference
- Minimal query cheat sheet
- When a document database is a good fit
- Wrap-up — ten-minute summary
1. Introduction — what is a “document”?
Databases such as MySQL and PostgreSQL store data in rows and columns—a table.
Real-world data does not always fit that grid.
Suppose you store user profiles.
- Some users have one phone number
- Some have three
- Some have none
In a relational database you often add another table or fill with NULL.
A document database stores what you actually have.
{
"name": "Jane Doe",
"phones": ["+1-555-0100", "+1-555-0199"],
"address": {
"city": "Seattle",
"district": "Capitol Hill"
}
}
That JSON-like blob is a document.
2. Relational vs document databases — key differences
| Topic | Relational (e.g. MySQL) | Document (e.g. MongoDB) |
|---|---|---|
| Structure | Fixed schema (tables) | Flexible schema (JSON/BSON) |
| Relationships | JOINs | Embed or reference |
| Scaling | Often discussed as scale-up | Many workloads lean scale-out |
| Transactions | Strong ACID | Supported, but check version and topology1 |
| Good fit | Structured data, heavy joins | Semi-structured data, fast change |
Note: PostgreSQL and MySQL can also scale horizontally with read replicas, sharding products, and so on. This table captures common talking points; details depend on the product and architecture.
One-line takeaway: when the schema changes often or data is deeply nested, a document model is often easier to work with.
3. What is MongoDB?
MongoDB is a leading open source document database that was released as open source in 2009 (the company behind it, originally 10gen, was founded in 2007).
Highlights
- Flexible schema: you do not need rigid columns up front—great for early prototypes.
- BSON: documents are handled in a binary form related to JSON. Encoding helps with types and parsing, but latency and cost still depend mostly on indexes, query patterns, network, and storage.
- Horizontal scaling (sharding): you can shard as data grows.
- Replica sets: replication for availability is standard.
- Query and analytics: aggregation pipelines, geospatial queries, and more (depends on version and deployment). Atlas Search and similar full-text products are MongoDB Atlas–specific.
Where it runs
| Option | Description |
|---|---|
| Self-managed | Install on your own servers |
| MongoDB Atlas | MongoDB’s managed service on AWS, GCP, or Azure |
| Docker | Run in a container |
4. What is AWS DocumentDB? (MongoDB-compatible API)
AWS DocumentDB is Amazon’s fully managed document database service.
It targets a MongoDB-compatible API, so many applications can use MongoDB drivers in a familiar way. It is not a byte-for-byte MongoDB server—validate feature by feature before production (some aggregation stages, indexes, or search features differ or are absent).
Highlights
- Managed operations: patching, backup, and recovery are operated by AWS.
- MongoDB-compatible API: easier migration paths, but read the compatibility matrix.
- Storage and availability: the service is described as maintaining six copies of storage data across three Availability Zones (How It Works — Amazon DocumentDB).
- Inside a VPC: fits AWS networking and security models.
5. MongoDB vs AWS DocumentDB — which one?
| Topic | MongoDB (e.g. Atlas) | AWS DocumentDB |
|---|---|---|
| Operator | MongoDB Inc. | Amazon Web Services |
| Features | Native MongoDB | Compatible—not every new MongoDB feature arrives at the same time |
| Multi-cloud | Multiple cloud options | AWS only |
| Pricing | Product-specific | EC2-style instance, storage, I/O |
| Operations | Can be low-touch managed | Strong fit if you live in AWS |
When to consider which
- All-in on AWS with RDS, VPC, IAM, etc. → AWS DocumentDB is a candidate.
- You need latest MongoDB features or Atlas-only tools, or want multi-cloud flexibility → look at MongoDB Atlas or self-managed MongoDB.
- Side projects and local dev → MongoDB Community Edition or Docker is a simple start.
6. Core concepts — collection, document, BSON
Terminology mapping
| Relational | Document |
|---|---|
| Database | Database |
| Table | Collection |
| Row | Document |
| Column | Field |
| Primary key | _id (unique per collection; default is usually ObjectId) |
| JOIN | Embed / $lookup, etc. |
Example document
{
"_id": "64a1f2e3b5c6d7e8f9a0b1c2",
"title": "MongoDB in ten minutes",
"author": {
"name": "Jane Doe",
"email": "jane@example.com"
},
"tags": ["database", "nosql", "mongodb"],
"views": 1024,
"published": true,
"createdAt": "2024-07-01T09:00:00Z"
}
Nested objects (author) and arrays (tags) live naturally in one document.
BSON is the on-the-wire and on-disk representation for these documents. Performance is not “fast because BSON”—it is fast or slow because of schema design, indexes, and workload.
Structure at a glance
The diagram below shows a database, collections, documents, and where embed vs reference strategies apply.
7. Modeling at a glance — embed vs reference
Common rules of thumb:
- Embed: data is read and updated together and fits comfortably in one document without consistency or size problems.
- Reference: entities change independently, or embedding would duplicate data across many documents—store another collection and link with
_id.
Production systems often mix both. Start from read patterns (what each screen needs in one round trip).
8. Minimal query cheat sheet
Connect (Node.js)
const { MongoClient } = require('mongodb');
async function main() {
const client = new MongoClient('mongodb://localhost:27017');
try {
await client.connect();
const db = client.db('myDatabase');
const col = db.collection('articles');
// Run CRUD here.
} finally {
await client.close();
}
}
main().catch(console.error);
CRUD basics
const col = db.collection('articles');
await col.insertOne({ title: 'First post', views: 0 });
const doc = await col.findOne({ title: 'First post' });
const docs = await col.find({ published: true }).toArray();
await col.updateOne(
{ title: 'First post' },
{ $set: { views: 100 } }
);
await col.deleteOne({ title: 'First post' });
Common patterns
await col.find({ views: { $gte: 100 } }).toArray();
await col.find({ tags: { $in: ['nosql'] } }).toArray();
await col.find({}).sort({ createdAt: -1 }).skip(0).limit(10).toArray();
await col.aggregate([
{ $match: { published: true } },
{ $group: { _id: '$author.name', total: { $sum: 1 } } },
{ $sort: { total: -1 } },
]).toArray();
9. When a document database is a good fit
Often a good fit
- Frequent schema changes — early-stage products
- Nested structures — product options, profiles, configuration blobs
- Read-heavy workloads — catalogs and content (with sensible indexes)
- Event logs and IoT — append-heavy pipelines when the model matches
Think twice
- Domains where large multi-table joins are always central
- Strict transactional guarantees for financial cores—verify the database meets the bar
- Highly regular, fixed schemas where relational modeling stays simpler
10. Wrap-up — ten-minute summary
| Keyword | One line |
|---|---|
| Document DB | NoSQL family built around flexible JSON/BSON documents |
| MongoDB | The flagship open source product and ecosystem |
| AWS DocumentDB | Managed MongoDB-compatible API on AWS—verify features in the docs |
| Collection / document | Think “table / row” for intuition |
| Use it | When flexibility, nesting, or horizontal scaling patterns matter |
| Skip it | When joins and strict transactions dominate the story |
Next steps
- Learn: MongoDB University
- Local Docker (official community image):
docker run -d -p 27017:27017 mongodb/mongodb-community-server:latest— see Docker Hub — mongodb-community-server. - GUI: MongoDB Compass
References (official docs)
- MongoDB Manual
- BSON types · bsonspec.org
- Documents overview
- Transactions
- Data modeling introduction
$lookup- MongoDB Atlas
- Amazon DocumentDB — supported APIs
- Amazon DocumentDB — how it works
Footnotes
-
MongoDB multi-document ACID transactions landed in 4.0 on replica sets and expanded to sharded clusters in 4.2. See Transactions — MongoDB Manual for exact rules. ↩