Introduction to MongoDB
Beginner · 5 min read
MongoDB is a document-oriented NoSQL database. Instead of rows and tables, it stores data as flexible JSON-like documents (BSON) in collections. It is schema-less by default, horizontally scalable, and optimized for hierarchical or variable-structure data.
Key Concepts
- Database — contains collections (like a schema in SQL)
- Collection — a group of documents (like a table in SQL)
- Document — a JSON/BSON record (like a row in SQL)
- Field — a key-value pair inside a document (like a column)
- _id — auto-generated ObjectId, unique primary key for every document
- BSON — Binary JSON; supports extra types like Date, ObjectId, Binary
SQL vs MongoDB
// SQL concept → MongoDB equivalent
Database → Database
Table → Collection
Row → Document
Column → Field
Primary Key → _id (ObjectId)
JOIN → $lookup (aggregation) or embedding
Foreign Key → Reference (ObjectId) or embedded document
INDEX → Index
SELECT → find()
INSERT → insertOne() / insertMany()
UPDATE → updateOne() / updateMany()
DELETE → deleteOne() / deleteMany()
💡 MongoDB is ideal when your data has variable or nested structure, you need horizontal scaling, or you're building document-centric apps like catalogs, CMSes, or real-time analytics.
Setup & Atlas
Beginner · 7 min read
# Option 1: Local installation
# Download from https://www.mongodb.com/try/download/community
# Start MongoDB service
sudo systemctl start mongod # Linux
brew services start mongodb-community # macOS
# Connect via mongosh (MongoDB Shell)
mongosh
mongosh "mongodb://localhost:27017"
# Basic shell commands
show dbs # list databases
use myapp # switch/create database
show collections # list collections in current DB
db.stats() # database statistics
db.dropDatabase() # delete current database
# Option 2: MongoDB Atlas (cloud — recommended)
# 1. Create free account at https://cloud.mongodb.com
# 2. Create a free M0 cluster
# 3. Add database user & whitelist IP
# 4. Get connection string:
mongodb+srv://<user>:<pass>@cluster0.abc.mongodb.net/myapp
Data Modelling Concepts
Beginner · 7 min read
MongoDB stores data as BSON documents. Understanding what you can store and how is fundamental.
// A document in the 'users' collection
{
"_id": ObjectId("60a7c2..."), // auto-generated
"name": "Aftab Khan",
"email": "aftab@example.com",
"age": 28,
"active": true,
"score": 98.5,
"tags": ["java", "spring", "mongodb"], // array
"address": { // embedded document
"street": "MG Road",
"city": "Greater Noida",
"pin": "201310"
},
"orders": [ // array of embedded docs
{ "id": "ORD-001", "amount": 1200, "date": ISODate("2024-06-01") }
],
"createdAt": ISODate("2024-01-15T09:30:00Z")
}
BSON Data Types
- String — UTF-8 strings
- Number — Int32, Int64, Double, Decimal128
- Boolean — true / false
- Array — ordered list of values
- Object — embedded document
- ObjectId — 12-byte unique identifier
- Date — ISODate (milliseconds since epoch)
- Null — null value
- Binary — raw binary (images, files)
Insert Documents
Beginner · 7 min read
// Switch to database (creates if not exists)
use myapp
// insertOne — insert a single document
db.users.insertOne({
name: "Aftab Khan",
email: "aftab@example.com",
age: 28,
role: "admin"
})
// Returns: { acknowledged: true, insertedId: ObjectId("...") }
// insertMany — insert multiple documents at once
db.products.insertMany([
{ name: "Laptop", price: 55000, category: "electronics", stock: 10 },
{ name: "Mouse", price: 800, category: "electronics", stock: 50 },
{ name: "Desk", price: 12000, category: "furniture", stock: 5 }
])
// Returns: { acknowledged: true, insertedIds: { 0: ObjectId, 1: ... } }
// Custom _id (must be unique)
db.configs.insertOne({
_id: "app_settings", // string _id
theme: "dark",
lang: "en"
})
Find & Query
Beginner · 10 min read
// find() — returns cursor (all matching documents)
db.users.find() // all documents
db.users.find({}) // same
db.users.find({ role: "admin" }) // where role = 'admin'
db.users.find({ age: 28, active: true }) // AND conditions
// findOne() — returns first matching document
db.users.findOne({ email: "aftab@example.com" })
// Find by ObjectId
db.users.findOne({ _id: ObjectId("60a7c2...") })
// Query nested fields (dot notation)
db.users.find({ "address.city": "Greater Noida" })
// Query array — documents that contain the value
db.users.find({ tags: "java" }) // has 'java' in tags array
db.users.find({ tags: { $all: ["java", "spring"] } }) // has BOTH
// Counting
db.users.countDocuments({ active: true })
Update Documents
Beginner · 10 min read
// updateOne — update first matching document
db.users.updateOne(
{ email: "aftab@example.com" }, // filter
{ $set: { age: 29, role: "superadmin" } } // update
)
// updateMany — update all matching
db.users.updateMany(
{ role: "guest" },
{ $set: { active: false } }
)
// Update operators
{ $set: { field: value } } // set field value
{ $unset: { field: "" } } // remove field
{ $inc: { age: 1, score: -5 } } // increment/decrement
{ $mul: { price: 1.1 } } // multiply
{ $rename: { oldName: "newName" } } // rename field
{ $push: { tags: "nodejs" } } // push to array
{ $pull: { tags: "java" } } // remove from array
{ $addToSet: { tags: "react" } } // push if not exists
{ $pop: { tags: 1 } } // 1=remove last, -1=remove first
// upsert — insert if not found
db.users.updateOne(
{ email: "new@user.com" },
{ $set: { name: "New User", createdAt: new Date() } },
{ upsert: true }
)
// findOneAndUpdate — returns the document (before or after update)
const updated = db.counters.findOneAndUpdate(
{ _id: "orderId" },
{ $inc: { seq: 1 } },
{ returnDocument: "after", upsert: true }
)
Delete Documents
Beginner · 5 min read
// deleteOne — delete first matching
db.users.deleteOne({ email: "spam@user.com" })
// deleteMany — delete all matching
db.users.deleteMany({ active: false })
// Delete all documents in a collection
db.logs.deleteMany({})
// Drop entire collection (more efficient)
db.logs.drop()
// findOneAndDelete — returns the deleted document
const deleted = db.sessions.findOneAndDelete({ token: "abc123" })
⚠️ deleteMany({}) deletes ALL documents but keeps the collection and its indexes. drop() removes the entire collection including indexes — much faster for clearing large collections.
Query Operators
Intermediate · 10 min read
// Comparison operators
db.products.find({ price: { $gt: 1000 } }) // greater than
db.products.find({ price: { $gte: 1000 } }) // greater than or equal
db.products.find({ price: { $lt: 500 } }) // less than
db.products.find({ price: { $lte: 500 } }) // less than or equal
db.products.find({ price: { $ne: 0 } }) // not equal
db.products.find({ category: { $in: ["electronics", "books"] } })
db.products.find({ category: { $nin: ["clothing"] } })
// Range query
db.products.find({ price: { $gte: 500, $lte: 5000 } })
// Logical operators
db.users.find({ $and: [{ age: { $gte: 18 } }, { active: true }] })
db.users.find({ $or: [{ role: "admin" }, { role: "superadmin" }] })
db.users.find({ $nor: [{ active: false }, { age: { $lt: 18 } }] })
db.users.find({ role: { $not: { $eq: "guest" } } })
// Element operators
db.users.find({ email: { $exists: true } }) // field exists
db.users.find({ avatar: { $exists: false } }) // field missing
db.users.find({ age: { $type: "number" } }) // by BSON type
// Regex
db.users.find({ name: { $regex: /^aftab/i } })
db.users.find({ email: { $regex: "@gmail\\.com$" } })
// Expression operator (use aggregation expressions in find)
db.products.find({ $expr: { $gt: ["$sold", "$stock"] } })
Projection, Sort & Limit
Beginner · 7 min read
// Projection — choose which fields to include/exclude
db.users.find({}, { name: 1, email: 1 }) // include only name & email (+_id)
db.users.find({}, { password: 0 }) // exclude password
db.users.find({}, { _id: 0, name: 1, email: 1 }) // exclude _id too
// Note: can't mix 1 and 0 (except _id)
// Sort
db.products.find().sort({ price: 1 }) // ascending
db.products.find().sort({ price: -1 }) // descending
db.products.find().sort({ category: 1, price: -1 }) // multi-field sort
// Pagination
const page = 2;
const limit = 10;
db.products.find()
.sort({ createdAt: -1 })
.skip((page - 1) * limit)
.limit(limit)
// distinct — unique field values
db.products.distinct("category")
// ['electronics', 'furniture', 'books']
// explain — query execution plan
db.users.find({ email: "aftab@example.com" }).explain("executionStats")
Array Operators
Intermediate · 8 min read
// Query array with conditions
db.orders.find({ items: { $size: 3 } }) // array has exactly 3 elements
db.users.find({ tags: { $all: ["java", "spring"] } }) // contains all
// $elemMatch — element in array matches multiple conditions
db.orders.find({
items: {
$elemMatch: { product: "Laptop", qty: { $gte: 2 } }
}
})
// Update specific array element (by index)
db.orders.updateOne(
{ _id: orderId },
{ $set: { "items.0.qty": 5 } } // update first element
)
// $ positional operator — update matched array element
db.orders.updateOne(
{ "items.product": "Laptop" },
{ $set: { "items.$.qty": 3 } } // $ = matched element position
)
// $push with modifiers
db.users.updateOne(
{ _id: userId },
{
$push: {
notifications: {
$each: ["Alert A", "Alert B"],
$slice: -10, // keep only last 10
$sort: { date: -1 }
}
}
}
)
Aggregation Pipeline
Intermediate · 10 min read
The aggregation pipeline processes documents through a series of stages — each stage transforms the data and passes results to the next. It replaces SQL's GROUP BY, HAVING, JOIN, and window functions.
// Basic pipeline structure
db.collection.aggregate([
{ $stage1: { ... } },
{ $stage2: { ... } },
...
])
// Example: revenue report by category
db.orders.aggregate([
{ $match: { status: "completed", year: 2024 } },
{ $unwind: "$items" },
{
$group: {
_id: "$items.category",
revenue: { $sum: { $multiply: ["$items.price", "$items.qty"] } },
orderCount: { $sum: 1 },
avgPrice: { $avg: "$items.price" }
}
},
{ $sort: { revenue: -1 } },
{ $limit: 5 }
])
Common Aggregation Stages
Intermediate · 12 min read
// $match — filter (like find query, use early to reduce data)
{ $match: { status: "active", age: { $gte: 18 } } }
// $project — include/exclude/compute fields
{ $project: {
name: 1,
email: 1,
password: 0, // exclude
fullName: { $concat: ["$first", " ", "$last"] },
discounted: { $multiply: ["$price", 0.9] }
} }
// $group — aggregate by key
{ $group: {
_id: "$department", // group by this field
count: { $sum: 1 }, // count documents
total: { $sum: "$salary" }, // sum field
avg: { $avg: "$salary" },
max: { $max: "$salary" },
min: { $min: "$salary" },
names: { $push: "$name" } // collect into array
} }
// $sort — sort results
{ $sort: { total: -1, name: 1 } }
// $limit / $skip — pagination
{ $skip: 20 }
{ $limit: 10 }
// $unwind — deconstruct array into separate documents
// { name:"Aftab", tags:["a","b"] } → two docs: {name:"Aftab",tags:"a"}, {name:"Aftab",tags:"b"}
{ $unwind: "$tags" }
{ $unwind: { path: "$items", preserveNullAndEmpty: true } }
// $addFields / $set — add computed fields without removing others
{ $addFields: {
totalPrice: { $multiply: ["$price", "$qty"] },
isExpensive: { $gt: ["$price", 10000] }
} }
// $count
{ $count: "totalUsers" }
// $facet — run multiple pipelines in parallel
{ $facet: {
byCategory: [{ $group: { _id: "$category", count: { $sum: 1 } } }],
priceRange: [{ $group: { _id: null, min: { $min: "$price" }, max: { $max: "$price" } } }]
} }
$lookup (Joins)
Advanced · 10 min read
// Simple $lookup (like LEFT JOIN)
db.orders.aggregate([
{
$lookup: {
from: "users", // collection to join
localField: "userId", // field in orders
foreignField: "_id", // field in users
as: "user" // output array field
}
},
{ $unwind: "$user" }, // flatten array to object
{
$project: {
orderId: 1,
total: 1,
"user.name": 1,
"user.email": 1
}
}
])
// Advanced $lookup with pipeline (multiple conditions)
db.orders.aggregate([
{
$lookup: {
from: "products",
let: { productIds: "$items.productId" },
pipeline: [
{ $match: { $expr: { $in: ["$_id", "$$productIds"] } } },
{ $project: { name: 1, price: 1, category: 1 } }
],
as: "productDetails"
}
}
])
Indexes
Advanced · 10 min read
Indexes dramatically speed up queries by allowing MongoDB to scan a sorted data structure instead of the entire collection. Without indexes, MongoDB does a collection scan (COLLSCAN) — reads every document.
// Create indexes
db.users.createIndex({ email: 1 }, { unique: true }) // unique email
db.users.createIndex({ createdAt: -1 }) // newest first
db.users.createIndex({ lastName: 1, firstName: 1 }) // compound index
db.users.createIndex({ age: 1, city: 1 }, { name: "age_city_idx" })
// Sparse — only index documents where field exists
db.users.createIndex({ phone: 1 }, { sparse: true })
// TTL — auto-delete documents after a time
db.sessions.createIndex({ expiresAt: 1 }, { expireAfterSeconds: 0 })
// Text index — full-text search
db.products.createIndex({ name: "text", description: "text" })
db.products.find({ $text: { $search: "laptop gaming" } },
{ score: { $meta: "textScore" } })
.sort({ score: { $meta: "textScore" } })
// Partial index — only index subset of documents
db.orders.createIndex(
{ userId: 1, createdAt: -1 },
{ partialFilterExpression: { status: "active" } }
)
// Manage indexes
db.users.getIndexes() // list all indexes
db.users.dropIndex("email_1") // drop by name
db.users.dropIndexes() // drop all (except _id)
db.users.find({...}).explain("executionStats") // check if index is used
💡 Index design rule: Build indexes based on your actual queries — use explain("executionStats") and look for IXSCAN (good) vs COLLSCAN (bad). Follow the ESR rule: Equality → Sort → Range.
Schema Design Patterns
Advanced · 12 min read
Embedding vs Referencing
// EMBEDDING — store related data inside the document
// Use when: data is always read together, 1:1 or 1:few relationship
{
_id: ObjectId(),
name: "Aftab Khan",
address: { // embedded — always loaded with user
street: "MG Road",
city: "Greater Noida"
}
}
// REFERENCING — store ObjectId pointing to another collection
// Use when: many-to-many, data is large, data is updated independently
{
_id: ObjectId(),
userId: ObjectId("60a..."), // reference to users collection
productId: ObjectId("60b..."),
qty: 2
}
Common Design Patterns
- Bucket pattern — group time-series data into buckets (IoT sensors)
- Outlier pattern — handle documents that exceed normal size with an overflow flag
- Computed pattern — pre-compute expensive values and store them
- Extended Reference — embed a subset of referenced data to avoid $lookup
- Subset pattern — store only most recent N items in main doc, rest in another collection
Mongoose Setup
Beginner · 5 min read
npm install mongoose
// config/db.js
const mongoose = require('mongoose');
const connectDB = async () => {
mongoose.set('strictQuery', true);
try {
await mongoose.connect(process.env.MONGO_URI);
console.log('MongoDB connected');
} catch (err) {
console.error('DB connection failed:', err);
process.exit(1);
}
};
// Graceful disconnect on app shutdown
process.on('SIGINT', async () => {
await mongoose.disconnect();
process.exit(0);
});
module.exports = connectDB;
// index.js
const connectDB = require('./config/db');
await connectDB();
app.listen(3000);
Mongoose Schema & Models
Intermediate · 10 min read
const { Schema, model, Types } = require('mongoose');
const productSchema = new Schema({
name: { type: String, required: true, trim: true },
slug: { type: String, unique: true, index: true },
price: { type: Number, required: true, min: 0 },
category: { type: String, enum: ['electronics', 'books', 'clothing'] },
tags: [String],
stock: { type: Number, default: 0 },
active: { type: Boolean, default: true },
seller: { type: Types.ObjectId, ref: 'User' }, // reference
images: [{ url: String, alt: String }], // array of objects
meta: {
views: { type: Number, default: 0 },
rating: { type: Number, min: 1, max: 5 }
}
}, { timestamps: true }); // adds createdAt, updatedAt
// Model = compiled schema
const Product = model('Product', productSchema);
// Mongoose CRUD
await Product.create({ name: 'Laptop', price: 55000 });
await Product.find({ active: true }).sort('-createdAt').limit(10);
await Product.findById(id);
await Product.findByIdAndUpdate(id, { $inc: { 'meta.views': 1 } }, { new: true });
await Product.findByIdAndDelete(id);
// Populate — resolve references
const product = await Product
.findById(id)
.populate('seller', 'name email'); // select only name & email from User
Mongoose Validation
Intermediate · 8 min read
const userSchema = new Schema({
name: {
type: String,
required: [true, 'Name is required'], // custom error msg
trim: true,
minlength: [2, 'Min 2 characters'],
maxlength: [50, 'Max 50 characters']
},
email: {
type: String,
required: true,
unique: true,
lowercase: true,
validate: {
validator: (v) => /^[\w.]+@[\w.]+\.\w{2,}$/.test(v),
message: 'Invalid email format'
}
},
age: {
type: Number,
min: [0, 'Age cannot be negative'],
max: [120, 'Age seems invalid']
},
role: {
type: String,
enum: { values: ['user', 'admin'], message: 'Invalid role' },
default: 'user'
},
password: {
type: String,
required: true,
minlength: 8,
select: false // exclude from queries by default
}
});
// Catch validation errors in controller
try {
await User.create(data);
} catch (err) {
if (err.name === 'ValidationError') {
const errors = Object.values(err.errors).map(e => e.message);
res.status(400).json({ errors });
}
}
Mongoose Middleware (Hooks)
Advanced · 8 min read
// Pre-save hook — runs before save()
userSchema.pre('save', async function(next) {
// 'this' = the document being saved
if (this.isModified('password')) {
this.password = await bcrypt.hash(this.password, 12);
}
if (this.isNew) {
this.slug = slugify(this.name, { lower: true });
}
next();
});
// Post-save hook — runs after save()
userSchema.post('save', function(doc) {
console.log(`User ${doc.email} saved`);
});
// Pre-query hooks
userSchema.pre(/^find/, function(next) { // matches find, findOne, etc.
this.find({ active: { $ne: false } }); // always exclude inactive
next();
});
// Pre-delete hook — clean up related data
userSchema.pre('findOneAndDelete', async function(next) {
const userId = this.getQuery()._id;
await Order.deleteMany({ userId }); // cascade delete
next();
});
// Error handling middleware
userSchema.post('save', function(err, doc, next) {
if (err.code === 11000) next(new Error('Email already in use'));
else next(err);
});
Transactions
Advanced · 8 min read
MongoDB ACID transactions (multi-document) are supported from v4.0 and require a replica set or sharded cluster. Use them when multiple documents must be updated atomically.
// Mongoose transaction
const session = await mongoose.startSession();
session.startTransaction();
try {
// Pass session to every operation
const order = await Order.create([{
userId: req.user.id,
items: req.body.items,
total: req.body.total
}], { session });
await Product.updateOne(
{ _id: req.body.productId },
{ $inc: { stock: -req.body.qty } },
{ session }
);
await User.findByIdAndUpdate(
req.user.id,
{ $inc: { balance: -req.body.total } },
{ session }
);
await session.commitTransaction();
res.status(201).json(order);
} catch (err) {
await session.abortTransaction(); // roll back all changes
next(err);
} finally {
session.endSession();
}