MongoDB Database Administration for Web Applications
MongoDB is a document database with horizontal scaling. Flexible schema is convenient during development, but in production requires discipline: without indexes queries become collection scans, without compaction WiredTiger doesn't free space, without oplog monitoring replicas lag and lose synchronization.
Initial Audit
// mongosh
use mydb
// Database statistics
db.stats({ scale: 1024 * 1024 }) // in MB
// Collection statistics
db.runCommand({ listCollections: 1 }).cursor.firstBatch.forEach(c => {
const stats = db[c.name].stats({ scale: 1024 })
printjson({
name: c.name,
docs: stats.count,
size_kb: stats.size,
storageSize_kb: stats.storageSize,
totalIndexSize_kb: stats.totalIndexSize,
nindexes: stats.nindexes
})
})
// Collection indexes with sizes
db.orders.stats().indexSizes
// Operations running right now (> 1 second)
db.currentOp({ "secs_running": { $gt: 1 } })
Indexing
// Compound index for typical query by user and date
db.orders.createIndex(
{ user_id: 1, created_at: -1 },
{ background: true, name: "idx_user_date" }
)
// Partial index — only for active orders
db.orders.createIndex(
{ created_at: -1 },
{
partialFilterExpression: { status: { $in: ["pending", "processing"] } },
name: "idx_active_orders_date"
}
)
// TTL index for auto-deleting expired sessions
db.sessions.createIndex(
{ expires_at: 1 },
{ expireAfterSeconds: 0, name: "ttl_sessions" }
)
// Check index usage
db.orders.aggregate([
{ $indexStats: {} }
]).forEach(stat => {
if (stat.accesses.ops === 0) {
print("UNUSED INDEX: " + stat.name)
}
})
background: true allows creating indexes without locking the database. In MongoDB 4.2+ this is the default behavior.
Explain and Query Optimization
// Analyze query execution plan
db.orders.find({ user_id: "507f1f77bcf86cd799439011", status: "pending" })
.explain("executionStats")
// Key fields in executionStats:
// - stage: "COLLSCAN" means full collection scan — need an index
// - stage: "IXSCAN" — uses index, good
// - nReturned vs totalDocsExamined — closer to 1:1 ratio, the better the index
// - executionTimeMillis — execution time
Enable profiler to find slow queries:
// Profile queries > 100ms
db.setProfilingLevel(1, { slowms: 100 })
// View slow queries
db.system.profile.find(
{ millis: { $gt: 100 } },
{ ns: 1, command: 1, millis: 1, ts: 1 }
).sort({ millis: -1 }).limit(20)
// Disable profiler (affects performance!)
db.setProfilingLevel(0)
Replica Set: Setup and Monitoring
// Initialize replica set
rs.initiate({
_id: "rs0",
members: [
{ _id: 0, host: "mongo1:27017", priority: 2 },
{ _id: 1, host: "mongo2:27017", priority: 1 },
{ _id: 2, host: "mongo3:27017", arbiterOnly: true }
]
})
// Replication status
rs.status()
// Replica lag (critical: if oplog overflows — replica loses synchronization)
rs.printSecondaryReplicationInfo()
// Oplog size (should cover several hours of operations)
rs.printReplicationInfo()
// oplog size: 5760MB
// log length start to end: 14400 secs (4 hrs)
// If less than 4–6 hours — increase:
db.adminCommand({ replSetResizeOplog: 1, size: 10240 }) // 10 GB
Backup
mongodump — for databases up to several GB:
# Backup with consistency point on replica set
mongodump \
--uri="mongodb://backup_user:password@mongo1:27017,mongo2:27017/mydb?replicaSet=rs0" \
--readPreference=secondary \
--oplog \
--gzip \
--out=/backups/dump_$(date +%Y%m%d_%H%M)
# Restore
mongorestore \
--uri="mongodb://admin:password@localhost:27017" \
--gzip \
--oplogReplay \
/backups/dump_20250101_0300
mongodump with --oplog only blocks reads on secondary — for production always backup from secondary.
For large databases (>100 GB) — filesystem snapshots (LVM, AWS EBS Snapshot) are faster physically.
WiredTiger: Compaction
MongoDB (WiredTiger) doesn't return freed space to OS automatically. After bulk document deletion — run compact:
// Compact specific collection
// WARNING: locks database during execution!
db.runCommand({ compact: "orders" })
// On replica set — compact sequentially on secondary, then switch primary
// Run during maintenance window
Check overhead before and after:
const stats = db.orders.stats({ scale: 1024 * 1024 })
console.log({
dataSize: stats.size.toFixed(1) + ' MB',
storageSize: stats.storageSize.toFixed(1) + ' MB',
overhead_pct: ((1 - stats.size / stats.storageSize) * 100).toFixed(1) + '%'
})
User Management
use mydb
// Create application user
db.createUser({
user: "app_user",
pwd: "strong_password",
roles: [
{ role: "readWrite", db: "mydb" }
]
})
// Read-only for analytics
db.createUser({
user: "analytics",
pwd: "analytics_password",
roles: [
{ role: "read", db: "mydb" }
]
})
// List users
db.getUsers({ showCredentials: false })
Connection Pooling
Mongoose (Node.js) opens 5 connections by default. For high-load applications:
mongoose.connect(process.env.MONGO_URI, {
maxPoolSize: 50, // maximum connections in pool
minPoolSize: 5,
serverSelectionTimeoutMS: 5000,
socketTimeoutMS: 45000,
connectTimeoutMS: 10000
})
For PHP (MongoDB extension):
$client = new MongoDB\Client(
"mongodb://app_user:password@mongo1:27017,mongo2:27017/mydb",
["replicaSet" => "rs0", "readPreference" => "secondaryPreferred"],
["typeMap" => ["root" => "array", "document" => "array"]]
);
Monitoring via mongostat and mongotop
# Real-time operation statistics (each second)
mongostat --uri="mongodb://admin:password@localhost:27017" --discover
# Top collections by read/write time
mongotop --uri="mongodb://admin:password@localhost:27017" 5
# Key metrics to monitor in Prometheus (via mongodb_exporter):
# - mongodb_ss_globalLock_currentQueue_total > 0 — lock queue
# - mongodb_ss_connections_current vs maxIncomingConnections
# - replication lag on secondary
# - opcounters: insert/query/update/delete per second







