PostgreSQL Performance Tuning (shared_buffers, work_mem, effective_cache_size)
Default PostgreSQL configuration — shared_buffers = 128MB, work_mem = 4MB — was written to run on any 2005 hardware. On modern servers with 32–128 GB RAM it leaves 95% of memory unused. Proper tuning is not just "set bigger numbers" but understanding how PostgreSQL uses memory and interacts with the OS.
How PostgreSQL Uses Memory
shared_buffers — common cache of database pages for all processes. Pages that PostgreSQL read from disk go here. If a page is in shared_buffers — the next query gets it from memory.
work_mem — memory for each sort or hash join operation within a single query. If a query has 3 sort nodes, it can consume 3 × work_mem. With 100 parallel connections running heavy queries, consumption can be 100 × 3 × work_mem.
effective_cache_size — not actually allocated memory, but a hint to the planner about how much memory is available for caching (OS + shared_buffers). Affects the choice between index scan and seq scan.
maintenance_work_mem — for VACUUM, CREATE INDEX, ALTER TABLE. Doesn't affect normal queries.
Recommended Values
For a dedicated PostgreSQL server with 32 GB RAM:
# postgresql.conf
# 25% RAM for dedicated server
shared_buffers = 8GB
# Remaining RAM minus shared_buffers and system
effective_cache_size = 24GB # shared_buffers + OS page cache
# Warning: work_mem multiplies by number of parallel operations
# For OLTP with short queries: 4-16 MB
# For analytics with GROUP BY / ORDER BY: 64-256 MB
work_mem = 16MB
# For VACUUM and CREATE INDEX — increase only during maintenance
maintenance_work_mem = 2GB
# WAL buffer — 64 MB is enough for most workloads
wal_buffers = 64MB
Tool pgtune (https://pgtune.leopard.in.ua) generates starting values by workload type (OLTP, DW, Web).
Planner Configuration
# Cost model for SSD (default designed for HDD)
random_page_cost = 1.1 # SSD: 1.1, HDD: 4.0 (default)
seq_page_cost = 1.0 # usually don't change
# Enable parallel queries (PostgreSQL 9.6+)
max_parallel_workers_per_gather = 4 # up to CPU cores
max_parallel_workers = 8
parallel_tuple_cost = 0.1
parallel_setup_cost = 1000.0
# Statistics: more = better planner chooses indexes
default_statistics_target = 100 # default 100, for complex columns — up to 500
For specific columns with skewed data distribution:
ALTER TABLE orders ALTER COLUMN status SET STATISTICS 500;
ANALYZE orders;
Checkpoint and WAL
# Checkpoints happen less frequently — fewer I/O spikes
checkpoint_completion_target = 0.9 # spread checkpoint over 90% of interval
checkpoint_timeout = 15min # maximum interval (default 5min)
max_wal_size = 4GB # with heavy writes — increase
# fsync must NOT be disabled in production (data loss on power failure)
fsync = on
# synchronous_commit = off — allow loss of up to 60ms of transactions
# Appropriate for non-critical data, not for financial
synchronous_commit = on
Tuning for Specific Case: Sorting
Query slowly executes ORDER BY on large table — sorting happens via temporary disk file:
-- Check: are there temp files in the query
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT * FROM events
WHERE user_id = 1
ORDER BY created_at DESC
LIMIT 100;
-- If output shows "Sort Method: external merge Disk: 45678kB" — need index or more work_mem
-- Create covering index
CREATE INDEX CONCURRENTLY idx_events_user_date
ON events(user_id, created_at DESC)
INCLUDE (id, event_type, payload);
-- INCLUDE adds data to index, query works without touching table
Hit Rate Monitoring
-- Efficiency of shared_buffers: should be > 99% for OLTP
SELECT
sum(heap_blks_hit) AS heap_hit,
sum(heap_blks_read) AS heap_read,
round(
sum(heap_blks_hit)::numeric /
nullif(sum(heap_blks_hit) + sum(heap_blks_read), 0) * 100,
2
) AS cache_hit_ratio
FROM pg_statio_user_tables;
-- Similarly for indexes
SELECT
round(
sum(idx_blks_hit)::numeric /
nullif(sum(idx_blks_hit) + sum(idx_blks_read), 0) * 100,
2
) AS index_hit_ratio
FROM pg_statio_user_indexes;
If cache_hit_ratio < 99% — either shared_buffers is too small, or working data set doesn't fit in memory at all.
pg_buffercache Extension
CREATE EXTENSION pg_buffercache;
-- What's currently in shared_buffers?
SELECT c.relname,
count(*) AS buffers,
round(count(*) * 8192.0 / 1024 / 1024, 1) AS size_mb,
round(count(*) * 100.0 / (SELECT count(*) FROM pg_buffercache), 1) AS pct
FROM pg_buffercache b
JOIN pg_class c ON b.relfilenode = c.relfilenode
GROUP BY c.relname
ORDER BY buffers DESC
LIMIT 20;
Connection Parameters
max_connections = 200 # for direct connections; with PgBouncer can be less
# Stack size for each backend process
# Don't reduce below 100KB
# Reducing max_stack_depth is not recommended
When using PgBouncer in transaction mode: max_connections = 50-100 on PostgreSQL (actual backends), thousands of clients connect to PgBouncer.
Order of Applying Changes
| Parameter | Requires Restart |
|---|---|
| shared_buffers | Yes |
| max_connections | Yes |
| work_mem | No (RELOAD) |
| effective_cache_size | No |
| checkpoint_timeout | No |
| random_page_cost | No |
| max_parallel_workers | No |
-- Apply without restart
SELECT pg_reload_conf();
-- Verify change took effect
SHOW work_mem;
Workload Profile Affects Tuning
Web OLTP (many short transactions, INSERT/SELECT by PK): small work_mem (4–16 MB), large shared_buffers, max_connections via PgBouncer.
Analytical queries (GROUP BY, window functions, large JOIN): large work_mem (256 MB – 1 GB), max_parallel_workers_per_gather = 4+, can reduce max_connections.
Mixed workload: PgBouncer for OLTP connections + separate replica for analytics with different work_mem.







