Analysis and Optimization of Slow SQL Queries (EXPLAIN ANALYZE)
A slow query in production is a specific cause of degradation: full table scan on a 50 million row table, sorting without index, cartesian product from missing JOIN condition. EXPLAIN ANALYZE shows what PostgreSQL actually does — not what the planner thinks it will do, but what really happened at runtime.
How to Read EXPLAIN ANALYZE
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT u.name, COUNT(o.id) AS order_count
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
WHERE u.country = 'US'
AND o.created_at > '2025-01-01'
GROUP BY u.id, u.name
ORDER BY order_count DESC
LIMIT 20;
Example output:
Limit (cost=45231.23..45231.28 rows=20) (actual time=892.341..892.345 rows=20)
-> Sort (cost=45231.23..45387.41) (actual time=892.340..892.341 rows=20)
Sort Key: (count(o.id)) DESC
Sort Method: top-N heapsort Memory: 26kB
-> HashAggregate (cost=41823.10..43011.52) (actual time=867.234..880.123 rows=12340)
-> Hash Left Join (cost=12345.00..40234.12) (actual time=234.123..801.234 rows=450000)
Hash Cond: (o.user_id = u.id)
Buffers: shared hit=234 read=12890
-> Seq Scan on orders o (cost=0.00..18234.00 rows=450000) (actual time=0.023..345.234 rows=450000)
Filter: (created_at > '2025-01-01')
Rows Removed by Filter: 1234567
Buffers: shared hit=12 read=12878
-> Hash (cost=9876.00..9876.00 rows=123456) (actual time=234.012..234.012 rows=98765)
-> Seq Scan on users u (cost=0.00..9876.00 rows=123456) (actual time=0.021..189.234 rows=98765)
Filter: (country = 'US')
What we see and what to do about it:
-
Seq Scan on orderswithRows Removed by Filter: 1234567— scans 1.7M rows, filters 1.23M. Need index on(created_at)or(user_id, created_at). -
Buffers: shared hit=12 read=12878— almost all pages read from disk (read), not cache. Either table is larger thanshared_buffersor data is rarely accessed. -
actual time=892ms— for a UI button this is a disaster.
Finding Slow Queries via pg_stat_statements
-- Enable extension
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
-- postgresql.conf
-- shared_preload_libraries = 'pg_stat_statements'
-- pg_stat_statements.max = 10000
-- pg_stat_statements.track = all
-- Top by total time (most important)
SELECT
left(query, 100) AS query_preview,
calls,
round(total_exec_time::numeric, 0) AS total_ms,
round(mean_exec_time::numeric, 2) AS avg_ms,
round(stddev_exec_time::numeric, 2) AS stddev_ms,
rows
FROM pg_stat_statements
WHERE dbid = (SELECT oid FROM pg_database WHERE datname = current_database())
ORDER BY total_exec_time DESC
LIMIT 20;
-- Reset statistics after optimization
SELECT pg_stat_statements_reset();
Patterns of Slow Queries and Solutions
Seq Scan on Large Table
-- Slow: full scan
SELECT * FROM orders WHERE status = 'pending';
-- EXPLAIN: Seq Scan on orders (rows=5000000) Filter: status='pending'
-- Solution: index
CREATE INDEX CONCURRENTLY idx_orders_status ON orders(status)
WHERE status IN ('pending', 'processing');
-- Partial index — only for active statuses, smaller size
-- Even better: covering index (doesn't access table)
CREATE INDEX CONCURRENTLY idx_orders_status_cover
ON orders(status, created_at DESC)
INCLUDE (id, user_id, total_amount)
WHERE status IN ('pending', 'processing');
Inefficient JOIN
-- Slow: JOIN without index on orders.user_id
SELECT u.name, o.total
FROM users u
JOIN orders o ON o.user_id = u.id
WHERE u.registered_at > '2025-01-01';
-- EXPLAIN: Hash Join ... Seq Scan on orders (rows=5000000)
-- Solution
CREATE INDEX CONCURRENTLY idx_orders_user_id ON orders(user_id);
-- After: Hash Join ... Index Scan on orders
N+1 Query in ORM
This is not a SQL problem but architectural, but shows as slow queries:
-- N+1: 1 query for list + N queries for each related object
-- In pg_stat_statements: simple SELECT executed 10000 times per second
-- In Eloquent: was
$orders = Order::all();
foreach ($orders as $order) {
echo $order->user->name; // N queries to users
}
-- Now: eager loading
$orders = Order::with('user:id,name')->get();
-- One JOIN instead of N queries
Sorting Without Index
-- Slow: Sort Method: external merge Disk: 45678kB
SELECT * FROM events ORDER BY created_at DESC LIMIT 100;
-- EXPLAIN: Sort ... actual time=3400ms
-- Sorting via temporary disk file
-- Solution: index on sort field
CREATE INDEX CONCURRENTLY idx_events_created_at ON events(created_at DESC);
-- After: Index Scan Backward — 0.3ms
LIKE with Prefix Wildcard
-- B-Tree index doesn't work for LIKE '%text%'
SELECT * FROM products WHERE name LIKE '%phone%';
-- EXPLAIN: Seq Scan, Filter: name LIKE '%phone%'
-- Solution 1: pg_trgm for any LIKE
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE INDEX CONCURRENTLY idx_products_name_trgm
ON products USING gin(name gin_trgm_ops);
-- Now LIKE '%phone%' uses index
-- Solution 2: full-text search for text queries
ALTER TABLE products ADD COLUMN search_vector tsvector;
CREATE INDEX idx_products_fts ON products USING gin(search_vector);
UPDATE products SET search_vector = to_tsvector('english', name || ' ' || description);
-- Query:
SELECT * FROM products
WHERE search_vector @@ plainto_tsquery('english', 'phone Samsung');
Function on Indexed Column
-- Bad: function on column disables index usage
SELECT * FROM orders WHERE DATE(created_at) = '2025-01-15';
-- EXPLAIN: Seq Scan (function DATE() applied to each row)
-- Good: range without function
SELECT * FROM orders
WHERE created_at >= '2025-01-15 00:00:00'
AND created_at < '2025-01-16 00:00:00';
-- EXPLAIN: Index Scan (uses index on created_at)
-- Or: functional index
CREATE INDEX idx_orders_date ON orders(DATE(created_at));
SELECT * FROM orders WHERE DATE(created_at) = '2025-01-15';
Analysis Tools
# auto_explain: automatically log plans for slow queries
# postgresql.conf:
# shared_preload_libraries = 'pg_stat_statements,auto_explain'
# auto_explain.log_min_duration = 1000 # ms
# auto_explain.log_analyze = true
# auto_explain.log_buffers = true
# Plan visualization: https://explain.dalibo.com or https://explain.depesz.com
# Copy output of EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON) — get visual tree
Optimization Process
- Find top-10 queries by
total_exec_timeviapg_stat_statements -
EXPLAIN (ANALYZE, BUFFERS)on each - Identify bottleneck: Seq Scan, sort, hash join
- Create or modify index (
CONCURRENTLY— without locking) -
ANALYZE table_name— update statistics - Repeat
EXPLAIN ANALYZE— compare plans -
pg_stat_statements_reset()— reset and observe new statistics
Cycle takes from hours to days depending on number of problem queries and data volume.







