# PostgreSQL Performance Tuning: Indexing, Vacuum, Partitioning, and pgBadger
TL;DR
PostgreSQL handles 10× its apparent throughput if tuned right. This guide covers the four levers that matter: correct indexing (b-tree, partial, GIN, BRIN), vacuum + autovacuum strategy, partitioning huge tables, and analysing slow queries with pgBadger — all on DomainIndia VPS.
## When to tune
Default PostgreSQL config is conservative — safe for any hardware but underuses modern VPS resources. Tune when:
- Queries suddenly slow after table grew past 1M rows
- `pg_stat_activity` shows many `waiting` sessions
- CPU spikes during business hours
- Monthly bandwidth climbs (often: bloat + lack of VACUUM)
- Long-running maintenance tasks blocking writes
If your site is small (<100K rows) and fast, don't tune. Default PG works.
## Level 1 — Right-size memory settings
`/var/lib/pgsql/data/postgresql.conf` (or `/etc/postgresql/16/main/postgresql.conf` on Ubuntu):
| Setting | Default | 4 GB VPS | 8 GB VPS | 16 GB VPS |
shared_buffers | 128MB | 1GB | 2GB | 4GB |
effective_cache_size | 4GB | 3GB | 6GB | 12GB |
work_mem | 4MB | 8MB | 16MB | 32MB |
maintenance_work_mem | 64MB | 256MB | 512MB | 1GB |
max_connections | 100 | 100 | 200 | 300 |
random_page_cost | 4.0 | 1.1 (SSD) | 1.1 | 1.1 |
effective_io_concurrency | 1 | 200 (SSD) | 200 | 200 |
Apply and restart:
```bash
sudo systemctl restart postgresql
```
`pgtune` online tool gives a starting config for your RAM/workload.
## Level 2 — Indexing strategies
### b-tree (default) — equality + range
```sql
-- Fast for:
-- WHERE email = ...
-- WHERE created_at > '2026-01-01'
-- ORDER BY price
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_orders_created ON orders(created_at);
```
### Composite — multi-column
```sql
-- Fast for WHERE status='paid' AND user_id=42 ORDER BY created_at
CREATE INDEX idx_orders_status_user_created
ON orders(status, user_id, created_at DESC);
```
Column order matters. Most selective first, then by query pattern.
### Partial — index only relevant rows
Huge win for sparse columns:
```sql
-- 95% of orders are 'paid' — only index unusual ones
CREATE INDEX idx_orders_pending
ON orders(id)
WHERE status != 'paid';
```
Index is smaller → faster writes, smaller reads, better cache use.
### GIN — full-text search, JSONB, arrays
```sql
-- Search JSON metadata
CREATE INDEX idx_products_meta
ON products USING gin(metadata);
SELECT * FROM products WHERE metadata @> '{"category": "electronics"}';
-- Full-text search
ALTER TABLE articles ADD COLUMN tsv tsvector
GENERATED ALWAYS AS (to_tsvector('english', title || ' ' || body)) STORED;
CREATE INDEX idx_articles_tsv ON articles USING gin(tsv);
SELECT * FROM articles WHERE tsv @@ plainto_tsquery('english', 'hosting tips');
```
### BRIN — huge append-only tables
```sql
-- Log table with 100M rows by time
CREATE INDEX idx_logs_created_brin
ON logs USING brin(created_at);
```
BRIN index is tiny (KB) vs GB for b-tree. Best when data is physically clustered by the indexed column (append-only).
### Covering — index-only scans
```sql
-- Query: SELECT id, email FROM users WHERE status = 'active'
CREATE INDEX idx_users_active_covering
ON users(status) INCLUDE(id, email)
WHERE status = 'active';
```
PostgreSQL returns results without touching the table — huge speedup.
## Level 3 — EXPLAIN ANALYZE
Before tuning, measure. Always.
```sql
EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON)
SELECT u.name, COUNT(o.id)
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
WHERE u.created_at > now() - interval '30 days'
GROUP BY u.name;
```
Look for:
- `Seq Scan` on big tables = missing index
- `Nested Loop` on big tables = wrong join strategy; force with `enable_hashjoin` or add stats
- Actual rows >> estimated rows = outdated statistics (run `ANALYZE`)
- `Buffers: shared read=10000` = data not cached; large `shared_buffers` helps
Visualise plans at [explain.depesz.com](https://explain.depesz.com) or [pgmustard.com](https://pgmustard.com).
## Level 4 — Vacuum strategy
PostgreSQL's MVCC keeps "dead tuples" after UPDATE/DELETE. Autovacuum removes them. Without it: bloat grows, queries slow down.
Check bloat:
```sql
SELECT schemaname, relname,
pg_size_pretty(pg_relation_size(relid)) AS size,
n_dead_tup,
last_autovacuum
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC
LIMIT 10;
```
If `n_dead_tup > 20% of rowcount`, autovacuum is behind. Tune:
```ini
# postgresql.conf
autovacuum_max_workers = 4
autovacuum_vacuum_scale_factor = 0.05 # vacuum at 5% dead (default 20%)
autovacuum_analyze_scale_factor = 0.02
autovacuum_vacuum_cost_limit = 2000 # aggressive
```
For high-write tables, set per-table:
```sql
ALTER TABLE orders SET (
autovacuum_vacuum_scale_factor = 0.02,
autovacuum_analyze_scale_factor = 0.01
);
```
**Manual vacuum for recovery:**
```sql
VACUUM (VERBOSE, ANALYZE) orders;
-- Nuclear option (locks table briefly, reclaims disk):
VACUUM FULL orders;
```
Don't `VACUUM FULL` on live production large tables — it takes an exclusive lock for potentially hours.
## Level 5 — Partitioning huge tables
When a single table > 50M rows, partitioning by time often helps.
```sql
-- Range partition by month
CREATE TABLE events (
id bigserial,
created_at timestamptz NOT NULL,
user_id uuid,
data jsonb
) PARTITION BY RANGE (created_at);
CREATE TABLE events_2026_04 PARTITION OF events
FOR VALUES FROM ('2026-04-01') TO ('2026-05-01');
CREATE TABLE events_2026_05 PARTITION OF events
FOR VALUES FROM ('2026-05-01') TO ('2026-06-01');
-- Indexes on each partition
CREATE INDEX ON events_2026_04 (user_id);
CREATE INDEX ON events_2026_05 (user_id);
```
Use `pg_partman` extension to auto-create future partitions + drop old ones.
Benefits:
- Query for "last 30 days" hits 1 partition, not all
- `DROP TABLE events_2023_01` instantly reclaims old data
- Individual partition vacuum/index maintenance
## Level 6 — Connection pooling
Every new connection costs 1-2 MB RAM + startup time. Use **PgBouncer** for pooling:
```bash
sudo dnf install pgbouncer
# /etc/pgbouncer/pgbouncer.ini
```
```ini
[databases]
mydb = host=localhost port=5432 dbname=mydb
[pgbouncer]
listen_addr = 127.0.0.1
listen_port = 6432
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 20
```
App connects to `localhost:6432` instead of `:5432`. 1000 app connections → 20 real PG connections.
## Level 7 — Slow query log + pgBadger
Log slow queries:
```ini
# postgresql.conf
log_min_duration_statement = 500 # ms — log queries slower than this
log_line_prefix = '%t [%p]: db=%d,user=%u,app=%a,client=%h '
log_statement = 'ddl'
log_duration = off
```
Analyse logs with pgBadger (HTML reports):
```bash
sudo dnf install pgbadger
pgbadger /var/log/postgresql/postgresql-*.log -o /var/www/html/pgbadger.html
```
Serve via nginx with basic auth. See: slowest queries, most-called, peak hours, errors.
## Level 8 — Replication + read replicas
For read-heavy apps, add a streaming replica:
```ini
# On primary
wal_level = replica
max_wal_senders = 10
wal_keep_size = 1GB
# On replica, base backup + start:
pg_basebackup -h primary-ip -D /var/lib/pgsql/17/data -U replicator -P -R
sudo systemctl start postgresql
```
Route read queries to replica in your app:
```python
write_engine = create_engine(PRIMARY_URL)
read_engine = create_engine(REPLICA_URL)
# Use read_engine for SELECTs, write_engine for writes
```
## Common pitfalls
## FAQ
Q
How do I know if I need tuning?
If you have no slow queries and CPU is <30% during peak — don't tune. If queries occasionally exceed 1 second or CPU >70% sustained — start tuning.
Q
Postgres or MySQL?
For modern apps: Postgres. Better JSON, strong consistency guarantees, GIN indexes, stored procedures, materialised views. MySQL is simpler but less capable. Use MySQL only if you're locked in (WordPress, older apps).
Q
pg_repack or VACUUM FULL for bloat?
pg_repack rebuilds tables without exclusive lock — production-safe. VACUUM FULL locks — emergency only.
Q
How do I benchmark?
pgbench (ships with Postgres) for generic OLTP load. For real workload, replay slow query log against staging.
Q
Do I need a DBA?
For <1 TB data + <1K req/sec, an engineer + these tuning patterns is enough. At scale (multi-TB, 10K+ req/sec), a DBA pays for themselves.