Client Area

The Production-Ready VPS Playbook: OS Hardening, Gateways, TLS, Observability & Ops

5 min readPublished 4 Mar 2026Updated 15 Apr 2026514 views

In this article

  • 1Short Summary
  • 2Table of Contents
  • 31) OS & Security Baseline
  • 42) Secrets & Configuration Management
  • 53) Edge Gateways: Nginx vs Caddy

Short Summary

This addon to your main guide fills common gaps teams hit when taking a KVM NVMe VPS from first boot to production: OS baseline, secrets management, TLS & security headers, reverse proxy best practices (Nginx/Caddy), observability, backups/DR, Cloudflare edge config, performance tuning, runbooks, and stackspecific connection snippets (MongoDB included).


Table of Contents

  1. OS & Security Baseline

  2. Secrets & Configuration Management

  3. Edge Gateways: Nginx vs Caddy

  4. TLS & App Security Headers

  5. Process Management Patterns

  6. Logging & Observability

  7. Backups & Disaster Recovery

  8. Networking: Cloudflare, RealIP, WebSockets

  9. Performance Tuning Cheatsheet

  10. Compliance & Data Handling

  11. Runbooks: OnCall & Incident Basics

  12. Templates & Snippets

    • Nginx SSR/WebSockets

    • Caddy reverse proxy

    • Cloudflare RealIP

    • systemd service template

    • .env example

  13. Stack DB Connect QuickRefs (MongoDB)

  14. Conclusion / Next Steps


1) OS & Security Baseline

Goal: Hard, predictable foundation for all stacks.

Checklist

  • Create nonroot sudo user; disable password SSH; use ed25519 keys

  • UFW allow 22, 80, 443. Deny others by default

  • Fail2ban with jails for SSH, Nginx, and auth logs

  • Unattendedupgrades for security patches

  • Time sync with chrony; set timezone; enable NTP

  • Kernel/VM: disable Transparent Huge Pages (THP), set vm.swappiness=1

  • FD limits: LimitNOFILE=64000 via systemd dropin

  • Swap: create (1-2 RAM) or confirm swap is adequate for crash tolerance

  • Filesystem: prefer ext4/xfs on NVMe; consider noatime/lazytime mounts


2) Secrets & Configuration Management

Goal: Keep credentials safe, versionable, and environmentspecific.

  • Store application config in .env with 0600 permissions

  • For production, reference env via systemd EnvironmentFile=

  • Rotate secrets quarterly; maintain a secrets changelog

  • Optional: age/sops for encrypted config in Git

  • Separate staging and production env files and buckets


3) Edge Gateways: Nginx vs Caddy

When to choose Nginx

  • Familiarity, finegrained performance knobs, advanced caching

When to choose Caddy

  • Fastest path to automatic HTTPS, simple reverse proxy, clean config

Rule of thumb: Use Caddy for 1-3 services and quick wins; use Nginx when you need granular control, complex caching, or legacy features.


4) TLS & App Security Headers

Let's Encrypt (either via Certbot with Nginx or Caddy's autoTLS).

Minimum headers

  • Strict-Transport-Security: max-age=31536000; includeSubDomains; preload

  • Content-Security-Policy (start with default-src 'self' and expand)

  • X-Frame-Options: SAMEORIGIN

  • X-Content-Type-Options: nosniff

  • Referrer-Policy: no-referrer-when-downgrade

  • Permissions-Policy as needed


5) Process Management Patterns

Pick one per service -- don't doublewrap.

  • Node.js: PM2 or systemd (not both). PM2 for clustering/zerodowntime; systemd for OSnative control

  • Python: Gunicorn/Uvicorn behind Nginx/Caddy, managed by systemd

  • Rails: Puma + systemd; assets precompiled in CI/CD

  • .NET: Kestrel + Nginx/Caddy; systemd service

  • Go/Rust: single binary + systemd; graceful shutdown; health endpoints


6) Logging & Observability

Target: See issues before users do.

  • Structured JSON logs (app + proxy); centralize to Loki/ELK

  • Metrics: node_exporter, app exporters (e.g., MongoDB exporter), Prometheus + Grafana dashboards

  • Uptime: Prometheus alerts + external pingers

  • Logrotate policies; compress & age out


7) Backups & Disaster Recovery

  • Filesystem snapshots (provider) + logical dumps (DB tools)

  • 321 rule: 3 copies, 2 media, 1 offsite (S3/B2)

  • Restore tests monthly; document RTO/RPO targets

  • DB specifics (see templates): Postgres (pgBackRest), MySQL (Percona/XtraBackup or mysqldump), MongoDB (mongodump + snapshots)


8) Networking: Cloudflare, RealIP, WebSockets

  • Proxy orangecloud for DNS A/AAAA; set CF-Connecting-IP handling

  • Respect realclient IP in app & logs (Nginx/Caddy snippets below)

  • WebSockets: ensure upgrade headers and keepalive tuning

  • Rate limits/WAF: start with sensible defaults; log violations


9) Performance Tuning Cheatsheet

  • Keepalive: keepalive_timeout 15s; HTTP/2 enabled

  • Compression: Brotli (prefer) with sane min sizes; fallback gzip

  • Static: long Cache-Control + content hashing

  • DB: create indexes early; monitor slow logs; pool sizes appropriate

  • Redis: persistent AOF + memory policy; separate from session cache for Woo/Magento

  • Queues: RabbitMQ/Redis with deadletter policies


10) Compliance & Data Handling

  • Data residency: pin region; document data flows

  • Encrypt PII at rest (DB or applevel). Use KMS or sealed secrets

  • GDPR basics: retention schedules, righttoerasure playbook


11) Runbooks: OnCall & Incident Basics

  • Who to page: roles & contact ladder

  • First 5 minutes: check health endpoints, error rates, DB connections, disk free, TLS expiry

  • Rollback: documented blue/green or tagbased deployment rollback

  • Postmortem: template with action items & owners


12) Templates & Snippets

Nginx (SSR + WebSockets)

map $http_upgrade $connection_upgrade { default upgrade; '' close; }
server {
 listen 80; server_name example.com;
 location / {
 proxy_pass http://127.0.0.1:3000;
 proxy_set_header Host $host;
 proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
 proxy_set_header X-Forwarded-Proto $scheme;
 proxy_http_version 1.1;
 proxy_set_header Upgrade $http_upgrade;
 proxy_set_header Connection $connection_upgrade;
 proxy_read_timeout 60s;
 }
}

Caddy (autoTLS + reverse proxy)

example.com {
 encode zstd gzip
 reverse_proxy 127.0.0.1:3000
 header {
 Strict-Transport-Security "max-age=31536000; includeSubDomains; preload"
 X-Content-Type-Options "nosniff"
 Referrer-Policy "no-referrer-when-downgrade"
 }
}

Cloudflare RealIP (Nginx)

# Pull latest ranges from Cloudflare docs periodically
set_real_ip_from 173.245.48.0/20;
set_real_ip_from 103.21.244.0/22;
# ... (other ranges)
real_ip_header CF-Connecting-IP;

systemd service (template)

[Unit]
Description=App Service
After=network.target

[Service]
User=app
EnvironmentFile=/etc/app/app.env
WorkingDirectory=/var/www/app
ExecStart=/usr/bin/node server.js
Restart=always
RestartSec=3
LimitNOFILE=64000

[Install]
WantedBy=multi-user.target

.env (example)

PORT=3000
NODE_ENV=production
MONGODB_URI=mongodb://appuser:[email protected]:27017/appdbauthSource=appdb
REDIS_URL=redis://127.0.0.1:6379
SESSION_SECRET=replace_me

13) Stack DB Connect QuickRefs (MongoDB)

Node.js (Mongoose)

const mongoose = require('mongoose');
mongoose.connect(process.env.MONGODB_URI, {
 maxPoolSize: 10, serverSelectionTimeoutMS: 5000
});

FastAPI (Motor)

import motor.motor_asyncio as mtr
client = mtr.AsyncIOMotorClient(os.getenv("MONGODB_URI"), maxPoolSize=10)
db = client.appdb

Django (Mongo/AltORM)

DATABASES={
 'default':{
 'ENGINE':'djongo',
 'NAME':'appdb',
 'CLIENT':{'host':os.environ.get('MONGODB_URI')}
 }
}

Rails (Mongoid)

production:
 clients:
 default:
 uri: <%= ENV['MONGODB_URI'] %>

Security notes

  • Auth enabled, local bind or IP allowlist

  • UFW deny 27017 by default; allow only trusted IPs

  • Replica set for change streams/transactions when needed

  • Backups: daily mongodump + provider snapshots; test restores


Conclusion / Next Steps

  • Apply the baseline & security hardening

  • Choose Nginx or Caddy per your complexity needs

  • Wire observability + backups from day one

  • Use the snippets to accelerate SSR, WebSockets, and RealIP correctness

  • Add the MongoDB pieces into each stack chapter


  • Choosing Nginx vs Caddy for App Gateways (tradeoffs & configs)

  • Zerodowntime Deployments (blue/green, canaries, rollbacks)

  • Prometheus, Grafana & Loki on a Single VPS (quick start)

  • Redis vs RabbitMQ for Queues (when to pick which)

Was this article helpful?

Your feedback helps us improve our documentation

Still need help? Submit a support ticket