Client Area

Redis Data Structures Beyond Caching — Leaderboards, Rate Limits, Pub/Sub, Streams

ByDomain India Team·DomainIndia Engineering
6 min read24 Apr 20264 views
# Redis Data Structures Beyond Caching — Leaderboards, Rate Limits, Pub/Sub, Streams
TL;DR
Redis is more than a cache. Its data structures — sorted sets, hashes, streams, bitmaps, HyperLogLog — enable use cases that take gigabytes of disk + hours of work in PostgreSQL. This guide covers 8 non-cache patterns you can deploy on your DomainIndia VPS today.
## Why Redis isn't just cache Most people learn Redis as "key-value cache". But Redis ships with 10+ data structures, each optimised for specific access patterns. A well-designed app often has Redis doing 4-5 jobs beyond caching. ## Pattern 1 — Leaderboards (sorted sets) "Top 10 scorers this week" — classic SQL query, expensive at scale. Redis sorted set: O(log N) per update, O(log N + M) per range read. ```python import redis r = redis.Redis() # User 42 scores 500 r.zincrby('leaderboard:weekly', 500, 'user:42') # Top 10 top = r.zrevrange('leaderboard:weekly', 0, 9, withscores=True) # [(b'user:42', 500), (b'user:17', 450), ...] # User's rank rank = r.zrevrank('leaderboard:weekly', 'user:42') # 0-indexed # Around the user (±5 rows) r.zrevrangebyscore('leaderboard:weekly', max=user_score + 100, min=user_score - 100) ``` Store 1M users in ~100 MB. Reads 1ms. Aggregations via `ZUNIONSTORE`. ## Pattern 2 — Rate limiting (atomic counters) Fixed window: ```python key = f'ratelimit:{user_id}:{int(time.time() // 60)}' count = r.incr(key) r.expire(key, 65) # TTL slightly > window if count > 60: raise TooManyRequests() ``` Sliding window using sorted set: ```python key = f'ratelimit:sliding:{user_id}' now = time.time() # Remove requests older than window r.zremrangebyscore(key, 0, now - 60) # Count current window count = r.zcard(key) if count >= 60: raise TooManyRequests() # Record this request r.zadd(key, {str(uuid.uuid4()): now}) r.expire(key, 65) ``` More accurate than fixed window; slight perf cost. ## Pattern 3 — Distributed locks Multiple workers, one task. Use Redis SETNX: ```python # Try to acquire lock lock = r.set(f'lock:job:42', worker_id, nx=True, ex=30) if lock: try: process_job(42) finally: # Atomic check-and-delete (Lua) lua = """if redis.call('get', KEYS[1]) == ARGV[1] then return redis.call('del', KEYS[1]) end""" r.eval(lua, 1, 'lock:job:42', worker_id) else: # Another worker has it pass ``` For production use: `redis-py-lock` or `Redlock` library handles edge cases (clock drift, etc.). ## Pattern 4 — Session storage Replaces file-based or DB sessions. Each session = one hash. ```python # On login r.hset(f'session:{sid}', mapping={ 'user_id': '42', 'role': 'admin', 'created': str(int(time.time())), }) r.expire(f'session:{sid}', 3600) # On request data = r.hgetall(f'session:{sid}') # On logout r.delete(f'session:{sid}') # Invalidate all of user's sessions (password change) # Requires tracking: maintain set of active sessions per user ``` Laravel, Rails, Django, Express — all ship Redis session drivers. ## Pattern 5 — Pub/Sub (fire-and-forget events) Simple cross-process messaging. No persistence. ```python # Publisher r.publish('events:order', json.dumps({'order_id': 42, 'status': 'paid'})) # Subscriber (in another process) pubsub = r.pubsub() pubsub.subscribe('events:order') for message in pubsub.listen(): if message['type'] == 'message': handle(json.loads(message['data'])) ``` **Warning:** pub/sub has no persistence. If subscriber is offline, messages are lost. For reliable delivery use Streams (next pattern). ## Pattern 6 — Streams (persistent event log) Like Kafka, lightweight. ```python # Produce r.xadd('events', {'type': 'order.paid', 'order_id': 42}) # Consume — with consumer groups for work distribution r.xgroup_create('events', 'workers', id='0', mkstream=True) while True: # XREADGROUP blocks until new messages msgs = r.xreadgroup('workers', 'worker-1', {'events': '>'}, count=10, block=5000) for stream, entries in msgs: for msg_id, fields in entries: process(fields) r.xack('events', 'workers', msg_id) ``` Replaces Kafka for small-to-medium workloads. Runs on the same Redis instance. ## Pattern 7 — Presence / online users Who's online right now? ```python # On heartbeat (every 30s) r.zadd('online', {user_id: time.time()}) # Trim old r.zremrangebyscore('online', 0, time.time() - 60) # Who's online now online_count = r.zcard('online') online_ids = r.zrangebyscore('online', time.time() - 60, '+inf') ``` 1M users: ~40 MB RAM, updates fast. ## Pattern 8 — Deduplication via HyperLogLog "How many unique visitors today?" — at scale. Exact count needs storing every unique ID (GBs). HyperLogLog estimates with 0.8% error in 12 KB. ```python r.pfadd('visitors:2026-04-24', f'ip:{request.ip}') count = r.pfcount('visitors:2026-04-24') # estimate # Union across days (for weekly) r.pfmerge('visitors:week', 'visitors:2026-04-20', ..., 'visitors:2026-04-24') r.pfcount('visitors:week') ``` Powers "unique views" counters at Twitter/Reddit scale. ## Pattern 9 — Bitmaps for feature flags / 1-bit per user Is user 42 in the beta? Store 1 bit per user ID: ```python r.setbit('feature:new_checkout', 42, 1) r.getbit('feature:new_checkout', 42) # → 1 # How many users enabled? r.bitcount('feature:new_checkout') # A/B test: users with even IDs for uid in range(0, 1000000, 2): r.setbit('ab:group_a', uid, 1) ``` 1M users = 125 KB (1 bit × 1M). Checks are O(1). ## Pattern 10 — Geospatial queries "Restaurants within 5 km of the customer": ```python r.geoadd('restaurants', (lng, lat, 'pizza-palace')) r.geoadd('restaurants', (lng2, lat2, 'burger-joint')) # Nearby r.geosearch('restaurants', longitude=77.5946, latitude=12.9716, radius=5, unit='km', sort='ASC', withcoord=True, withdist=True) ``` Lightweight alternative to PostGIS for "find nearby" needs. ## Performance tips - **Pipelining** — batch commands, one round-trip: ```python pipe = r.pipeline() for u in users: pipe.zincrby('leaderboard', u.score, u.id) pipe.execute() # single round trip ``` - **Lua scripts** — atomic multi-step operations: ```lua -- Atomic "check balance, deduct, log": local balance = redis.call('HGET', KEYS[1], 'balance') if tonumber(balance) >= tonumber(ARGV[1]) then redis.call('HINCRBY', KEYS[1], 'balance', -ARGV[1]) return 'OK' end return 'INSUFFICIENT' ``` - **Connection pooling** — don't open/close per request - **Monitor slowlog** — `SLOWLOG GET 20` shows slow queries ## Persistence strategies Redis can be pure in-memory (lose data on restart) or persistent: - **RDB snapshots** — periodic DB dumps. Fast restart, some data loss possible. - **AOF (Append-Only File)** — every write logged. Minimal loss; larger disk + slower. - **Both** — recommended for production. Configure in `redis.conf`: ``` save 900 1 # RDB every 15 min if ≥1 change save 300 10 # every 5 min if ≥10 changes appendonly yes appendfsync everysec ``` ## On DomainIndia - **Shared cPanel:** limited Redis access; contact support - **VPS:** `sudo dnf install redis` — 1 command - **App Platform:** add Redis service via dashboard ## Common pitfalls ## FAQ
Q Redis vs Memcached?

Redis — richer data types, persistence, pub/sub, streams. Memcached — pure key-value, slightly faster for just caching. For modern apps: Redis.

Q How much RAM for Redis?

Rule: total stored data × 1.3 (overhead). 1M session hashes × 1 KB = 1.3 GB.

Q Redis Cluster when?

>50 GB data or >100K ops/sec single instance. For most DomainIndia customers — one Redis on VPS is plenty.

Q Redis vs Valkey?

Valkey is Redis fork (Linux Foundation). API-compatible drop-in. AlmaLinux 10+ ships Valkey. Pick either.

Q How to back up Redis?

RDB file (/var/lib/redis/dump.rdb) — copy elsewhere. Restore by replacing file + restart. See Automated Backups.

Run Redis for everything on a DomainIndia VPS. Start with VPS

Was this article helpful?

Your feedback helps us improve our documentation

Still need help? Submit a support ticket