← Tutorials ☕ Java 🧩 DSA 🌿 Spring Boot 🚀 DevOps 🗂 MySQL 🏗 System Design ❓ FAQ 🌐 HTML 🔇 JavaScript ⚛ ReactJS 🌻 NodeJS 🐍 Python 🍏 MongoDB 📋 Cheatsheet
Tutorials › System Design

What is System Design?

🏗️ System Design · Foundations · 5 min read

System design is the process of defining the architecture, components, and data flow of a large-scale software system to meet functional and non-functional requirements.

Why It Matters

  • Core topic in senior/staff engineer interviews (L5+)
  • Teaches you to think about trade-offs, not just code
  • Required to build systems that handle millions of users

Interview Framework

  1. Clarify requirements — functional (what it does) and non-functional (scale, latency)
  2. Estimate scale — QPS, storage, bandwidth
  3. High-level design — components and their connections
  4. Detailed design — deep dive into 2-3 key components
  5. Identify bottlenecks and how to address them

Scalability

🏗️ System Design · Foundations · 6 min read

Vertical Scaling (Scale Up)

Add more CPU/RAM to a single server. Simple but has an upper limit and single point of failure.

Horizontal Scaling (Scale Out)

Add more servers. Needs a load balancer. No theoretical limit. Standard approach for web apps.

Stateless Services

Keep no session state on the server — store it in Redis or a DB. Required for horizontal scaling.

Database Sharding

Split data across multiple DB servers by a shard key (e.g. user_id % N). Enables massive scale.

CAP Theorem

🏗️ System Design · Foundations · 5 min read

In a distributed system, you can only guarantee two out of three:

Consistency (C)

Every read gets the most recent write. All nodes see the same data at the same time.

Availability (A)

Every request gets a response (not guaranteed to be the latest). The system stays up.

Partition Tolerance (P)

The system continues operating even if some network messages are lost between nodes.

In practice, network partitions happen — so you must choose between CP (consistent but may be unavailable during a partition) or AP (available but may serve stale data).

  • CP — ZooKeeper, HBase, traditional SQL
  • AP — Cassandra, DynamoDB, CouchDB

Load Balancers

🏗️ System Design · Core Components · 5 min read

A load balancer distributes incoming traffic across multiple servers to prevent any one server from being overwhelmed.

Algorithms

  • Round Robin — requests distributed in rotation
  • Least Connections — route to the server with fewest active connections
  • IP Hash — same client always goes to same server (sticky sessions)
  • Weighted Round Robin — more powerful servers get more traffic

Layer 4 vs Layer 7

  • L4 (Transport) — routes by IP/TCP without looking at content; faster
  • L7 (Application) — can route by URL, headers, cookies; enables path-based routing
💡 AWS ALB (Application Load Balancer) is L7. AWS NLB (Network Load Balancer) is L4. Use ALB for most web apps.

Caching

🏗️ System Design · Core Components · 6 min read

Caching stores frequently accessed data in fast memory (RAM) to reduce DB load and response latency.

Cache Strategies

  • Cache-Aside (Lazy Loading) — app checks cache first; on miss, loads from DB and writes to cache. Most common.
  • Write-Through — write to cache and DB simultaneously. Consistent but slower writes.
  • Write-Back — write to cache only; flush to DB asynchronously. Fast writes, risk of data loss.

Eviction Policies

  • LRU — Least Recently Used (most common)
  • LFU — Least Frequently Used
  • TTL — expire after a time period

Tools

  • Redis — in-memory data store; supports strings, hashes, lists, sets. Use for session, leaderboards, pub/sub.
  • Memcached — simple key-value cache; faster for pure caching use cases.

Databases — SQL vs NoSQL

🏗️ System Design · Core Components · 7 min read

SQL (Relational)

Structured schema, ACID transactions, complex joins. MySQL, PostgreSQL. Good for financial, inventory, user data.

NoSQL (Document)

Flexible schema, horizontal scaling. MongoDB, DynamoDB. Good for catalogues, user profiles, content.

NoSQL (Wide-Column)

Cassandra, HBase. Excellent write throughput. Good for time-series, IoT, activity logs.

NoSQL (Graph)

Neo4j. Models relationships natively. Good for social graphs, recommendation engines.

Database Replication

  • Master-Slave — writes go to master, reads from replicas. Most common.
  • Multi-Master — writes to any node; complex conflict resolution.

Message Queues

🏗️ System Design · Core Components · 5 min read

Message queues decouple producers from consumers, enabling async processing and absorbing traffic spikes.

Use Cases

  • Sending emails/notifications asynchronously after a user action
  • Processing uploads (video encoding, image resizing) in the background
  • Distributing work across multiple worker instances
  • Event-driven microservices communication

Tools

  • Kafka — high-throughput event streaming; replay messages; used at Netflix, LinkedIn
  • RabbitMQ — traditional message broker; flexible routing; easier to set up than Kafka
  • AWS SQS — fully managed queue; great for AWS-native apps

CDN (Content Delivery Network)

🏗️ System Design · Core Components · 4 min read

A CDN caches static assets (images, CSS, JS, videos) on servers geographically close to users, reducing latency and origin server load.

How It Works

  1. User requests cdn.example.com/logo.png
  2. CDN routes to the nearest edge server (PoP)
  3. If cached → serve immediately (cache hit)
  4. If not → fetch from origin, cache, then serve (cache miss)

Tools

  • CloudFront — AWS CDN, integrates with S3 and EC2
  • Cloudflare — CDN + DDoS protection + DNS
  • Fastly — programmable CDN, used by GitHub, Stripe

Microservices

🏗️ System Design · Patterns · 6 min read

Microservices break a monolith into small, independently deployable services, each owning its domain and data.

Monolith vs Microservices

Monolith

Simpler to build, test, and deploy initially. Single deployment unit. Scales poorly at high load.

Microservices

Independent scale, deploy, and tech stack per service. Complex ops (service discovery, distributed tracing).

When to Use Microservices

  • Different parts of the system have very different scaling needs
  • Multiple teams working independently
  • Different reliability requirements per domain
💡 Start with a monolith. Extract microservices only when you have clear pain points — premature decomposition is a common mistake.

API Gateway

🏗️ System Design · Patterns · 5 min read

An API Gateway is the single entry point for all client requests in a microservices architecture. It handles cross-cutting concerns so services don't have to.

Responsibilities

  • Request routing to the correct microservice
  • Authentication & authorisation (validate JWT before forwarding)
  • Rate limiting & throttling
  • SSL termination
  • Response aggregation (combine data from multiple services)
  • Logging & monitoring

Tools

  • AWS API Gateway
  • Kong
  • Nginx (simple reverse proxy)
  • Spring Cloud Gateway

Rate Limiting

🏗️ System Design · Patterns · 5 min read

Rate limiting controls how many requests a client can make in a time window, protecting APIs from abuse and DDoS attacks.

Algorithms

  • Token Bucket — tokens refill at a fixed rate; burst allowed up to bucket capacity. Most common.
  • Leaky Bucket — requests processed at a constant rate regardless of burst; smooths traffic.
  • Fixed Window Counter — count requests per time window; simple but vulnerable to edge spikes.
  • Sliding Window Log — accurate but memory-intensive.

Implementation

  • Store counters in Redis (fast, shared across server instances)
  • Return 429 Too Many Requests with a Retry-After header
  • Use Nginx limit_req module for simple cases

Case Study: URL Shortener

🏗️ System Design · Case Studies · 8 min read

Design a system like bit.ly that converts long URLs to short codes and redirects users.

Requirements

  • Shorten a URL → return a short code (e.g. 7ds.in/aB3kR)
  • Redirect short URL to original URL in < 10ms
  • 100M new URLs/day; 10B redirects/day (100:1 read/write ratio)

Design Decisions

  • Short code generation — Base62 encode a unique ID (integers → a-zA-Z0-9, 6-7 chars)
  • Storage — MySQL: (id, short_code, long_url, created_at); index on short_code
  • Cache — Redis for hot URLs; 80% of reads likely hit 20% of URLs (Pareto principle)
  • Redirect301 Permanent for SEO (browser caches); 302 Temporary if analytics needed
  • Scaling — stateless app servers behind a load balancer; DB read replicas for redirect queries
GET /aB3kR
→ Check Redis cache → hit → 302 redirect to long URL
→ miss → query MySQL → write to cache → redirect

Case Study: News Feed

🏗️ System Design · Case Studies · 8 min read

Design the Facebook/Twitter home feed — show a user a ranked list of posts from people they follow.

Two Approaches

Fan-Out on Write (Push)

When user A posts, precompute and push the post to all followers' feed caches.

  • Reads are fast — feed is pre-built in Redis
  • Writes are expensive for celebrities (millions of followers)

Fan-Out on Read (Pull)

When user B loads their feed, query posts from everyone they follow and merge.

  • Writes are cheap
  • Reads are slow at scale

Hybrid Approach (used by Instagram/Twitter)

  • Push for normal users (small follower counts)
  • Pull for celebrities — fetch on read, merge with pre-built feed

Key Components

  • Post storage — Cassandra (time-series writes)
  • Feed cache — Redis sorted set (score = timestamp)
  • Media — S3 + CDN
  • Feed generation — async Kafka consumer per post publish event