Common System Design Questions

Some common system design questions

Delivery Framework

Requirements
- functional reuqirements
- unfunctional requirements, the quality of service
- May start from CAP
Core entities, the entities db will persist or api to exchange. Or, I can call it a concept. e.g., a user, an event, a ticket, etc.
API or Interface design, no more than 5 mins
Data Flow
High-level Design, to meet the functional requirements
Deep Dives, to fullfill the non-functional requirements, e.g., strongly consistent

How are system design interviews evaluated?

problem solving: identify & prioritize the core challenges
solution design: create scalable architectures with balanced trade-offs
technical excellence: demonstrate deep knowledge and expertise
communication: clearly explain complex concepts to stakeholders

Why do we need capacity estimation

determine number of servers and databases
cost management
decide the type and specifications of all hardware(server, db, etc.)
help us determine if the system is read heavy or write heavy, from read/write throughput analyse.
- read heavy, choose postgres with indexing
- write heavy, choose casandra or couchdb?

Common System Design Concepts

Network and Infrastructure Constraints

Limited bandwidth or high packet loss forces TCP retransmissions, inflating tail latency —especially for large payloads .
Geo-distributed assets served from a single origin add 100-200 ms of RTT per continent hop. Push static files to a CDN edge; Cloudflare, or AWS CloudFront usually cut download time by ≥50% .
Blind dependence on third-party APIs (payments, auth, maps) without fallbacks can block every request during an outage. Wrap such calls in circuit-breaker logic plus timeouts <½ your SLO .
Inefficient replication schemes: pulling full data dumps hourly balloons egress costs and stalls warm replicas; switch to delta or change-data-capture (CDC) streams to ship only modified rows/objects .

Centralised “Choke-Point” Resources

Single load balancer, cache cluster, or primary DB becomes a reliability and performance bottleneck. Deploy them as N + 1 pairs behind anycast or DNS load-balancing, and shard state where possible.
Hot shards or sticky sessions cause one node to hit 90% CPU while siblings idle. Use consistent hashing or randomised load-balancing to level traffic.

CAP Theorem Trade-Offs

We can’t have strong consistency, full availability, and partition tolerance simultaneously.

Slow queries and inefficient indexing
Read-heavy loads without replicas
Write-heavy traffic, especially in systems not designed for asynchronous processing

Code and Application Design Flaws

Inefficient algorithms or poor code quality, e.g., too many loops burn CPUs
Monolithic architectures makes it hard to scale parts independently
Inadequate caching

Resource & Auto-Scaling Issues

Resource contention: CPU, memory, or connection pools
Inefficient scaling strategies: no elastic auto-scaling
Third-party dependencies: Slow or rate-limited external services, like payment gateways

Proactive Optimisation Playbook

Profile Resource Usage: Identify hot paths in code and DB queries.
Stress Testing: Simulate max load with tools like JMeter to identify weak spots.
Bulkheading: Isolate dependent services with timeouts/fail fast strategies.
Cache Layering: Tier 1 (in-app LRU) → Tier 2 (Redis) → Tier 3 (DB/Backup).

Case study: design a youtube

Requirements

Functional Requirements
- users shoud be able to upload videos
- users should be able to watch/stream videos

the scale 1m uplods/day 1000M DAU Max video size of a video is 256GB

non-functional requirements
- availablity » consistency for video upload
- support uploading and streaming for large videos (256GB)
- streaming low latency < 500 ms
- scalability to scale to 1 M uploads/day and 100M views

Core entities

video
video metadata
user

API design

// upload a video POST /api/v1/videos { videoStoreId, videoMetadata }

// watch a video GET /api/v1/vides{videoId} -> video & videoMetadata

HLD

// upload a video client -> S3 to get videoStoreId client -> api gateway -> video services -> videoMetadataDB (status: pending) S3 notification, update videoMetadata status -> uploaded

// to stream a video client -> api gateway -> video services -> videoMetadataDB: fetch by video id, find s3Url -> S3

Deep dive

References

Share on

Twitter Facebook Google+ LinkedIn

Moss GU

Common System Design Questions

Delivery Framework

How are system design interviews evaluated?

Why do we need capacity estimation

Common System Design Concepts

Network and Infrastructure Constraints

Centralised “Choke-Point” Resources

CAP Theorem Trade-Offs

Code and Application Design Flaws

Resource & Auto-Scaling Issues

Proactive Optimisation Playbook

Case study: design a youtube

Requirements

Core entities

API design

HLD

Deep dive

References

Share on

You May Also Enjoy

Programmatic vs Conceptual Prompts

Prompts for Weaker LLM Models

Scalable Prompt Design

Evaluating LLM Models

Moss GU

Delivery Framework

How are system design interviews evaluated?

Why do we need capacity estimation

Common System Design Concepts

Network and Infrastructure Constraints

Centralised “Choke-Point” Resources

CAP Theorem Trade-Offs

Database-Related Bottlenecks

Code and Application Design Flaws

Resource & Auto-Scaling Issues

Proactive Optimisation Playbook

Case study: design a youtube

Requirements

Core entities

API design

HLD

Deep dive

References

Share on

You May Also Enjoy

Programmatic vs Conceptual Prompts

Prompts for Weaker LLM Models

Scalable Prompt Design

Evaluating LLM Models