System designmediumfaangdesign-systemsscale
Design a URL shortener like Bitly
Framework
- Clarify: scale (writes/reads/day), latency target, custom aliases?, analytics?
- Capacity: estimate URL/day, storage growth, read/write QPS
- API: POST /shorten, GET /:code → 302 redirect
- Schema: id (base62), longUrl, userId, createdAt, expiresAt, hitCount
- ID generation: counter + base62 encode (sequential) or hash-then-collision-check
- Caching: hot URLs in Redis with LRU eviction
- Read path: redirect needs to be <50ms — cache → DB → 404
- Analytics: write-through to a queue, batch-aggregate in a warehouse
Sample answer
Lead with: "I'd clarify the read/write ratio first — that drives everything." Reads >> writes (~100:1 is typical). So we optimize for redirect latency: 1. ID generation: distributed counter (e.g. Redis INCR or Snowflake) → base62-encode → 7 chars handles 3.5T URLs. 2. Storage: simple Postgres table indexed on the short code. 3. Hot cache: Redis LRU. 95%+ of requests hit cache for popular URLs. 4. Redirect path: cache hit → 302 in <10ms. Cache miss → DB lookup, populate cache, 302 in <50ms. 5. Analytics: async — write a click event to Kafka, aggregate offline. Failure modes to discuss: ID collisions (use distributed counter, not hash), cache stampede on viral URLs (request coalescing or stale-while-revalidate), abuse (rate-limit per user/IP, scan for malicious target domains).
Common pitfalls
- Jumping to implementation without clarifying read/write ratio
- Forgetting analytics (Bitly's actual product value)
- Designing a 302 path with synchronous logging