Day 3: Timeline Generation Algorithms: The Heart of Social Media

Lesson 3 15 min

Building Production-Ready Feed Systems That Scale

What We're Building Today

Today we're tackling the most critical component of any social media platform: timeline generation. Think of it as the brain that decides what content each user sees and when.

High-Level Build Agenda:

  • Three different timeline models (Pull, Push, Hybrid) with smart switching logic

  • Fanout service that distributes tweets to followers in under 200ms

  • Cursor-based pagination system for infinite scroll

  • Redis caching layer with intelligent TTL strategies

  • Performance monitoring dashboard showing real-time metrics

  • Complete React/TypeScript frontend with Twitter-like UI

  • Node.js backend with PostgreSQL database optimization

Our Mission: Transform static tweet storage into dynamic, personalized timelines that update in real-time and automatically choose the best algorithm based on user behavior.


The Timeline Challenge: Why Netflix Gets This Right

When you open Netflix, your homepage isn't generated on-the-spot from their entire catalog. Instead, Netflix pre-computes personalized recommendations and serves them instantly. Social media timelines face the same challenge but with a twist: content changes every second.

Instagram processes over 500 million timeline requests daily. Their secret? They don't use one approach—they use all three timeline models depending on the user type.


Core Concepts: The Three Timeline Architectures

1. Pull Model (On-Demand Generation)

How it works: Generate timeline when user requests it by querying all followed users' recent tweets.

Real-world application: Reddit uses this for smaller subreddits where content velocity is manageable.

Workflow: User opens app → Query follows table → Fetch recent tweets → Sort by timestamp → Return timeline

Perfect for: Users with few followers/following, low-traffic periods

2. Push Model (Pre-computed Timelines)

How it works: When someone posts a tweet, immediately push it to all followers' pre-built timelines.

Real-world application: Facebook's original News Feed architecture used pure push for faster loading.

Workflow: User posts tweet → Fanout service distributes to all followers → Update each follower's materialized timeline → Instant retrieval

Perfect for: Active users, real-time experience priority

3. Hybrid Model (The Production Reality)

How it works: Combine both approaches based on user behavior patterns and follower counts.

Real-world application: Twitter's current architecture—celebrities use pull, regular users use push, power users get hybrid treatment.

Workflow: Dynamic routing based on user classification → Apply appropriate model → Merge results when necessary


System Architecture Overview

Our timeline system consists of four core components:

  1. Timeline Controller: Routes requests based on user type and load

  2. Fanout Service: Distributes tweets to follower timelines

  3. Timeline Cache: Redis-based caching for instant retrieval

  4. Pagination Manager: Handles infinite scroll without performance degradation

Data Flow Sequence:

  1. User requests timeline → Controller classifies user type

  2. Pull users: Query recent tweets → Sort → Cache → Return

  3. Push users: Retrieve pre-computed timeline → Paginate → Return

  4. Hybrid users: Merge cached timeline + real-time queries → Return


The Fanout Service: Distribution at Scale

The fanout service is your system's nervous system—it determines how quickly information spreads through your network.

Smart Distribution Logic:

  • Celebrity posts (>10K followers): Use pull model to avoid fanout explosion

  • Regular posts ( 10000) {
    scheduleAsyncFanout(tweet); // Lazy loading for celebrities
    immediateFanout(tweet); // Real-time for regular users

Timeline State Management

Timeline systems maintain three critical states:

  • Fresh: Timeline is current ( 30 minutes old)


Implementation Deep Dive

Prerequisites Setup

Before we start building, make sure you have:

  • Node.js 18+ installed

  • Docker Desktop running

  • Basic understanding of React and TypeScript

Project Structure Creation

bash
# Create the main project structure
mkdir twitter-timeline-system && cd twitter-timeline-system
mkdir -p {frontend/src/{components,services,types},backend/src/{controllers,services,models},database,docker}

Database Architecture Implementation

User Classification Algorithm:

typescript
function determineTimelineModel(user: User): TimelineModel {
  if (user.followerCount > 10000) {
    return 'pull';  // Avoid fanout explosion
  } else if (user.followingCount < 100 && user.followerCount  {
  const response = await timelineApi.getTimeline(cursor);
  // Append new tweets without duplicates
  setTweets(prev => [...prev, ...response.tweets]);
}, []);

Backend Service Implementation

Timeline Service Architecture:
The TimelineService class implements all three models:

Pull Model Implementation:

  • Queries user's following list

  • Fetches recent tweets from followed users

  • Sorts by timestamp and paginates

  • Perfect for celebrity users to avoid fanout explosion

Push Model Implementation:

  • Retrieves pre-computed timeline from materialized table

  • Uses Redis caching for sub-50ms responses

  • Ideal for regular users with predictable patterns

Hybrid Model Implementation:

  • Combines cached timeline with real-time queries

  • Merges and deduplicates results

  • Balances performance with freshness

Performance Optimization Strategies

Cursor-Based Pagination Advantage:

sql
-- Slow offset-based query
SELECT * FROM tweets OFFSET 10000 LIMIT 20;

-- Fast cursor-based query  
SELECT * FROM tweets WHERE created_at < '2024-01-01' AND id < 'abc123' LIMIT 20;

Cursor pagination maintains O(log n) performance regardless of page depth, crucial for infinite scroll.

Caching Strategy Implementation:

  • Hot Cache: Active users get 5-minute TTL

  • Warm Cache: Regular users get 15-minute TTL

  • Cold Cache: Inactive users get 30-minute TTL


Build and Testing Instructions

1. Environment Setup

bash
# Start infrastructure services
cd docker && docker-compose up -d postgres redis

# Verify services are healthy
docker-compose ps
# Expected: postgres (healthy), redis (healthy)

2. Backend Development

bash
cd backend
npm install
npm run build
npm run dev
# Expected: Server running on http://localhost:5000

The backend automatically sets up the database schema and seeds demo data with different user types for testing all three timeline models.

3. Frontend Development

bash
cd frontend  
npm install
npm run dev
# Expected: Vite dev server on http://localhost:3000

4. Performance Testing

Timeline Generation Speed Test:

bash
# Test response times for all models
for i in {1..10}; do 
  curl -w "%{time_total}n" -o /dev/null -s http://localhost:5000/api/timeline
done
# Expected: All results under 0.200 seconds

Load Testing with Multiple Users:

bash
# Simulate 100 concurrent users
npx artillery run load-test.yml
# Expected: 95th percentile under 300ms

5. Demo Verification

Timeline Model Switching Demo:

  1. Open http://localhost:3000

  2. Notice the timeline model indicator showing which algorithm is active

  3. Refresh the page and observe generation time metrics

  4. Test infinite scroll functionality

Performance Monitoring:
The UI displays real-time metrics:

  • Generation Time: Should consistently show 80% hit rates

  • Database query optimization with proper indexing


What's Next: Preparing for Real-Time

Lesson 4 will transform our static timelines into living streams with WebSocket connections and Redis pub/sub. Every timeline update will flow in real-time, creating that addictive "always something new" experience that keeps users engaged.

Today's timeline foundation makes real-time updates possible without sacrificing performance. You're building the engine that will power instant notifications, live reactions, and seamless user experiences.

The architecture we've built today includes:

  • WebSocket connection points for real-time updates

  • Event hooks for timeline invalidation

  • Cache invalidation patterns for instant updates

  • Performance baseline for measuring real-time improvements


Assignment Challenge

Implement timeline caching with TTL optimization by analyzing user activity patterns. Your goal is to improve cache hit rates by at least 15%.

Solution Approach:

  1. Track last_active timestamps for all users

  2. Implement sliding TTL windows (5min for active users, 30min for inactive)

  3. A/B test the optimization against current fixed TTL

  4. Measure and document performance improvements

Success Metrics: Cache hit ratio improvement, reduced database load, maintained response times under 200ms.

This challenge prepares you for production-level optimizations where small improvements in cache efficiency can save thousands in infrastructure costs while improving user experience.

Need help?