Configuring RoadRunner: The rr.yaml supervisor setup.

Lesson 2 15 min

Architecting for Velocity: Unlocking Peak Performance with RoadRunner's rr.yaml Supervisor

Welcome back, engineers! Yesterday, we laid the groundwork for our High-Scale PHP CMS, recognizing the monumental shift from traditional shared-nothing PHP to a persistent runtime model. We saw how this paradigm allows PHP to finally shed its "bootload tax" – that constant, repetitive overhead of bootstrapping the entire application on every single request.

Today, we're diving deeper into the beating heart of this new architecture: RoadRunner's rr.yaml configuration file, specifically focusing on its role as a process supervisor. This isn't just a config file; it's the operational blueprint, the manifest that dictates how your PHP application workers will live, breathe, and perform under immense pressure. Understanding this file is the difference between a high-performance system and one that crumbles under load.

Agenda for Day 2:

  1. Core Concepts: The Supervisor Pattern, rr.yaml as an Orchestration Manifest, Worker Lifecycle Management.

  2. Architectural Fit: How rr.yaml configures the RoadRunner server within our overall CMS.

  3. Control & Data Flow: Tracing a request through a supervised PHP worker.

  4. Real-world Insights: Sizing and tuning for 100M+ requests per second.

  5. Hands-on Implementation: Setting up a basic RoadRunner-managed PHP application.

  6. Assignment & Solution: Practical exercises to solidify your understanding.

Core Concepts: The Unseen Hand of the Supervisor

At its essence, a supervisor is a guardian process that monitors and manages other processes. Think of it as a benevolent overseer ensuring that critical components of your system remain healthy and operational. In the context of RoadRunner and high-scale PHP, this concept is nothing short of revolutionary.

Why is a Supervisor Critical for High-Scale PHP?

Traditional PHP, with its "shared-nothing" architecture, was inherently resilient to memory leaks or stale state because every request started fresh. But this freshness came at a cost: the bootload tax. RoadRunner eliminates this by keeping PHP workers alive and ready to serve multiple requests – a "shared-everything" or "persistent runtime" model.

This persistence, while a performance boon, introduces new challenges:

  • Memory Leaks: Long-running processes are susceptible to gradual memory accumulation, leading to performance degradation and eventual crashes.

  • Stale State: Application-level caches or global variables might become outdated if not properly managed across requests.

  • Resource Exhaustion: A single runaway worker could consume excessive CPU or memory, impacting the entire pool.

The RoadRunner supervisor, configured via rr.yaml, is the elegant solution to these problems. It actively monitors your PHP workers and, crucially, recycles them gracefully before they become problematic.

rr.yaml: Your Application's Operational Manifest

The rr.yaml file is where you declare your intentions for RoadRunner. While it covers various services like http, rpc, jobs, our focus today is on the php section – this is where the supervisor magic happens.

yaml
# rr.yaml - The Brain of Your High-Scale PHP System

http:
  address: 0.0.0.0: 8080
  middleware: ["compress"]

php:
  # The command to execute your PHP worker. This is your application's entry point.
  # It's crucial this script is designed to accept requests via STDIN and return via STDOUT.
  command: "php src/public/index.php"

  # The number of PHP worker processes RoadRunner will spawn and maintain.
  # This is a critical knob for scaling: typically, 1-2 workers per CPU core.
  num_workers: 2

  # The maximum number of requests a single PHP worker will serve before being gracefully restarted.
  # This is your primary defense against memory leaks and stale state.
  # A lower number means more frequent recycling, higher stability, but slightly more overhead.
  max_requests: 500

  # The maximum memory (in bytes) a PHP worker can consume before being restarted.
  # Another safeguard against memory leaks.
  # Example: 128MB = 134217728 bytes
  max_memory: 134217728 # 128MB

  # The maximum time a PHP worker is allowed to live, regardless of requests served.
  # Ensures workers are eventually recycled, preventing very slow leaks.
  max_lifetime: 3600s # 1 hour

  # How long an idle worker (not processing requests) will wait before being terminated.
  # Useful for reducing resource consumption during low traffic periods.
  idle_ttl: 10s

  # The maximum time a worker is allowed to process a single request.
  # Prevents runaway scripts from hanging your application.
  exec_timeout: 60s

  # Environment variables passed to your PHP workers.
  # Critical for configuring your application (e.g., database credentials, API keys).
  env:
    APP_ENV: "production"
    APP_DEBUG: "false"

Worker Lifecycle Management: The Art of Graceful Recycling

The num_workers, max_requests, max_memory, and max_lifetime parameters are your most potent weapons in the fight for stability and performance.

  • num_workers: Directly impacts concurrency. On a multi-core server, you'd typically set this to CPU_CORES * N, where N is 1 or 2, depending on whether your application is CPU-bound or I/O-bound. Too many workers can lead to context switching overhead; too few, and you're leaving performance on the table.

  • max_requests: This is the golden knob. By setting a reasonable max_requests (e.g., 500-1000), you instruct RoadRunner to gracefully shut down and replace a worker after it has processed that many requests. This proactively mitigates memory leaks and ensures your workers are always starting from a relatively "fresh" state, without incurring the full bootload tax. The new worker spins up in the background, minimizing downtime.

  • max_memory: A hard limit. If a worker exceeds this, it's restarted. Essential for catching unexpected memory spikes.

  • max_lifetime: A time-based fallback. Even if max_requests isn't hit, workers will eventually be recycled.

Component Architecture: rr.yaml in the Grand Scheme

Component Architecture Diagram

Orchestration Layer: rr.yaml Supervisor HTTP REQ RoadRunner Server rr.yaml ● num_workers: N ● max_requests: 500 ● max_memory: 128MB ● exec_timeout: 60s SUPERVISOR ENGINE PHP Worker Pool (Persistent) Worker #1 (Active) Worker #2 (Idle) Worker #3 (Recycling) ...

In our High-Scale PHP CMS, rr.yaml defines how the RoadRunner server interacts with your PHP application. It's not a component itself, but rather the configuration for the central component: the RoadRunner server.

The RoadRunner server acts as a reverse proxy and process manager. It listens for HTTP requests, then dispatches them to an available PHP worker from its pool. The rr.yaml defines the size of this pool, the lifecycle rules for each worker, and the environment they operate within.

Control Flow and Data Flow

Flowchart Diagram

RoadRunner (Go) Protocol Marshalling STDIN (JSON Request Data) STDOUT (JSON Response Body) PHP Worker Application Logic
  1. Request Ingress: An HTTP request hits our system (potentially via a load balancer).

  2. RoadRunner Interception: The RoadRunner server, configured by rr.yaml, receives the request.

  3. Worker Selection: RoadRunner selects an Idle PHP worker from its pool.

  4. Data Ingestion: The HTTP request details (headers, body, method, path) are passed via STDIN to the selected PHP worker.

  5. PHP Application Execution: Your src/public/index.php (or your framework's entry point) processes the request, interacting with databases, caches, etc.

  6. Response Egress: The PHP application writes its HTTP response (headers, body) to STDOUT.

  7. RoadRunner Relay: RoadRunner captures the STDOUT and sends the HTTP response back to the client.

  8. Supervisor Check: After each request, the supervisor checks if the worker has hit max_requests, max_memory, or max_lifetime. If so, it marks the worker for graceful shutdown and replacement, ensuring a fresh worker is always ready.

Sizing for 100 Million Requests Per Second: The Nuance of max_requests

State Machine Diagram

Time → Requests Handled Worker A (PID: 1024) Requests: 498, 499, 500 [THRESHOLD HIT] Worker B (PID: 1025) Spinning up & Bootstrapping... Seamless Traffic Handover GRACEFUL EXIT

Handling 100M RPS is not just about raw power; it's about surgical precision in resource management. For such extreme scale, max_requests is your most critical operational lever.

  • The Trade-off: A higher max_requests means fewer worker restarts, less overhead, potentially better throughput if your application is perfectly leak-free. A lower max_requests means more frequent restarts, slightly higher overhead, but significantly improved stability and memory footprint control.

  • Real-world Insight: In systems handling massive traffic, even tiny memory leaks can accumulate rapidly across millions of requests. Setting max_requests to a conservative value (e.g., 500-1000) is often preferred for stability, even if it introduces a minuscule performance penalty. The cost of a few extra worker spawns is dwarfed by the cost of a memory-exhausted server or an unstable application.

  • Dynamic Tuning: In very advanced setups, max_requests might even be dynamically adjusted based on real-time memory profiling of workers. But for most high-scale systems, a well-chosen static value is sufficient.

  • CPU/Memory Allocation: The num_workers should be carefully calibrated against your server's CPU cores and available memory. A good starting point is num_workers = number_of_CPU_cores * 1.5 to allow for some I/O wait. Then, monitor memory usage closely. If workers are frequently hitting max_memory, either increase the limit (if resources allow) or reduce max_requests to recycle them sooner.

Hands-on: Setting Up Our First Supervised PHP Application

Let's get our hands dirty. We'll set up a minimal PHP application and configure RoadRunner to supervise it using rr.yaml.

1. Project Structure

We'll create a simple src/public/index.php that will serve as our web entry point.

Code
high-scale-cms-rr/
├── rr.yaml
└── src/
    └── public/
        └── index.php

2. src/public/index.php

This script will simulate a basic web request and show us some memory usage, demonstrating the persistent nature.

php
 200,
    'headers' => [
        'Content-Type' => ['text/plain'],
        'X-Powered-By' => ['RoadRunner/PHP']
    ],
    'body' => $responseBody
];

// Write the JSON response to STDOUT
echo json_encode($response);

// Crucially, exit 0 to signal success to RoadRunner, keeping the worker alive.
// Only exit if RoadRunner explicitly tells you to, or if a critical error occurs.
exit(0);

3. rr.yaml

This will be our supervisor configuration.

yaml
# rr.yaml - Our RoadRunner Supervisor Configuration
http:
  address: 0.0.0.0: 8080
  middleware: ["compress"]

php:
  command: "php src/public/index.php"
  num_workers: 2 # Start with 2 workers for demonstration
  max_requests: 5 # Set low for easy observation of recycling
  max_memory: 33554432 # 32MB - For demonstration, keep it tight
  exec_timeout: 10s
  idle_ttl: 5s
  env:
    APP_ENV: "dev"

Note: We've intentionally set max_requests to a very low 5 here. This is purely for demonstration purposes so you can quickly observe worker recycling. In a production system, this would typically be 500-1000.

Assignment: Master the Supervisor

Your mission, should you choose to accept it, is to experiment with the rr.yaml configuration and observe its impact.

  1. Initial Setup: Follow the start.sh script to get the application running.

  2. Observe Recycling: Hit your application (curl http://localhost:8080) more than 5 times. Watch the requestCount in the response and RoadRunner's logs. You should see workers restarting after 5 requests.

  3. Tune max_requests: Change max_requests to 100. Restart RoadRunner. Observe how many requests a worker now handles before recycling.

  4. Tune num_workers: Change num_workers to 4. Restart RoadRunner. Observe if your system feels more responsive under concurrent load (e.g., using ab -n 50 -c 10 http://localhost:8080).

  5. Simulate Memory Leak (Optional but insightful): In src/public/index.php, add a line like $GLOBALS['leak_array'][] = str_repeat('A', 1024 * 10); inside the request processing loop. This will intentionally leak 10KB per request. Then, set max_memory to a tight value (e.g., 16777216 for 16MB) and max_requests to a high value (e.g., 1000). Observe if workers are now recycled by max_memory instead of max_requests. Remember to revert this change for production!

Solution Hints:

  • RoadRunner Logs: Pay close attention to the console output from RoadRunner when it starts and when you make requests. It will explicitly log when workers are "spawned," "recycled," or "terminated."

  • requestCount: Our simple index.php's $requestCount variable, being static, will persist within a single worker. When a worker is recycled, the new worker starts its $requestCount from 1. This is your visual cue.

  • memory_get_usage(): The memory reported will show the worker's current memory footprint. You'll see it reset when a worker recycles.

  • Restarting RoadRunner: After any change to rr.yaml, you must restart RoadRunner for the changes to take effect. The stop.sh and start.sh scripts will help with this.

By mastering rr.yaml and its supervisor capabilities, you're not just configuring a server; you're designing a resilient, self-healing system capable of handling the demands of a truly high-scale CMS. This understanding is foundational for building robust distributed systems, not just in PHP, but across any language where persistent processes are managed.

Need help?