Day 1 : Deconstructing the architecture of LeetCode and SPOJ.

Lesson 1 30 min

Unlocking the Secrets of Online Judges: Building Your First Code Execution Engine

Welcome, engineers, to a journey unlike any other. This isn't just another system design course; it's your backstage pass to the architectural decisions that power the world's most demanding systems. Today, we peel back the layers of something you might use every day: online judges like LeetCode and SPOJ. But we're not just solving problems on them; we're going to build a foundational piece of their very heart.

Why start here? Because understanding how these platforms securely and efficiently run your untrusted code is a masterclass in distributed task execution, sandboxing, resource management, and robust error handling – concepts that are absolutely critical whether you're building serverless functions, CI/CD pipelines, or even advanced AI training environments.

The Challenge: Running Untrusted Code Safely and at Scale

Imagine a system that accepts millions of code submissions daily, written in various languages, from millions of users. Each submission needs to be compiled (if applicable), executed, and tested against a set of hidden inputs, all while:

Isolation: Preventing malicious code from accessing or harming the host system or other users' submissions.
Resource Limiting: Ensuring one submission doesn't hog all CPU, memory, or time, leading to denial of service for others.
Scalability: Handling a massive, fluctuating load of concurrent executions.
Accuracy: Providing precise feedback on correctness, runtime, and memory usage.

This is the core problem LeetCode, SPOJ, and even your company's internal code review systems solve daily.

Core Concept: The Code Execution Worker – Your First Step into Distributed Computing

For our first deep dive, we're focusing on the "Code Execution Worker" – the unsung hero that takes a piece of code, runs it, and tells us what happened. Think of it as a specialized, highly controlled virtual machine that performs one job: executing code and reporting back.

Component Architecture: Our Minimalist Online Judge

Component Architecture

At its simplest, our system today will consist of:

Submission Receiver (Implicit): For this lesson, we'll simulate this by directly feeding code to our worker. In a real system, this would be an API Gateway receiving user submissions.
Code Execution Worker (Our Focus): This is the component we'll build. It's responsible for:

Receiving code and execution parameters (language, input).
Setting up an isolated environment (conceptually, for now).
Running the code.
Capturing output, errors, execution time, and exit status.
Returning structured results.

Result Store (Implicit): For now, our worker will print results to the console. In a real system, this would be a database or message queue to store and relay results.

Control Flow: A Journey from Code to Result

Flowchart

A user's code snippet (e.g., Python) and optional input are provided.
Our code_executor_service (the worker) receives this.
The worker writes the code to a temporary file.
It invokes the appropriate language interpreter/compiler (e.g., python3) in a subprocess, feeding the temporary file and input.
Crucially, it sets a timeout for the execution to prevent infinite loops.
The subprocess runs, potentially producing stdout and stderr.
Once finished (or timed out), the worker collects stdout, stderr, the exit code, and calculates the total execution time.
It then reports these structured results.

Data Flow: What Flows Where

State Machine

Input: (code_string, language, input_data) -> Code Execution Worker
Intermediate: temp_code_file.py, temp_input.txt (written by worker)
Output: stdout, stderr, exit_code, execution_time Running -> Finished (which can be Accepted, Wrong Answer, Runtime Error, Time Limit Exceeded). Today, we'll focus on Running -> Finished with basic error/timeout detection.

Real-time Production System Application: Why This Matters Beyond LeetCode

This isn't just academic. Every time you push code to GitHub and a CI/CD pipeline runs tests, a similar execution worker is at play. When you deploy a serverless function (like AWS Lambda or Google Cloud Functions), your code runs within a highly optimized, sandboxed execution environment that shares many principles with our worker. Understanding how to isolate, run, and monitor arbitrary code is fundamental to building secure, scalable, and reliable cloud infrastructure.

Hands-on Build-Along: Your First Code Execution Worker

Our goal for today is to build a Python-based code_executor_service that can:

Accept a Python code string and an optional input string.
Execute it with a timeout.
Capture its stdout, stderr, and execution time.
Report these results in a structured format.

This will be a simple, single-file Python script to start.

python

# src/worker.py
import subprocess
import tempfile
import os
import time
import json

def execute_python_code(code_string: str, input_string: str = "", timeout_seconds: int = 5):
    """
    Executes a given Python code string with optional input and a timeout.
    Returns a dictionary with execution results.
    """
    result = {
        "status": "Unknown",
        "stdout": "",
        "stderr": "",
        "execution_time_ms": 0,
        "error": None
    }

    with tempfile.TemporaryDirectory() as temp_dir:
        code_file_path = os.path.join(temp_dir, "submission.py")
        input_file_path = os.path.join(temp_dir, "input.txt")

        # Write the user's code to a temporary file
        try:
            with open(code_file_path, "w") as f:
                f.write(code_string)
        except IOError as e:
            result["status"] = "Internal Error"
            result["error"] = f"Failed to write code file: {e}"
            return result

        # Write input to a temporary file if provided
        input_stream = None
        if input_string:
            try:
                with open(input_file_path, "w") as f:
                    f.write(input_string)
                input_stream = open(input_file_path, "r")
            except IOError as e:
                result["status"] = "Internal Error"
                result["error"] = f"Failed to write input file: {e}"
                return result

        start_time = time.perf_counter()
        try:
            # Command to execute the Python script
            cmd = ["python3", code_file_path]

            process = subprocess.run(
                cmd,
                stdin=input_stream,
                capture_output=True,
                text=True,  # Capture stdout/stderr as text
                timeout=timeout_seconds,
                check=False # Don't raise CalledProcessError for non-zero exit codes
            )
            end_time = time.perf_counter()
            result["execution_time_ms"] = int((end_time - start_time) * 1000)
            result["stdout"] = process.stdout.strip()
            result["stderr"] = process.stderr.strip()

            if process.returncode == 0:
                result["status"] = "Accepted"
            else:
                if result["stderr"]:
                    result["status"] = "Runtime Error"
                else:
                    # This case might indicate a logic error or unexpected exit
                    result["status"] = "Execution Failed"
                result["error"] = f"Process exited with code {process.returncode}"

        except subprocess.TimeoutExpired:
            end_time = time.perf_counter()
            result["execution_time_ms"] = int((end_time - start_time) * 1000)
            result["status"] = "Time Limit Exceeded"
            result["error"] = f"Execution timed out after {timeout_seconds} seconds"
            # If timeout, process.stdout and process.stderr might not be fully captured
            # but still try to get what's available
            if process.stdout: result["stdout"] = process.stdout.strip()
            if process.stderr: result["stderr"] = process.stderr.strip()
        except Exception as e:
            result["status"] = "Internal Error"
            result["error"] = f"An unexpected error occurred: {e}"
        finally:
            if input_stream:
                input_stream.close() # Ensure input file handle is closed

    return result

if __name__ == "__main__":
    print("Code Execution Worker Demo")
    print("--------------------------")

    # Example 1: Simple print statement
    code_1 = "print('Hello, world!')"
    print("n--- Running Example 1 (Simple Print) ---")
    res_1 = execute_python_code(code_1)
    print(json.dumps(res_1, indent=2))

    # Example 2: Code with input
    code_2 = "name = input()nprint(f'Hello, {name}!')"
    input_2 = "Alice"
    print("n--- Running Example 2 (With Input) ---")
    res_2 = execute_python_code(code_2, input_string=input_2)
    print(json.dumps(res_2, indent=2))

    # Example 3: Runtime Error
    code_3 = "print(1 / 0)"
    print("n--- Running Example 3 (Runtime Error) ---")
    res_3 = execute_python_code(code_3)
    print(json.dumps(res_3, indent=2))

    # Example 4: Time Limit Exceeded
    code_4 = "while True:n    pass"
    print("n--- Running Example 4 (Time Limit Exceeded) ---")
    res_4 = execute_python_code(code_4, timeout_seconds=2)
    print(json.dumps(res_4, indent=2))

Assignment: Level Up Your Worker

Now that you have a basic worker, let's make it smarter.

Task: Enhance the execute_python_code function (or create a new judge_python_code function) to support basic test case evaluation.

Detailed Steps:

Input Structure: Modify the function to accept a list of (input_data_string, expected_output_string) tuples as test_cases.
Execution Loop: For each test case in the list:

Execute the user's code_string with the input_data_string from the current test case.
Capture the stdout.
Compare the captured stdout with the expected_output_string for that test case.
Keep track of how many test cases pass and fail.

Result Reporting: The final result dictionary should include:

overall_status: "Accepted" if all test cases pass, "Wrong Answer" if any fail, "Time Limit Exceeded" if any test case times out, "Runtime Error" if any test case causes a runtime error.
passed_test_cases: Count of passed tests.
total_test_cases: Total number of tests.
details: A list of dictionaries, each containing results for an individual test case (e.g., {"test_id": 1, "status": "Passed", "actual_output": "...", "expected_output": "..."}).

Error Propagation: If any test case results in a "Time Limit Exceeded" or "Runtime Error", the overall_status should immediately reflect that, and subsequent test cases can be skipped (or marked as "Skipped").

Solution Hints:

Iteration: Use a for loop to go through your test_cases list.
Reusability: You already have the execute_python_code function. Call it inside your loop!
Comparison: Simple string comparison (actual_output == expected_output) will work for exact matches. For more robust OJs, you'd need to handle whitespace, floating-point precision, etc., but let's keep it simple for now.
Early Exit: If a critical error (TLE, RTE) occurs on any test case, you can set the overall_status and break the loop.

This assignment will solidify your understanding of how test cases are handled, a crucial part of any automated judging system. You're not just running code; you're judging it, just like LeetCode does!

Learning Objectives

✓ Understand how online judges safely execute untrusted user code
✓ Design a code execution worker with clear control, data, and state flows
✓ Implement isolated code execution using subprocesses and temporary files
✓ Enforce timeouts and basic resource control to prevent infinite loops
✓ Capture and analyze stdout, stderr, exit codes, and execution time
✓ Classify execution outcomes (Accepted, Runtime Error, TLE, Execution Failed)
✓ Extend a basic executor into a test-case–based judging system
✓ Connect execution workers to real-world systems like CI/CD and serverless platforms

💬 Discuss this topic