Day 3 : Wait Faster: The Loom Philosophy.

Lesson 3 60 min

Wait Faster: The Loom Philosophy.

Component Architecture

Client Load Balancer Spring Boot Microservice JVM & Platform Threads P-T1 P-T2 P-T3 V-T1 V-T2 V-T3 V-T4 V-T5 V-T6 External I/O

Welcome back, fellow architects and engineers. In Day 2, we demystified the crucial difference between Virtual Threads and Platform Threads, setting the stage for a fundamental shift in how we build concurrent applications. Today, we're diving deeper into the philosophy behind Project Loom – not just what it is, but why it's a game-changer for our microservices, especially when scaling to the astronomical demands of 100 million requests per second.

The title, "Wait Faster," might sound like a paradox. How do you wait faster? The trick isn't to eliminate waiting, but to make the time spent waiting incredibly efficient, unlocking unprecedented throughput without the usual architectural gymnastics.

Agenda for Today:

Flowchart

Request Arrives Spring Boot Controller Virtual Thread (V-T) Assigned I/O Operation? No I/O: Continue I/O: V-T Parks Platform Thread Free I/O Completes: V-T Resumes Return Response
  1. Revisiting the Bottleneck: Why traditional concurrency struggles with I/O.

  2. The Loom Philosophy Unpacked: How it redefines "blocking" code for hyperscale.

  3. Impact on Microservice Design: Simplification, scalability, and cost efficiency.

  4. Hands-on: Embracing Virtual Threads in Spring Boot.

  5. Assignment: Deepening your understanding.

Why Waiting is the Real Bottleneck (Revisited)

Think about your typical microservice. It receives a request, perhaps fetches data from a database, calls another internal service, or integrates with a third-party API. What's happening most of the time? The CPU is often idle, waiting for data to arrive from the network or disk.

In the old world, each incoming request often got its own heavy platform thread. When that thread hit an I/O operation (like waiting for a database query to return), it would block. The operating system would then context-switch to another thread. While this works, platform threads are a finite, expensive resource. They consume significant memory (megabytes per thread) and context switching is not free. At scale, say, handling hundreds of thousands or millions of concurrent connections, this model collapses under its own weight, leading to thread exhaustion, high memory usage, and poor throughput.

This challenge led to the rise of reactive programming frameworks like Spring WebFlux, which explicitly embrace asynchronous, non-blocking I/O. They solved the scalability problem by avoiding blocking calls, but at a cost: increased cognitive load, complex callback chains, and a steep learning curve.

The Loom Philosophy: Simplicity at Hyperscale

This is where Project Loom offers a profound shift. The "Loom Philosophy" can be distilled into this: Write simple, synchronous-looking, blocking code, and let the JVM handle the non-blocking magic under the hood.

  1. Abstraction of Waiting: Virtual threads are cheap, almost infinitely available, and managed by the JVM. When a virtual thread encounters a blocking I/O operation, it doesn't block an expensive platform thread. Instead, the virtual thread parks itself, and the underlying platform thread is immediately freed up to execute another virtual thread. When the I/O operation completes, the virtual thread is unparked and resumed, often on a different platform thread. This means your application's platform threads are almost always doing useful work, not waiting.

  2. Cognitive Load Reduction: This is perhaps the most underrated benefit. Developers can write straightforward, sequential code that's easy to read, debug, and reason about. No more Mono or Flux transformations just to achieve scalability for I/O. It brings back the simplicity of the thread-per-request model but with the scalability benefits of non-blocking I/O.

  3. Cost-Efficiency: By maximizing the utilization of platform threads and reducing memory overhead per concurrent task, your services can handle significantly more concurrent requests on the same hardware. This directly translates to lower cloud infrastructure costs, a critical factor when operating at the scale of 100 million RPS.

How it Fits into Our Hyperscale System

Imagine our payment processing microservice. It talks to multiple external APIs (fraud detection, bank gateways), and a database. Each of these interactions involves network latency – waiting. With the Loom philosophy, each incoming payment request can be processed by a virtual thread. When it calls the fraud detection API, the virtual thread parks, freeing up its underlying platform thread to handle another incoming payment. This cycle repeats, allowing a small pool of platform threads to orchestrate thousands, even millions, of concurrent payment flows efficiently.

Core Concept: The system design concept at play here is Structured Concurrency combined with Efficient Resource Multiplexing. Virtual threads enable structured concurrency by allowing developers to express concurrent tasks naturally, while the JVM handles the low-level multiplexing of these tasks onto a smaller number of platform threads, optimizing resource utilization.

Control Flow: A request comes in, a virtual thread is assigned. This thread executes business logic. When it hits an I/O boundary (e.g., calling an external service), the virtual thread yields its platform thread. Another virtual thread takes over. When the I/O completes, the original virtual thread resumes.

Data Flow: Data flows through the virtual thread's execution context, just as it would with a platform thread. The key is that the state of the waiting virtual thread is preserved and restored seamlessly.

State Changes: The primary state change for a virtual thread is between RUNNABLE and PARKED. When PARKED, it's not consuming a platform thread, allowing the system to handle more concurrent operations without increasing resource footprint.

Hands-on: Embracing Virtual Threads in Spring Boot

State Machine

Start RUNNABLE I/O Call PARKED Platform Thread Available I/O Complete End

Let's build a simple Spring Boot microservice that simulates an I/O-bound operation. We'll enable Virtual Threads and see how straightforward it is.

Component Architecture:
Our service will be a DataProcessingService that exposes a REST endpoint. This endpoint will simulate a call to an external, slow service.

java
// src/main/java/com/example/loomdemo/DataProcessingService.java
package com.example.loomdemo;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.stereotype.Service;

import java.util.concurrent.TimeUnit;

@Service
public class DataProcessingService {

private static final Logger log = LoggerFactory.getLogger(DataProcessingService.class);

public String processData(String input) {
log.info("Processing data for input: {} on thread: {}", input, Thread.currentThread().getName());
try {
// Simulate a slow external API call or database operation (I/O bound)
TimeUnit.SECONDS.sleep(2); // This will park the virtual thread
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
log.error("Data processing interrupted for input: {}", input, e);
return "Error: Interrupted";
}
log.info("Finished processing data for input: {} on thread: {}", input, Thread.currentThread().getName());
return "Processed: " + input + " (via Virtual Thread)";
}
}
java
// src/main/java/com/example/loomdemo/DataProcessingController.java
package com.example.loomdemo;

import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;

@RestController
public class DataProcessingController {

private final DataProcessingService dataProcessingService;

public DataProcessingController(DataProcessingService dataProcessingService) {
this.dataProcessingService = dataProcessingService;
}

@GetMapping("/api/process")
public String process(@RequestParam String data) {
return dataProcessingService.processData(data);
}
}
java
// src/main/java/com/example/loomdemo/LoomDemoApplication.java
package com.example.loomdemo;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;

@SpringBootApplication
public class LoomDemoApplication {

public static void main(String[] args) {
SpringApplication.run(LoomDemoApplication.class, args);
}
}

Enabling Virtual Threads:
The magic happens in src/main/resources/application.properties:

properties
spring.threads.virtual.enabled=true
server.port=8080

That's it. With spring.threads.virtual.enabled=true, Spring Boot will automatically use virtual threads for handling incoming requests where appropriate (e.g., in DispatcherServlet for web requests). When our Thread.sleep(2) is called, the virtual thread will park, and the underlying platform thread will be free to serve other requests.

Notice the Thread.currentThread().getName() in the logs. When you run this, you'll see thread names like Tomcat-Virtual-Thread-X. This is your confirmation that virtual threads are active.

Assignment

Your mission, should you choose to accept it, is to deepen your understanding of the Loom philosophy through practical application:

  1. Expand the Service: Add a second endpoint, /api/aggregate, which calls the existing /api/process twice concurrently using CompletableFuture.supplyAsync (and join()) before returning an aggregated result. Observe the thread names in the logs. How does supplyAsync behave with virtual threads enabled?

  2. Introduce an External Call: Instead of Thread.sleep(), simulate a real external HTTP call using RestTemplate or WebClient to a dummy endpoint (you can use a public test API like https://httpbin.org/delay/2). Configure RestTemplate to use virtual threads where possible (it often does by default with Loom enabled if using SimpleClientHttpRequestFactory).

  3. Observe Resource Usage (Conceptual): While we don't have a load generator set up, reflect on how this design conceptually reduces resource overhead compared to a traditional blocking-thread model if you were to simulate 10,000 concurrent requests. What would be the difference in active platform threads vs. active virtual threads?

Solution Hints

  1. For aggregate endpoint:

  • Create a new method in DataProcessingService called aggregateData(String input1, String input2).

  • Inside this method, use CompletableFuture.supplyAsync for each call to processData. Remember CompletableFuture also benefits from virtual threads when ForkJoinPool.commonPool() (the default) is backed by virtual threads, or you provide an executor that creates virtual threads.

  • CompletableFuture future1 = CompletableFuture.supplyAsync(() -> processData(input1));

  • CompletableFuture future2 = CompletableFuture.supplyAsync(() -> processData(input2));

  • Use future1.join() and future2.join() to wait for results, then combine them.

  • Create a new @GetMapping("/api/aggregate") in DataProcessingController that uses this service method.

  1. For External Call:

  • Add spring-boot-starter-webflux or spring-boot-starter-web (if using RestTemplate) dependency.

  • Inject RestTemplate or WebClient into DataProcessingService.

  • Replace TimeUnit.SECONDS.sleep(2) with restTemplate.getForObject("https://httpbin.org/delay/2", String.class); or webClient.get().uri("https://httpbin.org/delay/2").retrieve().bodyToMono(String.class).block();. Note that WebClient is inherently reactive, but block() on a virtual thread is efficient.

  1. For Resource Usage:

  • With virtual threads, you'd conceptually see many more Tomcat-Virtual-Thread-X logs than actual http-nio-8080-exec-Y (platform threads). The number of platform threads would remain relatively low, perhaps matching your CPU core count, even with thousands of virtual threads actively processing or waiting. This is the core benefit: multiplexing.

Need help?