← Back to all blogs

Scaling Distributed Systems: Patterns and Anti-Patterns


When you transition from a monolithic architecture to distributed microservices, the complexity of your system shifts from the code to the network.

Suddenly, network partitions, cascading failures, and resource exhaustion become everyday realities. In this post, we’ll explore exactly how modern distributed systems handle massive, unpredictable traffic spikes without collapsing.

The Thundering Herd Problem

Imagine you launch a highly anticipated feature. A million users hit your API at exactly 9:00 AM.

In a poorly architected system, your API Gateway blindly forwards all these requests to your backend services. The services attempt to process them simultaneously, run out of memory, crash, and restart—only to immediately crash again due to the backlog of pending requests. This is the Thundering Herd.

Interactive Load Simulation

Let’s visualize this. Below is a live interactive model of a standard 3-node microservice cluster behind an API Gateway.

Click “Simulate Traffic Spike” and watch how the nodes respond:

Users / Clients
API Gateway / Load Balancer
Service A
Load: 10%
Service B
Load: 12%
Service C
Load: 9%

As you can see from the simulation, without proper protection mechanisms, nodes quickly hit 100% capacity and enter an overload state (represented in red). When one node fails, the load balancer reroutes its traffic to the remaining healthy nodes, immediately causing them to overload and fail. This is known as a cascading failure.

Defense Mechanisms

How do we prevent this? We use three primary architectural patterns.

1. Circuit Breakers

A circuit breaker wraps a vulnerable network call. If the failure rate of that call exceeds a certain threshold (e.g., 50% of requests fail within 10 seconds), the circuit breaker “trips” and opens.

Once open, all subsequent requests immediately fail fast, returning a cached or fallback response. This gives the struggling downstream service time to recover without being hammered by retries.

2. Backpressure

In a system utilizing backpressure, downstream services actively communicate their load to upstream services. If Service B is overwhelmed, it signals Service A to slow down its request rate.

// A simple Spring WebFlux example demonstrating backpressure
public Flux<Data> processStream(Flux<Data> inbound) {
    return inbound
        .onBackpressureBuffer(100, BufferOverflowStrategy.DROP_OLDEST)
        .flatMap(data -> expensiveProcessing(data));
}

By dropping older requests or refusing to accept new ones until the queue clears, the service ensures it never runs out of memory.

3. Load Shedding

Load shedding is the ultimate fail-safe. If the API Gateway detects that the backend is reaching critical capacity, it begins selectively dropping requests before they ever reach the internal network.

Usually, non-critical endpoints (like syncing profile pictures) are shed first, ensuring that core functionalities (like processing payments) remain available.

Conclusion

Building resilient distributed systems is about accepting that failure is inevitable. By implementing circuit breakers, backpressure, and load shedding, you design systems that bend under extreme pressure rather than break.