The Strangler Fig Pattern: Using Cloud Load Balancing to Route Traffic

August 15, 2021

Every legacy migration I've worked on starts with the same conversation: someone wants to freeze features for six months and rewrite the monolith in Go on Kubernetes. It never works. The market moves, the rewrite slips, and the team is back to patching the original system anyway.

The pattern that does work is the strangler fig: keep the monolith running, peel off one piece of functionality at a time, and route traffic to the new implementation as each piece is ready. On GCP, the Global External HTTP(S) Load Balancer is a surprisingly good tool for this. Below is how we're using it on a current project.

What the strangler fig actually buys you

The name comes from the fig tree that grows around a host tree until the host dies and only the fig is left. The cloud version is the same idea: the monolith stays in place, the new services grow alongside it, and traffic gradually moves over until the monolith is small enough to retire (or ignore).

The lever that makes this work without users noticing is the load balancer. You change where requests go, not what the URL looks like.

The architecture

We aren't touching the monolith's code. We're putting a routing layer in front of it.

                ┌──────────────────┐
                   User / Client  
                └────────┬─────────┘
                          HTTPS
                         
              ┌──────────────────────┐
                Global HTTP/S LB    
              └──────────┬───────────┘
                          URL Map
                         
                  ┌─────────────┐
                   Path Matcher│
                  └──┬───────┬──┘
       /api/v1/cart           /*  (default)
                     ▼       ▼
        ┌──────────────────┐ ┌──────────────────────┐
        │   Modern World   │ │    Legacy World      │
        │ ┌──────────────┐ │ │ ┌──────────────────┐ │
        │ │ NEG: GKE     │ │ │ │ MIG: Monolith    │ │
        │ │ Cart Service │ │ │ │ VMs              │ │
        │ └──────────────┘ │ │ └──────────────────┘ │
        └──────────────────┘ └──────────────────────┘

The flow is straightforward:

  1. All traffic hits a single GCLB anycast IP.
  2. The URL map decides where the request goes. Anything matching /api/v1/cart goes to the GKE cluster (via a Network Endpoint Group); everything else falls through to the instance group running the monolith.

Implementation

Take the cart functionality of an e-commerce app as the slice we're carving off.

1. Stand up the new service. Build "Cart" as a microservice. For this project it's Spring Boot on GKE; Cloud Run would also be reasonable. Deploy it, expose it internally, verify it works in isolation.

2. Configure two backend services. In the GCP console (or in Terraform):

  • Backend A (legacy): the Managed Instance Group containing the monolith VMs.
  • Backend B (modern): a Network Endpoint Group pointed at the GKE service for the cart.

If you haven't switched to NEGs for container-native load balancing yet, this is the moment to do it. NEGs skip the kube-proxy hop and send traffic directly to the pod IP, which lowers latency and removes a moving part.

3. Update the URL map. Add a path rule for /cart and /checkout pointing to backend B. DNS doesn't change. The user's URL doesn't change. They're browsing products on the monolith, then add to cart on the microservice, and never know.

Why GCLB instead of Nginx or HAProxy

You can absolutely do this with Nginx. The reasons I keep reaching for GCLB on GCP migrations:

  • Single global anycast IP. No matter what happens behind it, the entry point doesn't change.
  • Mixed backends. The same load balancer can front VMs, GKE pods, and Cloud Run. As you migrate from one to the other, you don't touch the front door.
  • Fast rollback. If the new cart service breaks under load, reverting the URL map is one operation. Traffic flows back to the monolith in seconds. That's the whole reason to do this incrementally.

What goes wrong

A few things bit me on this project that are worth flagging.

Sticky sessions. The monolith probably uses JSESSIONID and keeps session state in process memory. The new microservice is stateless and uses JWTs or Redis. The user logs in on the monolith, clicks "Cart," lands on the microservice, and gets bounced back to login. Fix: move to a shared session store (Memorystore for Redis) or migrate to token-based auth before you start splitting traffic.

The shared database. Splitting code doesn't split data. The new cart service connecting to the same Oracle/MySQL as the monolith is fine for day one, but it's not a long-term answer — the monolith will lock tables you need. Plan the data split as a separate workstream.

Path rewrites. The monolith expects /app/cart. The container expects /. GCLB supports URL rewriting at the path matcher; configure it or you'll be debugging 404s.

Wrap up

The strangler fig is a risk-management pattern more than an architecture pattern. Slicing traffic by path turns a "big bang" migration into a series of small, reversible changes. In ops, small and reversible is the whole game.

Up next, in part 2: containerizing the legacy Java side. We dropped Dockerfiles in favor of Jib + Cloud Build, and the build times stopped being a daily annoyance.