Devlog #12: Building for 100k concurrent — lessons so far

2022-05-01 · From the vault

We're not at 100k yet. We're building for it. This post is about what we've learned so far: bottlenecks, tradeoffs, and the kind of problems that only show up when you stress the system. No magic—just work.

Traditional scaling means sharding and instancing—chop the world into pieces, cap concurrency per piece, done. The problem: you don't get one world. You get parallel copies. We're aiming for something harder: distributed simulation. One world as a computation that can be partitioned across many cooperating machines. Think mass-scale events, dense spaces where thousands of people share the same moment, persistent logistics where the movement of people and goods is simulated as a first-class mechanic. That's the ceiling we're pushing toward.

What breaks at scale

When world state is split across machines, the central problem is consistency. An action on one side of the world can't wait for a global consensus protocol to complete before it matters—it has to be visible locally and then reconciled. We're learning to build mechanics that stay fair and legible under partial consistency. Some inconsistency has to be tolerated; the design challenge is making that invisible to players. Bottlenecks: synchronization overhead, exploit surface when state is distributed, and the simple fact that our current stack wasn't built for this. We're refactoring.

Why 100k matters

100k concurrent in one coherent world isn't a vanity number. It's the threshold where a multiplayer world starts to resemble a city-scale simulation platform—the kind of system that can host not just play but training, events, and real coordination. We're not there yet. We're building for it. Lessons so far: no magic, just work. More devlogs as we go.

← Newer · Older → · Blog