Comment by closeparen

16 hours ago

>Modules cannot call each other, except through specific interfaces (for our Python monolith, we put those in a file called some_module/api.py, so other modules can do from some_module.api import some_function, SomeClass and call things that way.

This is a solution to a large chunk of what people want out of microservices. There are just two problems, both of which feel tractable to a language/runtime that really wanted to solve them:

1. If the code implementing the module API is private, it must all be colocated in one package. If it is public, then anyone can import it, breaking the module boundary. You need a visibility system that can say "this subtree of packages can cooperate with each other, but code outside the subtree can only use the top-level package."

2. If a change module A has a problem, you must roll back the entire monolith, preventing a good change in module B from reaching users. You need a way to change the deployed version of different modules independently. Short of microservices, they could be separate processes doing some kind of IPC, or you need a runtime with hot reloading (and a reasonably careful backwards compatibility story).

Though even the solution to 1 doesn't solve "ACLs", which distribution does. If you want to ensure your module is only called by an approved list of upstream modules, public / private isn't granular enough. (You can solve it with attributes or some build tools, but it's ad-hoc and complex, doesn't integrate with the editor, etc. I've always thought something more granular and configurable should've been built into Java/OOP theory from the start).

That said, 2 is really the big problem. As things really scale, this tends to cause problems on every deployment and slow the whole company down, cause important new features to get blocked by minor unrelated changes, a lot of extra feature flag maintenance, etc. 90% of the time, that should be the gating factor by which you decide went to split a service into multiple physical components.

As the author said, an additional reason for distribution is sometimes it's prudent to distribute because of physical scale reasons (conflicts between subservice A needing high throughput, B needing low latency, C needing high availability, and D needing high IOPS that blows your budget to have VMs that satisfy every characteristic, or impossible in memory-managed languages), but usually I see this being done way too early, based more on ideals than numbers.

  • Exactly. There should be no problem having tens of thousands of stateless, I/O bound database-transaction-wrapper endpoints in the same service. You're not going to run out of memory to hold the binary or something. If you want to partition physical capacity between groups of endpoints, you can probably accomplish that at the load balancer.

    Having the latent capability to serve endpoint A in the binary is not interfering with endpoint B's QPS unless it implies some kind of crazy background job or huge in-memory dataset. Even in this case, monoliths normally have a few different components according to function: API, DB, cache, queue, background worker, etc. You can group workloads by their structure even if their business purposes are diverse.

> This is a solution to a large chunk of what people want out of microservices.

Yeah, just put some gRPC services in a single pod and have them communicate over unix domain socket. You now have a single unit of modules that can only communicate over IPC using a well-defined type-safe boundary. As a bonus you can set resource quotas separately according to the needs of the module.

Want to scale? Rewrite some config and expose a few internal services and have them communicate over TCP.

> 1. If the code implementing the module API is private, it must all be colocated in one package. If it is public, then anyone can import it, breaking the module boundary. You need a visibility system that can say "this subtree of packages can cooperate with each other, but code outside the subtree can only use the top-level package."

This is easily achieved in Scala with a particular use of package paths. You are allowed to make some symbols only visible under a package path.

  • Unfortunately the thought pattern “modern software development is way too complex, we need something radically simpler… I know! I’ll use Scala!” is not a thing.

Visibility systems are great!

> If a change module A has a problem, you must roll back the entire monolith, preventing a good change in module B from reaching users.

eh. In these setups you really want to be fixing forward for the reason you describe - so you revert the commit for feature A or turn off the feature flag for it or something. You don't really want to be reverting deployments. If you have to, well, then it's probably worth the small cost of feature B being delayed. But there are good solutions to shipping multiple features at the same time without conflicting.

  • There is a point in commit throughput at which finding a working combination of reverts becomes unsustainable. I've seen it. Feature flagging can delay this point but probably not prevent it unless you’re isolating literally every change with its own flag (at which point you have a weird VCS).