The first fallacy of distributed computing -

(This was supposed to go out yesterday, but I made a mistake in how my site software works. Apologies! We’ll be back to our regular flow of content with today’s.)

You can have a second computer once you’ve shown you know how to use the first one. – Paul Barham

For the time being, our computers are subject to the laws of physics, and this has an unfortunate-but-often-left-out-of-tool-vendor-marketing consequence: networks fail.

Believing otherwise is the first fallacy of distributed computing.

This typically manifests as a team finishing an implementation with, say, Kafka and turning it on, only to start noticing weird duplication in their workloads. Examples include duplicate charges to customer accounts, duplicate emails, etc.

In Kafka’s case, the first fallacy implication goes something like:

Network transmissions can get lost
Kafka uses acknowledgement messages to avoid duplication
Acknowledgement messages are network transmissions
Goto 1

Now, “unreliable” does not equal “happens all the time,” which makes this sort of issue difficult to debug and understand. I (and probably your customers!) want you and your teams to understand it, so the next few days will cover how to enable your systems to safely inhabit reality.

For those who want to start googling now, your countermeasure is idempotence.

Get the book!