Comment by IOT_Apprentice

2 days ago

This seems weird to me. The number of records is minuscule compared to internet scale tech.

The data model for this sounds like it would be simple. Exactly how many use cases are there to be implemented?

Build this with modern tech on HA Linux backends. Eliminate the batch job nonsense.

This could be written up as a project for bootcamps or even a YouTube series.

I suspect some internal politics about moving forward and clinging to old methods is at hand.

Perhaps someone could build an open source platform if the requirements were made public.

The thing is that a lot of internet scale stuff tends to be non-critical. It’s not a big deal if 1% of users don’t see a post to a social network site. It’ll show up later, maybe, or never, but nobody will care.

On the other hand, with transactions like banking or licensing or health insurance, it’s absolutely essential that we definitely maintain ACID compliance for every single transaction, which is something that many “internet-scale” data solutions do not and often cannot promise. I have a vague recollection of some of the data issues at a large health insurance company where I worked a couple years ago that made it really clear why there would be an overnight period where the system would be offline—it was essential to make sure that systems could be brought to a consistent state. It also became clear why enrolling someone in a new plan was not simply a matter of adding a record to a database somewhere.

Not to mention that I suspect that data such as bank transaction records or health insurance claims probably rival “internet scale” for being real big data operations.

  • The reason that these "internet scale" solutions are challenging to operate is because of their latency and availability targets.

    If you threw into the requirements "can go down nightly, for hours, for writes AND reads", they could absolutely provide the transactional guarantees you're looking for.