← Back to context

Comment by Kaliboy

6 days ago

Not sure how they do it, but I would do it like so:

Have old database be master. Let new be a slave. Load in latest db dump, may take as long as it wants.

Then start replication and catch up on the delay.

You would need, depending on the db type, a load balancer/failover manager. PgBouncer and PgPoolII come to mind, but MySQL has some as well. Let that connect to the master and slave, connect the application to the database through that layer.

Then trigger a failover. That should be it.

> Load in latest db dump, may take as long as it wants.

400TB its about a week+ ?

> Then start replication and catch up on the delay.

Then u have a changes in the delay about +- 1TB. It means a changes syncing about few days more while changes still coming.

They said "current requests are buffered" which is impossible, especial for long distributed (optional) transactions which in a progress (it can spend a hours, days (for analitycs)).

Overwall this article is a BS or some super custom case which irrelevant for common systems. You can't migrate w/o downtime, it's a physical impossible.

  • Feels the same to me as well.

    "Take snapshot and begin streaming replication"... like to where? The snapshot isn't even prepared fully yet and definitely hasn't reached the target. Where are you dumping/keeping those replication logs for the time being?

    Secondly, how are you managing database state changes due to realtime update queries? They are definitely going in source table at this point.

    I don't get this. Im still stuck on point 1... have read it twice already.

  • So you don't understand how something works. That's fine. But to then say the article and/or tech are BS is... a choice.

    This work has been and is being used by some of the largest sites / apps in the world including Uber, Slack, GitHub, Square... But sure, "it's BS, super custom, and irrelevant". Gee, yer super smart! Thank you for the amazing insights. 5 stars.