← Back to context

Comment by fxtentacle

1 day ago

"All that to say, a small point query on a well-designed schema should easily execute in sub-msec times if the pages are in the DB’s buffer pool"

Shopify is hosting a large number of webshops with billions of product descriptions, but each store only has a low visitor count. So we are talking about a very large and, hence, uncacheable dataset with sparse access. That means almost every DB query to fetch a product description will hit the disk. I'd even assume a RAID of spinning HDDs for price reasons.

Shopify runs a heavily sharded MySQL backend. Their Shop app uses Vitess; last I knew the main Shopify backend wasn’t on Vitess (still sharded, just in-house), but I could be wrong.

I would be very surprised if “almost every query” was hitting disk, and I’d be even more surprised to learn that they used spinners.