Comment by sgarland

9 months ago

Because MySQL got a (rightfully so) bad rap before it adopted InnoDB as the default storage engine, and then tech influencers happened. I love Postgres, but I also love MySQL, and 99% of the time I see people gushing about Postgres, they aren’t using any features that MySQL doesn’t have.

The single biggest thing for MySQL that should be a huge attraction for devs without RDBMS administration experience is that MySQL, by and large, doesn’t need much care and feeding. You’re not going to get paged for txid wraparound because you didn’t know autovacuum wasn’t keeping up on your larger tables. Unfortunately, the Achilles heel of MySQL (technically InnoDB) is also its greatest strength: its clustering index. This is fantastic for range queries, IFF you design your schema to exploit it, and don’t use a non-k-sortable PK like UUIDv4. Need to grab every FooRecord for a given user? If your PK is (user_id, <some_other_id>) then congrats, they’re all physically colocated on the same DB pages, and the DB only has to walk the B+tree once from the root node, then it just follows a linked list.

3 comments

sgarland

abraxas 9 months ago

To the contrary when the PK has to be a BTree it already ties my hands because I can't have good disk layout for say, time series data where I might use a ligher index like BRIN at a cost of somewhat slower queries but much better index update rates.

sgarland 9 months ago

I would not personally build a TSDB atop MySQL, though I worked at a place that did, and it worked OK. I don't remember their schema, though.

If I had to, I'd probably do something like this (haven't tested it beyond validating that it creates):

    mysql> SHOW CREATE TABLE ts\G
    *************************** 1. row ***************************
           Table: ts
    Create Table: CREATE TABLE `ts` (
      `metric_id` smallint unsigned NOT NULL,
      `ts` datetime NOT NULL,
      PRIMARY KEY (`metric_id`,`ts`)
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
    /*!50100 PARTITION BY RANGE (dayofmonth(`ts`))
    SUBPARTITION BY HASH (`metric_id`)
    SUBPARTITIONS 3
    (PARTITION p1 VALUES LESS THAN (2) ENGINE = InnoDB,
     PARTITION p2 VALUES LESS THAN (3) ENGINE = InnoDB,
     PARTITION p3 VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */
    1 row in set (0.012 sec)

Obviously there would be more partitions for different days (or whatever other date chunk you wanted – months, weeks, etc.), and the sub partitions number would depend on how many metrics you were tracking. You could also simplify this at the expense of more tables by having a table per metric.

crowcroft 9 months ago

Postgres is a lot more flexible so if you're making a TSDB, handling geospatial data etc. etc. it is usually better (not to say MySQL can't be used effectively for a lot of these use cases still).
I just see lots of people making CRUD web apps and choosing these new Postgres solutions, and that seems like the one thing that MySQL is almost always better at.