Comment by bob1029

20 hours ago

The only GitHub identifier Ive ever bothered to store explicitly (I.e., in its own dedicated column) is an immutable URL key like issue/pr # or commit hash. I've stored comment ids but I've never thought about it. They just get sucked up with the rest of the JSON blob.

Not everything has to be forced through some normalizing layer. You can maintain coarse rows at the grain of each issue/PR and keep everything else in the blob. JSON is super fast. Unless you're making crosscutting queries along comment dimensions, I don't think this would ever show up on a profiler.

> immutable URL key like issue/pr

they are not immutable because repositories can change URLs (renamed or moved to a different org).

  • Issue #, commit hashes, etc. are still immutable in this scenario. When you rename or transfer a GitHub repository, all of these keys are preserved.

    What I do is store 2 tuples:

    Repository: (Id, Org, Repo)

    Issue/PR: (Repository.Id, #)

    Transferring or renaming a repository is an update to 1 row in this schema.