Comment by loxias
6 hours ago
I would love, _love_ to know more about your data formats, your tools, what the JSON looks like, basically as much as you're willing to share. :)
For about a month now I've been working on a suite of tools for dealing with JSON specifically written for the imagined audience of "for people who like CLIs or TUIs and have to deal with PILES AND PILES of JSON and care deeply about performance".
For me, I've been writing them just because it's an "itch". I like writing high performance/efficient software, and there's a few gaps that it bugged me they existed, that I knew I could fill.
I'm having fun and will be happy when I finish, regardless, but it would be so cool if it happened to solve a problem for someone else.
I maintain some tools for the videogame World of Warships. The developer has a file called GameParams.bin which is Python-pickled data (their scripting language is Python).
Working with this is pretty painful, so I convert the Pickled structure to other formats including JSON.
The file has always been prettified around ~500MB but as of recently expands to about 3GB I think because they’ve added extra regional parameters.
The file inflates to a large size because Pickle refcounts objects for deduping, whereas obviously that’s lost in JSON.
I care about speed and tools not choking on the large inputs so I use jaq for querying and instruction LLMs operating on the data to do the same.