← Back to context

Comment by messe

13 hours ago

So what's the use case for keeping them in that format rather than something more easily indexed and queryable?

I'd probably just shove it all into Postgres, but even a multi terabyte SQLite database seems more reasonable.

Replying here because the other comment is too deeply nested to reply.

Even if it's once off, some people handle a lot of once-offs, that's exactly where you need good CLI tooling to support it.

Sure jq isn't exactly super slow, but I also have avoided it in pipelines where I just need faster throughput.

rg was insanely useful in a project I once got where they had about 5GB of source files, a lot of them auto-generated. And you needed to find stuff in there. People were using Notepad++ and waiting minutes for a query to find something in the haystack. rg returned results in seconds.

  • You make some good points. I've worked in support before, so I shouldn't have discounted how frequent "once-offs" can be.

The use case could be e.g. exactly processing an old trove of logs into something more easily indexed and queryable, and you might want to use jq as part of that processing pipeline

  • Fair, but for a once-off thing performance isn't usually a major factor.

    The comment I was replying to implied this was something more regular.

    EDIT: why is this being downvoted? I didn't think I was rude. The person I responded to made a good point, I was just clarifying that it wasn't quite the situation I was asking about.

    • At scale, low performance can very easily mean "longer than the lifetime of the universe to execute." The question isn't how quickly something will get done, but whether it can be done at all.

      1 reply →

    • Certain people/businesses deal with one-off things every day. Even for something truly one-off, if one tool is too slow it might still be the difference between being able to do it once or not at all.