← Back to context

Comment by latexr

7 hours ago

Because the output you get can have hallucinations, which don’t happen with a deterministic tool. Furthermore, by getting the `jq` command you get something which is reusable, fast, offline, local, doesn’t send your data to a third-party, doesn’t waste a bunch of tokens, … Using an LLM to filter the data is worse in every metric.

I get that AI isn’t deterministic by definition, but IMHO it’s become the go-to response for a reason to not use AI, regardless of the use case.

I’ve never seen AI “hallucinate” on basic data transformation tasks. If you tell it to convert JSON to YAML, that’s what you’re going to get. Most LLMs are probably using something like jq to do the conversion in the background anyway.

AI experts say AI models don’t hallucinate, they confabulate.

  • Just because you haven't seen it hallucinate on these tasks doesn't mean it can't.

    When I'm deciding what tool to use, my question is "does this need AI?", not "could AI solve this?" There's plenty of cases where its hard to write a deterministic script to do something, but if there is a deterministic option, why would you choose something that might give you the wrong answer? It's also more expensive.

    The jq script or other script that an LLM generates is way easier to spot check than the output if you ask it to transform the data directly, and you can reuse it.

You can use a local LLM and you can ask it to use tools so it is faster.

  • "so it is faster" than what? A cloud hosted LLM? That's a pretty low bar. It's certainly not faster than jq.

  • There is hardware that is able to run jq but no a local AI model that's powerful enough to make the filtering reliable. Ex a raspberry pi