Comment by DocTomoe

2 months ago

So ... you are letting a nondeterministic LLM operate on the shell, via quasi-shellscript. This will appeal mostly to people who do not have the skillset to write an actual shell-script.

In short, isn't that like giving a voice-controlled scalpel to a random guy on the street an tell them 'just tell it to neurosurgery', and hope it accidentally does the right procedure?

3 comments

DocTomoe

jedwhite 2 months ago

I know this will not appeal to developers who don’t see a legitimate role for the use of AI coding tools with nondeterministic output.

It is intended to be a useful complement to traditional Shell scripting, Python scripting etc. for people who want to add composable AI tooling to their automation pipelines.

I also find that it helps improve the reliability of AI in workflows when you can break down prompts into re-useable single-task-focused modules that leverage LLMs for tasks they are good at (format.md, summarize-logs.md, etc). These can then be chained with traditional Shell scripts and command line tools.

Examples are summarizing reports, formatting content. These become composable building blocks.

So I hope that is something that has practical utility even for users like yourself who don’t see a role for plain language prompting in automation per se.

In practice this is a way to add composable AI-based tooling into scripts.

Many people are concerned about (or outright opposed to) the use of AI coding tools. I get that this will not be useful for them. Many folks like myself find tools like Claude helpful, and this just makes it easier to use them in automation pipelines.

DocTomoe 2 months ago

I'm more concerned that someone decides to prompt for 'analyze these logfiles, then clean up', and the LLM randomly decides the best way to 'clean up' is a 'rm -rf /' - not on the first run, but on the 27th.
That kind of failure mode is fundamentally different from traditional scripting: it passes tests, builds trust, and then fails catastrophically once the implicit interpretation shifts.
In short: I believe it's nice this works for the engineer who knows exactly what (s)he is doing - but those folks usually don't need LLMs, they just write the code. People who this appeals to - and who may not begin to think about side-effects of innocent-sounding prompts - are being given a foot machine gun, which may act like a genie with hilarious unintended consequences.

jrmg 2 months ago

Don’t worry, it’s “more auditable”!