Show HN: qqqa – A fast, stateless LLM-powered assistant for your shell

5 days ago (github.com)

I built qqqa as an open-source project, because I was tired of bouncing between shell, ChatGPT / the browser for rather simple commands. It comes with two binaries: qq and qa.

qq means "quick question" - it is read-only, perfect for the commands I always forget.

qa means "quick agent" - it is qq's sibling that can run things, but only after showing its plan and getting an approval by the user.

It is built entirely around the Unix philosophy of focused tools, stateless by default - pretty much the opposite of what most coding agent are focusing on.

Personally I've had the best experience using Groq + gpt-oss-20b, as it feels almost instant (up to 1k tokens/s according to Groq) - but any OpenAI-compatible API will do.

Curious if the HN crowd finds it useful - and of course, AMA.

For inspiration (and, ofc, PR since I'm salty that this gets attention while my pet project doesn't), you can checkout clai[0] which works very similarly but has a year or so's worth of development behind it.

So feature suggestions:

* Pipe data into qq ("cat /tmp/stacktrace | qq What is wrong with this: "),

* Profiles (qq -profile legal-analysis Please checkout document X and give feedback)

* Conversations (this is simply appending a new message to a previous query)

[0]: https://github.com/baalimago/clai/blob/main/EXAMPLES.md

Just about everyone has already written one of these. Mine are called "ask" and "please". My "ask" has a memory though, since I often needed to ask followup questions:

https://github.com/pmarreck/dotfiles/blob/master/bin/ask

I have a local version of ask that works with ollama: https://github.com/pmarreck/dotfiles/blob/master/bin/ask_loc...

And here is "please" as in "please rename blahblahblah in this directory to blahblah": https://github.com/pmarreck/dotfiles/blob/master/bin/please

  • Since we're sharing, I have a "claude" command that lets me get quick answers but also saves the conversation and outputs an identifier so in the rare case I want a follow-up, I can ask a question with the ID to continue the conversation.

    https://gist.github.com/rbitr/bfbc43b806ac62a5230555582d63d4...

    • Neat idea! Although as an identifier, instead of a hash, I'd probably ask it to summarize the conversation into 3 to 7 underscore-separated words and use that as the identifier (plus maybe a timestamp), since a list of them will more easily tell you which is relevant

I can suggest our service (previously here https://ch.at/v1/chat/completions (supports streamed responses)

Also accessible via HTTP/SSH/DNS for quick tests: curl ch.at/?q=… , ssh ch.at Privacy note: we don’t log anything, but upstream LLM providers might...

  • That would be pretty cool for testing the waters, will give it a thought!

    How do you guys pay for this? I guess the potential for abuse is huge.

    • Cool! Right now it's just IP address rate limiting and the costs have not mattered too much, but yes long term I am not sure what we'll do...

I built a similar tool called “lmsh” (LM shell) that uses Claude-code non-interactive mode (hence no API keys needed, since it uses your CC subscription): it presents the shell command on a REPL like line that you can edit first and hit enter to run it. Used Rust to make it a bit snappier:

https://github.com/pchalasani/claude-code-tools?tab=readme-o...

It’s pretty basic, and could be improved a lot. E.g make it use Haiku or codex-CLI with low thinking etc. Another thing is have it bypass reading CLAUDE.md or AGENTS.md. (PRs anyone? ;)

  • >it presents the shell command on a REPL like line that you can edit first and hit enter to run it.

    Oh genius, that's the best UX idea for the situation of asking an LLM to flesh out the CLI command without relying entirely on blind faith.

    Even better if we can have that kind of behavior in the shell itself. For example if we started typing "cat list | grep foo | " and then suddenly realized we want help with the awk command so that it drops the first column.

  • This a pretty neat approach, indeed. Having to use the API might be an inconvenience for some people indeed. I guess having the Claude or ChatGPT subscription and using it with the CLI tools is what makes developers stick with these tools, instead of using what is out there.

    • Right, when we’re already paying $100 or $200 per month, leveraging that “almost-all-you-can eat buffet” is always going to be more attractive than spending more on per token API billing.

On the stateless part - I increasingly believe that state keeping is an absolute necessity. Not necessarily across requests but on the local storage. Handoffs are proving invaluable in overcoming context limitations and I would like more tools to support a higher level of coordination and orchestration across sessions and with sub-agents.

I believe the best “worker” agents of the future are going to be great at following instructions, have a fantastic intuition but not so much knowledge. They’ll be very fast but will need to retain their learnings so they can build on it, rather than relearning everything in every request - which is slow and a complete waste a resources. Much like what Claude is trying to achieve with skills.

I’m not suggesting that every tool reinvent this paradigm in its own unique way. Perhaps we a single system that can do all the necessary state keeping so each tool can focus on doing its job really well.

Unfortunately, this is more art than science - for example, asking each model to carry out handoff in the expected way will be a challenge. Especially on current gen small models. But many people are using frontier models, that are slowly converging in their intuition and ability to comprehend instructions. So it might still be worth the effort.

What a phenomenal launch it has been! Thanks a lot to everyone, for the many ideas and feedback. It has really made me push harder to make qqqa even cooler.

Since I launched it yesterday, I added a few new features - check out the latest version on Github!

Here is what we have now:

* added support for OpenRouter

* added support for local LLMs (Ollama)

* qqqa can be installed via Homebrew, to avoid signing issues on MacOS

* qq/qa can ingest piped input from stdin

* qa now preserves ANSI colors and TTY behavior

* hardened the agent sandbox - execute_command can't escape the working directory anymore

* history is disabled by default - can be enabled at --init, via config or flag

* qq --init refuses to override an existing .qq/config.json

llm cmdcomp is better:

    - it puts the command in the shell editor line so you can edit it (for example to specify filenames using the line editor after the fact and make use of the shell tools like glob expansion etc.) 
    - it goes into the history. 
    - It can use a binding so you can start writing something without remembering to prefix it with a command and invoke the cmd completion at any place in the line editor. 
    - It also allows you to refine the command interactively.

I haven't see any of the other of the myriad of tools do these very obvious things.

https://github.com/CGamesPlay/llm-cmd-comp

  • Thanks. I guess it all depends on the perspective. I do not see how editing the command is a good tradeoff here in terms of complexity+UI. Once you get the command suggested by the LLM, you can quickly copy and modify it, before running it.

    qqqa uses history - although in a very limited fashion for privacy reasons.

    I am taking note of these ideas though, never say never!

    • > Once you get the command suggested by the LLM, you can quickly copy and modify it, before running it.

      Copying and pasting tends to be a very tedious operation in the shell, which usually requires moving your hands away from the keyboard to the mouse (there are terminals which allow you to quick-select and insert lines but they are still more tedious than simply pressing enter to have the command on the line editor). Maybe try using llm-cmd-comp for a while.

      > I do not see how editing the command is a good tradeoff here in terms of complexity+UI.

      I don't find it a tradeoff, I think it's strictly superior in every way including complexity. llm-cmd-comp is probably the way I most often interface with llms (maybe second to basic search-engine-replacement) and I almost always either 1. don't have the file glob or the file names themselves ready (they may not exist yet!) at the time when I want to start writing the command or they are easier to enter using a fuzzy selector like fzf 2. don't want the llm to do weird things with globs when I pass them directly and having the shell expand them is usually difficult because the prompt is not a command (so the completion system won't do the right thing).

      But even in your own demo it is faster to use llm-cmd-comp and you also get the benefit that the command goes into the history and you can optionally edit it if you want or further revise the prompt! It does require pressing enter twice instead of "y" but I don't find that a huge inconvenience especially since I almost always edit the command anyway.

      Again, try installing llm-cmd-comp and try out your demo case.

apparently everyone has made their own, some better, others worse. but here's my implementation (not as full-featured as this one but it does the job): https://github.com/Jotalea/FRIDAY

it's inspired on F.R.I.D.A.Y. from the Marvel Cinematic Universe, a digital assistant with access to all of the (fictional) hardware.

There is also the llm tool written by simonwillison: https://github.com/simonw/llm

I personally use "claude -p" for this

  • Compared to the llm tool, qqqa is as lightweight as it gets. In the Ruby world it would be Sinatra, not Rails.

    I have no interest in adding too many complex features. It is supposed to be fast and get out of your way.

    Different philosophies.

This is nice. Reminds me how in warp terminal you can (could?) just type `# question` and it would call some LLM under the hood. Good UX.

  • Thank you - appreciate it. I really tried to create something simple, that solve one problem really well.

And of course, if you find any bugs or feature requests, report them via issues on Github.

very cool, can be useful for simple commands, but i find github cli's copilot extension useful for this, i just do `ghcs <question>` and it gives me an command, i can ask it how it works, or make it better, copy it, or run it

This looks really cool and I love the idea but I will stick with opencode run ”query” and for specific agents which have specific models, I can just configure that also in an agent.md then add opencode run ”query” -agent quick

Looks interesting! Does it support multiple tool calls in a chain, or only terminating with a single tool use?

Why is there a flag to not upload my terminal history and why is that the default?

  • Thanks!

    It does not support chaining multiple tool calls - if it did, it would not be a lightweight assistant anymore, I guess.

    The history is there to allow referencing previous commands - but now that I think about it, it should clearly not be on by default.

    Going to roll out a new version soon. Thanks for the feedback!

Nice! Do you have plans to make it work with a CC subscription? Great idea but not really interested in paying for another API key

[flagged]

  • It is highly disturbing that you would go through my private profiles and nicknames to prove what? Ever heard of nicknames on the Internet? Ever heard a person can have multiple projects over the many years?

    I published an open source library, it is not even v1.0 yet.

    I kindly ask you to delete this comment.

    • The act of looking is normal. Running your code on their computer requires a lot of trust, after all.

      But there’s nothing suspicious about having multiple nicknames. I don’t really get what they are talking about there.

      8 replies →

    • Frankly, I second that sentiment.

      I'm not sure how extensive your search was to find OP's LinkedIn, but it's clearly not in his HN profile, and that's enough to be unwarranted imho.

      4 replies →

  • So if the random guy who posted it on HN wasn't the OP, it would've been a thousand times more untrustworthy, obviously?

    I don't see your point, and I squinted very hard.

    • And since when does publishing open source software require you to present any credentials at all? I am not hiding anything, I just published using my regular accounts - some of which I have been using for more than a decade.

      1 reply →

  • Mate, it's a free project on github they shared with us. Let's keep things in perspective.

    • The perspective is that it is a free project shared on Github which prompts a OS-level warning message on macOS, which might certainly intimidate some people.

      I really want to see this project succeed, and thus gave feedback on this — what else is a "Show HN" good for, then?

      3 replies →

  • Hey dang, can you please remove the above comment 45834359, as per iagooar's request?

    I don't see an option for removal on the HN ui.

    Same for 45834692 if possible, as this also contains the name.