Comment by kristopolous

9 months ago

I've got a similar approach from a Unix philosophy.

Look at the savebrace screenshot here

https://github.com/kristopolous/Streamdown?tab=readme-ov-fil...

There's a markdown renderer which can extract code samples, a code sample viewer, and a tool to do the tmux handling and this all uses things like fzf and simple tools like simonw's llm. It's all I/O so it's all swappable.

It sits adjacent and you can go back and forth, using the chat when you need to but not doing everything through it.

You can also make it go away and then when it comes back it's the same context so you're not starting over.

Since I offload the actual llm loop, you can use whatever you want. The hooks are at the interface and parsing level.

When rendering the markdown, streamdown saves the code blocks as null-delimited chunks in the configurable /tmp/sd/savebrace. This allows things like xargs, fzf, or a suite of unix tools to manipulate it in sophisticated chains.

Again, it's not a package, it's an open architecture.

I know I don't have a slick pitch site but it's intentionally dispersive like Unix is supposed to be.

It's ready to go, just ask me. Everyone I've shown in person has followed up with things like "This has changed my life".

I'm trying to make llm workflow components. The WIMP of the LLM era. Things that are flexible, primitive in a good way, and also very easy to use.

Bug reports, contributions, and even opinionated designers are highly encouraged!

9 comments

kristopolous

rane 9 months ago

Maybe if you could explain what exactly is happening in the savebrace example because it's not clear how it relates to this.

Wilfred 9 months ago
If I've understood this interesting workflow correctly, there's two major components.
streamdown: a markdown renderer for the terminal, intended for consuming LLM output. It has affordances to make it easier to run the code snippets: no indentation, easy insertion in the clipboard, fzf access to previous items.
llmehelp: tools to slurp the current tmux text content (i.e. recent command output) as well as slurp the current zsh prompt (i.e. the command you're currently writing).
I think the idea is then you bounce between the LLM helping you and just having a normal shell/editor tmux session. The LLM has relevant context to your work without having to explicitly give it anything.
- kristopolous 9 months ago
  
  Basically.
  About 20 years ago I had a since-long-disappeared article called "The Great Productivity Mental Experiment" which we can extend now for the AI era:
  You've got 3 equally capable competent programmers with the same task, estimated to take on the order of days.
  #1 has no Internet access and only debuggers, source and on system documentation
  #2 has all that + search engines and the Internet
  #3 has all #2 + all the SOTA AI tools.
  They are all given the same task, and a timer starts.
  Who gets to "first run" the fastest? 90% success rate? 99.9%?
  The point of the exercise is the answer: "I don't know"
  Ergo there is no clear objective time saver.
  The next question is what would establish a clear victor without having to make a taxonomy of the tasks. We're looking for best time practice.
  The answer is workflow, engagement, and behavior.
  Current AI flow will get you to first run faster. But to the 99.9% pass? Without new flows it can't. It's a phenomenon called automation complacency and it can make bad workflows very costly.
  (The original point of the exercise was to point out how better tools don't fix bad practice and how frameworks, linters, the internet, stronger type systems ... these can either solve problems or create larger ones based on how you use them. There is no silver bullet as Fred Brooks said in the 1980s)
kristopolous 9 months ago

thanks. that's exactly the feedback I need! Appreciated. I put more screenshots here: https://github.com/kristopolous/llmehelp ... maybe that's clearer?

jarbus 9 months ago

Just wanted to say I love this, didn't known I needed this until now.

kristopolous 9 months ago
There's this new thing I'm currently working on. I have a tool that does a clean execvp of what you pass it through, as a total wrapper.
You can do ./tool "bash" and then open up nvim, emacs, do whatever, while the tool sits there passing things back and forth cleanly. Full modern terminal support.
Now here's the thing. You get context. Lots of it. Here's what it can do:
psql# <ctrl-x - the tool sees this, looks at the previous N I/O bytes and reverses video to symbolize it's in a mode> I need to join the users and accounts table <enter>
Then it knows from the PPID chain you're in postrgresql, it knows the output of previous commands, it then sends that to an llm, which processes it and gives you this
psql# I need to join the users and accounts table [ select * from users as u ... (Y/n) ]
Then it shows it. Here's the nice thing. You're STILL IN THE MODE and now you have more context. You can get out of it at any time through another ctrl-x toggle.
This way it follows you throughout your session and you can selectively beckon the LLM at your leisure, typing in english where you need to.
SSH into a host and you're still in it. Stuck in a weird emacs mode? Just press the hotkey and the i/o gets redirected as you ask the LLM to get you out.
But more importantly this is generic. It's a tool that allows you to intercept terminal session context windows and inject middleware, generically and then tie it to hotkeys.
As a result it works with any shell, inside of tmux, outside, in the vscode terminal, wherever you want... and you can make as many tools for it as you want.
I think it's fundamentally a new unix primitive. And I'm someone that researches this stuff (https://siliconfolklore.com/scale/ is a conference talk I gave last year).
If you know of anything else that's like this please tell me I haven't been able to find it.
Btw you cannot do this through pipes, the input of the left process isn't available to the piped process on the right. You can intercept stdin but you don't get the input file descriptor of the left process. The shell starts two processes at the time and then passes things through so you can't even use PPID cleanly without heuristic guessing. Trust me. I tried doing things this way many times. That's why nothing else works like this, you need new tricks.
I intend to package this up and release it in the next few days.
- sshine 9 months ago
  
  Mind blown.
  This is so simple.
  It’s like rlwrap, but generic. So you could reimplement rlwrap with this.
  I’ve been experimenting with schemesh recently. It’s a shell with ask control structures in scheme. Amazing, but a little immature still. Having a scheme middleware would be stronger: I can have a full zsh where a control key drops me onto a Scheme interpreter.
  Now, how far up your process tree do you want to host it?
  
  1 reply →
- jarbus 9 months ago
  
  Would need to re-read this a few more times to fully understand it, but very interested in the direction. Still struggling to wrap my mind around how it would work from nvim automatically, though. Excited to see what you've got