Comment by Jerry2

10 months ago

> I typically access the backend UIs provided by each LLM service, which serve as a light wrapper over the API functionality

Hey Max, do you use a custom wrapper to interface with the API or is there some already established client you like to use?

If anyone else has a suggestion please let me know too.

6 comments

Jerry2

simonw 10 months ago

I'm going to plug my own LLM CLI project here: I use it on a daily basis now for coding tasks like this one:

llm -m o4-mini -f github:simonw/llm-hacker-news -s 'write a new plugin called llm_video_frames.py which takes video:path-to-video.mp4 and creates a temporary directory which it then populates with one frame per second of that video using ffmpeg - then it returns a list of [llm.Attachment(path="path-to-frame1.jpg"), ...] - it should also support passing video:video.mp4?fps=2 to increase to two frames per second, and if you pass ?timestamps=1 or &timestamps=1 then it should add a text timestamp to the bottom right conner of each image with the mm:ss timestamp of that frame (or hh:mm:ss if more than one hour in) and the filename of the video without the path as well.' -o reasoning_effort high

Any time I use it like that the prompt and response are logged to a local SQLite database.

More on that example here: https://simonwillison.net/2025/May/5/llm-video-frames/#how-i...

dcre 10 months ago

Seconded, everyone should be using CLIs.
Simon: while that example is impressive, it is also complicated and hard to read in an HN comment.

minimaxir 10 months ago

I was developing an open-source library for interfacing with LLMs agnostically (https://github.com/minimaxir/simpleaichat) and although it still works, I haven't had the time to maintain it unfortunately.

Nowadays for writing code to interface with LLMs, I don't use client SDKs unless required, instead just hitting HTTP endpoints with libraries such as requests and httpx. It's also easier to upgrade to async if needed.

asabla 10 months ago

most services has a "studio mode" for their models served.

As an alternative you could always use OpenWebUI

danenania 10 months ago

I built an open source CLI coding agent for this purpose[1]. It combines Claude/Gemini/OpenAI models in a single agent, using the best/most cost effective model for different steps in the workflow and different context sizes. You might find it interesting.

It uses OpenRouter for the API layer to simplify use of APIs from multiple providers, though I'm also working on direct integration of model provider API keys—should release it this week.

1 - https://github.com/plandex-ai/plandex