Comment by konart

13 hours ago

>Real work

This part should have featured something about real work. But instead it features a paragraph about one-shot bs that creates "something".

Unless your work is to create thousands wordpress tremplates to sell - this is not a "real work".

Give it a repository (any kind of OSS project will do for an example) and a github issue requesting a knew feature or describing a confirmed bug. (you can and probably should write a prompt for LLM shough, don't just provide the issue itself)

And then whatch it go.

And then judge the result and it's quality.

Sorry, but from my experience 27B is just useless. You do get a result and some times it does work, but most of the times it is not event on junior dev level. And it takes it a lot of time to do the thing, unless you have an extremely expensive machine.

If your expectation is to treat it as a coworker, then you're right.

If your expectation is to treat it as a tool, then you're wrong.

I guess that's where the disconnect lies.

  • Define "a tool" for me and we can talk.

    I already have tools for autocomplete, working with structured data and many more. Deterministic tools.

    Obviously you do not expect something like that from a model with some harness. It can read some input (user's or other tools) and give you some output.

    My expectation is that this tool, given some meaning full input (instructions, expectations, motivations and an optional source files to work with), will produce something that will actually be aligned with the input.

    For example: consider I have a services that has some sort of events created now and then. I what those events to be available for other services. So I decide it to have a transactional outbox and an observer that will pull events from the outbox and put them into a kafka topic.

    My expectation is that I can give this tool some context (source code and description), state my instructions, expectations, motivations, design decisions and have an implementation as a result.

    My other expectation is that given my context etc and agent's context (skills etc) were correct and adequate - the outout will also be correct and adequate.