Comment by gabriel666smith

6 days ago

Ah got you! Makes sense, and makes it so much more clear. Thanks. In that case, I totally retract my crit in my prev comment. Appreciate it.

So - just so I completely understand - the variant we're calling 897ccd25-4776-4077-a9e6-0da34abb32a4 emerged during batch 5, and doesn't have a parent in a prior batch? Very interesting to compare iterations.

I currently run some very similar scripts for one of my own workflows. Though I understand making LLMs do 'good creative writing' wasn't necessarily the point here - perhaps solely to prove that LLMs can improve their own work, according to their own (prompted) metric(s) - the blog post is correct to point out that there's a huge limitation around prompt sensitivity (not to mention subjectivity around quality of art).

As a human using LLMs to create work that suits my own (naturally subjective) tastes and preferences, I currently get around this issue by feeding back on variants manually, then having the LLM update its own prompt (much like a cursorrules file, but for prose) based on my feedback, and only then generating new variants, to be fed back on, etc.

It's extremely hard to one-shot-prompt everything you do or do not like in writing, but you can get a really beefy ruleset - which even tiny LLMs are very good at following - incredibly quickly by asking the LLM to iterate on its own instructions in this manner.

Like I said, not sure if your goal is to 'prove improvement is possible' or 'create a useful creative writing assistant' but, if it's the latter, that's the technique that has created the most value for me personally over the last couple of years. Sharing in case that's useful.