Comment by paradite

1 month ago

Ironically this is a goldmine for AI labs and AI writer startups to do RL and fine-tuning.

5 comments

paradite

That's not quite how that works though. It can for example be possible that fine-tuning a model to avoid the styles described in the article cause the LLM to stop functionaing as well as it can. It might just be an artefact of the architecture itself that to be effective it has to follow these rules. If it was as easy as just providing data and the LLM would then 'encode' that as a rule, we would advance much quicker than we currently are.

einrealist 1 month ago

In the case of those big 'foundation models': Fine-tune for whom and how? I doubt it is possible to fine-tune things like this in a way that satisfies all audiences and training set instances. Much of this is probably due to the training set itself containing a lot of propaganda (advertising) or just bad style.

paradite 1 month ago

I'm pretty sure Mistral is doing fine tuning for their enterprise clients. OpenAI and Anthropic are probably not?
I'm more thinking about startups for fine-tuning.

kingstnap 1 month ago

Seems more like the kind of thing you would make prompts using.

I can totally see someone taking that page and throwing it into whatever bot and going "Make up a comprehensive style guide that does the opposite of whatever is mentioned here".

eddyg 1 month ago

https://github.com/blader/humanizer