In the case of those big 'foundation models': Fine-tune for whom and how? I doubt it is possible to fine-tune things like this in a way that satisfies all audiences and training set instances. Much of this is probably due to the training set itself containing a lot of propaganda (advertising) or just bad style.
Seems more like the kind of thing you would make prompts using.
I can totally see someone taking that page and throwing it into whatever bot and going "Make up a comprehensive style guide that does the opposite of whatever is mentioned here".
That's not quite how that works though. It can for example be possible that fine-tuning a model to avoid the styles described in the article cause the LLM to stop functionaing as well as it can. It might just be an artefact of the architecture itself that to be effective it has to follow these rules. If it was as easy as just providing data and the LLM would then 'encode' that as a rule, we would advance much quicker than we currently are.
In the case of those big 'foundation models': Fine-tune for whom and how? I doubt it is possible to fine-tune things like this in a way that satisfies all audiences and training set instances. Much of this is probably due to the training set itself containing a lot of propaganda (advertising) or just bad style.
I'm pretty sure Mistral is doing fine tuning for their enterprise clients. OpenAI and Anthropic are probably not?
I'm more thinking about startups for fine-tuning.
Seems more like the kind of thing you would make prompts using.
I can totally see someone taking that page and throwing it into whatever bot and going "Make up a comprehensive style guide that does the opposite of whatever is mentioned here".
https://github.com/blader/humanizer
That's not quite how that works though. It can for example be possible that fine-tuning a model to avoid the styles described in the article cause the LLM to stop functionaing as well as it can. It might just be an artefact of the architecture itself that to be effective it has to follow these rules. If it was as easy as just providing data and the LLM would then 'encode' that as a rule, we would advance much quicker than we currently are.