Comment by Yemoshino
16 hours ago
I found a few papers in this direction with perplexity like this one https://ceur-ws.org/Vol-4005/paper1.pdf and it doesn't seem to be that relevant for now.
The progress of a handful models seem to be so much better (because limited compute, we have only a handful of big ones, i presume) that these finetunings are just not yet relevant.
I'm also curious if a english java + html + css + javascript only model would look like in size and speed for example.
Unfortunate whenever i ask myself the question of finetunging tokens (just a few days ago this question came up again), deep diving takes too much time.
Claude only got lsp support in november i think. And its not even clear to me to what extend. So despite the feeling we are moving fast, tons of basic ideas haven't even made it in yet
if you have a corpus of code snippets to train the manifold (Laplacian) on (and a good embedding model), it is definitely possible to try something like this.