Comment by wizzwizz4
6 months ago
> As a first try, I just copy+pasted the whole library and my whole program into GPT 4.1 and told it to rewrite it using the library.
That's a translation task. Transformer models are excellent at translation tasks (and, for the same reasons, half-decent at fuzzy search and compression), and that's basically all they can do, but generative models tend to be worse at translation tasks than seq2seq models.
So the fact that a GPT model can one-shot this correspondence, given a description of the library, suggests there's a better way to wire a transformer model up that'd be way more powerful. Unfortunately, this isn't my field, so I'm not familiar with the literature and don't know what approaches would be promising.
No comments yet
Contribute on Hacker News ↗