Comment by wizzwizz4

7 months ago

> As a first try, I just copy+pasted the whole library and my whole program into GPT 4.1 and told it to rewrite it using the library.

That's a translation task. Transformer models are excellent at translation tasks (and, for the same reasons, half-decent at fuzzy search and compression), and that's basically all they can do, but generative models tend to be worse at translation tasks than seq2seq models.

So the fact that a GPT model can one-shot this correspondence, given a description of the library, suggests there's a better way to wire a transformer model up that'd be way more powerful. Unfortunately, this isn't my field, so I'm not familiar with the literature and don't know what approaches would be promising.

0 comments

wizzwizz4

No comments yet

Contribute on Hacker News ↗