← Back to context Comment by catlifeonmars 1 month ago Doesn’t that apply to the OP as well? 10 comments catlifeonmars Reply verdverm 1 month ago Yes, I'm not going to fill my precious context with documentation for a programming languageThis seems like a research dead end to me, the fundamentals are not there catlifeonmars 1 month ago It seems kind of silly that you can’t teach an LLM new tricks though, doesn’t it? This doesn’t sound like an intrinsic limitation and more an artifact of how we produce model weights today. verdverm 1 month ago getting tricks embedded into the weights is expensive, it doesn't happen in a single passthey's why we teach them new tricks on the fly (in-context learning) with instruction files 7 replies →
verdverm 1 month ago Yes, I'm not going to fill my precious context with documentation for a programming languageThis seems like a research dead end to me, the fundamentals are not there catlifeonmars 1 month ago It seems kind of silly that you can’t teach an LLM new tricks though, doesn’t it? This doesn’t sound like an intrinsic limitation and more an artifact of how we produce model weights today. verdverm 1 month ago getting tricks embedded into the weights is expensive, it doesn't happen in a single passthey's why we teach them new tricks on the fly (in-context learning) with instruction files 7 replies →
catlifeonmars 1 month ago It seems kind of silly that you can’t teach an LLM new tricks though, doesn’t it? This doesn’t sound like an intrinsic limitation and more an artifact of how we produce model weights today. verdverm 1 month ago getting tricks embedded into the weights is expensive, it doesn't happen in a single passthey's why we teach them new tricks on the fly (in-context learning) with instruction files 7 replies →
verdverm 1 month ago getting tricks embedded into the weights is expensive, it doesn't happen in a single passthey's why we teach them new tricks on the fly (in-context learning) with instruction files 7 replies →
Yes, I'm not going to fill my precious context with documentation for a programming language
This seems like a research dead end to me, the fundamentals are not there
It seems kind of silly that you can’t teach an LLM new tricks though, doesn’t it? This doesn’t sound like an intrinsic limitation and more an artifact of how we produce model weights today.
getting tricks embedded into the weights is expensive, it doesn't happen in a single pass
they's why we teach them new tricks on the fly (in-context learning) with instruction files
7 replies →