← Back to context Comment by vichle 3 hours ago What type of hardware do I need to run a small model like this? I don't do Apple. 3 comments vichle Reply bodegajed 3 hours ago 1.5B models can run on CPU inference at around 12 tokens per second if I remember correctly. moffkalast 3 hours ago Ingesting multiple code files will take forever in prompt processing without a GPU though, tg will be the least of your worries. Especially when you don't append but change it in random places so caching doesn't work. jychang 3 hours ago 1.54GB model? You can run this on a raspberry pi.
bodegajed 3 hours ago 1.5B models can run on CPU inference at around 12 tokens per second if I remember correctly. moffkalast 3 hours ago Ingesting multiple code files will take forever in prompt processing without a GPU though, tg will be the least of your worries. Especially when you don't append but change it in random places so caching doesn't work.
moffkalast 3 hours ago Ingesting multiple code files will take forever in prompt processing without a GPU though, tg will be the least of your worries. Especially when you don't append but change it in random places so caching doesn't work.
1.5B models can run on CPU inference at around 12 tokens per second if I remember correctly.
Ingesting multiple code files will take forever in prompt processing without a GPU though, tg will be the least of your worries. Especially when you don't append but change it in random places so caching doesn't work.
1.54GB model? You can run this on a raspberry pi.