Comment by decide1000 6 months ago I use it on a 24gb gpu Tesla P40. Very happy with the result. 4 comments decide1000 Reply hkt 6 months ago Out of interest, roughly how many tokens per second do you get on that? edude03 6 months ago Like 4. Definitely single digit. The P40s are slow af coolspot 6 months ago P40 has memory bandwidth of 346GB/s which means it should be able to do around 14+ t/s running a 24 GB model+context. 1 reply →
hkt 6 months ago Out of interest, roughly how many tokens per second do you get on that? edude03 6 months ago Like 4. Definitely single digit. The P40s are slow af coolspot 6 months ago P40 has memory bandwidth of 346GB/s which means it should be able to do around 14+ t/s running a 24 GB model+context. 1 reply →
edude03 6 months ago Like 4. Definitely single digit. The P40s are slow af coolspot 6 months ago P40 has memory bandwidth of 346GB/s which means it should be able to do around 14+ t/s running a 24 GB model+context. 1 reply →
coolspot 6 months ago P40 has memory bandwidth of 346GB/s which means it should be able to do around 14+ t/s running a 24 GB model+context. 1 reply →
Out of interest, roughly how many tokens per second do you get on that?
Like 4. Definitely single digit. The P40s are slow af
P40 has memory bandwidth of 346GB/s which means it should be able to do around 14+ t/s running a 24 GB model+context.
1 reply →