← Back to context

Comment by rahimnathwani

1 year ago

Wow! Only $2k with no quantization.

  hit between 4.25 to 3.5 TPS (tokens per second) on the Q4 671b full model