← Back to context

Comment by throwa356262

1 day ago

Better performance than TQ and better quality than FP16?

Am I reading this right??

any divergence (even if the benchmark is better) from full precision is error

  • Just pretend that it is the next step update when training. You didn’t train your model to step=inf, I hope?