Comment by beastman82

5 months ago

No brainer if you're sitting on a >$100k inference server.

Sure, that's fair. If you're aiming for state of the art performance. Otherwise, you can get close and do it on reasonably priced hardware by using smaller distilled and/or quantized variants of llama/r1.

Really though I just meant "it's a no-brainer that they are popular here on HN".