The "M" nomenclature has been around since at least BERT and T5/FLAN. It's valid to use it even if today's LLM devs are more familiar with billion-scale models.
I was so confused by many comments in this post but thanks to you I realized that some people are apparently reading it as 26B and that's why their comments make no sense.
Can you please make your substantive points without sharp elbows? We're trying for something different here, and would appreciate it if you'd post in the intended spirit.
I don't think they're attacking you, but suggesting you read more carefully. The information provided is correct and clear, but you need to let go of your own biases when consuming it.
I personally prefer the M to the B. I guess as an engineer, noticing the units comes pretty naturally.
The "M" nomenclature has been around since at least BERT and T5/FLAN. It's valid to use it even if today's LLM devs are more familiar with billion-scale models.
I was so confused by many comments in this post but thanks to you I realized that some people are apparently reading it as 26B and that's why their comments make no sense.
Haha, we were trying to not be hand-wavy too much :)
Oh hey it's Henry. I met you a couple weeks ago at an event in SF. Nice to see you on here.
[flagged]
Can you please make your substantive points without sharp elbows? We're trying for something different here, and would appreciate it if you'd post in the intended spirit.
https://news.ycombinator.com/newsguidelines.html
I’d edit it if I could, but it seems to be past the timeout.
As the other poster noted, the post wasn’t meant to be read as a personal attack
1 reply →
Pardon me, do I know you?
Why are you attacking me?
I don't think they're attacking you, but suggesting you read more carefully. The information provided is correct and clear, but you need to let go of your own biases when consuming it.
I personally prefer the M to the B. I guess as an engineer, noticing the units comes pretty naturally.
1 reply →
I read it as 26B as well.