Comment by zozbot234

1 month ago

The weights likely won't be available wrt. this model since this is part of the Max series that's always been closed. The most "open" you get is the API.

5 comments

zozbot234

storystarling 1 month ago

The closed nature is one thing, but the opaque billing on reasoning tokens is the real dealbreaker for integration. If you are bootstrapping a service, I don't see how you can model your margins when the API decides arbitrarily how long to think and bill for a prompt. It makes unit economics impossible to predict.

TobTobXX 1 month ago
Doesn't ClosedAI do the same? Thinking models bill tokens, but the thinking steps are encrypted.
- Rastonbury 1 month ago
  
  Destroying unit economics is a bit dramatic... you can chose thinking effort for modern models/APIs and add guidance to the system prompts
czl 1 month ago

FYI: Newer LLM hosting APIs offer control over amount of "thinking" (as well as length of reply) -- some by token count others by an enum (high low, medium, etc.).
zozbot234 1 month ago

You just have to plan for the worst case.