Comment by ianhawes
5 months ago
> Include the beta header output-128k-2025-02-19 in your API request to increase the maximum output token length to 128k tokens for Claude 3.7 Sonnet.
This is pretty big! Previously most models could accept massive input tokens but would be restricted to 4096 or 8192 output tokens.
This amounts to a cost-saving measure - you can generate arbitrarily many tokens by appending the output and re-invoking the model.