Comment by egorfine

1 day ago

Native MCP:

For Qwen 35B enabling native MCP on MLX models slows it down by 10%.

For Qwen 27B enabling native MCP on MLX models speeds token generation up almost exactly 1.5x.

(all tested on M5 pro).

1 comment

egorfine

[dead]