I just followed the Quickstart[1] in the GitHub repo, refreshingly straight forward. Using the pip package worked fine, as did installing the editable version using the git repository. Just install the CUDA version of PyTorch[2] first.
The HF demo is very similar to the GitHub demo, so easy to try out.
That's for CUDA 12.8, change PyTorch install accordingly.
Skipped FlashAttention since I'm on Windows and I haven't gotten FlashAttention 2 to work there yet (I found some precompiled FA3 files[3] but Qwen3-TTS isn't FA3 compatible yet).
I just followed the Quickstart[1] in the GitHub repo, refreshingly straight forward. Using the pip package worked fine, as did installing the editable version using the git repository. Just install the CUDA version of PyTorch[2] first.
The HF demo is very similar to the GitHub demo, so easy to try out.
That's for CUDA 12.8, change PyTorch install accordingly.
Skipped FlashAttention since I'm on Windows and I haven't gotten FlashAttention 2 to work there yet (I found some precompiled FA3 files[3] but Qwen3-TTS isn't FA3 compatible yet).
[1]: https://github.com/QwenLM/Qwen3-TTS?tab=readme-ov-file#quick...
[2]: https://pytorch.org/get-started/locally/
[3]: https://windreamer.github.io/flash-attention3-wheels/
https://github.com/sdbds/flash-attention-for-windows/release... - FA2 binaries for you
It flat didn't work for me on mps. CUDA only until someone patches it.
Demo ran fine, if very slowly, with CPU-only using "--device cpu" for me. It defaults to CUDA though.
Try using mps I guess, I saw multiple references to code checking if device is not mps, so seems like it should be supported. If not, CPU.