← Back to context

Comment by qeternity

1 day ago

PyTorch is only part of it. There is still a huge amount of CUDA that isn’t just wrapped by PyTorch and isn’t easily portable.

3 comments

qeternity

Reply

svara 1 day ago

... but not in deep learning or am I missing something important here?

qeternity 21 hours ago
Yes, absolutely in deep learning. Custom fused CUDA kernels everywhere.
- Scene_Cast2 20 hours ago
  
  Yep. MoE, FlashAttention, or sparse retrieval architectures for example.