← Back to context

Comment by svara

3 months ago

... but not in deep learning or am I missing something important here?

2 comments

svara

Reply

qeternity 3 months ago

Yes, absolutely in deep learning. Custom fused CUDA kernels everywhere.

Scene_Cast2 3 months ago

Yep. MoE, FlashAttention, or sparse retrieval architectures for example.