← Back to context Comment by svara 1 day ago ... but not in deep learning or am I missing something important here? 2 comments svara Reply qeternity 1 day ago Yes, absolutely in deep learning. Custom fused CUDA kernels everywhere. Scene_Cast2 20 hours ago Yep. MoE, FlashAttention, or sparse retrieval architectures for example.
qeternity 1 day ago Yes, absolutely in deep learning. Custom fused CUDA kernels everywhere. Scene_Cast2 20 hours ago Yep. MoE, FlashAttention, or sparse retrieval architectures for example.
Yes, absolutely in deep learning. Custom fused CUDA kernels everywhere.
Yep. MoE, FlashAttention, or sparse retrieval architectures for example.