Comment by phi-go
18 hours ago
Does this have a compute benefit or could one use different specialized LLM architectures / models for the subnetworks?
18 hours ago
Does this have a compute benefit or could one use different specialized LLM architectures / models for the subnetworks?
No comments yet
Contribute on Hacker News ↗