Comment by phi-go
1 day ago
Does this have a compute benefit or could one use different specialized LLM architectures / models for the subnetworks?
1 day ago
Does this have a compute benefit or could one use different specialized LLM architectures / models for the subnetworks?
No comments yet
Contribute on Hacker News ↗