Comment by xiphias2
20 hours ago
I guess they will upload it later, it seems like an honest mistake to me.
Anyways SwiTransformer paper looks interesting and doing a post training to optimize for it looks interesting as well.
20 hours ago
I guess they will upload it later, it seems like an honest mistake to me.
Anyways SwiTransformer paper looks interesting and doing a post training to optimize for it looks interesting as well.
Looks like they couldn't find the correct weights. https://x.com/IplanRio_rj/status/2066693494769348946