Comment by BloondAndDoom
10 hours ago
This is a bit more akin to distill - https://github.com/samuelfaj/distill
Advantage of SML in between some outputs cannot be compressed without losing context, so a small model does that job. It works but most of these solutions still have some tradeoff in real world applications.
[dead]