Comment by winwang

1 year ago

Depends on if we can interpret the final hidden layer. It's plausible we evolve models to _have_ interpretable (final/reasoning) hidden layers, just that they aren't constrained to the (same representation of) input/output domains (i.e. tokens).

0 comments