Comment by yogurt-male
15 hours ago
Mostly out of reach. There is a ton of research on figuring out how to do this coming out every day, including both proposals of new ways to do things and (often strong) critiques of old or recently proposed ways of doing things. Interpretability (esp. for large, modern models) is very, very far from being a solved problem.
No comments yet
Contribute on Hacker News ↗