Comment by ritvikpandey21
15 days ago
we completely agree - mechanistic interpretability might help keep these language models in check, but it’s going to be very difficult to run this on closed source frontier models. im excited to see where that field progresses
No comments yet
Contribute on Hacker News ↗