Comment by minimaxir
19 hours ago
Golden Gate Claude was two years ago and it's surprising there hasn't been as much research into targeted activations since.
19 hours ago
Golden Gate Claude was two years ago and it's surprising there hasn't been as much research into targeted activations since.
There’s been some, but naive activation steering makes models dumber pretty reliably and training an SAE is a pretty heavy lift.