Comment by nomel

2 hours ago

Like I said, it would be neat if someone benchmarked it. It's definitely an anecdote.

Try it though. If it's context rot, then I don't think the weekend reset I mentioned should work? For me, it very reliably does. Or, maybe the weekend reset is just putting the current context into a more "productive" latent space. But, if that's possible, then that would suggest it was previously in a less productive space?

Maybe a test would be ask the LLM what time it thinks it is, or just if it's tired once, within sessions of different length (not within same, since that could pollute the context) to see if there's any relation between length and statistics of a late/tired type response?

Again, I'm sure all this will go away. They're getting good at beating these "unhelpful" statistics out of the base models.

0 comments

nomel

No comments yet

Contribute on Hacker News ↗