← Back to context

Comment by radarsat1

1 year ago

I think this may be the best explanation I've seen on the topic!

But, shouldn't that situation be handled somewhat by backtracking sampling techniques like beam search? But maybe that is not used much in practice due to being more expensive.. don't know.

Thanks!

I'm not sure if beam search would help much since it's just based on combining word probabilities as opposed to overall meaning, but something like "tree of thoughts" presumably would since I doubt these models would rate their own hallucinated outputs too highly!