Comment by jcuenod
10 days ago
So I had a similar experience with your prompt (on the f16 model). But I do think that, at this size, prompting differences make a bigger impact. I had this experience trying to get it to list entities. It kept trying to give me a bulleted list and I was trying to coerce it into some sort of structured output. When I finally just said "give me a bulleted list and nothing else" the success rate went from around 0-0.1 to 0.8+.
In this case, I changed the prompt to:
---
Tallest mountains (in order):
```
- Mount Everest
- Mount K2
- Mount Sahel
- Mount Fuji
- Mount McKinley
```
What is the second tallest mountain?
---
Suddenly, it got the answer right 95+% of the time
Still pretty sad that its only 95% instead of 99%