Comment by boroboro4
3 days ago
Ok, bonus content #2.
I took Qwen3 1.7B model and did the same but rather then using embedding vector I used vector after 1st/etc layer, below accuracies for 1st positions:
- embeddings: 0.855
- 1st: 0.913
- 2nd: 0.870
- 3rd: 0.671
- 16th: 0.676
- 20th: 0.683
And now mega bonus content: the same but with prefix "count letters in ":
- 1st: 0.922
- 2nd: 0.924
- 3rd: 0.920
- 16th: 0.877
- 20th: 0.895
And for 2nd letter:
- embeddings: 0.686
- 1st: 0.679
- 2nd: 0.682
- 3rd: 0.674
- 16th: 0.572
No comments yet
Contribute on Hacker News ↗