← Back to context

Comment by boroboro4

3 days ago

Ok, bonus content #2.

I took Qwen3 1.7B model and did the same but rather then using embedding vector I used vector after 1st/etc layer, below accuracies for 1st positions:

- embeddings: 0.855

- 1st: 0.913

- 2nd: 0.870

- 3rd: 0.671

- 16th: 0.676

- 20th: 0.683

And now mega bonus content: the same but with prefix "count letters in ":

- 1st: 0.922

- 2nd: 0.924

- 3rd: 0.920

- 16th: 0.877

- 20th: 0.895

And for 2nd letter:

- embeddings: 0.686

- 1st: 0.679

- 2nd: 0.682

- 3rd: 0.674

- 16th: 0.572