Comment by wongarsu
4 hours ago
I just tried it on qwen3-embedding:8b with a little vibe-coded 100 line script that does the obvious linear math and compares the result to the embeddings of a couple of candidate words using cosine similarity, and it did prefer the expected words. Same 22 candidates for both questions
king - man + woman ≈ queen (0.8510)
Top similarity
0.8510 queen
0.8025 king
0.7674 princess
0.7424 woman
0.7212 queen Elizabeth
Berlin - Germany + France ≈ Paris (0.8786)
Top similarity
0.8786 Paris
0.8309 Berlin
0.8057 France
0.7824 London
Sure, 0.85 is not an exact match so things are not exactly linear, and if I dump an entire dictionary in there it might be worse, but the idea very much works
Edit: after running a 100k wordlist through qwen3-embedding:0.6b, the closest matches are:
king – man + woman ≈ queen (0.7873)
berlin – germany + france ≈ paris (0.9038)
london – england + france ≈ paris (0.9137)
stronger – strong + weak ≈ weaker (0.8531)
stronger – strong + nation ≈ country (0.8047)
walking – walk + run ≈ running (0.9098)
So clearly throwing a dictionary at it doesn't break it, the closest matches are still the expected ones. The next closest matches got a lot more interesting too, for example the four closest matches for london – england + france are (in order) paris, strasbourg, bordeaux, marseilles
No comments yet
Contribute on Hacker News ↗