Comment by magimas

1 month ago

this completely misses how crazy word2vec is. The model doesn't get told anything about word meanings and relationships and yet the training results in incredibly meaningful representations that capture many properties of these words.

And in reality you can use it in much broader applications than just words. I once threw it onto session data of an online shop with just the visited item_ids one after another for each individual session. (the session is the sentence, the item_id the word) You end up with really powerful embeddings for the items based on how users actually shop. And you can do more by adding other features into the mix. By adding "season_summer/autumn/winter/spring" into the session sentences based on when that session took place you can then project the item_id embeddings onto those season embeddings and get a measure for which items are the most "summer-y" etc.