Comment by simonw

2 years ago

If you want to play with a demo of this kind of thing I suggest this Colab notebook: https://linus.zone/contra-colab - via https://twitter.com/thesephist/status/1711597804974739530

It demonstrates a neat model that's specifically designed to let you embed text, manipulate the embeddings (combine them, average them or whatever) and then turn them back into text again. It's fascinating.

3 comments

simonw

jxmorris12 2 years ago

vec2text has a pretty nice demo notebook too! You just need to provide your OpenAI API key, since we use their API to get embeddings. Here's a link: https://colab.research.google.com/drive/14RQFRF2It2Kb8gG3_YD... and a link to vec2text on Github: https://github.com/jxmorris12/vec2text.

srush 2 years ago

This is an awesome notebook.

Just want to note that the difference in this paper is that it works without direct access to the embedding models (encoder). So it can't design the embedding space.

thesephist 2 years ago

Thanks for sharing Simon! I will note that by training an adapter layer between this autoencoder's embedding space and OpenAI's, it's possible to recover a significant amount of detail from text-embedding-ada-002's embeddings with this model too[0]. But as the paper author's reply in a different thread points out, their iterative refinement approach is able to recover much more detail in their research with a smaller model.

[0] https://twitter.com/thesephist/status/1698095739899974031