← Back to context

Comment by jxmorris12

2 months ago

> Given the demonstrated risk of information leakage from embeddings, have you explored any methods for hardening, obfuscating, or 'watermarking' embedding spaces to resist universal translation and inversion?

No, we haven't tried anything like that. There's definitely a need for it. People are using embeddings all over the place, not to mention all of the other representations people pass around (kv caches, model weights, etc.).

One consideration is that's likely going to be a tradeoff between embedding usefulness and invertability. So if we watermark our embedding space somehow, or apply some other 'defense' to make inversion difficult, we will probably sacrifice some quality. It's not clear yet how much that would be.

Are you continuing research? Is there somewhere we can follow along?

  • Yes! For now just our Twitters: - Rishi, the first author: x.com/rishi_d_jha - Me: x.com/jxmnop

    And there's obviously always ArXiv. Maybe we should make a blog or something, but the updates really don't come that often.