Comment by rafaelero
9 months ago
VQVAE is trained to reconstruct the image, so in theory it should contain all the information (both content and location) inside its embeddings.
9 months ago
VQVAE is trained to reconstruct the image, so in theory it should contain all the information (both content and location) inside its embeddings.
No comments yet
Contribute on Hacker News ↗