Comment by rafaelero
2 years ago
VQVAE is trained to reconstruct the image, so in theory it should contain all the information (both content and location) inside its embeddings.
2 years ago
VQVAE is trained to reconstruct the image, so in theory it should contain all the information (both content and location) inside its embeddings.
No comments yet
Contribute on Hacker News ↗