Comment by throwaway314155
3 days ago
Deeply uninformed person here:
Is the inference cost of generating this tree to be pruned something of a hindrance? In particular I'm watching your MNIST example and thinking - does each cell in that video require a full inference? Or is this done in parallel at least? In any case, you're basically memory for "faster" runtime (for more correct outputs), no?
This understanding is incorrect. The video samples all the leaf nodes of the entire tree only to visualize the distribution in latent space. In normal use, only the L outputs along a single path are generated.
Interesting, thanks for clarifying.