Comment by scribu

1 year ago

Would be curious to know how this stacks up against Coconut [1] which also uses latent space for reasoning.

4 comments

scribu

Definitely curious, this looks very similar to Coconut, even down to the CoT encoding process in Figure 2. They go into a lot more detail though, seems like parallel innovation.

singularity2001 1 year ago

I wonder whether even those models which emit thinking tokens in reality do most of the work within the latent space so the difference is only superficial

esafak 1 year ago

I'm behind on reading but don't all models use continuous embeddings to represent reasoning?

winwang 1 year ago

I believe the "continuous" in Coconut means that the CoT is in the continuous latent space, instead of being on output tokens (see Fig. 1).