Comment by another_poster

1 year ago

Is “multimodal reasoning” as big a deal as it sounds? Does this technique mean LLMs can generate chains of thought that map to other modalities, such as sound and images?

2 comments

another_poster

ygouzerh 1 year ago

From what I understood (not an expert), it seems that it's the goal, to see if the knowledge in one modality can be translated in an another one. Typically, if a model trained on sound can leverage the knowledge of musical theory, it would be quite interesting

exclipy 1 year ago

It'd be cool to see its reasoning for solving visual puzzles, as imagery.