Comment by tinyhouse

4 months ago

They can take something like an image of a graph and provide a description of it. From my understanding, these are multimodal models with reasoning capabilities.

0 comments