Comment by tinyhouse
3 months ago
They can take something like an image of a graph and provide a description of it. From my understanding, these are multimodal models with reasoning capabilities.
3 months ago
They can take something like an image of a graph and provide a description of it. From my understanding, these are multimodal models with reasoning capabilities.
No comments yet
Contribute on Hacker News ↗