Comment by embedding-shape

8 hours ago

If you have a model that only know how to model CAD but also doesn't know history, and was trained on visual language of said history, how is it supposed to be able to model the Pantheon in the first place? It'd only be able to model exactly what you can describe with text, or even worse, exactly what it'd be able to visually extract from images via the vision encoders, for "vision models", but it'd be a far cry from what you see in this blogpost, would be my guess.

0 comments

embedding-shape

No comments yet

Contribute on Hacker News ↗