Comment by nohat
1 day ago
I tried this a few months back with claude 3.5 writing cadquery code in cline, with render photos for feedback. I got it to model a few simple things like terraforming mars city fairly nicely. However it still involved a fair bit of coaching. I wrote a simple script to automate the process more but it went off the rails too often.
I wonder if the models improved image understanding also lead to better spatial understanding.
how did you feedback the rendered photos or was it a manual copy-paste step?
Just a python script to render then api call with a prompt to check if the render looks right.