Comment by nohat

1 day ago

I tried this a few months back with claude 3.5 writing cadquery code in cline, with render photos for feedback. I got it to model a few simple things like terraforming mars city fairly nicely. However it still involved a fair bit of coaching. I wrote a simple script to automate the process more but it went off the rails too often.

I wonder if the models improved image understanding also lead to better spatial understanding.

how did you feedback the rendered photos or was it a manual copy-paste step?

  • Just a python script to render then api call with a prompt to check if the render looks right.