Comment by westoque

1 month ago

> I simply booted up a VM with an H100, ssh’d into it with Cursor, and prompted the agent to set up an inference server that I could ping from my web generation app. What used to take hours or days of painful, slow debugging now takes literally minutes.

an awesome takeaway from this is that self-hosted models are the future! can't wait for hardware to catch up and we can do much more experiments on our laptops!