Comment by rurban

1 month ago

Local models are not comparable to the FOTA models at all. I know what I'm saying because I do have 4 local H100's in my server, and could run the very best local models. It's night and day. They are unusable and stupid.

I get perfectly acceptable results from a Strix Halo PC the size of a shoebox, man. An APU that uses ~150w, has 0 discrete GPUs, and a bill of $0/m. What's more, it doesn't go down every week, limit use, or change the terms at a whim.

I'll burn/discard 'frontier' tokens (at work) only because they're mandated and they foot the bill. I'd rather resell them; meet the asinine requirement from $EMPLOYER, provide cover for outsourcing to my equipment, and get a return for the hassle.

TLDR: perhaps you're holding it wrong or haven't tried the latest, as we so often hear. That's a lot of GPU for not much utility.

  • Well, my python and typescript folks are also happy with the simplier local models. But I'm using more advanced stuff, C/C++ embedded real-time, vision AI, and compilers.

    • Fair point. I treat LLMs like the forgetful junior we often hear about. The things I don't care to do, they (both local and hosted/'frontier') can. Boilerplate, very-well-described edits, some research/report, etc; a lot is riding on 'acceptable'.

      Easier to spawn another terminal pane/browser tab than hire a contractor, I just don't find the 'frontier' services/terms compelling.