Comment by rurban

1 month ago

Local models are not comparable to the FOTA models at all. I know what I'm saying because I do have 4 local H100's in my server, and could run the very best local models. It's night and day. They are unusable and stupid.

8 comments

rurban

poisonborz 1 month ago

For what do you use the 4 local H100s then?

rurban 1 month ago

For training our AI model of course. Inference is for the cheaper machines.

wasabinator 1 month ago

Not all tasks require a frontier model

y0eswddl 1 month ago

what is "FOTA"

rurban 1 month ago

That's my auto-correction, because I'm doing too much embedded (Firmware-over-the-air updates). Frontier it should be called.

bravetraveler 1 month ago

I get perfectly acceptable results from a Strix Halo PC the size of a shoebox, man. An APU that uses ~150w, has 0 discrete GPUs, and a bill of $0/m. What's more, it doesn't go down every week, limit use, or change the terms at a whim.

I'll burn/discard 'frontier' tokens (at work) only because they're mandated and they foot the bill. I'd rather resell them; meet the asinine requirement from $EMPLOYER, provide cover for outsourcing to my equipment, and get a return for the hassle.

TLDR: perhaps you're holding it wrong or haven't tried the latest, as we so often hear. That's a lot of GPU for not much utility.

rurban 1 month ago
Well, my python and typescript folks are also happy with the simplier local models. But I'm using more advanced stuff, C/C++ embedded real-time, vision AI, and compilers.
- bravetraveler 1 month ago
  
  Fair point. I treat LLMs like the forgetful junior we often hear about. The things I don't care to do, they (both local and hosted/'frontier') can. Boilerplate, very-well-described edits, some research/report, etc; a lot is riding on 'acceptable'.
  Easier to spawn another terminal pane/browser tab than hire a contractor, I just don't find the 'frontier' services/terms compelling.