← Back to context

Comment by jon-wood

1 day ago

I’m not usually one to ask this because learning to do a thing can be fun, but why exactly have you spent 25 thousand dollars on getting an LLM someone else made to answer maths exam questions?

The cost is obviously not that big of factor for OP as it might be for others. It's actually refreshing to hear the candid viewpoint that he expresses here.

  • 25k is definitely a lot but I did the risk analysis and I figured worst case I would lose a 1000-2000 after a year of playing around with it, so I look at it more like renting (I'm going to keep the Macbook Pro no matter what since I needed a new one).

I didn't spend that much, only $6500 AUD for a GB10 based Asus GX10 which is even slower than OPs, but I spent that because it makes for a great learning platform. Theres not much else that lets me fiddle with 128GB of RAM for my graphics processor, and it's quite lovely to be able to run things as long as I like without worrying about my cloud instance being shut down.

It's not financially a good idea: renting really does beat owning, and cloud beats both if you're only running inference on these machines. But I'm not just doing inference, and as a thing I can do silly stuff on to learn, it's hard to beat!

  • When you say you are not just doing inference, you mean you are also training your own llms? I am curious what other things can be done.

    • Fine tuning, and yeah training my own, experimenting with architectures and learning how it all works. Been a lot of fun

  • $6500 AUD can get you a good chunk of B200 time on any of the GPU neoclouds :)

    • Less than I expected, though! And I get to run this all through the night

      I do still use Vast and Runpod for things too, but it’s much nicer to test a fine tuning run here to make sure I’m in the ballpark

      I also did literally say “It's not financially a good idea, renting is better than owning” so I’m confused why I have two people telling me that

      Also it’s just far more fun to play with something tangible to me :)

  • You could just rent a bare metal server with those specs

    • Yes I could, but that is annoying because of spot pricing and having my instance shut down, and it has fluctuating prices

      It’s also annoying because then I need to make sure my little “lab” setup is well automated, and I’m lazy :)

      Also, I literally said “ It's not financially a good idea” so I’m confused why you think I don’t know that.

      2 replies →

Privacy and offline operation are valuable or non-negotiable in some cases, but the difference is pretty categorical between what can run on a single card and what can run on a DGX GB200 NVL72 cabinet. Doesn't mean it's not worth seeing how far local models can be pushed. Not every problem needs a senior engineer.

  • I know it's one of those "if you have to ask" situations, but curiosity got the better part of me. Here's the search assist response:

    "The DGX GB200 NVL72 AI server costs approximately $3 million per unit. This system includes 72 Blackwell GPUs and 36 Grace CPUs, making it one of the most powerful AI servers available."

    The search assist actually credited a source used with: https://www.tweaktown.com/news/98292/nvidias-new-gb200-super...

    That $25k spend by GGGP seems like nothing in comparison. That's ~1/3 of one chip in that cabinet. God gawd I'm old and out of touch with modern AI data centers.

    • By comparison, the Colossus 1 data center had 32,000 GB200s (as well as 150,000 H100 GPUs, 50,000 H200 GPUs), and they are bringing another 110,000 GB200s online (although this might be Colossus 2?)

      There are bigger data centers than Colossus 1 around too.

      There is a reason NVidia is the most valuable company on the planet.

      https://en.wikipedia.org/wiki/Colossus_(supercomputer)#Curre...

    • It's The Circle of Computing Life. The pendulum swings between centralised mainframe timesharing-for-hire and desktop individuality.

      We've been in a centralised phase for longer than usual - first cloud everything, then AI - but at some point in the next decade prices will crash and a market will appear for personal, local intelligence.

      1 reply →

  • > the difference is pretty categorical between what can run on a single card and what can run on a DGX GB200 NVL72 cabinet.

    A better way of putting it is that you can run plenty of things on a single ordinary system, but you may be disappointed at the performance. Generally, you can't expect inference to be as quick as with cloud for SOTA-like models. You have to run smaller models for quick replies, and large models with a lot of real-world knowledge for less time-critical inference, possibly batching many requests simultaneously to improve throughput.

One year ago finetuned local LLMs had a significant edge over ChatGPT or Claude. Look up in YouTube all the DIY videos testing LLMs on their own machines with different setups.

Remember: one year showed up to be a gigantic leap in regards to quality of results and innovation in the AI space. Agents weren't really a thing and vibe coding wasn't even invented as a term because the top notch tools at the time were lousy, with lovable being the frontrunner with its - in my view - sorry Tailwind recombination tool shaming AI to do the work.

Then fall hit 2025 hit us, new year's eve and suddenly there was such a massive surge of innovation and competition with ChatGPT Codex suddenly showing up.

Remember: one year ago many now commonly used tools weren't yet available like Nano Banana or Codex.

"The 25k are so vast" - Yes, and no. For example, if the machine is bought for business usage I can deduct the costs from taxes. This roughly amount for 50% of the financial burden.

So I jokingly use to say, that I pay only half the price for my Apple business machines. And yes, I am strict in this regard. Business means business. No private emails etc. nothing on my company computers.

Maybe there are other options as well to reduce the financial expenses the dude mentions, but it doesn't seem so.

I would also go for leasing, this way already the monthly payments can be deduced and I don't need to buy and maybe resell the machine.

Apple is a luxury good. Without business usage or at least partly using it for business as well as private (mixed usage in tax reports) I wouldn't buy the devices or think twice.

Apple under Cook evolved into a Gucci like luxury brand, that is more and more a rip off than quality delivered, especially considering the latest OS updates for Mac, iOS and iPad. Apple is a mess, following Microsoft Windows' footsteps happily, because the CEO is as has been correctly assessed, no product guy.

But I stop with my rant here.

Always try to use tax deduction as leverage for your computer expenses. Every citizen should invest in basic knowledge about that.

Even a 10-20% professional usage for work (mixed usage) gives you a noticeable advantage over normal pay.

It's just a project I'm working on. I'm working on projects where AIs are processing and classifying large amounts of data that would be a lot of work for humans to do.

  • I think of LLMs as being well equipped for handling dynamic data or adapting to unforeseen circumstances well (random code requests, website's ever changing layouts, typos, non-standard formatting in docs, groking out important info, etc), but math problems are be definition a very specific set of instructions to run, so is the overhead and "thinking" aspect of a LLM/AI even needed here? I'm genuinely curious, btw, I'm not asking sarcastically. Can't these math problems just be yanked from some test file and rapid fired directly at a gpu/compute unit?

    • > Can't these math problems just be yanked from some test file and rapid fired directly at a gpu/compute unit?

      Yes this is exactly what I'm doing. I isolated the actual math question, and then sent it to my two servers to process and that's what's taking 10m+ to return. I'm asking them to solve the question and return the full answer along with their steps. I care about correctness so taking time is okay but I can't use 10m per solution.

      1 reply →

This is making me feel a lot better about my plan to lease a $25k EV simply because it's available at a massive discount. I'll probably end up using less electricity, too.

That hardware is costing him ~1$/hour over 3 years. Presumably having it answer math questions was a tiny fraction of what he was using it for.

I’ve spent twice that on hosting movies and tv for Plex, so… I think they are worthy of my praise. What a healthy outlet for money.

Because buying Macs is not about performance, its about feeling like you are rich.

That money could have been spent on way more bang/buck performance in the form of a set of 4 graphics cards.

Also I would probably put the odds 70:30 that Apple marketing is astroturfing on HN from the amount of posts about running llms on Macbooks, because in reality, the inference speed of any decent llm is unusable on a Macbook despite the ability to fit it into RAM.

  • 40-80 tok/s is unusable to you? Ok.

    If you like having a box with 8-12 fans blasting hot air and noise into your office all day, nobody's stopping you.

  • Or it could have had way more bang/buck by feeding a family of real brains for a year or two

    • Excuse me for this comment, really, but I can't comprehend the absurdity, some people are buying GPUs when other people have no money for insulin so they literally die. I don't mean anything towards op or gp, quite the opposite I'm truly happy they have this kind of freedom, it must feel really nice, I just hate this game so much.