Ironwood, our latest TPU

2 months ago (blog.google)

40 comments

zdw

Google having their own hardware for training and inference is newsworthy, but the link is pretty bad. Here is a much better source https://blog.google/products/google-cloud/ironwood-tpu-age-o...

shrubble 2 months ago

Not much real data or news there.

gorbot 2 months ago

I'm an idiot and I know nothing

But I wonder if there could be room for an ARM-like spec that Google could try and own and license but for AI chips. Arm is to risc-cpu as google-thing is to asic-aichip

Prolly a dumb idea, better to sell the chips or access to them?

eru 2 months ago
I'm not sure the chip spec (or instruction set) is the level of abstraction here?
Something like DirectX (or OpenGL) might be the better level to target? In practice, CUDA is that level of abstraction, but it only really works for Nvidia cards.
- karmakaze 2 months ago
  
  It's not that it only works on Nvidia cards, it's only allowed to work on Nvidia cards. A non-clean room implementation of CUDA for other hardware has been done but is a violation of EULA (of the thing that was reverse engineered), copyright on the driver binary interface, and often patents. Nvidia aggressively sends cease-and-desist letters and threatens lawsuits (successfully killed ZLUDA, threatened others). It's an artificial (in a technical sense moat).
  
  5 replies →
- latchkey 2 months ago
  
  > CUDA is that level of abstraction, but it only really works for Nvidia cards.
  There are people actively working on that.
  https://scale-lang.com/
- pjmlp 2 months ago
  
  Not really, because as usual people misunderstand what CUDA is.
  CUDA is hardware designed according to the C++ memory model, with first tier support for C, C++, Fortran and Python GPGPU DSLs, with several languages also having a compiler backend for PTX.
  Followed by IDE integration, a graphical debugger and profiler for GPU workloads, and an ecosystem of libraries and frameworks.
  Saying just use DirectX, Vulkan, OpenGL instead, misses the tree from the forest that is CUDA, and why researchers rather use CUDA, than deal with yet another shading language or C99 dialect, without anything else.
amypetrik8 2 months ago

they tried selling years ago, not much happened, coral
now they dont want to sell them - why power local inference when they can saubscribe forever and you get their juicy datas too

jeffbee 2 months ago

These are only available in Iowa on GCP, which to me raises this question: do they have them all over the world for their own purposes, or does this limited geography also mean that users of Google AI features get varied experiences depending on their location?

londons_explore 2 months ago

Things needing the most compute (llm's, image and video generation) tend not to be latency sensitive.
100ms of latency is nothing when added to 10 seconds of generation time.
wmf 2 months ago
Running on v6 vs v7 should just be different performance.
- jeffbee 2 months ago
  
  If a search feature runs on a deadline then different performance could be observable as more work done in 100ms or whatever unit of time.

aurareturn 2 months ago

I think we need an analysis of tokens/$1 and tokens/second for Nvidia Blackwell vs Ironwood.

ipnon 2 months ago

It depends on how they’re utilized , especially at these scales, you have to squeeze every bit out.

bigyabai 2 months ago

> It’s designed for AI with AI

CUDA engineers, your job security has never felt more certain.

htrp 2 months ago

So what's the difference between their announcement in april and now?

bgwalter 2 months ago

So we will be getting wrong answers faster now.

ragequittah 2 months ago
I'll never understand this attitude. Recently I set up a full network with 5 computers, opnsense, xcp-ng and a few things like a pi, switch, AP, etc.
I was migrating from pfsense to Opnsense so I wasn't too familiar with some of the nitty gritty. Was migrating to xcp-ng 8.3 from 8.2 which has some major CLI differences. It was a pretty big migration that took me a full weekend.
OpenAI got things wrong (mostly because it was using old documentation - opnsense had just upgraded) maybe 8 times in the whole project and was able to quickly correct itself when I elaborated on the problem.
If I just had google this would've been a 2 week project easily. I'd have to drudge through extremely dry documentation that mostly doesn't apply to anything I'm doing. Would have to read a bunch of toxic threads demeaning users who don't know everything. Instead I had chatgpt 5 do all that for me and got to the exact same result with a tenth of the effort.
The AI is useless crowd truly makes me scratch my head.
- tarsinge 2 months ago
  
  > The AI is useless crowd truly makes me scratch my head.
  I think it's because, past autocomplete, for AI to be useful professionally you need to already have a lot of background and experience in what you are using it for, in addition to engineering and project management to keep the scope on track. While demos with agents are impressive in practice autonomy is not there they need strong guidance, so it only works as very smart assistant. What you are describing is very representative of this.
  If you don't have that level of seniority then you'll struggle to get value from AI because it'll be hard to guide and keep on track, also spotting and navigating errors and wrong thinking paths. You cannot use it as an assistant, only takes what it says at face value, and given it'll randomly be wrong it makes it useless.
  
  2 replies →
- bgwalter 2 months ago
  
  Feeling glad that one is insulated from the knowledgeable users that have trained the "AI" that stole their IP is just strange.
  "AI" is also larger than plagiarizing Stackoverflow. Google AI answers on any topic, which most people use, are pretty poor.
  Coming back to sysadmin/programming. There are many migration guides from pfsense to Opnsense, for example (note there are no mean people in that thread):
  https://forum.opnsense.org/index.php?topic=32793.0
  The estimates are days, which is not that different from a weekend.
  OpenAI now basically has your firewall configuration and who knows what else, so I would not recommend using "AI" for such sensitive matters.
  
  1 reply →
- oliwarner 2 months ago
  
  > If I just had google this would've been a 2 week project easily.
  But you'd know something new by the end of it.
  So many are so fast to skip the human experience element of life that they're turning themselves into mere prompt generators, happy to regurgitate others' knowledge without feeling or understanding.
  For this, you might not care to gain meaningful experience, and as a conscious choice, that's fine. But there are an increasing number of developer and developer adjacent people who reach for the LLM first. Who don't understand "their" contributions to projects.
  The haters are those of us who have to deal with this slop, and the sloppy people submitting it without thought, care or understanding.
  
  11 replies →