Comment by mythz

7 days ago

I consider HuggingFace more "Open AI" than OpenAI - one of the few quiet heroes (along with Chinese OSS) helping bring on-premise AI to the masses.

I'm old enough to remember when traffic was expensive, so I've no idea how they've managed to offer free hosting for so many models. Hopefully it's backed by a sustainable business model, as the ecosystem would be meaningfully worse without them.

We still need good value hardware to run Kimi/GLM in-house, but at least we've got the weights and distribution sorted.

83 comments

mythz

data-ottawa 7 days ago

Can we toss in the work unsloth does too as an unsung hero?

They provide excellent documentation and they’re often very quick to get high quality quants up in major formats. They’re a very trustworthy brand.

disiplus 7 days ago
Yeah, they're the good guys. I suspect the open source work is mostly advertisements for them to sell consulting and services to enterprises. Otherwise, the work they do doesn't make sense to offer for free.
- danielhanchen 6 days ago
  
  Haha for now our primary goal is to expand the market for local AI and educate people on how to do RL, fine-tuning and running quants :)
  
  8 replies →
- arcanemachiner 6 days ago
  
  I hope that is exactly what is happening. It benefits them, and it benefits us.
swyx 6 days ago
not that unsung! we've given them our biggest workshop spot every single year we've been able to and will do until they are tired of us https://www.youtube.com/@aiDotEngineer/search?query=unsloth
- danielhanchen 6 days ago
  
  Appreciate it immensely haha :) Never tired - always excited and pumped for this year!
danielhanchen 6 days ago

Oh thank you - appreciate it :)
cubie 7 days ago
I'm a big fan of their work as well, good shout.
- danielhanchen 6 days ago
  
  Thank you!

Tepix 7 days ago

It's insane how much traffic HF must be pushing out of the door. I routinely download models that are hundreds of gigabytes in size from them. A fantastic service to the sovererign AI community.

razster 6 days ago
My fear is that these large "AI" companies will lobby to have these open source options removed or banned, growing concern. I'm not sure how else to explain how much I enjoy using what HF provides, I religiously browse their site for new and exciting models to try.
- culi 6 days ago
  
  ModelScope is the Chinese equivalent of Hugging Face and a good back up. All the open models are Chinese anyways
  
  22 replies →
- throwaway27448 6 days ago
  
  They can try. I don't think they'll be able to get the toothpaste back in the tube. The data will just move our of the country.
  
  1 reply →
- dotancohen 6 days ago
  
  How do you choose which models to try for which workflows? Do you have objective tests that you run, or do you just get a feel for them while using them in your daily workflow?
- toofy 6 days ago
  
  it’s only a matter of time. we have all seen first hand how … wrong … these companies behave, almost on a regular basis.
  there’s a small tinfoil hat part of me that suspects part of their obscene investments and cornering the hardware market is driven by an conscious attempt to stop open source local from taking off. they want it all, the money, the control, and to be the only source of information to us.
Onavo 6 days ago

Bandwidth is not that expensive. The Big 3 clouds just want to milk customers via egress. Look at Hetzner or CloudFlare R2 if you want to get get an idea of commodity bandwidth costs.
vardalab 6 days ago
Yup, I have downloaded probably a terabyte in the last week, especially with the Step 3.5 model being released and Minimax quants. I wonder what my ISP thinks. I hope they don't cut me off. They gave me a fast lane, they better let me use it, lol
- fc417fc802 6 days ago
  
  Even fairly restrictive data caps are in the range of 6 Tb per month. P2P at a mere 100 Mb works out to 1 TiB per 24 hours.
  Hypothetically my ISP will sell me unmetered 10 Gb service but I wonder if they would actually make good on their word ...
  
  7 replies →

zozbot234 7 days ago

> We still need good value hardware to run Kimi/GLM in-house

If you stream weights in from SSD storage and freely use swap to extend your KV cache it will be really slow (multiple seconds per token!) but run on basically anything. And that's still really good for stuff that can be computed overnight, perhaps even by batching many requests simultaneously. It gets progressively better as you add more compute, of course.

Aurornis 6 days ago
> it will be really slow (multiple seconds per token!)
This is fun for proving that it can be done, but that's 100X slower than hosted models and 1000X slower than GPT-Codex-Spark.
That's like going from real time conversation to e-mailing someone who only checks their inbox twice a day if you're lucky.
- zozbot234 6 days ago
  
  You'd need real rack-scale/datacenter infrastructure to properly match the hosted models that are keeping everything in fast VRAM at all times, and then you only get reasonable utilization on that by serving requests from many users. The ~100X slower tier is totally okay for experimentation and non-conversational use cases (including some that are more agentic-like!), and you'd reach ~10X (quite usable for conversation) by running something like a good homelab.
HPsquared 7 days ago
At a certain point the energy starts to cost more than renting some GPUs.
- vardalab 6 days ago
  
  Yeah, that is hard to argue with because I just go to OpenRouter and play around with a lot of models before I decide which ones I like. But there's something special about running it locally in your basement
  
  1 reply →
- fc417fc802 6 days ago
  
  Aren't decent GPU boxes in excess of $5 per hour? At $0.20 per kWhr (which is on the high side in the US) running a 1 kW workstation 24/7 would work out to the same price as 1 hour of GPU time.
  The issue you'll actually run into is that most residential housing isn't wired for more than ~2kW per room.

sowbug 7 days ago

Why doesn't HF support BitTorrent? I know about hf-torrent and hf_transfer, but those aren't nearly as accessible as a link in the web UI.

embedding-shape 7 days ago
> Why doesn't HF support BitTorrent?
Harder to track downloads then. Only when clients hit the tracker would they be able to get download states, and forget about private repositories or the "gated" ones that Meta/Facebook does for their "open" models.
Still, if vanity metrics wasn't so important, it'd be a great option. I've even thought of creating my own torrent mirror of HF to provide as a public service, as eventually access to models will be restricted, and it would be nice to be prepared for that moment a bit better.
- sowbug 7 days ago
  
  I thought of the tracking and gate questions, too, when I vibed up an HF torrent service a few nights ago. (Super annoying BTW to have to download the files just to hash the parts, especially when webseeds exist.) Model owners could disable or gate torrents the same way they gate the models, and HF could still measure traffic by .torrent downloads and magnet clicks.
  It's a bit like any legalization question -- the black market exists anyway, so a regulatory framework could bring at least some of it into the sunlight.
  
  3 replies →
- Barbing 5 days ago
  
  That would be a very nice service. I think folks might rely on it for a number of reasons, including that we'll want to see how biases changed over time. What got sloppier, shillier...
- jimbob45 6 days ago
  
  Wouldn’t it still provide massive benefits if they could convince/coerce their most popular downloaded models to move to torrenting?
  
  1 reply →
- homarp 6 days ago
  
  how are all the private trackers tracking ratios?
- taminka 6 days ago
  
  most of the traffic is probably from open weights, just seed those, host private ones as is

Fin_Code 7 days ago

I still don't know why they are not running on torrent. Its the perfect use case.

heliumtera 7 days ago

How can you be the man in the middle in a truly P2P environment?
freedomben 7 days ago
That would shut out most people working for big corp, which is probably a huge percentage of the user base. It's dumb, but that's just the way corp IT is (no torrenting allowed).
- zozbot234 7 days ago
  
  It's a sensible option, even when not everyone can really use it. Linux distros are routinely transfered via torrent, so why not other massive, open-licensed data?
  
  4 replies →