Am I the only one excited for the release but not overanalyzing their words? This thread feels full of personal interpretations. DeepSeek is still a business—great release, but expectations and motivations seem inflated.
Probably it's because there's nothing specific here to discuss. In the absence of specific new information, discussions turn generic [1] and that tends to make for shallow/indignant discussion. That's one reason why an announcement of announcement (like "Starting next week, we'll open-source 5 repos") is off topic on HN [2].
The releases themselves may turn out to be interesting, of course, and then there may be something substantive to have a thread about. The best submission would be to pick the most interesting release once it shows up.
The "launch week" pattern isn't great for HN, because we end up with a bunch of follow-ups that we have to downweight [3], and there's no guarantee that the largest thread(s) will be about the most interesting element(s) in the sequence. But startups do it anyway so we'll adapt.
Most interested to see their inference stack, hope that’s one of the 5. I think most people are running R1 on a single H200 node but Deepseek had much lower RAM per GPU for their inference and so had some cluster based MoE deployment.
Correct. There are 3 main ways to "gimp" high end GPUs meant for training - "cores", "on-chip memory speed" and "interconnects". IIUC the H800 had the first 2 unchanged but halved the interconnect speeds.
H20 is the next iteration of the "sanctions" that I believe also limited the "cores" but left the on-chip memory intact, or slightly higher (from the new generation).
> We believe DeepSeek has access to around 10,000 of these H800s and about 10,000 H100s. Furthermore they have orders for many more H20’s, with Nvidia having produced over 1 million of the China specific GPU in the last 9 months.
Emotionally I agree, but... o1 was a paradigm shift. Nothing DeepSeek has done is on that level yet. DeepSeek themselves would agree. Supposedly Liang Wenfeng himself flew to US to gather information when o1 was launched.
Maybe in terms of advancing scientific knowledge but DeepSeek has achieved a paradigm shift back from opex to capex.
Certain applications are now economically viable when you don't have to pay per request and don't have to fight NVIDIA/sanctions for the privilege
> Starting next week, we'll open-source 5 repos – one daily drop
Probably counts as announcement of announcement? Let’s wait for the actual repo drops before discussing them, especially because there are no details about what will be open sourced other than
> These are humble building blocks of our online service: documented, deployed and battle-tested in production.
You are right for sure saying to wait for the actual repos.
But on the other hand, compare this announcement in a README.md file in a GitHub repo with this slideware approach of EU https://openeurollm.eu/
If I had to bet on someone providing some value, unfortunately I wouldn't bet on Europe.
I'm saying this as a European, deeply convinced that Europe is a good place to live. I've also worked for a couple of EU funded research projects, so I have some background experience on the outcome of these projects.
You’re not wrong, it’s a hell lot more exciting to watch players organically emerging from a competitive landscape with stuff you can put your hands on today (or next week) than players hand-picked and tasked by governments, making hollow announcements before they have anything interesting to show.
I think before "drop" in electronic music was a widely used term, "dropping a new track" (ie releasing new music) was a common hip-hop term, since forever.
Deep respect for DeepSeek and what they've done regarding all the innovations and researches they have been putting out in-the-open.
"Because every line shared becomes collective momentum that accelerates the journey. Daily unlocks begin soon. No ivory towers - just pure garage-energy and community-driven innovation" is a great phase.
I'm empathetic to this argument, but it feels like it's doing a disservice to the open source ethos in general.
Pragmatic as china is, they may actually see the long term value of being open research leaders to short term profit. They are not as bound to immediate and constant growth as we are, their horizons do not change so dramatically every 4 years to say the least.
Don't many respected developers care deeply about their research being open source? I'm no expert but I've read many an article (maybe I'll buy your bridge, too) that suggests willingness to open source research holds at least some weight in some researchers choosing their company. It strikes me as at least possible some of that is earnest, sure even deepseek isn't open source open source, no training set etc. but it feels like they deserve the benefit of the doubt.
All that said I'm still a student, a master of none, so cannot speak first hand to any of this. Just offering another point of view
I mostly agree with you. Google has a good strategy of driving down costs, for example. I am amazed by the large number of API providers who host either the original DeepSeek R1 or a distilled version.
When cost approaches zero, use cases increase exponentially.
> Most likely, without any intention on their part.
I think this is a very, very naive assumption.
The founder is a quant with involvements in domestic investments and market design and pricing for decades - in China.
As seen with the case of Jack Ma, after you cross a certain level, there is no such thing as "not involved with politics" in China.
Liang knows exactly what he's doing.
> During 2021, Liang started buying thousands of Nvidia GPUs for his AI side project while running High-Flyer. Some industry insiders viewed it as the eccentric actions of a billionaire looking for a new hobby. One of Liang's business partners said they initially did not take Liang seriously and described their first meeting as seeing a very nerdy guy with a terrible hairstyle who could not articulate his vision. Liang simply said he wanted to build something and it will be a game changer which his business partners thought was only possible from giants such as ByteDance and Alibaba Group.
> During that month in an interview with 36Kr, Liang stated that High-Flyer had acquired 10,000 Nvidia A100 GPUs before the US government imposed AI chip restrictions on China.
> On 20 January 2025, Liang was invited to the Symposium with Experts, Entrepreneurs and Representatives from the Fields of Education, Science, Culture, Health and Sports (专家、企业家和教科文卫体等领域代表座谈会) hosted by Premier Li Qiang in Beijing. Liang, being considered as an industry expert, was asked to provide opinions and suggestions on a draft for comments of the annual 2024 government work report.
> On 17 February 2025, Liang along with the heads of other Chinese technology companies attended a symposium hosted by President Xi Jinping at the Great Hall of the People in Beijing.
Whether he intended to or not initially, what happens with DeepSeek is now out of this man's hand and will be 100% influenced by politics.
The chip bans and dual use nature of the technology have catapulted Liang to the first row of CCP tech strategists' attention, for sure.
I am not sure what you mean by AI bubble. Do you mean the valuation of some companies? Or course some won't do well in the future. In the meanty, a significant part of the population uses on it to accelerate their tasks (be it admin work, legal question, learning, getting inspiration). There is no way back. It feels like saying the video streaming bubble will burst in 2020. No. It is too valuable. But yes, some player will die. Nothing special here. IMHO.
A bubble bursting does not mean the industry in the bubble ceases to exist. It means the market hype dies down and only the things that have actual value survive. When it comes to AI, realistically most of the hype is fluff, so calling it a bubble is fair.
I mean the whole world still uses the Internet after the dot-com bubble burst. A significant amount of “AI companies” are valued with revenue multipliers never used before. 44x in the case of OpenAI for example. I agree there is no going back, but this bubble will burst, and hard. IMHO.
Kinda interesting to see where the moat is in AI space. Good base models can always distilled when you have access to API. System prompts can get leaked, and UI tricks can be copied. In the end, the moat might be in the hardware and vertical integration.
> the moat might be in the hardware and vertical integration.
The moat is the products that can be built. The moat is always the product - because a differentiated product can't be a commodity. And an LLM is not a product.
Google and MSFT and Meta have already "won" because they have profitable products they can build LLMs onto. Every other company seems to be burning cash to build a product, and only ChatGPT is getting the brand recognition to realistically compete.
Building an LLM is like building a database. Sure a good one unlocks new uses, but consumers aren't buying something for the database. Meanwhile enterprise customers will shop around and drive the price of a commodity down while open source alternatives grow from in-house uses to destroy moats.
Even hardware isn't a true moat. Only Google has strong vertical integration with their TPUs, and that gives them a lead. BUT Microsoft, AWS, Meta and a whole bunch of startups are building out custom silicon which will surely put pressure on them and Nvidia to keep innovating and earning that price edge.
See I kind of buy the database argument but also kind of don't. A database needs an operator whereas a LLM doesn't. You're basically melting the product into a piece of goo and the UI can be approached using natural language.
For products that still need a UI you could claim that LLM operators take over, so that's still a tax you pay to the incumbents as you interact with a product. It's sort of like we take the money which was paid to SQL operators and engineers and instead pay it to the hyperscalers.
How many times have we been down this path? Tcp/IP, dos/windows, Linux, virtualization, and on and on. Open platforms always seem to find a way to usurp everyone else. In the end, it's better to be a service provider.
You can use the outputs of a closed source model (or deepseek -> llama. see llama 70b deepseek distilled) to create a synthetic training data set which lets you fine tune (distill) most of the benefits of the "smarter" model in to a "dumber" model. This is why openAi does not show the actual full chain of thought but a summarized version. To stop exfiltration of their IP which has proven immensely difficult.*
This is great to see! Open-sourcing infrastructure tools can really accelerate innovation in the AI space. I've found that having access to well-documented repos makes it much easier to experiment and build on existing work. Are there any specific areas these repos focus on, like distributed training or model serving?
How do the valuations of foundation model companies compete with them being firmly open sourced by Facebook and DeepSeek? It seems likely that building these models will not produce hundreds of billions in value given China and Facebook are giving them away largely for free.
Those valuations are built on an imaginary future the founders made investors believe.
The idea is: if we reach true AGI first, we are going to own ALL THE MONEY!
Which erroneously assumes that models can't be siphoned off/recreated, as deepseek proved possible and even reasonably doable. Which in turn fundamentally shows that both openai and anthropic very likely have basically no moat.
I can almost smell another AI winter arriving, once all those valuations meet reality.
I cant see a future where AGI exists and money in general isn't worthless within 6 months of it existing.
Either it kills us all, or makes the creator so much money that it's essentially worthless because they're the only one with money, or creates a utopia where money isn't needed.
IMO it's harder to move away from Oracle DB than from Open AI. The type of businesses that rely on Oracle DB have all the characteristics of a "tech kidnap victim". Huge DB-driven projects, old bad code with few tests, and a profit margin low enough to not be able to fund a migration to a different DB.
I think businesses that rely on new AI models are very different.
Looking forward to it! I'll generally make an effort to use Open Models over proprietary alternatives when the use-case permits as Open Models getting better and more popular encourages more models to become open as well - a requisite for a future to be able to build self-hosted solutions that's not beholden to the control of mega corps and AI monopolies.
Is this actually going to be open source? Or is it going to be just an open weights release? Seeing training code would be interesting.
Personally I don’t think even a true open source release would erase the downsides of the model incorporating CCP propaganda and censorship. I would prefer control of megacorps to control of an untrustworthy dictatorship.
> In economics, the Jevons paradox occurs when technological advancements make a resource more efficient to use (thereby reducing the amount needed for a single application); however, as the cost of using the resource drops, if the price is highly elastic, this results in overall demand increases causing total resource consumption to rise.
Tencent recently bought 100k-200k H20 to serve R1. [1] I think it's not clear open source will tank nvidia price. And you won't place a lot of bets if the outcome is anywhere from certain.
> Why? Because every line shared becomes collective momentum that accelerates the journey.
Truly admireable on their part and a great paradigm for others. Reasons for this doesn't really matter to me but I can't help but wonder if somehow they were obliged or otherwise indebted to follow this route.
> These are humble building blocks of our online service: documented, deployed and battle-tested in production. No vaporware, just code that moved our tiny moonshot forward.
My not-so-innocent guess is that they are looking to crowd-source their online platform (the front-end essentially) in order to reduce costs. Still acceptable though as they made the model open weight and partially re-producible.
Everyone who ever open-sourced anything knows that it just isnt cost cutting. You suddenly get army of people posting issues, opinions and those who try contribute often make more mess than its worth.
Well, although R1-671b is way too expensive for me to self-host, given their past open source (or weight) contributions, I DO have high expectation of them.
Each and every contribution to open source community will be helpful. Thanks DeepSeek!
>> Amodei's / Hassabis' comments in particular came off as so arrogant and annoying.
Exactly which part of their writings comes off as arrogant to you? The only point in Amodei's article[0] that could be remotely be interpreted as arrogant is this:
All of this is to say that DeepSeek-V3 is not a unique breakthrough or something that fundamentally changes the economics of LLM’s; it’s an expected point on an ongoing cost reduction curve. What’s different this time is that the company that was first to demonstrate the expected cost reductions was Chinese.
Maybe I'm different, but it really does sound reasonable judgement to me.
Not only did DeepSeek opensource their model, they also showed the user chain-of-thought right up front, which everyone else rushed to emulate when they saw how much users liked it.
I really like this definition of "AGI": When everyone (yes everyone) benefits from very powerful AI models released for free and it is not gate-kept by one company and it costs $0 to use commercially or for research and you can do whatever you want with it.
Unlike the other counterpart which believes that "AGI" means: "raising billions of dollars to achieve $100BN of profits to their investors". (Which is complete nonsense).
While not totally "open source" by the strictest definition, it is at least better than having no model released with no mention of the architecture on the system card or paper and just vague comments about the 'performance'.
Ladies and gentlemen, this is closer towards being an better "Open AI". Unlike the other alleged $157BN "non-profit" scam.
I think you know which one really is beneficial to humanity and is the real "Open AI".
You’re assuming they won’t follow in OpenAI’s footsteps. OpenAI published a lot for a while and truly changed the world, far more than deepseek has. Only time will tell.
But I think it’d be a mistake to think that this is necessarily beneficial for humanity just because the weights are open. It’s maybe great to commoditize models, but their displacement in jobs, original thought and work, facilitation of disinformation and population psychological warfare doesn’t change… if anything it’s accelerated and harder to temper the bad elements.
Of course. Except we know what happens when one tries to close them up again - someone else will release another more powerful AI model for free.
So it doesn't matter when there are multiple players competing to destroy each others in this race to zero.
> But I think it’d be a mistake to think that this is necessarily beneficial for humanity just because the weights are open. It’s maybe great to commoditize models, but their displacement in jobs, original thought and work, facilitation of disinformation and population psychological warfare doesn’t change… if anything it’s accelerated and harder to temper the bad elements.
It is unrealistic to close it up and hope that no-one catches up and releases a better AI model for free since the cat's already out of the bag and the progress of these AI models cannot be delayed, stopped or gate-kept for long.
By that time, someone will release a more powerful AI model for free.
I really admire their mindset of striving for the betterment of humanity.
There was a time when OpenAI, Anthropic, and even Musk used to talk with that same lofty vision. But now, they've all shifted to competing for national interests instead, which is honestly quite disappointing.
Well, it’s a highly effective PR tactic that works well for the small fish. You say your competition is too selfish and you just want to help people and it creates a bunch of goodwill you can use to grow. Once you grow, your view on things changes, and you’re able to be more selfish. It’s not guaranteed things will go that way, but it’s certainly true that this is a good PR tactic for new entrants in to a crowded field. It can also be genuine. When you’re new you don’t have much to lose and it’s easier to be truly altruistic.
I think DeepSeek is trying to push the idea that LLMs are not marketable products themselves, but are a part of the 'digital commons', as in a hard to develop and maintain software which in of itself does not produce value, but can be the foundation of a product that does. This is very similar to what Facebook is doing with Llama, or what is going on with big open source projects, like databases or the Linux kernel.
I also think that the companies that are doing that have a different idea on how to make money. Facebook's competitive edge lies in all the people using their social media, and for the Chinese, I think their edge lies in manufacturing physical products, so they try to commodify the software component.
Which is in stark contrast to the US, who have a world-beating software and silicon industry, but are merely competent in other areas, so it makes sense for them to want to avoid that.
Why not for now just applaud them for their actions rather than focus on some potential 3rd order plan?
Who knows what any of then might do in the future? For now I'm cheering for Deepseek, Meta and anyone publishing open models as I strongly believe that the potential "danger" of AI in the hands of everyone is far outstripped by the concrete dangers of AI dictated by a select small group of corps/gov symbionts.
From what I know, DeepSeek is a small company that made a lot of money from other businesses, which makes their lack of focus on commercial interests feel more genuine. Plus, even back when they were relatively unknown, they had a habit of donating over $100 million annually to charitable causes. That makes their claim of striving for humanity a lot more believable.
Yes, it is PR. While individuals can be altruistic visionaries, shareholders will protest any action that is not in the company's interest.
For a smaller player, open-sourcing might be a strategic move. It would likely go unnoticed if a small Chinese company released a model "almost as good as" ones from the top US players. But releasing it as open source is a game-changer.
However, open source isn't just for small players. Microsoft develops Visual Studio Code and Meta develops PyTorch - to name a few examples out of hundreds. In these cases, it's also PR - they can afford it, and it doesn't compete with their core business.
There's a story about someone asking the Dalai Lama whether all altruism is actually a form of egoism, since we do good things to feel better. He responded that if that's the case, we need more of this type of egoism. (I can't find the exact source, but it aligns with his quote "Being wisely selfish means taking a broader view and recognizing that our own long-term individual interest lies in the welfare of everyone.")
True, in the end you are not sure if companies like meta / deepseek are promoting opensource because they genuinely care or it is just a differentiated marketing strategy to win over the developers.
Some companies will play on opensource, some will play on pricing, some on quality.
Almost all of the open source companies which do good eventually start an enterprise / paid division as well.
I get the urge to be cynical all the time, but this isn't that time. "Once you grow", they have already grown and competing with the SoTA models and still giving it all back to the community.
I just wish this smear campaign against them stops sometime soon.
my intuition suggests that because they are not the leaders, they will not stay in news for long.
This way you stay on mouth of people for longer period and by publishing code you hurt established giants by allowing much smaller players to compete.
There is no PR tactic, the only company that will stay on top will be the one that open source its models and it is free of use. There are other ways to monetize. People around the globe are not going to use on daily basis, anything that is paid.
LLM's are not that different than programming languages. Imagine Guido van Rossum charging $200 so you can use Python...
Striving for the betterment of humanity, or striving for their peer technology competitor to have their intellectual property moat atom-bombed? I don't think altruism has any real role in this.
How will their mindset not be exploited (even, given time and power, by the exact same now-honest idealists) in the same way as the other people and companies you mention? It's a hard pill to swallow but especially after I read "The Power Broker" it's very true that some of the most inspiring idealists really do turn into amoral pragmatists.
I suspect the Chinese government fears being locked into US SaaS much more than the loss of control from open source. After all censorship can still be enforced at the level of App Stores / DNS for most consumers even with open source models.
And before you get carried away, let's wait and see. A chinese company making claims of just open source is hard to buy, specially in era of making fake promises in the beginning.
Saying that Musk "doesn't have the mindset" for betterment of humanity is just ignorant in a very short-sighted way. Sure, he currently has a side project of fixing the US government and ensuring US doesn't stray too far outside of its core interests, but SpaceX and Tesla are still his bread and butter he has spent most of his time on beside this scenic route.
I've followed him closely since ~2016 so I can say this with some conviction. He's exactly the same guy he was back then. He even talks of the exact same things with the same excitement. Sure, "American boots on MARS!" instead of just "boots on Mars" like he did after the inauguration, but it's quite clear he has seen US falling apart as a existential risk for the more lofty goals especially SpaceX has for Humanity.
https://www.youtube.com/watch?v=wubITdJ_MCw
> I've followed him closely since ~2016 so I can say this with some conviction.
Its sad that you fell for it then. Read Phillip Long's post on him, not someone who follows him but someone who has worked with him for years. It should be eye opening in the kind of man he is.
There will be no Mars terraforming, his goal is being the worlds first trillionaire. The emperor has no clothes, the companies run despite him not because of him and the cult of personality only appeals to people who somehow still fall for it.
Look past what he says and into what is actually happening.
He is actively helping take health care from poor people. He is firing thousands of people with families, mortgages and medical bills without cause. He is closing our national parks. All so he can personally have a tax cut.
His ex-wife is frantically posting for him to help with the healthcare of their own son in his replies. He can't even manage his family I don't think he has the betterment of humanity on his mind.
I think you've been drinking the koolaid too much. He's only in it to enrich himself and his cronies. There's a reason he's on course to become a trillionaire and it ain't because of altruism.
I don't really care. True intelligent discussions happen in some closed groups everywhere. It's been this way since forever. Only open discussions always attract unwanted users.
Am I the only one excited for the release but not overanalyzing their words? This thread feels full of personal interpretations. DeepSeek is still a business—great release, but expectations and motivations seem inflated.
Probably it's because there's nothing specific here to discuss. In the absence of specific new information, discussions turn generic [1] and that tends to make for shallow/indignant discussion. That's one reason why an announcement of announcement (like "Starting next week, we'll open-source 5 repos") is off topic on HN [2].
The releases themselves may turn out to be interesting, of course, and then there may be something substantive to have a thread about. The best submission would be to pick the most interesting release once it shows up.
The "launch week" pattern isn't great for HN, because we end up with a bunch of follow-ups that we have to downweight [3], and there's no guarantee that the largest thread(s) will be about the most interesting element(s) in the sequence. But startups do it anyway so we'll adapt.
[1] https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...
[2] https://hn.algolia.com/?dateRange=all&page=0&prefix=true&sor...
[3] https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...
In China businesses are not treated as a type of person under law. The word "business" does not mean the same thing there.
“Pure garage-energy” is a great phrase.
Most interested to see their inference stack, hope that’s one of the 5. I think most people are running R1 on a single H200 node but Deepseek had much lower RAM per GPU for their inference and so had some cluster based MoE deployment.
Their tech report says one inference deployment is around 400 GPUs...
You need that to optimize load balancing. Unfortunately that gain is not available to small or individual deployment.
I don't think the RAM size of the H800 was nerfed (80GB), but rather the memory bandwidth between gpus.
But yeah, would be interesting to see how they optimized for that.
Correct. There are 3 main ways to "gimp" high end GPUs meant for training - "cores", "on-chip memory speed" and "interconnects". IIUC the H800 had the first 2 unchanged but halved the interconnect speeds.
H20 is the next iteration of the "sanctions" that I believe also limited the "cores" but left the on-chip memory intact, or slightly higher (from the new generation).
[flagged]
[flagged]
1 reply →
You know you're doing well as a company when someone uses bots to boycott
You do realize that... it's literally their entire company right? It's pretty damn cool that they're including everyone.
1 reply →
“Pure garage-energy” with 10,000 A100s, apparently. I’d love to have a garage like that.
From https://semianalysis.com/2025/01/31/deepseek-debates/
> We believe DeepSeek has access to around 10,000 of these H800s and about 10,000 H100s. Furthermore they have orders for many more H20’s, with Nvidia having produced over 1 million of the China specific GPU in the last 9 months.
12 replies →
This is more exciting to me than OpenAI's 12 days of Christmas
Emotionally I agree, but... o1 was a paradigm shift. Nothing DeepSeek has done is on that level yet. DeepSeek themselves would agree. Supposedly Liang Wenfeng himself flew to US to gather information when o1 was launched.
The paradigm shift is the actual 'Open' part, which OpenAI seems to be struggling with.
1 reply →
Maybe in terms of advancing scientific knowledge but DeepSeek has achieved a paradigm shift back from opex to capex. Certain applications are now economically viable when you don't have to pay per request and don't have to fight NVIDIA/sanctions for the privilege
2 replies →
Yeah OpenAI's 12 days was pure Altman bs
> Starting next week, we'll open-source 5 repos – one daily drop
Probably counts as announcement of announcement? Let’s wait for the actual repo drops before discussing them, especially because there are no details about what will be open sourced other than
> These are humble building blocks of our online service: documented, deployed and battle-tested in production.
You are right for sure saying to wait for the actual repos.
But on the other hand, compare this announcement in a README.md file in a GitHub repo with this slideware approach of EU https://openeurollm.eu/
If I had to bet on someone providing some value, unfortunately I wouldn't bet on Europe.
I'm saying this as a European, deeply convinced that Europe is a good place to live. I've also worked for a couple of EU funded research projects, so I have some background experience on the outcome of these projects.
You’re not wrong, it’s a hell lot more exciting to watch players organically emerging from a competitive landscape with stuff you can put your hands on today (or next week) than players hand-picked and tasked by governments, making hollow announcements before they have anything interesting to show.
2 replies →
Yup, I posted https://news.ycombinator.com/item?id=43129444 before I saw that you'd made the point already.
On a completely innocuous side note, I kind of like to see the ´drop´ language used by electronic dance music and hip hop producers used in software.
I think before "drop" in electronic music was a widely used term, "dropping a new track" (ie releasing new music) was a common hip-hop term, since forever.
Honestly I think this is drop as in drop shipping.
Deep respect for DeepSeek and what they've done regarding all the innovations and researches they have been putting out in-the-open.
"Because every line shared becomes collective momentum that accelerates the journey. Daily unlocks begin soon. No ivory towers - just pure garage-energy and community-driven innovation" is a great phase.
[flagged]
I'm empathetic to this argument, but it feels like it's doing a disservice to the open source ethos in general.
Pragmatic as china is, they may actually see the long term value of being open research leaders to short term profit. They are not as bound to immediate and constant growth as we are, their horizons do not change so dramatically every 4 years to say the least.
Don't many respected developers care deeply about their research being open source? I'm no expert but I've read many an article (maybe I'll buy your bridge, too) that suggests willingness to open source research holds at least some weight in some researchers choosing their company. It strikes me as at least possible some of that is earnest, sure even deepseek isn't open source open source, no training set etc. but it feels like they deserve the benefit of the doubt.
All that said I'm still a student, a master of none, so cannot speak first hand to any of this. Just offering another point of view
1 reply →
In fact they are totally dismantling OpenAI. Most likely, without any intention on their part.
LLMs have been more legitimate "blockchain" when most CIO magazines had these essays with "What's your blockchain strategy?" kind of stuffed material.
AI bubble will burst and will burst hard. By end of 2026 at max.
Doesn't OpenAI have like 400M weekly active users now?
Is that app/website or API or both?
1 reply →
I mostly agree with you. Google has a good strategy of driving down costs, for example. I am amazed by the large number of API providers who host either the original DeepSeek R1 or a distilled version.
When cost approaches zero, use cases increase exponentially.
> Most likely, without any intention on their part.
I think this is a very, very naive assumption.
The founder is a quant with involvements in domestic investments and market design and pricing for decades - in China.
As seen with the case of Jack Ma, after you cross a certain level, there is no such thing as "not involved with politics" in China.
Liang knows exactly what he's doing.
> During 2021, Liang started buying thousands of Nvidia GPUs for his AI side project while running High-Flyer. Some industry insiders viewed it as the eccentric actions of a billionaire looking for a new hobby. One of Liang's business partners said they initially did not take Liang seriously and described their first meeting as seeing a very nerdy guy with a terrible hairstyle who could not articulate his vision. Liang simply said he wanted to build something and it will be a game changer which his business partners thought was only possible from giants such as ByteDance and Alibaba Group.
> During that month in an interview with 36Kr, Liang stated that High-Flyer had acquired 10,000 Nvidia A100 GPUs before the US government imposed AI chip restrictions on China.
> On 20 January 2025, Liang was invited to the Symposium with Experts, Entrepreneurs and Representatives from the Fields of Education, Science, Culture, Health and Sports (专家、企业家和教科文卫体等领域代表座谈会) hosted by Premier Li Qiang in Beijing. Liang, being considered as an industry expert, was asked to provide opinions and suggestions on a draft for comments of the annual 2024 government work report.
> On 17 February 2025, Liang along with the heads of other Chinese technology companies attended a symposium hosted by President Xi Jinping at the Great Hall of the People in Beijing.
Whether he intended to or not initially, what happens with DeepSeek is now out of this man's hand and will be 100% influenced by politics.
The chip bans and dual use nature of the technology have catapulted Liang to the first row of CCP tech strategists' attention, for sure.
Source: https://en.wikipedia.org/wiki/Liang_Wenfeng
I am not sure what you mean by AI bubble. Do you mean the valuation of some companies? Or course some won't do well in the future. In the meanty, a significant part of the population uses on it to accelerate their tasks (be it admin work, legal question, learning, getting inspiration). There is no way back. It feels like saying the video streaming bubble will burst in 2020. No. It is too valuable. But yes, some player will die. Nothing special here. IMHO.
A bubble bursting does not mean the industry in the bubble ceases to exist. It means the market hype dies down and only the things that have actual value survive. When it comes to AI, realistically most of the hype is fluff, so calling it a bubble is fair.
I mean the whole world still uses the Internet after the dot-com bubble burst. A significant amount of “AI companies” are valued with revenue multipliers never used before. 44x in the case of OpenAI for example. I agree there is no going back, but this bubble will burst, and hard. IMHO.
Kinda interesting to see where the moat is in AI space. Good base models can always distilled when you have access to API. System prompts can get leaked, and UI tricks can be copied. In the end, the moat might be in the hardware and vertical integration.
> the moat might be in the hardware and vertical integration.
The moat is the products that can be built. The moat is always the product - because a differentiated product can't be a commodity. And an LLM is not a product.
Google and MSFT and Meta have already "won" because they have profitable products they can build LLMs onto. Every other company seems to be burning cash to build a product, and only ChatGPT is getting the brand recognition to realistically compete.
Building an LLM is like building a database. Sure a good one unlocks new uses, but consumers aren't buying something for the database. Meanwhile enterprise customers will shop around and drive the price of a commodity down while open source alternatives grow from in-house uses to destroy moats.
Even hardware isn't a true moat. Only Google has strong vertical integration with their TPUs, and that gives them a lead. BUT Microsoft, AWS, Meta and a whole bunch of startups are building out custom silicon which will surely put pressure on them and Nvidia to keep innovating and earning that price edge.
See I kind of buy the database argument but also kind of don't. A database needs an operator whereas a LLM doesn't. You're basically melting the product into a piece of goo and the UI can be approached using natural language.
For products that still need a UI you could claim that LLM operators take over, so that's still a tax you pay to the incumbents as you interact with a product. It's sort of like we take the money which was paid to SQL operators and engineers and instead pay it to the hyperscalers.
1 reply →
Oracle is doing great just selling databases. Having your data is a moat.
How many times have we been down this path? Tcp/IP, dos/windows, Linux, virtualization, and on and on. Open platforms always seem to find a way to usurp everyone else. In the end, it's better to be a service provider.
Open source finds a way.
Good enough + open (and free) is a very appealing proposition.
> Good base models can always distilled when you have access to API.
What does that mean?
You can use the outputs of a closed source model (or deepseek -> llama. see llama 70b deepseek distilled) to create a synthetic training data set which lets you fine tune (distill) most of the benefits of the "smarter" model in to a "dumber" model. This is why openAi does not show the actual full chain of thought but a summarized version. To stop exfiltration of their IP which has proven immensely difficult.*
*disclaimer; i am an expert of nothing
Why do we need a moat?
_We_ don't. Investors do. Because without being able to gatekeep the rest of the world, there is little money in LLMs.
1 reply →
So a company can make enough money to fund the next breakthrough/training run
there is no open source alternative to GPU farm, that's the moat
that's why they can open source their model and be fine because running this shit is actually hard, let alone maintaining SLA for millions of users??
How long until laptops are able to run high end models? What's the use case that requires a server farm for end user's?
1 reply →
>Kinda interesting to see where the moat is in AI space.
Where we're going, we don't need moats.
ecosystem
Could DeepSeek and OpenAI swap names?
OpenSeek and DeepAI?
I think GP means that DeepSeek is actually open and thus should be named OpenAI.
This is great to see! Open-sourcing infrastructure tools can really accelerate innovation in the AI space. I've found that having access to well-documented repos makes it much easier to experiment and build on existing work. Are there any specific areas these repos focus on, like distributed training or model serving?
How do the valuations of foundation model companies compete with them being firmly open sourced by Facebook and DeepSeek? It seems likely that building these models will not produce hundreds of billions in value given China and Facebook are giving them away largely for free.
Those valuations are built on an imaginary future the founders made investors believe.
The idea is: if we reach true AGI first, we are going to own ALL THE MONEY!
Which erroneously assumes that models can't be siphoned off/recreated, as deepseek proved possible and even reasonably doable. Which in turn fundamentally shows that both openai and anthropic very likely have basically no moat.
I can almost smell another AI winter arriving, once all those valuations meet reality.
I cant see a future where AGI exists and money in general isn't worthless within 6 months of it existing. Either it kills us all, or makes the creator so much money that it's essentially worthless because they're the only one with money, or creates a utopia where money isn't needed.
4 replies →
winter won't come soon enough this time
Postgres and MySql are free but hasn't stopped Oracle from making tens of billions each year in database subscriptions.
IMO it's harder to move away from Oracle DB than from Open AI. The type of businesses that rely on Oracle DB have all the characteristics of a "tech kidnap victim". Huge DB-driven projects, old bad code with few tests, and a profit margin low enough to not be able to fund a migration to a different DB.
I think businesses that rely on new AI models are very different.
1 reply →
It's pretty disgraceful to DeepSeek saying Facebook and China.
Looking forward to it! I'll generally make an effort to use Open Models over proprietary alternatives when the use-case permits as Open Models getting better and more popular encourages more models to become open as well - a requisite for a future to be able to build self-hosted solutions that's not beholden to the control of mega corps and AI monopolies.
Is this actually going to be open source? Or is it going to be just an open weights release? Seeing training code would be interesting.
Personally I don’t think even a true open source release would erase the downsides of the model incorporating CCP propaganda and censorship. I would prefer control of megacorps to control of an untrustworthy dictatorship.
[dead]
I wonder if they are just shorting Nvidia...
With how they are releasing models and keeping the open source spirit alive? I hope to god they are. Let the quants cook!
This could boost Nvidia. https://en.wikipedia.org/wiki/Jevons_paradox
> In economics, the Jevons paradox occurs when technological advancements make a resource more efficient to use (thereby reducing the amount needed for a single application); however, as the cost of using the resource drops, if the price is highly elastic, this results in overall demand increases causing total resource consumption to rise.
Tencent recently bought 100k-200k H20 to serve R1. [1] I think it's not clear open source will tank nvidia price. And you won't place a lot of bets if the outcome is anywhere from certain.
[1]:https://aiproem.substack.com/p/ai-at-the-speed-of-light-tenc...
what it have to do with anything? trading and stocks have no correlation whatsoever with actual company sales and prodcts.
> Why? Because every line shared becomes collective momentum that accelerates the journey.
Truly admireable on their part and a great paradigm for others. Reasons for this doesn't really matter to me but I can't help but wonder if somehow they were obliged or otherwise indebted to follow this route.
This team is truly something special.
> These are humble building blocks of our online service: documented, deployed and battle-tested in production. No vaporware, just code that moved our tiny moonshot forward.
My not-so-innocent guess is that they are looking to crowd-source their online platform (the front-end essentially) in order to reduce costs. Still acceptable though as they made the model open weight and partially re-producible.
Everyone who ever open-sourced anything knows that it just isnt cost cutting. You suddenly get army of people posting issues, opinions and those who try contribute often make more mess than its worth.
their frontend is probably just open'webui https://github.com/open-webui/open-webui
I always consider open-sourcing to be a great social experiment. It may fail one day, but its effects will remain and benefit everyone.
Well, although R1-671b is way too expensive for me to self-host, given their past open source (or weight) contributions, I DO have high expectation of them.
Each and every contribution to open source community will be helpful. Thanks DeepSeek!
Would love another MoE that fits in 120GB VRAM for the 128gb Mac owners
Deepseek seems to be having huge PR wins as the "oh shucks" modest boy genius, while the Americans seem like pouty jerks.
Amodei's / Hassabis' comments in particular came off as so arrogant and annoying.
>> Amodei's / Hassabis' comments in particular came off as so arrogant and annoying.
Exactly which part of their writings comes off as arrogant to you? The only point in Amodei's article[0] that could be remotely be interpreted as arrogant is this:
Maybe I'm different, but it really does sound reasonable judgement to me.
[0]: https://darioamodei.com/on-deepseek-and-export-controls#deep...
[flagged]
1 reply →
[dead]
The funding company holds the assets, and the news make the stock market blooming and they make money!
God bless the DeepSeek team with more innovative ideas to share with us all!
R1 is a better o1, this is a better devdays.
DeepSeek seems like Hisoka helping Gon and Killua ... just for a more challenging battle at some point xD
More like the reverse? -- Gon and Killua (young, with tons of room to grow) helping Hisoka (very experienced, smaller runway).
Speaking of DeepSeek, anyone here used SambaNova - are they reliable?
duckduckgo also have one, so not sure if this makes a difference
Is it out of the realm of possibility to look at this move as a way to take down the moat of closed source AI companies?
I mean strategically this could be the first use of open source in this way.
No turning back...
Remember when OpenAI was doing this:
"OpenAI threatens to revoke o1 access for asking it about its chain of thought"
https://news.ycombinator.com/item?id=41534474
Not only did DeepSeek opensource their model, they also showed the user chain-of-thought right up front, which everyone else rushed to emulate when they saw how much users liked it.
DeepSeek is seeking deep to Open AI.
irony
I really hope DeepSeek is going to open source their entire training pipeline.
Tbh this just feels like the same playbook as OAI. Open start and then less so over time.
Mistral has been holding the line on that topic remarkable well.
Beatings will continue until openness improves, apparently. Kudos to Deepseek, about time someone spilled some significant beans.
odds on r1.5/r2 release?
Looking forward to it
deepseek just keeps on giving. kudos to them.
i can almost hear sam altman and dario amodei cry every time deepseek does something amazing.
launch weeks ftw
I really like this definition of "AGI": When everyone (yes everyone) benefits from very powerful AI models released for free and it is not gate-kept by one company and it costs $0 to use commercially or for research and you can do whatever you want with it.
Unlike the other counterpart which believes that "AGI" means: "raising billions of dollars to achieve $100BN of profits to their investors". (Which is complete nonsense).
While not totally "open source" by the strictest definition, it is at least better than having no model released with no mention of the architecture on the system card or paper and just vague comments about the 'performance'.
Ladies and gentlemen, this is closer towards being an better "Open AI". Unlike the other alleged $157BN "non-profit" scam.
I think you know which one really is beneficial to humanity and is the real "Open AI".
You’re assuming they won’t follow in OpenAI’s footsteps. OpenAI published a lot for a while and truly changed the world, far more than deepseek has. Only time will tell.
But I think it’d be a mistake to think that this is necessarily beneficial for humanity just because the weights are open. It’s maybe great to commoditize models, but their displacement in jobs, original thought and work, facilitation of disinformation and population psychological warfare doesn’t change… if anything it’s accelerated and harder to temper the bad elements.
Of course. Except we know what happens when one tries to close them up again - someone else will release another more powerful AI model for free.
So it doesn't matter when there are multiple players competing to destroy each others in this race to zero.
> But I think it’d be a mistake to think that this is necessarily beneficial for humanity just because the weights are open. It’s maybe great to commoditize models, but their displacement in jobs, original thought and work, facilitation of disinformation and population psychological warfare doesn’t change… if anything it’s accelerated and harder to temper the bad elements.
It is unrealistic to close it up and hope that no-one catches up and releases a better AI model for free since the cat's already out of the bag and the progress of these AI models cannot be delayed, stopped or gate-kept for long.
By that time, someone will release a more powerful AI model for free.
I really admire their mindset of striving for the betterment of humanity. There was a time when OpenAI, Anthropic, and even Musk used to talk with that same lofty vision. But now, they've all shifted to competing for national interests instead, which is honestly quite disappointing.
Well, it’s a highly effective PR tactic that works well for the small fish. You say your competition is too selfish and you just want to help people and it creates a bunch of goodwill you can use to grow. Once you grow, your view on things changes, and you’re able to be more selfish. It’s not guaranteed things will go that way, but it’s certainly true that this is a good PR tactic for new entrants in to a crowded field. It can also be genuine. When you’re new you don’t have much to lose and it’s easier to be truly altruistic.
I think DeepSeek is trying to push the idea that LLMs are not marketable products themselves, but are a part of the 'digital commons', as in a hard to develop and maintain software which in of itself does not produce value, but can be the foundation of a product that does. This is very similar to what Facebook is doing with Llama, or what is going on with big open source projects, like databases or the Linux kernel.
I also think that the companies that are doing that have a different idea on how to make money. Facebook's competitive edge lies in all the people using their social media, and for the Chinese, I think their edge lies in manufacturing physical products, so they try to commodify the software component.
Which is in stark contrast to the US, who have a world-beating software and silicon industry, but are merely competent in other areas, so it makes sense for them to want to avoid that.
4 replies →
Why not for now just applaud them for their actions rather than focus on some potential 3rd order plan?
Who knows what any of then might do in the future? For now I'm cheering for Deepseek, Meta and anyone publishing open models as I strongly believe that the potential "danger" of AI in the hands of everyone is far outstripped by the concrete dangers of AI dictated by a select small group of corps/gov symbionts.
7 replies →
From what I know, DeepSeek is a small company that made a lot of money from other businesses, which makes their lack of focus on commercial interests feel more genuine. Plus, even back when they were relatively unknown, they had a habit of donating over $100 million annually to charitable causes. That makes their claim of striving for humanity a lot more believable.
19 replies →
Yes, it is PR. While individuals can be altruistic visionaries, shareholders will protest any action that is not in the company's interest.
For a smaller player, open-sourcing might be a strategic move. It would likely go unnoticed if a small Chinese company released a model "almost as good as" ones from the top US players. But releasing it as open source is a game-changer.
However, open source isn't just for small players. Microsoft develops Visual Studio Code and Meta develops PyTorch - to name a few examples out of hundreds. In these cases, it's also PR - they can afford it, and it doesn't compete with their core business.
There's a story about someone asking the Dalai Lama whether all altruism is actually a form of egoism, since we do good things to feel better. He responded that if that's the case, we need more of this type of egoism. (I can't find the exact source, but it aligns with his quote "Being wisely selfish means taking a broader view and recognizing that our own long-term individual interest lies in the welfare of everyone.")
So yes, I want to see more of this kind of PR.
1 reply →
True, in the end you are not sure if companies like meta / deepseek are promoting opensource because they genuinely care or it is just a differentiated marketing strategy to win over the developers.
Some companies will play on opensource, some will play on pricing, some on quality.
Almost all of the open source companies which do good eventually start an enterprise / paid division as well.
I get the urge to be cynical all the time, but this isn't that time. "Once you grow", they have already grown and competing with the SoTA models and still giving it all back to the community.
I just wish this smear campaign against them stops sometime soon.
my intuition suggests that because they are not the leaders, they will not stay in news for long. This way you stay on mouth of people for longer period and by publishing code you hurt established giants by allowing much smaller players to compete.
7 replies →
There is no PR tactic, the only company that will stay on top will be the one that open source its models and it is free of use. There are other ways to monetize. People around the globe are not going to use on daily basis, anything that is paid.
LLM's are not that different than programming languages. Imagine Guido van Rossum charging $200 so you can use Python...
1 reply →
literally how openai attracted talent with deepmind as the boogeyman. its a playbook that works
[flagged]
12 replies →
Power does terrible things to people, we really need to stop letting that happen.
"Power attracts pathological personalities. It is not that power corrupts but that it is magnetic to the corruptible." - Frank Herbert
1 reply →
Rich nations see risk, rising giants see leverage.
Striving for the betterment of humanity, or striving for their peer technology competitor to have their intellectual property moat atom-bombed? I don't think altruism has any real role in this.
Really it just shows the beauty of market competition.
They just stopped pretending.
How will their mindset not be exploited (even, given time and power, by the exact same now-honest idealists) in the same way as the other people and companies you mention? It's a hard pill to swallow but especially after I read "The Power Broker" it's very true that some of the most inspiring idealists really do turn into amoral pragmatists.
It’s greed not national interests unless you know something I don’t about greedy people.
OpenAI is the biggest irony, it's not even bothered with national interests, it's on a pure profit maximising goal without regard to anything else.
It's just an Nvidia short, so they can get the yuuge amount of graphic cards they need for further training even cheaper (joke).
Don’t forget Google who typically make their best AI products available only to large customers. For “safety” of course.
To me it's notable that Chinese government didn't care (or know) about this going open source.
I suspect the Chinese government fears being locked into US SaaS much more than the loss of control from open source. After all censorship can still be enforced at the level of App Stores / DNS for most consumers even with open source models.
We are making the world a better place more than our competitors
you forgot to add "/sarcasm"
And before you get carried away, let's wait and see. A chinese company making claims of just open source is hard to buy, specially in era of making fake promises in the beginning.
The CPC seem to be encouraging open source, gitee (Chinese github) is run by the government.
1 reply →
Isn't Musk still on the open side? Isn't that what the whole Musk - Altman conflict is about?
Maybe. We’ll see if he open sources grok 2 or if he just want others to open source their models and weights.
1 reply →
[dead]
Saying that Musk "doesn't have the mindset" for betterment of humanity is just ignorant in a very short-sighted way. Sure, he currently has a side project of fixing the US government and ensuring US doesn't stray too far outside of its core interests, but SpaceX and Tesla are still his bread and butter he has spent most of his time on beside this scenic route.
I've followed him closely since ~2016 so I can say this with some conviction. He's exactly the same guy he was back then. He even talks of the exact same things with the same excitement. Sure, "American boots on MARS!" instead of just "boots on Mars" like he did after the inauguration, but it's quite clear he has seen US falling apart as a existential risk for the more lofty goals especially SpaceX has for Humanity. https://www.youtube.com/watch?v=wubITdJ_MCw
> I've followed him closely since ~2016 so I can say this with some conviction.
Its sad that you fell for it then. Read Phillip Long's post on him, not someone who follows him but someone who has worked with him for years. It should be eye opening in the kind of man he is.
There will be no Mars terraforming, his goal is being the worlds first trillionaire. The emperor has no clothes, the companies run despite him not because of him and the cult of personality only appeals to people who somehow still fall for it.
1 reply →
Look past what he says and into what is actually happening.
He is actively helping take health care from poor people. He is firing thousands of people with families, mortgages and medical bills without cause. He is closing our national parks. All so he can personally have a tax cut.
His ex-wife is frantically posting for him to help with the healthcare of their own son in his replies. He can't even manage his family I don't think he has the betterment of humanity on his mind.
2 replies →
There was a time when I was this naive, but it's surely a very long time ago
1 reply →
I think you've been drinking the koolaid too much. He's only in it to enrich himself and his cronies. There's a reason he's on course to become a trillionaire and it ain't because of altruism.
1 reply →
He is not fixing anything, he is just a human, the kind with flaws, that thinks he isn't.
1 reply →
"Fixing" the US government
1 reply →
[flagged]
1 reply →
[flagged]
[flagged]
[flagged]
[flagged]
[flagged]
[flagged]
Long live llms I hope they infest every part of the internet with low level comments. Both the clear , deep, and dark.
Imagine no more human interactions just a permanent flood of meaningless thoughtless word salad.
I think the Chinese are perfect to introduce such a product very inline with what they usually produce.
Get ready for web3.o
I don't really care. True intelligent discussions happen in some closed groups everywhere. It's been this way since forever. Only open discussions always attract unwanted users.
This may be my cynical take, but this cannot be out of good will or noble intentions. There has to be an ulterior motive.
Pop the US AI bubble?
It's cynical, probably because you have only been consuming cynical news.
Wasn't it caught already sending data to China in a sneaky way? Why using it for anything?
Reference supporting precisely that: https://www.wired.com/story/deepseek-ai-china-privacy-data/