← Back to context

Comment by walrus01

18 hours ago

Reminds me a bit of the anecdote of Steve Jobs complaining about people ripping off the Mac GUI, in the mid to late 1980s, when he gave no public acknowledgement to the work done by Xerox on the Alto and Star operating system.

"you're trying to rip off what I've already ripped off!"

Crawl the whole Internet to build a gargantuan sized LLM and then complain you're being copied...

I think you meant a quote attributed to Bill Gates:

"Well, Steve, I think there's more than one way of looking at it. I think it's more like we both had this rich neighbor named Xerox and I broke into his house to steal the TV set and found out that you had already stolen it."

Yeah, the whole AI industry is just people ripping off each other.. Started by AI companies gulping up all the information that technical or altruistic people shared on the Internet in the past 40 years to help other fellow humans, then moved to AI companies consuming pirated and copyrighted material and now its AI companies ripping off each other.

Information really does want to become free, but AI companies want to be gatekeepers. Long term I bet on the open weights to win, as the more sustainable approach.

  • I'm very pro distillation. I think there needs to be distillation non profits who curate massive corpi of super high value training data from frontier models. They could have an "anonymous contribution" system where regular people with max subscriptions upload their conversation histories. It's a rough concept, but surely would be a huge boon to humanity.

    • sort of sounds like "project tapestry" by Yann LeCunn. Build projected data silos of highly valuable information, train in a distributed manner and share the weights upwards where they're combined and fine tuned.

Apple gave Xerox the right to buy $1 million of pre-IPO stock before the meeting took place.

  • Glad you pointed this out. I believe the sequence was that Jobs himself got a shorter demo during his first visit with no prior arrangements. He then negotiated bringing back a group of his key people to get a more in depth demo and that included the stock deal.

    When Apple was accused of 'ripping off' PARC, Steve didn't seem keen to bring up this rather salient point. I suspect it may have been a combination of wanting Apple to continue receiving credit for these innovations from consumers and also the fact that, in retrospect, the million dollar stock deal could seem a bit like trading beads to Native Americans for Manhattan Island. Another point worth noting is that Apple's PARC visit was in December 1979 and the Xerox Star was publicly announced in April 1981, so Apple got a 15 month head start (the Apple Lisa shipped in Jan 83).

    I've also heard that Xerox didn't hold on to the Apple stock for very long, so never gained the windfall they could have. As is well documented, Xerox senior management didn't understand what they had in PARC and also didn't understand how rapidly microcomputers would become ubiquitous. So, of course, they didn't think Apple's stock price would skyrocket either.

    • Lisa and early MacOS are tremendously different in their details than the Alto operating system. While there was clearly a transfer of inspiration, Apple engineers like Bill Atkinson made countless small and large innovations to simplify the Xerox GUI model and improve its usability based on extensive in-house R&D and user testing (and in some cases implement features that the Apple team presumed Xerox had but actually didn't exist on the Alto). It is simply ahistoric to build narratives around Apple stealing Xerox ideas wholesale.

      For more details on Apple's early UI evolution, Atkinson kept polaroids of a variety of prototypes and mockups: https://www.youtube.com/watch?v=Qg0mHFcB510

    • > the million dollar stock deal could seem a bit like trading beads to Native Americans for Manhattan Island

      But in both cases the value only existed because of the people offering the deal. XeroX doing nothing with a UI or native Americans doing nothing with some land would mean the UI and the land would continue to be worth nothing. It was the others coming with ideas and effort that made them valuable.

      2 replies →

[flagged]

  • The websites, music, movies, books, photos, art that they stole didn't appear out of thin air. The amount of time and effort people have collectively poured into creating these works throughout history far, far surpasses Anthropic's own effort of converting them into model weights.

  • The equivocation is crawling website <-> crawling LLM responses.

    Both Anthropic and Alibaba are trying to build bleeding edge LLMs. That part is the same. The way they source their data is slightly different, but they would both argue it constitutes fair use under Copyright law.

  • "Your extremely efficient multi petabyte internet content suction machine is ripping off my extremely efficient multi petabyte internet content suction machine"

    Sucking down petabytes of peoples' copyrighted content that they never granted a specific license to you to use seems to be an unavoidable and default part of the process of building any huge LLM.

  • It's not really equivocation in this instance. This feels like a 'bad faith' comment. We can do better.

    LLM's literally wouldn't work without the sum total of knowledge (in the forms of books and other copyrighted content) being used as 'training data' for these LLMs.

    The 'bleeding edge' LLMs required many things, but: 1 Tech innovation ('attention') 2 Lots of compute 3 Data 4 Pre + post training

    #4 doesn't happen without #3.

    It's pretty obvious at this point that the major providers have stolen vast amounts of #3 - they have paid nearly 0 of the creators.

    We can argue about the impact (I'd lean net good) vs. the cost. But arguing there isn't a cost is a bit silly.