Comment by handoflixue

13 hours ago

"Copyright violation of a published work" and "stealing private trade secrets" are in fact very different crimes.

Humans have spent millenia harvesting and distilling each other's IP - "the shoulder of giants" and all that, so it's an especially disingenuous take.

For something to be a trade secret, you have to actually keep it secret. If I get the ingredients of Coca-cola from an ex-employee, I've stolen a trade secret. If I work it out by doing a chemical analysis, I've stolen nothing.

There is a difference with anthropic, as no-one signs a licence agreement to buy a coke. But Anthropic are also not saying you can't publish the output of their models. It's not clear to me if trade secret law will (or should) cover a secret which can be extracted from information that licensees are not restricted from publishing.

  • Wait, really? So why doesn't someone just reverse-engineer Coca-Cola like that? My understanding was that a "clean room" implementation is fine, but not reverse-engineering. If you can just copy everything on the market, why isn't someone already doing that?

    • In the case of coca cola, because use of coca leaves is highly regulated due to the fact that they also contain cocaine. There is a YouTuber who claims to have reverse engineered Coca-Cola, but he had to use tea-tree oil instead of actual coca leaf extract.

      Here's EFF on reverse engineering and the law: https://www.eff.org/issues/coders/reverse-engineering-faq

      Historically a lot of competition in physical products was very much reverse engineering. Because you can buy them without signing your rights away. That's why companies are keen on patents and click-through agreements.

      If you look at how "clean room" processes work, they are actually a form of reverse engineering. Also clean room technique exists to avoid your new implementation infringing copyright, not trade secrets.

    • The Coca-Cola formula was reverse engineered in 2026 by a sufficiently motivated individual.

      Here it is.

      Per liter of cola:

      104 g sugar

      1 mL Flavor Solution A

      10 mL Flavor Solution B

      Carbonated water to volume

      Flavor Solution A (Essential Oils):

      Dilute 20–21 mL of the following oil mixture to 1 L using 95% ethanol:

      45.8 mL lemon oil

      36.5 mL lime oil

      8 mL tea tree oil (emulates decocainized coca leaf extract)

      4.5 mL Cassia cinnamon oil

      2.7 mL nutmeg oil

      1.2 mL orange oil

      0.7 mL coriander oil

      0.6 mL fenchol

      Flavor Solution B (Chemical and Color Base):

      Dilute the following ingredients to a volume of 1 L using water:

      320 mL Shank's caramel color or 190 mL Durkee caramel color

      160 g glycerin

      45 mL 85% phosphoric acid

      10 mL vinegar (5% acidity)

      10 mL vanilla extract

      10 g wine tannins (emulates decocainized coca leaf extract)

      9.65 g caffeine

      3 replies →

    • Because having the nominal rights and having the economical means, societal incentives and actual desire to do so can be highly disjoint sets?

      Plus Coca-Cola itself don’t even use the same formula through time and space IIRC. Which clearly show that what people will buy when they reach for Coca-Cola is not even the exact actual taste. You can’t replicate the whole customer experience that a given company provide at some point by only cloning the top of the iceberg they showcase as the product.

> Humans have spent millenia harvesting and distilling each other's IP

You maybe somewhat correct, but also copyright lawyers wouldn’t have work if it would be up for grabs to take others IP willy nilly just because “shoulders of giants and all that”.

  • I mean, there's an obvious difference between "distributing copies" (which is what the law was designed to prevent) and "training an LLM". We already managed "banning LLM output that contains copyrighted text" - it's much easier to just pirate a copy of the text. So I think the copyright lawyers will continue to have work as long as human written texts are worth buying.

    • > I mean, there's an obvious difference between "distributing copies" (which is what the law was designed to prevent) and "training an LLM".

      What's the difference between me/you downloading an mp3 through torrents for personal use (not distributing) while risking criminal punishment in most of the western world and BigCorp downloading petabytes worth of copyrighted works "to train an LLM" and resell it?

      Can me/you do the same, when police comes to mine/your door?

      "Dear police, don't lock me up - I was just going to train an LLM!"

      1 reply →