Comment by cheema33

3 hours ago

As others have pointed out, humans train on existing codebases as well. And then use that knowledge to build clean room implementations.

5 comments

cheema33

mxey 2 hours ago

That’s the opposite of clean-room. The whole point of clean-room design is that you have your software written by people who have not looked into the competing, existing implementation, to prevent any claim of plagiarism.

“Typically, a clean-room design is done by having someone examine the system to be reimplemented and having this person write a specification. This specification is then reviewed by a lawyer to ensure that no copyrighted material is included. The specification is then implemented by a team with no connection to the original examiners.”

kelnos 2 hours ago

No they don't. One team meticulously documents and specs out what the original code does, and then a completely independent team, who has never seen the original source code, implements it.

Otherwise it's not clean-room, it's plagiarism.

regularfry 2 hours ago

What they don't do is read the product they're clean-rooming. That's kinda disqualifying. Impossible to know if the GCC source is in 4.6's training set but it would be kinda weird if it wasn't.

pizlonator 2 hours ago

Not the same.

I have read nowhere near as much code (or anything) as what Claude has to read to get to where it is.

And I can write an optimizing compiler that isn't slower than GCC -O0

cermicelli 3 hours ago

If that's what clean room means to you, I do know AI can definitely replace you. As even ChatGPT is better than that.

(prompt: what does a clean room implementation mean?)

From ChatGPT without login BTW!

> A clean room implementation is a way of building something (usually software) without copying or being influenced by the original implementation, so you avoid copyright or IP issues.

> The core idea is separation.

> Here’s how it usually works:

> The basic setup

> Two teams (or two roles):

> Specification team (the “dirty room”)

> Looks at the original product, code, or behavior

> Documents what it does, not how it does it

> Produces specs, interfaces, test cases, and behavior descriptions

> Implementation team (the “clean room”)

> Never sees the original code

> Only reads the specs

> Writes a brand-new implementation from scratch

> Because the clean team never touches the original code, their work is considered independently created, even if the behavior matches.

> Why people do this

> Reverse-engineering legally

> Avoid copyright infringement

> Reimplement proprietary systems

> Create open-source replacements

> Build compatible software (file formats, APIs, protocols)

I really am starting to think we have achieved AGI. > Average (G)Human Intelligence

LMAO