Comment by acqq

7 years ago

> Consider that the name of local variables is never part of the binaries, only public symbols are.

"Never" and "only public" are wrong in the statement above, because non public symbols were indeed released by Microsoft.

I guess you are young enough not to know that Microsoft accidentally did release some NT builds with the names of the internal variables, and such builds were intentionally made with less compiler optimizations, allowing for easier reversing. Such events of releasing the internal names resulted in some very interesting stories and statements:

https://en.wikipedia.org/wiki/NSAKEY

"_NSAKEY was a variable name discovered in Windows NT 4 Service Pack 5 (which had been released unstripped of its symbolic debugging data) in August 1999 by Andrew D. Fernandes of Cryptonym Corporation."

Also, the Windows code is under shared source license for nearly 20 years. Not really some sensitive secret thing or crown jewels that many want us to believe.

  • Sure and the source code has also been leaked before. But a "clean room" implementation is not allowed to look at such things.

Back in NT3.x times, for quad-processors machines, and IIRC even NT 4.x with 8-processors machines (like AXEL crazy SMP monsters) Microsoft shipped windows NT as code, and you had to compile it on the machine it was to run on.

Microsoft does not release the kernel’s private symbols, trust me on that. But yes there was some leaks in the past, small portions of NT4 and W2K were leaked, I think I link to a Google query pointing to articles discussing the leaks in the Quora reply.

  • Well that's categorically untrue. Sure they don't release private symbols intentionally but they have done in the past accidentally. At that point it becomes a bit of a grey area, undoubtedly leaked/stolen source code is a no-no but reversing from private symbols when they do leak seem harder to quantify as you still need to reverse engineer the code, just structures/names etc are already known.

    Private symbols are not the only way of gleaning more information, other examples I can think of are:

    * Checked builds (prior to Win10). These builds shipped de-optimized kernels (e.g. no inlines) typically with copious debug strings which gave away important details. For example I gleaned a lot of knowledge of ALPC MSRPC from the checked build of rpcrt4.dll from Windows 8.

    * SDK/DDK headers, especially in the brave new world of insider previews with preview SDK/DDKs there is sometimes information present which should not have been released including "private" information. Again bit of a grey area.

    * The private symbols MS do ship. For example a significant proportion of the COM runtime has private symbols, intentionally. You can extract from those a surprising amount of system call structure information.

    I'd recommend watching Alex Ionescu's talk at OffensiveCon about how he does reverse engineering on Windows to see many of these things in action. https://www.youtube.com/watch?v=2D9ExVc0G10

    I'm not saying any of this would make it a clean-room re-implementation but to say ReactOS cannot possibly have been reverse engineered without just up and copying source isn't true.

    edit: Formatting.

    • I’d love that you point to an instance where the private symbols of the kernel actually shipped?

      It is very possible that some private symbols were part of some leak, but stolen data does not qualify as “shipping” :)

      Again, I stand behind my opinion. I eyeballed some of the code side-by-side and there was portions where I could literally see a line-by-line correlation, which I can hardly explain.

      Then if reversing the kernel is so doable using legitimate means, why ReactOS is still largely stuck in the early 2000’s, coincidentally where the major leaks happened?

      9 replies →

    • > Sure they don't release private symbols intentionally but they have done in the past accidentally.

      Even if that is the case, it's an incredibly poor idea to use them, so that the code ends up with spurious similarities in spite of being (otherwise) cleanly developed.

  • Are you arguing that MS mistakenly releasing a version of Windows with debug symbols didn't happen or that it constitutes a leak and isn't fair game?

    Because the former is fairly well documented and the latter doesn't seem right in the context of this discussion. If MS themselves messed up and published the symbols through an official channel that's fair game IMO. Although obviously IANAL etc... I'm talking from an ethical perspective, not a legal one.

    I don't know much about ReactOS or the NT kernel but we have this type of controversy regularly in the emulation scene and while sometimes it's true that people reuse docs they shouldn't have, a lot of the time people underestimate the skill and cunning of reverse-engineers to figure out how things work without having access to any restricted information.

    • Kind of late to the game here, but MS could ship actual source code and copying that code would still be copyright infringement. Reverse engineering for inter-operability is legal in many places, but copying implementations is not. Even having seen the code, you would have to reimplement it in a way that worked equivalently, but was different. The OP is claiming that even things like macros have the same implementation and names (i.e. code that is never exposed publicly in any way). Even if you could deduce this from debug symbols (which would be quite tricky, but probably not impossible), you've got to find another way to do that work.

      I don't really subscribe to a belief of absolute morality, but in the context of the discussion, I think that no matter how you got access to that code, if you say that you are reverse engineering it, then copying an implementation is not doing what you are saying you are doing (as well as being copyright infringement).

    • > Are you arguing that MS mistakenly releasing a version of Windows with debug symbols didn't happen or that it constitutes a leak and isn't fair game?

      I think he's saying that even having access to leaked or accidentally released originals is not implicit permission to use it freely. Otherwise any piece of software that was ever legitimately released would be fair game, just throw it at a decompiler and profit.

      If you're making a clean room design having so many similarities to the original is unlikely to happen accidentally.

      Anybody implementing a clean room design should theoretically have no prior knowledge of the original's inner workings. The specs are written by one person, checked to not include any of the original material by a second one, before being passed to a third to be implemented.

      From far enough a piece of wire and an isolation transformer do the same thing. The secret sauce is in that isolation, you can't just shunt it and pretend it's the same.