Comment by Natsu

3 days ago

It's funny, because while that's a terrible educational experience, you actually learned some important lessons despite them.

I remember the first time I found out that the software documentation I had been relying upon was simply and utterly wrong. It was so freeing to start looking at how things actually behaved instead of believing the utterly false documentation because the world finally made sense again.

Sometimes it's not even rare that documentation is wrong. The documentation for a vendor who I won't name - but might be at Series J and worth north or $50 billion - seems to be wrong more often than it's right.

We frequently say, don't blame the tools, it's you. That pushes "blame the tools" outside of the Overton window, and when we need to blame a tool, we're looked at like we have five heads.

Ten years ago, I was dealing with a bizarre problem in RHEL where we'd stand up an EC2 instance with 4GB of memory, have 4.4GB of memory reported to the system, and be able to use 3.6GB of it. I spent _a long_ time trying to figure out what was going on. (This was around the time we started using Node.js at that company, and needed 4GB of RAM just for Jenkins to run Webpack, and we couldn't justify the expensive of 8GB nodes.)

I did a deep dive into how the BIOS advertises memory to the system, how Linux maps it, and so forth, before finally filing a bug with Red Hat. 36 hours later, they identified a commit in the upstream kernel, which they forgot to cherry-pick into the RHEL kernel.

That's a very human mistake, and not one I dreamed the humans at Red Hat - the ones far smarter than me, making far more money than me - could ever make! Yet here we were, and I'd wasted a bunch of time convinced that a support ticket was not the right way to go.

  • > Yet here we were, and I'd wasted a bunch of time convinced that a support ticket was not the right way to go.

    From my experiences with public issue trackers for big projects, it's very reasonable to postpone creating a new issue, and rather follow my own hypothesis/solution first:

    * creating a new issue takes significant effort to be concise, provide examples, add annotated screenshots, follow the reporting template, etc., in hopes of convincing the project members that the issue is worth their time.

    Failing to do so often results in project members not understanding or misunderstanding the problem, and all too often leads to them directly closing the issue.

    And even when reporting a well-written issue, it can still just be ignored/stall, and be autoclosed by GitHubBot.

  • In my case, it was egregiously bad, because someone had cribbed docs from an entirely separate scripting language that did almost the same things. Most of the same features were there, but the docs were utter lies, and failures were silent. So you'd go down the wrong branch of an if statement because it wasn't checking the conditions it claimed to check.

    Once I started actually testing the scripts against the docs and rewriting them, life got so much better. The worst part is that it had been that way for years and somehow nobody noticed because the people using that horrible scripting language mostly weren't programmers and they'd just tweak things until they could happy path just enough to kinda-sorta work.

I took and then TA'd a class where the semester long project was to control robots (it was a software engineering principles class, the actual code writing could be done in a single weekend, but you had to do all the other stuff of software engineering- requirements analysis and documentation etc).

We had a software simulator of the robots, and the first lab was everyone dutifully writing the code that worked great on the simulator, and only then did we unlock the real robots and give you 2-3 minutes with the real robot. And the robot never moved that first lab, because the simulator had a bug, and didn't actually behave like the real robot did. We didn't deliberately design the robot that way, it came like that, but in a decade of doing the class we never once tried to fix the simulator because that was an incredibly important lesson we wanted to teach the students: documentation lies. Simulators aren't quite right. Trust no one, not even your mentor/TA/Professor.

We did not actually grade anyone on their robot failing to move, no grade was given on that first lab experience because everyone failed to move the robot. But they still learned the lesson.

  • > because the simulator had a bug

    I had something similar happen when I was taking microcomputers (a HW/SW codesign class at my school). We had hand-built (as in everything was wire wrapped) 68k computers we were using and could only download our code over a 1200-baud serial line. Needless to say, it was slow as hell, even for the day (early 2000s). So, we used a 68k emulator to do most of our development work and testing.

    Late one night (it was seriously like 1 or 2 am), our prof happened by the lab as we were working and asked to see how it was going. I was project lead and had been keeping him apprised and was confident we were almost complete. After waiting the 20 minutes to download our code (it was seriously only a couple dozen kb of code), it immediately failed, yet we could show it worked on the simulator. We single-stepped through the code (the only "debugger" we had available was a toggle switch for the clock and an LED hex readout of the 16-bit data bus). I had spent enough time staring at the bus over the course of the semester that I'd gotten quite good at decoding the instructions in my head. I immediately saw that we were doing a word-compare (16-bit) instead of a long-compare (32-bit) on an address. The simulator treated all address compares are 32-bit, regardless of the actual instruction. The real hardware, of course, did not. It was a simple fix. Literally one-bit. Did it in-memory on the computer instead of going through the 20-minute download again. Everything magically worked. Professor was impressed, too.

  • Just out of curiosity, were you up-front after the fact that this was part of the exercise?

    We had a first-semester freshman year course that all incoming students were required to take. The first assignment in that class was an essay, pretty typical stuff, I don't even remember what about.

    A day after handing it in, roughly half of the class would be given a formal academic citation for plagiarism. That half of the class hadn't cited their sources. "This one time only", the citation could be removed if the students re-submitted an essay with a bibliography.

    While it was obvious, in hindsight, that the point of the exercise was to get you to understand that the university took plagiarism seriously, especially with the "this one time only" string attached, it felt dishonest in that nobody ever came out and said so. I luckily wasn't on the receiving end of one of those citations, but I can only imagine the panic of a typical first-semester freshman being formally accused of plagiarism.

    • If someone complained to us TAs during or after the lab that the simulators were incorrect, we were quite open that indeed they were, and that was not our doing, but we were okay with it because lying documentation was a part of the real world.

      The professor had been doing the class with those robots for several years when I took the class the first time, but I don't know if he acquired that brand of robots because their simulator was broken or if that was just a happy accident that he took advantage of.

      The lesson certainly has stuck with me- this was one lab in a class I took almost a quarter-century ago and I vividly remember both the frustration of not moving the robot and the frustration of everyone in the sections that I TA'd.

    • Right. I'm all for making freshmen learn it early but this is just hazing.