Comment by crazygringo

1 day ago

> the lack of comments is simply not an issue

I'm looking at the code and just cannot agree. If I look at a command like "TRotateFloatCommand.DoIt" in URotate.p, it's 200 lines long without a single comment. I look at a section like this and there's nothing literate about it. I have no idea what it's doing or why at a glance:

  pt.h := BSR (r.left + ORD4 (r.right), 1);
  pt.v := BSR (r.top + ORD4 (r.bottom), 1);
  
  pt.h := pt.h - BSR (width, 1);
  pt.v := pt.v - BSR (height, 1);
  
  pt.h := Max (0, Min (pt.h, fDoc.fCols - width));
  pt.v := Max (0, Min (pt.v, fDoc.fRows - height));
  
  IF width > fDoc.fCols THEN
    pt.h := pt.h - BSR (width - fDoc.fCols - 1, 1);
  
  IF height > fDoc.fRows THEN
    pt.v := pt.v - BSR (height - fDoc.fRows - 1, 1);
  

Just breaking up the function with comments delineating its four main sections and what they do would be a start. As would simple things like commenting e.g. what purpose 'pt' serves -- the code block above is where it is first defined, but you can't guess what its purpose is until later when it's used to define something else.

Good code does not make comments unnecessary or redundant or harmful. This is a myth that needs to die. Comments help you understand code much faster, understand the purpose of variables before they get used, understand the purpose of functions and parameters before reading the code that defines them, etc. They vastly aid in comprehension. And those are just "what" comments I'm talking about -- the additional necessity of "why" comments (why the code uses x approach instead of seemingly more obvious approach y or z, which were tried and failed) is a whole other subject.

That particular code is idiomatic to anyone who worked with 2D bitmap graphics in that era.

pt == point, r == rect, h, v == horizontal, vertical, BSR(...,1) is a fast integer divide by 2, ORD4 promotes an expression to an unsigned 4 byte integer

The algorithms are extremely common for 2D graphics programming. The first is to find the center of a 2D rectangle, the second offsets a point by half the size, the third clips a point to be in the range of a rectangle, and so on.

Converting the idiomatic math into non-idiomatic words would not be an improvement in clarity in this case.

(Mac Pascal didn't have macros or inline expressions, so inline expressions like this were the way to go for performance.)

It's like using i,j,k for loop indexes, or x,y,z for graphics axis.

  • > Converting the idiomatic math into non-idiomatic words would not be an improvement in clarity in this case.

    You seem to be missing my point. It's not about improving "clarity" about the math each line is doing -- that's precisely the kind of misconception so many people have about comments.

    It's about, how long does it take me to understand the purpose of a block of code? If there was a simple comment at the top that said [1]:

      # Calculate top-left point of the bounding box
    

    then it would actually be helpful. You'd understand the purpose, and understand it immediately. You wouldn't have to decode the code -- you'd just read the brief remark and move on. That's what literate programming is about, in spirit -- writing code to be easily read at levels of the hierarchy. And very specifically not having to read every single line to figure out what it's doing.

    The original assertion that "This code is so literate, so easy to read" is demonstrably false. Naming something "pt" is the antithesis of literature programming. And if you insist on no comments, you'd at least need to name is something like "bbox_top_left". A generic variable name like "pt", that isn't even introduced in the context of a loop or anything, is a cardinal sin here.

    [1] https://news.ycombinator.com/item?id=46366341

    • To a graphics programmer this can feel like call for comments to explain that

          i++
      

      increments the loop variable. A newbie to programming might find such a comment useful, but to people who are maintaining such a piece of code that would be distracting line noise.

      It all depends on who your professional peer is that you are writing the code for. It's totally fine to write for a peer who is familiar with the domain, as it's fine to write for a beginner, for pedagogy, such as in a text book.

    • I think at certain calibers of work, like graphics programming in lower level languages, the best you can do is be readable and clear to others who are experts in your field. In other words, you aren't the target audience. There is likely no way to write this specific kind of code in a way that satisfies all audiences. I'm willing to concede then that the 'best' way to write this type of code is determined by the ones writing it, not us with standard views on software.

    • It all depends on how much context the reader has. For some audiences a comment explaining bounding boxes would be helpful; for others your example comment adds nothing that isn't immediately apparent from the code.

      Part of figuring out a reasonable level of commenting (and even variable naming) is a solid understanding of your audience. When in doubt aiming low is good practice, but keep in mind that this was 2D graphics software written at a 2D graphics software company.

    • A graphics programmer does not need that.

      To help understand, you need to see this code as math. Graphics programming algorithms are literally math.

      You're asking for training wheels comments, which just get in the way for those who are familiar with the domain.

      I'm sure a few graphics programming engineers might want calls to react useState(), useEffect(), etc. to be documented in a codebase, yet a react programmer would scoff at the idea.

      1 reply →

  • Xyz makes sense because that is what those axes are literally labeled, but ijk I will rail against until I die.

    There's no context in those names to help you understand them, you have to look at the code surrounding it. And even the most well-intentioned, small loops with obvious context right next to it can over time grow and add additional index counters until your obvious little index counter is utterly opaque without reading a dozen extra lines to understand it.

    (And i and j? Which look so similar at a glance? Never. Never!)

    • > but ijk I will rail against until I die.

      > There's no context in those names to help you understand them, you have to look at the code surrounding it.

      Hard disagree. Using "meaningful" index names is a distracting anti-pattern, for the vast majority of loops. The index is a meaningless structural reference -- the standard names allow the programmer to (correctly) gloss over it. To bring the point home, such loops could often (in theory, if not in practice, depending on the language) be rewritten as maps, where the index reference vanishes altogether.

      4 replies →

    • ijk are standard in linear algebra for vector components.

      > (And i and j? Which look so similar at a glance? Never. Never!)

      This I agree with.

      6 replies →

As other comments have mentioned, context does matter, and as someone with a lot of 2D image/pixel processing experience, other than the 'BSR' and 'ORD4' items - which are clearly common in the codebase and in that era of computing, all that code makes perfect sense.

Also, breaking things down to more atomic functions wasn't the best idea for performance-sensitive things in those days, as compilers were not as good about knowing when to inline and not: compiler capabilities are a lot better today than they were 35 years ago...

This actually looks surprisingly straightforward for what the function is doing - certainly if you have domain context of image editing or document placement. You'll find it in a lot of UI code - this one uses bit shifts for efficiency but what it's doing is pretty straightforward.

For clarity and to demonstrate, this is basically what this function is doing, but in css:

.container {

  position: relative;

}

.obj {

  position: absolute;

  left: 50%;

  top: 50%;

  transform: translate(-50%, -50%);


}

BSR = bitwise right-shift

ORD4 = cast as 32bit integer.

BSR(x,1) simply meant x divided by 2. This is very comment coding idom back in those days when Compiler don't do any optimization and bitwise-shift is much faster than division.

The snippet in C would be:

    pt.h = (r.left + (int32_t)r.right) / 2;
    pt.v = (r.top + (int32_t)r.bottom) / 2;

    pt.h -= (width / 2);
    pt.v -= (height / 2);
  
    pt.h = max(0, min(pt.h, fDoc.fCols - width));
    pt.v = max(0, min(pt.v, fDoc.fRows - height));
  
    if (width > fDoc.fCols) {
      pt.h -= (width - fDoc.fCols - 1) / 2;
    }
  
    if (height > fDoc.fRows) {
      pt.v -= (height - fDoc.fRows - 1) / 2;
    }

Are you familiar with the domain?

Because it's quite clear, everything is well named, and the filename also gives the context.

Finds the center of a rectangle r Positions a width × height region centered on that rectangle.

Clamps the result so it doesn’t go outside the document.

If the region is bigger than the document, it re-centers instead of snapping to (0,0).

The code's functionality is immediately obvious to me as someone who works a lot with graphics coordinate systems.

I'm sure the code would be immediately obvious to anyone who would be working on it at the time.

Comments aren't unnecessary, they can be very helpful, but they also come with a high maintenance cost that should be considered when using them. They are a long-term maintenance liability because by design the compiler ignores them so its very easy to change/refactor code and miss changing a comment and then having the comment be misleading or just plain wrong.

These days one could make some sort of case (though I wouldn't entirely buy it, yet) that an LLM-based linter could be used to make sure comments do not get disconnected from the code they are documenting, but in 1990? not so much.

Would I have used longer variable names for slightly more clarity? Today, sure. In 1990, probably not. Temporal context is important and compilers/editors/etc have come a long way since then.

Man I just don’t know who to believe, you or the Chief Scientist for Software Engineering at IBM research Almaden.

It’s not a myth, it’s a sound software engineering principle.

Every comment is a line of code, and every line of code is a liability, and, worse, comments are a liability waiting to rot, to be missed in a refactor, and waiting to become a source of confusion. It’s an excuse to name things poorly, because “good comment.” The purpose of variables should be in their name, including units if it’s a measurement. Parameters and return values should only be documented when not obvious from the name or type—for example, if you’re returning something like a generic Pair, especially if left and right have the same type. We’d been living with decades of autocomplete, you don’t need to make variables be short to type.

The problem with AI-generated code is that the myth that good code is thoroughly commented code is so pervasive, that the default output mode for generated code is to comment every darn line it generates. After all, in software education, they don’t deduct points for needless comments, and students think their code is now better w/ the comments, because they almost never teach writing good code. Usually you get kudos for extensive comments. And then you throw away your work. Computer science field is littered with math-formula-influenced space-saving one or two letter identifiers, barely with any recognizable semantic meaning.

  • No amount of good names will tell you why something was done a certain way, or just as importantly why it wasn't done a certain way.

    A name and signature is often not sufficient to describe what a function does, including any assumptions it makes about the inputs or guarantees it makes about the outputs.

    That isn't to say that it isn't necessary to have good names, but that isn't enough. You need good comments too.

    And if you say that all of that information should be in your names, you end up with very unwieldy names, that will bitrot even worse than comments, because instead of updating a single comment, you now have to update every usage of the variable or function.

  • >> Every comment is a line of code, and every line of code is a liability, and, worse, comments are a liability waiting to rot,

    This is exactly my view. Comments, while can be helpful, can also interrupt the reading of the code. Also are not verified by the compiler; curious, in the era when everyone goes crazy for rust safety, there is nothing unsafer as comments, because are completely ignored.

    I do bot oppose to comments. But they should be used only when needed.

  • No. What you are describing is exactly the myth that needs to die.

    > comments are a liability waiting to rot, to be missed in a refactor, and waiting to become a source of confusion

    This gets endlessly repeated, but it's just defending laziness. It's your job to update comments as you update code. Indeed, they're the first thing you should update. If you're letting comments "rot", then you're a bad programmer. Full stop. I hate to be harsh, but that's the reality. People who defend no comments are just saying, "I can't be bothered to make this code easier for others to understand and use". It's egotistical and selfish. The solution for confusing comments isn't no comments -- it's good comments. Do your job. Write code that others can read and maintain. And when you update code, start with the comments. It's just professionalism, pure and simple.

    • For all we know, the comment came from someone who was doing their job (by your definition) and were bitten in the behind by colleagues who did not do their job. We do not live in an ideal world. Some people are sloppy because they don't know, don't care, or simply don't have the time to do it properly. One cannot put their full faith into comments because of that.

      (Please note: I'm not arguing against comments. I'm simply arguing that trusting comments is problematic. It is understandable why some people would prefer to have clearly written code over clearly commented code.)

      1 reply →

    • I appreciate your attempt to defend this position and I, and others, wish you good luck. In my many decades of working with humans writing code it simply has never happened.