Comment by mianos

1 year ago

I have an alternate theory: about 10% of developers can actually start something from scratch because they truly understand how things work (not that they always do it, but they could if needed). Another 40% can get the daily job done by copying and pasting code from local sources, Stack Overflow, GitHub, or an LLM—while kinda knowing what’s going on. That leaves 50% who don’t really know much beyond a few LeetCode puzzles and have no real grasp of what they’re copying and pasting.

Given that distribution, I’d guess that well over 50% of Makefiles are just random chunks of copied and pasted code that kinda work. If they’re lifted from something that already works, job done—next ticket.

I’m not blaming the tools themselves. Makefiles are well-known and not too verbose for smaller projects. They can be a bad choice for a 10,000-file monster—though I’ve seen some cleanly written Makefiles even for huge projects. Personally, it wouldn’t be my first choice. That said, I like Makefiles and have been using them on and off for at least 30 years.

83 comments

mianos

huijzer 1 year ago

> That leaves 50% who don’t really know much beyond a few LeetCode puzzles and have no real grasp of what they’re copying and pasting.

Small nuance: I think people often don’t know because they don’t have the time to figure it out. There are only so many battles you can fight during a day. For example if I’m a C++ programmer working on a ticket, how many layers of the stack should I know? For example, should I know how the CPU registers are called? And what should an AI researcher working always in Jupyter know? I completely encourage anyone to learn as much about the tools and stack as possible, but there is only so much time.

kragen 1 year ago
If you spend 80% of your time (and mental energy) applying the knowledge you already have and 20% learning new things, you will very quickly be able to win more battles per day than someone who spends 1% of their time learning new things.
Specifically for the examples at hand:
- at 20%, you will be able to write a Makefile from scratch within the first day of picking up the manual, rather than two or three weeks if you only invest 1%.
- if you don't know what the CPU registers are, the debugger won't be able to tell you why your C++ program dumped core, which will typically enable you to resolve the ticket in a few minutes (because most segfaults are stupid problems that are easy to fix when you see what the problem is, though the memorable ones are much hairier.) Without knowing how to use the disassembly in the debugger, you're often stuck debugging by printf or even binary search, incrementally tweaking the program until it stops crashing, incurring a dog-slow C++ build after every tweak. As often as not, a fix thus empirically derived will merely conceal the symptom of the bug, so you end up fixing it two or three times, taking several hours each time.
Sometimes the source-level debugger works well enough that you can just print out C++-level variable values, but often it doesn't, especially in release builds. And for performance regression tickets, reading disassembly is even more valuable.
(In C#, managed C++, or Python, the story is of course different. Until the Python interpreter is segfaulting.)
How long does it take to learn enough assembly to use the debugger effectively on C and C++ programs? Tens of hours, I think, not hundreds. At 20% you get there after a few dozen day-long debugging sessions, maybe a month or two. At 1% you may take years.
What's disturbing is how many programmers never get there. What's wrong with them? I don't understand it.
- remus 1 year ago
  
  You make it sound easy, but I think it's hard to know where to invest your learning time. For example, I could put some energy into getting better at shell scripting but realistically I don't write enough of it that it'll stick so for me I don't think it'd be a good use of time.
  Perhaps in learning more shell scripting I have a breakthrough and realise I can do lots of things I couldn't before and overnight can do 10% more, but again it's not obvious in advance that this will happen.
  
  10 replies →
- icameron 1 year ago
  
  That’s an insightful comment, but there is a whole universe of programmers who never have to directly work in C/C++ and are productive in safe languages that can’t segfault usually. Admittedly we are a little jealous of those elite bitcrashers who unlock the unbridled power of the computer with C++… but yeah a lot of day jobs pay the bills with C#, JavaScript, or Python and are considered programmers by the rest of the industry
  
  5 replies →
ajross 1 year ago
> I completely encourage anyone to learn as much about the tools and stack as possible, but there is only so much time.
That seems like a weird way to think about this. I mean, sure, there's no time today to learn make to complete your C++ ticket or whatever. But yesterday? Last month? Last job?
Basically, I think this matches the upthread contention perfectly. If you're a working C++ programmer who's failed to learn the Normal Stable of Related Tools (make, bash, python, yada yada) across a ~decade of education and experience, you probably never will. You're in that 50% of developers who can't start stuff from scratch. It's not a problem of time, but of curiosity.
- Joker_vD 1 year ago
  
  > I mean, sure, there's no time today to learn make to complete your C++ ticket or whatever. But yesterday? Last month? Last job?
  That seems like a weird way to think about this. Of course there was no time in the past to learn this stuff, if you still haven't learned it by the present moment. And even if there were, trying to figure out whether there perhaps was some free time in the past is largely pointless, as opposed to trying to schedule things in the future: you can't change the past anyhow, but the future is somewhat more malleable.
  
  7 replies →
silveraxe93 1 year ago

This is the 40% that OP mentioned. But there's a proportion on people/engineers that are just clueless and are incapable of understanding code. I don't know the proportion so can't comment on the 50% number, but hey definitely exist.
If you never worked with them, you should count yourself lucky.
silver_silver 1 year ago
We can’t really call the field engineering if this is the standard. A fundamental understanding of what one’s code actually makes the machine do is necessary to write quality code regardless of how high up the abstraction stack it is
- kragen 1 year ago
  
  Steam engines predate the understanding of not just the crystalline structure of steel but even the basics of thermodynamics by quite a few decades.
  
  12 replies →
- retros3x 1 year ago
  
  The problem is that software is much more forgiving than real life engineering project. You can't build a skyscraper with duct tape. With software, especially the simple webapps most devs work on, you don't NEED good engineering skills to get it running. It will suck of course, but it will not fall apart immediately. So of course most "engineers" will go the path of least resistance and never leave the higher abstractions to dive deep in concrete fundamentals.
- cudgy 1 year ago
  
  Sure if you are doing embedded programming in C. How does one do this in web development though where there are hundreds of dependencies that get updated monthly and still add functionality and keep their job?
  
  10 replies →
godelski 1 year ago
Funny enough I'm an A̶I̶ML researcher and started in HPC (High Performance Computing).
> if I’m a C++ programmer ... should I know how the CPU registers are called?
Probably.
Especially with "low level"[0] languages knowing some basics about CPU operations goes a long way. You can definitely get away without knowing these things but this knowledge will reap rewards. This is true for a lot of system based information too. You should definitely know about things like SIMD, MIMD, etc because if you're writing anything in C/C++ these days it should be because you care a lot about performance. There's a lot of stuff that should be parallelized that isn't. Even stuff that could be trivially parallelized with OpenMP.
> what should an AI researcher working always in Jupyter know?
Depends on what they're researching. But I do wish a lot more knew some OS basics. I see lots of things in papers where they're like "we got 10x" performance on some speed measurement but didn't actually measure it correctly (e.g. you can't use time.time and be accurate because there's lots of asynchronous operations). There's lots of easy pitfalls here that are not at all obvious and will look like they are working correctly. There's things about GPUs that should be known. Things about math and statistics. Things about networking. But this is a broad field so there are of course lots of answers here. I'd at least say anyone working on AI should read at least some text on cognitive science and neuroscience because that's a super common pitfall too.
I think it is easy to not recognize that information is helpful until after you have that information. So it becomes easy to put off as not important. You are right that it is really difficult to balance everything though but I'm not convinced this is the problem with those in that category of programmers. There's quite a number of people who insist that they "don't need to" learn things or insist certain knowledge isn't useful based on their "success."
IMO the key point is that you should always be improving. Easier said than done, but it should be the goal. At worst, I think we should push back on anyone insisting that we shouldn't be (I do not think you're suggesting this).
[0] Quotes because depends who you talk to. C++ historically was considered a high level language but then what is Python, Lua, etc?
oweiler 1 year ago
That's why suggestions like RTFM! are stupid. I just don't have time to read every reference documentation of every tool I use.
- aulin 1 year ago
  
  you don't have the time because you spend it bruteforcing solutions by trial and error instead of reading the manual and doing them right the first time
- kccqzy 1 year ago
  
  I feel like your working environment might be to blame: maybe your boss is too deadline-driven so that you have no time to learn; or maybe there is too much pressure to fix a certain number of tickets. I encourage you to find a better workplace that doesn't punish people who take the time to learn and improve themselves. This also keeps your skills up to date and is helpful in times of layoffs like right now.
- prerok 1 year ago
  
  Seriously? Yes, you should read the docs of every API you use and every tool you use.
  I mean, it's sort of ok if you read somewhere how to use it and you use it in the same way, but I, for one, always check the docs and more often even the implementation to see what I can expect.

adrian_b 1 year ago

Actually it is trivial to write a very simple Makefile for a 10,000 file project, despite the fact that almost all Makefiles that I have ever seen in open-source projects are ridiculously complicated, far more complicated than a good Makefile would be.

In my opinion, it is a mistake almost always when you see in a Makefile an individual rule for making a single file.

Normally, there should be only generic building rules that should be used for building any file of a given type.

A Makefile should almost never contain lists of source files or of their dependencies. It should contain only a list with the directories where the source files are located.

Make should search the source directories, find the source files, classify them by type, create their dependency lists and invoke appropriate building rules. At least with GNU make, this is very simple and described in its user manual.

If you write a Makefile like this, it does not matter whether a project has 1 file or 10,000 files, the effort in creating or modifying the Makefile is equally negligible. Moreover, there is no need to update the Makefile whenever source files are created, renamed, moved or deleted.

mianos 1 year ago
If everything in your tree is similar, yes. I agree that's going to be a very small Makefile.
While this is true, for much larger projects, that have lived for a long time, you will have many parts, all with slight differences. For example, over time the language flavour of the day comes and goes. Structure changes in new code. Often different subtrees are there for different platforms or environments.
The Linux kernel is a good, maybe extreme, but clear example. There are hundreds of Makefiles.
- adrian_b 1 year ago
  
  Different platforms and environments are handled easily by Make "variables" (actually constants), which have platform-specific definitions, and which are sequestered into a platform-specific Makefile that contains only definitions.
  Then the Makefiles that build a target file, e.g. executable or library, include the appropriate platform-specific Makefile, to get all the platform-specific definitions.
  Most of my work is directed towards embedded computers with various architectures and operating systems, so multi-platform projects are the norm, not the exception.
  A Makefile contains 3 kinds of lines: definitions, rules and targets (typical targets may be "all", "release", "debug", "clean" and so on).
  I prefer to keep these in separate files. If you parametrize your rules and targets with enough Make variables to allow their customization for any environment and project, you must almost never touch the Makefiles with rules and targets. For each platform/environment, you write a file with appropriate definitions, like the names of the tools and their command-line options.
  The simplest way to build a complex project is to build it in a directory with a subdirectory for each file that must be created. In the parent directory you put a Makefile that is the same for all projects, which just invokes all the Makefiles from the subdirectories that it finds below, passing any CLI options.
  In the subdirectory for each generated file, you just put a minimal Makefile, with only a few lines, which includes the Makefiles with generic rules and targets and the Makefile with platform-specific definitions, adding the only information that is special for the generated file, i.e. what kind of file it is, e.g. executable, static library, dynamic library etc., a list of directories where to search for source files, the strings that should be passed to compilers for their include directory list and their preprocessor definition list, and optionally and infrequently you may override some Make definitions, e.g. for providing some special tool options, e.g. when you generate from a single source multiple object files.
- ori_b 1 year ago
  
  Engineering is deciding that everything in your source tree will be designed to be similar, so you can have a small makefile.
imtringued 1 year ago

Sure, but this will require you to know how to tell the compiler to generate your Makefile header dependencies and if you end up making a mistake, this will cause silent failures.

Loic 1 year ago

I like Makefiles, but just for me. Each time I create a new personal project, I add a Makefile at the root, even if the only target is the most basic of the corresponding language. This is because I can't remember all the variations of all the languages and frameworks build "sequences". But "$ make" is easy.

1aqp 1 year ago

I'd say: you are absolutely using the right tool. :-)
choeger 1 year ago
You're probably using the wrong tool and should consider a simple plain shell script (or a handful of them) for your tasks. test.sh, build.sh, etc.
- poincaredisk 1 year ago
  
  I disagree. Make is - at it's simplest form - exactly a "simple plain shell script" for your tasks, with some very nice bonus features like dependency resolution.
  Not the parent, bit I usually start with a two line makefile and add new commands/variables/rules when necessary.
- nrclark 1 year ago
  
  (not the parent)
  Make is - at its core - a tool for expressing and running short shell-scripts ("recipes", in Make parlance) with optional dependency relationships between each other.
  Why would I want to spread out my build logic across a bunch of shell scripts that I have to stitch together, when Make is a nicely integrated solution to this exact problem?
  
  4 replies →

f1shy 1 year ago

I would just change the percentages, but is about as true as it gets.

mianos 1 year ago
I’d be curious to hear your ratio. It really varies. In some small teams with talented people, there are hardly any “fake” developers. But in larger companies, they can make up a huge chunk.
Where I am now, it’s easily over 50%, and most of the real developers have already left.
PS: The fakes aren’t always juniors. Sometimes you have junior folks who are actually really good—they just haven’t had time yet to discover what they don’t know. It’s often absolutely clear that certain juniors will be very good just from a small contribution.
- f1shy 1 year ago
  
  My personal experience: - 5% geniuses. This are people who are passionate about what they do, they are always up to date. Typically humble, not loud people. - 15% good, can do it properly. Not passionate, but at least have a strong sense of responsibility. Want to do “the right thing” or do it right. Sometimes average intelligence, but really committed. - 80% I would not hire. People who talk a lot, and know very little. Probably do the work just because they need the money.
  That applies for doctors, contractors, developers, taxi drivers, just about anything and everything. Those felt percentages had been consistent across 5 countries, 3 continents and 1/2 a century of life
  PS: results are corrected for seniority. Even in the apprentice level I could tell who was in each category.
  
  1 reply →

afiori 1 year ago

Being able to set up things and truly understanding how they work are quite different imo.

I agree with the idea that a lot of productive app developers would not be able to set up a new project ex novo but often it is not about particularly true understanding but rather knowing the correct set of magic rules and incantations to make many tools work well together

znpy 1 year ago

> They can be a bad choice for a 10,000-file monster

Whether they are a bad choice really depends on what are the alternatives though

sebazzz 1 year ago

> That leaves 50% who don’t really know much beyond a few LeetCode puzzles and have no real grasp of what they’re copying and pasting.

Who likely wouldn't have a job if it weren't for LLMs.

raziel2p 1 year ago
pretty sure we've made this complaint about a subset of developers since way before chatgpt and the like.
- f1shy 1 year ago
  
  And that happens not only with developers, but in any profession, which gives me shivers when I go to the doctor!