Comment by Lerc
1 year ago
I have tried to convince people that ASM is reasonable as a first stage teaching language. The reputation as a nearly mystical art practiced by a few doesn't help. The thing is, instructions are simple. Getting them to do things is not hard, the difficulty comes from tasks exceeding a scale where you can think about things at their most basic level.
It quickly becomes tedious to do large programs, not really hard, just unmanagable, which is precisely it should be taught as a first language. You learn how do do simple things and you learn why programming languages are used. You teach the problem that is being solved before teaching more advanced programming concepts that solve the problem.
The biggest problem with using ASM as a first language to teach beginners is that it is extremely tedious, error prone, and sensitive to details. It is also unstructured, it uses entirely different control flow primitives than any language they will learn in the future, meaning they will not be well prepared for learning a real language that does scale to programs more complex than a few additions and calling an OS output routine.
So why teach someone a language that doesn't have if, while, (local) variables, scopes, types, nor even real function calls?
It's a very nice exercise for understanding how a computer functions, and it has a clear role in education - I'm not arguing people shouldn't learn it at all. But I think it's a terrible first language to learn.
Because these are the primatives that are in use when programming in any language, and there is a benefit to learning the primatives before learning higher level abstractions. For instance we teach arithmetic before calculus.
I see lots of people become pretty helpless when their framework isn’t working as expected or abstraction becomes leaky. Most people don’t really need to know assembly in order to get past this, but the general intuition of “there is something underneath the subtraction that I could understand” is very useful.
The primitives of control flow in programming languages are sequencing, if, while, for, switch, return, and "early return" (goto restricted to exit a containing block). We might compile these into a form that represents everything using conditional jumps, unconditional jumps, and jump tables, but that's not how people think about it, definitely not at the level of programming languages (and even in the compiler IR phase, we're often mentally retranslating the conditional jump/unconditional jump model back into the high-level control flows).
And I could go on with other topics. High-level languages, even something like C, are just a completely different model of looking at the world from machine language, and truly understanding how machines work is actually quite an alien model. There's a reason that people try to pretend that C is portable assembler rather than actually trying to work with a true portable assembler language.
The relationship you're looking for is not arithmetic to calculus, but set theory to arithmetic. Yes, you can build the axioms of arithmetic on top of set theory as a purer basis. But people don't think about arithmetic in terms of set theory, and we certainly don't try to teach set theory before arithmetic.
> For instance we teach arithmetic before calculus.
I don’t think that’s a fitting analogy.
Nearly everyone on the planet uses (basic) arithmetic. Very few use calculus.
By contrast, very few (programmers) use ASM, but nearly all of them use higher level languages.
5 replies →
I think comparing assembly with arithmetic is dead wrong. Arithmetic is something that you use constantly in virtually any mathematical activity you will ever do, at least at the under-graduate level. There is literally 0 calculus, statistics, or algebra you could understand if you didn't know arithmetic.
In contrast, you can have a very successful, very advanced career in computer science or in programming without once in your life touching a line of assembler code. It's not very likely, and you'll be all the poorer for it, but it's certainly possible.
Assembly language is much more like learning the foundations of mathematics, like Hilbert's program (except, of course, historically that came a few millenia after).
> extremely tedious, error prone, and sensitive to details
I've taught people Python as their first language, and this was their exact opinion of it.
When you're an experienced programmer you tend to have a poor gauge of how newcomers internalize things. For people who are brand new it is basically all noise. We're just trying to gradually get them used to the noise. Getting used to the noise while also trying to figure out the difference between strings, numbers, booleans, lists, etc. is more difficult for newcomers than many people realize. Even the concept of scoping can sometimes be too high-level for a beginner, IME.
I like asm from the perspective that, its semantics are extremely simple to explain. And JMP (GOTO) maps cleanly from the flowchart model of programming that most people intuit first.
IMO Python used to be a great first language, but it's gotten much more complicated over the years. When I'm teaching programming, I want an absolute minimum number of things where I have to say "don't worry about that, it's just boilerplate, you'll learn what it means later."
In particular, Python having generators and making range() be a generator means that in order to fully explain a simple for loop that's supposed to do something X times, I have to explain generators, which are conceptually complicated. When range() just returned a list, it was much easier to explain that it was iterating over a list that I could actually see.
2 replies →
My kid is just finishing up a high school intro CS class. A full school year in, and they still have trouble with the fact that their variable and type names must have the exact same capitalization everywhere they're used.
I do realize how difficult this all is, I still have some recollection from how I started to program and how alien it all seemed. And note that I first started with 4 years of C in high-school
However, I don't agree at all that having strings and numbers as different things was ever a problem. On the contrary, explaining that the same data can be interpreted as both 40 and "0" is mistifying and very hard to grok, in my experience. And don't get me started on how hard it is to conceptualize pointers. Or working with the (implicit) stack in assembly instead of being able to use named variables.
> So why teach someone a language that doesn't have if, while, (local) variables, scopes, types, nor even real function calls?
You can teach them how to implement function calls, variables and loops using assembly, to show them how they work under the hood and how they should be thankful for having simple if in their high level languages like C.
That often leaves people with very bad mental models of how programs actually compile in modern optimizing compilers and in modern operating systems (e.g. people end up believing that variables always live on the stack, that function parameters are passed on the stack, that loops are executed in the same way regardless of how you write them, etc).
13 replies →
> it is extremely tedious, error prone, and sensitive to details.
That sounds like the perfect beginner language! If they survive that experience, they'll do very well in almost any type of programming, as it's mostly the same just a tiny bit less tedious. A bit like "hardening" but for programmers.
Perhaps if you want to gatekeep for the most stubborn individuals, but you'll lose a lot of talent that way.
1 reply →
So much this.
This is like learning to read by first being asked to memorize all the rules of grammar and being quizzed on them, or being forced to learn all the ins and outs of book binding and ink production.
It's tedious, unproductive, miserable.
There's very little reward for a lot of complexity, and the complexity isn't the "stimulating" complexity of thinking through a problem; it's complexity in the sense of "I put the wrong bit in the wrong spot and everything is broken with very little guidance on why, and I don't have the mental model to even understand".
There's a perfectly fine time to learn assembly and machine instructions, and they're useful skills to have - but they really don't need to be present at the beginning of the learning process.
---
My suggestion is to go even farther the other way. Start at the "I can make a real thing happen in the real world with code" stage as soon as possible.
Kids & adults both light up when they realize they can make motor turn, or an LED blink with code.
It's similarly "low level" in that there isn't much going on and they'll end up learning more about computers as machines, but much more satisfying and rewarding.
The best way to go about that is to use a simulator for an old cpu like EdSim51[0]. Can do a lot of things with just a few lines of code.
> it's complexity in the sense of "I put the wrong bit in the wrong spot and everything is broken with very little guidance on why, and I don't have the mental model to even understand
That's the nice thing about assembly, it always works, but the result may not be as expected. But instead of having a whole lot of magic between what is happening and how you model it, it's easy to reason about the program. You don't have to deal with stack trace, types, garbage collection and null pointer exception. Execution and programming is the same mental model, linear unless you said so.
You can start with assembly and then switch to C or Python and tell them: For bigger project, assembly is tedious and this is what we invented instead.
[0]: https://edsim51.com/about-the-simulator/
2 replies →
"is that it is extremely tedious, error prone, and sensitive to details. It is also unstructured,"
That's why it's such an important first language! Pedagogically it's the foundation motivating all the fancy things languages give you.
You don't teach a kid to cut wood with a table saw. You give them a hand saw!
No, it is not the foundation motivating what other languages give you, not at all.
Programming languages are usually designed based on formal semantics. They include constructs that have been found either through experience or certain formal reasons to be good ways to structure programs.
Haskell's lazy evaluation model, for example, has no relationship to assembly code. It was not in any way designed with thought to how assembly code works, it was designed to have certain desirable theoretical properties like referential transparency.
It's also important to realize that there is no "assembly language". Each processor family has its own specific assembly code with its own particular semantics that may vary wildly from any other processor. Not to mention, there are abstract assembly codes like WebAssembly or JVM bytecode, which often have even more alien semantics.
1 reply →
You give them a hand saw because power tools are far easier to inflict serious injuries with. But if you're teaching a kid who's old enough, there's no reason to start on a hand saw if you have the power tools available.
1 reply →
> You don't teach a kid to cut wood with a table saw. You give them a hand saw!
Okay but that's not for pedagogical reasons, it's because power saws are MUCH more dangerous than hand saw.
Contrariwise, you don't teach a kid to drill wood with a brace & bit, because a power drill is easier to use.
Controversial opinion but we should be teaching new programmers how a CPU works and not hand-wave the physical machine away to the cloud.
Not doing this is how you get Electron.
We teach math this way. Addition and subtraction. Then multiplication. Then division. Fractions. Once those are understood we start diversifying and teaching different techniques where these make up the building blocks, statistics, finance, algebra, etc.
It may put people off a programming career, but perhaps that is good. There are a lot of people who work in programming who don't understand the machines they use, who don't understand algorithms and data structures, they have no idea of the impact of latency, of memory use, etc. They're entire career is predicated on being able to never have to solve a problem that hasn't been solved in general terms already.
We teach math starting with basic arithmetic, starting from the middle. We don't go explaining what numbers are in terms of sets, we don't teach Peano arithmetic or other theories that can give logical definitions of arithmetic from the ground up.
Plus, it is literally impossible to do any kind of math without knowing arithmetic. It is very possible to build a modestly advanced career knowing no assembly language.
1 reply →
> We teach math this way. Addition and subtraction. Then multiplication. Then division
The first graders in my neighbourhood school are currently leaning about probability. While they did cover addition earlier in the year, they have not yet delved into topics like multiplication, fractions, or anything of that sort. What you suggest is how things were back in my day, to be fair, but it is no longer the case.
Starting with assembly makes it pretty clear why higher level languages had been invented. E.g. a speed run through computing:
- machine code
- assembly
- Lisp and Forth
- C
- Pascal
- maybe a short detour into OOP and functional languages
...but in the end, all you need to understand for programming computers are "sequences, conditions and loops" (that's what my computer club teacher used to say - still good advice).
I'd change the end of that list to C, Pascal, Lisp, Python.
But in the end no one learns "assembler". Everyone learns a specific ISA, and they all have different strengths and limitations. Assembler on a 36-bit PDP-10, with 16 registers and native floating point, is a completely different experience to assembler on a Z80 with an 8-bit accumulator and no multiply or divide.
You can learn about the heap and the stack and registers and branches and jumps on both, but you're still thinking in terms of toy matchstick architecture, not modern building design.
> but in the end, all you need to understand for programming computers are "sequences, conditions and loops"
I fully agree - and assembly language teaches you precisely 0 of these.
2 replies →
> I have tried to convince people that ASM is reasonable as a first stage teaching language.
Unless you're teaching people preparing for engineering hardware perhaps, I think ASM is absolutely the wrong language for this. The first reason is that programming is about problem solving, not fiddling with the details of some particular architecture, and ASM is pretty bad at clearly expressing solutions in the language of the problem domain. Instead of programming in the language of the domain, you're busy flipping bits which are an implementation detail. It is really a language for interfacing with and configuring hardware.
The more insidious result is that teaching ASM will make an idol out of hardware by reinforcing the notion that computer science or programming are about computing devices. It is not. The computing device is totally auxiliary wrt subject matter. It is utterly indispensable practically, yes, but it is not what programming is concerned with per se. It is good for an astronomer to be able to operate his telescope well, but he isn't studying telescopes. Telescope engineers do that.
> The first reason is that programming is about problem solving, not fiddling with the details of some particular architecture, and ASM is pretty bad at clearly expressing solutions in the language of the problem domain. Instead of programming in the language of the domain, you're busy flipping bits which are an implementation detail.
"How do I use bits to represent concepts in the problem domain?" is the fundamental, original problem of computer science.
And to teach this, you use much simpler problems.
> ... reinforcing the notion that computer science or programming are about computing devices. It is not.
It is, however, about concepts like binary place-value arithmetic, and using numbers (addresses) as a means of indirection, and about using indirection to structure data, and about being able to represent the instructions themselves as data (such that they can be stored somewhere with the same techniques, even if we don't assume a Von Neumann machine), and (putting those two ideas together) about using a number as a way to track a position in the program, and manipulating that number to alter the flow of the program.
In second year university I learned computer organization more or less in parallel with assembly. And eventually we got to the point of seeing - at least in principle - how a basic CPU could be designed, with its basic components - an ALU, instruction decoder, bus etc.
Similarly:
> It is good for an astronomer to be able to operate his telescope well, but he isn't studying telescopes.
The astronomer is, however, studying light. And should therefore have a basic mental model of what a lens is, how lenses relate to light, how they work, and why telescopes need them.
> "How do I use bits to represent concepts in the problem domain?" is the fundamental, original problem of computer science.
> It is, however, about concepts like binary place-value arithmetic
That is the original problem of using a particular digital machine architecture. One shouldn't confuse the practical/instrumental problems at the time with the field proper. There's nothing special about bits per se. They're an implementation detail. We might study them for practical reasons, we may study the limits of what can be represented by or computed using binary encodings, or efficient ways to do so or whatever, but that's not the central concern of computer science.
> In second year university I learned computer organization more or less in parallel with assembly.
Sure. But just because a CS major learns these things doesn't make it computer science per se. It's interesting to learn, sure, and has practical utility, but particular computer architectures are not the domain of computer science. They're the domain of computer engineering.
> The astronomer is, however, studying light.
No, physicists studying optics study light in this capacity. Astronomers know about light, because knowledge of light is useful for things like computing interstellar distances or determining the composition of stellar objects or for calculating the rate of expansion or whatever. The same goes for knowledge of lenses and telescopes: they learn about them so they can use them, but they don't study them.
Ooh, very much disagree with a lot of these assertions. The problem I always encounter when trying to teach programming is that students completely lack an understanding of how to imagine and model the state of the computational system in their heads. This leads to writing code that looks kinda like it should do what the student wants, but betrays the fact that they really don't understand what the code actually means.
In order to successfully program a solution to a problem, it is necessary to understand the system you are working with. A machine-level programming language cuts through the squishiness of that and presents a discrete and concrete system whose state can be fully explained and understood without difficulty. The part where it's all implementation details is the benefit here.
I suspect your background is dominated by imperative languages, because these often bleed low-level, machine concepts into the domain of discourse, causing a conceptual muddle.
From a functional perspective, you see things properly as a matter of language. When you describe to someone some solution in English, do you worry about a "computational system"? When writing proofs or solving mathematical problems using some formal notation, are you thinking of registers? Of course not. You are using the language of the domain with its own rules.
Computer science is firmly rooted in the formal language tradition, but for historical reasons, the machine has assumed a central role it does not rightly possess. The reason students are confused is because they're still beholden to the machine going into the course, causing a compulsion to refer to the machine to know "what's really going on" at the machine level. Instead of thinking of the problem, they are worrying about distracting nonsense.
The reason why your students might feel comforted after you explain the machine model is because they already tacitly expect the machine to play a conceptual role in what they're doing. They stare and think "Okay, but what does this have to do with computers?". The problem is caused by the prior idolization of the machine in the first place.
But machine code and a machine model are not the "really real", with so-called "high-level languages" hovering above them like some illusory phantom that's just a bit of theater put on by 1s and 0s. The language exists in our heads; machines are just instruments for simulating them. And assembly language itself is just another language. It's domain just is, loosely, the machine architecture.
So my view is that introductory classes should beat the machine out of students' heads. There is no computer, no machine. The first few classes in programming should omit the computer and begin with paper and pencil and a small toy language (a pure, lispy language tends to be very good here). They should gain facility in this small language first. The aim should be to make it clear that the language is about talking about the domain, and that it stands on its own, as it were; the computer is to programming as the calculator is to mathematical calculation. Only once this have been achieved are computers permitted, because large programs are impractical to deal with using pen and paper.
This intuition is the foundation difference between a bona fide compute science curriculum and dilettante tinkering.
5 replies →
IMO It depends a lot on the assembly flavour.
The best ISA for learning is probably the Motorola 68000, followed by some 8-bit CPUs (6502, 6809, Z80), also probably ARM1, although I never had to deal with it. I always thought that x86 assembly is ugly (no matter if Intel or AT&T).
> It quickly becomes tedious to do large programs
IME with modern tooling, assembly coding can be surprisingly productive. For instance I wrote a VSCode extension for 8-bit home computers [1], and dog-fooded a little demo with it [2], and that felt a lot more productive than back in the day with an on-device assembler (or even typing in machine code by numbers).
[1] https://marketplace.visualstudio.com/items?itemName=floooh.v...
[2] https://floooh.github.io/kcide-sample/kc854.html?file=demo.k...
Oh nice, I was talking just yesterday how I liked chips as a programming paradigm.
I agree about tooling, I made a pacman game in a dcpu16 emulator in a couple of days.
I experimented with a fantasy console idea using an in-browser assembler as well. https://k8.fingswotidun.com/static/ide/
I think you can build environments that give immediate feedback and the ability to do real things quickly in ASM. I would still recommend moving swiftly on to something higher level as soon as it started to feel like a grind.
Sure, but learning an old ISA can leave you with a very very wrong idea about how modern processors work. Even x86 assembly paints a very misleading image of how modern processors actually work. For example, someone learning x86-64 assembly will likely believe all of the following:
- assembly instructions are executed in the order they appear in in the source code
- an x86 processor only has a handful of registers
- writing to a register is an instruction like any other and will take roughly the same time
- the largest registers on an x86 processor are 64-bit
All of which are completely irrelevant implementation details hidden behind the ISA. The x86-64 ISA promises execution of instructions in the specified order, a certain number of registers, etc. and that's all they need to know.
5 replies →
They will be disabused of any of those notions simply by reading the relevant portions of the architecture handbook. In a pedagogical environment that's very simple to arrange.
2 replies →
Peeking under the hood is a later step after getting comfortable with assembly coding. E.g. none of those details are really relevant when starting out, instead it makes a lot of sense to do a speed run through computing history in order to really understand why modern CPUs (and computers as a whole) work like they do.
I agree that M68k is nice, as are the 8-bit ones you mention. I just find it strange that you like Z80 and dislike x86 - they are fundamentally not that different and both are descended from 8080.
Yeah the Z80 instruction set is quite messy (mainly because it had to fill gaps of the 8080 instruction set for backward compatibility). But as an evolution of the 8080 instruction set, the Z80 is still cleaner than x86 (IMHO!).
Also, the Z8000 looks quite interesting and like the better 16-bit alternative to the x86, but it never took off: https://en.wikipedia.org/wiki/Zilog_Z8000
I started with Z-80 assembly, then BASIC, then 6502 assembly, then higher-level languages like C and perl, and I think the assembly gave me a useful foundation for what was going on under the hood. I'm not sure I'd even call assembly a "language" in the sense of the others. It has instructions, not statements, and there's really no syntax.
If I were teaching a general-interest programming course, I'd probably start with just a bit of BASIC to introduce a few broad concepts like variables and looping, then a little assembly to say, "And this is what's going on when you do those things," and then move up the chain. Then when they get to something like C and go to look at the assembly it produces for debugging, they'll at least be familiar with concepts like registers and branching. So not quite the order I happened to do it in, but similar.
I was a TA for an intro to assembly language course, which means I got my office hours full of all of the students who struggled with assembly language and had to work with them one-on-one to get them over their roadblocks to pass the class.
Assembly language is not a reasonable first programming language. There's just so many things about it that make it a poor choice for programming instruction.
Chiefly, assembly lacks structure. There's no such thing as variables. There's no such thing as functions. You can fake some of this stuff with convention, but if you make mistakes--and students in intro-to-programming will make mistakes--there is nothing that's going to poke you that you did something wrong, you just get the wrong result.
If you have a good macro assembler, it is only a little more difficult than C. There's just more to learn up front (things like calling conventions, register usage, etc...).
I wouldn't teach it first, but after a person knows the basics in another language, seeing how it all actually works can be fun.
I think in most CS programs, students do learn assembly early on, perhaps not as the first language, but definitely as a second language, as required by most Arch courses.
This almost feels like an argument that we should teach computer science via bare metal bootstrapping.
Start out at "here's your machine code. Let's understand how x86_64 gets started" and work your way up to "now you have the automation to compile Linux and a modern compiler".
Which would certainly have stops most of the way up for things we usually include.
So... NAND to Tetris?
Personally way back when, I first learned BASIC, then tried to learn C, but didn't get pointers, then learned ASM, and then pointers became obvious, and went back to C. If you're going to be using C or doing anything with hardware, learning ASM IMO is very useful just to understand how the machine really works.
assembly is a good first language if you have a simple instruction set or machine. When I saw new people learn java, easily the hardest initial bump to get over was "what the hell is public static void main(String[] args) ?" or "eclipse didn't build it for some reason"
Python is much easier to introduce someone to because there's no boilerplate and the tooling is very simple. Assembly on x86 machines is a royal PITA to set up, and you also need some kind of debugger to actively inspect the program counter and registers.
When I took Computer Organization & Architecture, they had us play around with MARIE[1] which really made assembly make sense to me. After that, I wrote an 8080 emulator and it made even MORE sense to me.
---
[1] https://marie.js.org/
> Getting them to do things is not hard, the difficulty comes from tasks exceeding a scale where you can think about things at their most basic level.
Indeed - you don't actually need to work on difficult tasks to get the intellectual benefit. Once you've properly understood what a computer is, you can absorb the ideas of SICP.
Beginners should have immediate positive feedback. It's not possible with assembly language.
It's just as straightforward as in higher level languages, just not quite as interactive as interpreted languages, but I've never seen an "intro to programming" that started in a REPL even when using an interpreted language. Hello world is even shorter and simpler than most languages (in a modern OS environment).
The old D86 debugger[1][2] comes close to being a REPL for assembly language, helped me a lot with learning it when I found it on a shareware collection CD as a kid.
Load registers, call DOS or BIOS with 'int', etc. all interactively and with a nice full screen display of registers, flags and memory. Of course entering single instructions to run immediately only gets you so far, but you can also enter short programs into memory and single step through them.
It's too bad nothing like this seems to exist for modern systems! With the CPU virtualization features now available, you could even experiment with ring 0 code without fear of crashing your machine.
[1] http://eji.com/a86
[2] to run it under DOSBox, you may need to add the '+b4' command line switch in order to bypass some machine detection code
Shenzhen I/O begs to disagree :P
https://www.zachtronics.com/shenzhen-io/
You start with a simulator like this one:
https://edsim51.com/about-the-simulator/
Very immediate and positive feedback.
You can with a decent monitor that shows the values of registers, can look up values in memory, etc.
x86 ASM absolutely is NOT a good first language due to it being a complete mess.
16 bit x86 isn't that complicated and (IMO) still helpful in learning some of the more modern stuff. But I'd recommend starting with either 6502, or the 8080, which is like the 8 bit "grandparent" of x86.
Avoid:
- Z80: at least as a first language. Extended 8080 with completely different syntax, even more messy and unorthogonal than x86!
- RISC-V: an architecture designed by C programmers, pretty much exclusively as a target for compiling C to & omitting anything not necessary for that goal