In response to: Programming is not Algebra by Andy Skelton

In that post, Andy explains one of his friends' problem with programming: she imported her knowledge from mathematics, and seeing as it worked in the simple cases, kept with that as her basis for the symbols in programming, in particular causing a misunderstanding of variables and =.

My experience with students of programming

I actually don't see this importation of math ideas as a common problem in students of programming. Andy mentions:

Sadly, academia and industry accept a high drop-out rate as if it were a natural consequence of the subject matter or some natural characteristic of the failing student. This acceptance is wrong. The drop-out rate is just as much a consequence of the teaching. Teachers who fail that many students have failed themselves.

I agree that this is a problem of teaching at the wrong level, but the most common times I see programmers fail to learn how code works is not when they import mathematical terms as programming terms. That's a rare problem, and as he's said, one that's easily fixed with teaching at the right level. The most common problem I've seen is programmers who expect the computer to "make sense of" the code. Despite professors/teachers trying to teach that a computer does exactly and only what you tell it to in various ways, they still have trouble. I think this is also solved by teaching at the right level.

Personally, I think that misunderstanding begins when they start with a high level language that hides the machine, most commonly, Java. I think Java is a bad choice as a teaching language for various reasons, but the most relevant is that it is, by design, disconnected from the machine. This hinders people's ability to see what the machine is doing in response to what they've written. While most practical programming does happen at a high level, for good programmers, their model of the machine is what's driving their decisions. Premature optimization is an unwanted sideeffect of that, but I think it shows that they are willing to dig into their model of the machine at whatever level they happen to be doing so at, which is a very, very good thing.

I take a different approach to teaching programming which deliberately avoids these misunderstandings by starting at assembly/machine language and talking about pointers, text encoding to make it fit in the model of "bytes," etc. I might add in a few notes about the theoretical models of computation as well, but only if I'm teaching mathy students and always quickly transitioning back into my fake, but rapidly explainable machine. I do some basic explanation of how the hardware works at a high level, just for a good mental model to be used for the next step. I teach them basic assembly, writing a fake assembly language on a simplified machine, and most importantly, interactively stepping through the "Hello, $name" and Fibonacci assembly programs on a whiteboard.

I step through one program myself, then guide them through stepping through the other mostly on their own.

This whole machine language start takes about 2 hours start to finish. I think it's important to explain it, work with it, and then move on and repeat that process, again fostering a good mentality: now one of learning what's happening, perhaps not to the deepest extent, but enough to know what's really going on and to know enough to be able to find information about the details, if that's really needed.

I explain operating system kernels and compilers in general for a few minutes, then go into a real language. Thus far this has been C, partly because it's relatively easy to point to the machine language behind it, and partly because I think it really is a nice language. I have a bunch of strong ideas on why I do what I do here, but they're largely irrelevant to this discussion. Later I add Python on top of that, using a similar "thin layer on the hardware" explanation, this time using C as the "hardware," explaining the VM behind Python, the GC and how it differs from manual memory management in C (some handwaving here to not waste time, but again, I make the lack of knowledge/explanation explicit), and other various differences like the type system.

What this means

Andy writes:

Expert programmers would agree that one of their most indispensable traits is their ability to mentally trace the execution of their programs–to emulate a computer. It makes sense to cultivate that ability as early as possible.

I didn't think of this when I came up with the idea of starting at a fake machine, but it makes sense now. Doing that gives a real, solid mental model that they can use to think about how computers execute code. It is, of course, too low level to be of practical use by itself, but it fosters a mentality of stepping through code in your head, understanding every step along the way, and knowing where your abstraction is - where you're throwing your trust that "it does what I expect," and also being able to drop down a level of abstraction when neccessary. I make the lack of knowledge explicit in the "Hello, $name" program when talking about the I/O through the kernel.

Starting at that super low level, it's easier to point to "this is what the computer is doing" and have them understand it because they have a mental model they can use. They can step through things. This is not the case when you start with a high level language like Java, which hides the machine from you. Experienced programmers can step through that because they have some understanding of how each step is actually implemented. They're thinking at a high level because it works. However, when the code breaks (and it will at some point), they have this deeper understanding to fall back on if they need to to figure out why it broke.

I haven't tried this structure of teaching programming on many people, and it may need some adjustments, especially for the impatient, very hands-on people. The people I've tried this on have been mathy, in-their-head thinkers, so I haven't looked into it too far yet. Perhaps an emulator for my fake machine which can be toyed with and stepped through with all the state of the machine visible would be a good tool to integrate into this.

I do think something similar to this method of starting low-level is much more appropriate for a beginning programming class than jumping into an "easy" language that abstracts really, really far away, or jumping into a heavy language with a lot of machinery behind it, like Java or C++.

I used to think doing this with the JVM as the hardware would be a reasonable option, but the more I think about it the more I see that as a bad idea. The JVM is powerful and abstracted from the real hardware. That gives it practical advantages when you use it, but not when you learn it. Having that powerhouse behind you as your last level of understanding means when GC causes issues it may take you a while to figure that out, since it's not in your model that the GC may have issues on actual hardware. In the opposite direction, you may limit your understanding of the strengths of the machine underneath, like the various SIMD instruction sets. Worst of all, only Java gives you direct access to the JVM's power - as soon as you switch to another language, you have to learn the commonalities and differences between what Java is implemented on and what the other language is implemented on. Until you do, you will write poor code in the new language.

Practicality notes

The low-level-first approach may not seem very useful for people who only want to do scripting. If all you'll ever do with coding is write stuff in MATLAB, script a short automation in JavaScript, make a little interaction in your Excel spreadsheets with VBScript, or something of that sort, you're never thinking at the low level. However, the approach does teach you about learning about coding. It teaches you about debugging. It does that in a very direct way, something that's hard to learn purely in those languages. Those skills are useful anywhere code is written, no matter how small or abstracted it is.

So the ultimate question about this whole thing: Is it worth the time?

For people looking to code for a living, yes, absolutely. For everyone else, it depends.