← Back to context

Comment by Spivak

11 hours ago

In C-ish languages the statement

    int x = "thing"

is perfectly valid. It means reserve a spot for a 32 bit int and then shove the pointer to the string "thing" at the address of x. It will do the wrong thing and also overflow memory but you could generate code for it. The type checker is what stops you. It's the same in Python, if you make type checking a build breaker then the annotations mean something. Types aren't checked at runtime but C doesn't check them either.

In C, int may be as small as 16 bits You may get 32 bits (or more) but it's not guaranteed. I don't see how you get a memory overflow though?

I'd be surprised if a compiler with -Wall -Werror accepts to compile this.

Trying to cast back the int to a char* might work if the pointers are the same size as int on the target platform, but it's actually Undefined Behaviour IIRC.

  • I guess an overflow would be possible if the size of a point and int differs.

It's valid in C, due to semantics around pointers. Try that in Java and you'll quickly find that it's not valid in "C-ish languages". C absolutely checks types, it's just weakly typed. Python doesn't check types at all, which I wouldn't have a problem with, if the language didn't have type annotations that sure look like they'll do something.

It won't "overflow memory".

This says there will be an immutable array of six bytes, with the ASCII letters for "thing" in the first five and then the sixth is zero, this array can be coerced to the pointer type char* (a pointer to bytes) and then (though a modern C compiler will tell you this is a terrible idea) coerced to the signed integer type int.

The six byte array will end up in the "read only data" section of the executable, it doesn't "overflow memory" and isn't stored in the x. Even if you gave x a more sensible type "char*" that word "thing" isn't somehow stored in your variable, it's a pointer.

So, this isn't the same at all and you don't understand C as well as you thought you did.

Edited: fix escaping bold markers

  • I was talking about the int being 32 bits and the pointer being 64 bits but go off. If you did a naive codegen of this without type checking where the compiler just said "yes ma'am blindly copying the value to &x" then you would clobber adjacent memory. That's the point I'm making, you rely on the type checker to make the types actually mean things and give you safety guarantees.

    It feels stronger is languages where you can't even produce a running program if type checking fails but it's conceptually the same.