Comment by benj111
3 years ago
It is undefined behaviour if I write GCC --hlep
Does that mean it's acceptable for GCC to reformat my hard drive?
Just because something is UD doesn't give anyone a license to do crazy things.
If I misspell --help I expect the program to do something reasonable. If I invoke UD I still expect the program to do something reasonable.
Removing checks for an overflow because overflows 'can't happen' is just crazy.
UD is supposed to allow C to be implemented on different architectures if you don't know whether it will overflow to INT_MIN it makes sense to leave the implementation open. If I, the user knows what happens when an int overflows then I should be able to make use of that and guard against it myself. A compiler undermining that is a bug and user hostile.
> It is undefined behaviour if I write GCC --hlep
No, it's not, and I don't know why you'd think so. UB is a concept applying to C programs, not GCC invocations.
> UD is supposed to allow C to be implemented on different architectures if you don't know whether it will overflow to INT_MIN it makes sense to leave the implementation open. If I, the user knows what happens when an int overflows then I should be able to make use of that and guard against it myself.
I think you're confusing UB with unspecified and implementation defined behavior. It's fine if you think something shouldn't be UB, but you have to go lobbying the C standard for that. Compiler writers aren't to blame here.
This has come up before, because, in some technical sense, the C standard does indeed not define what a "gcc" is, so "gcc --help" is undefined behavior according to the C standard, because the C standard does not define the behavior. By the same token, instrument flight rules are undefined behavior.
A slightly less textualist approach to language recognizes that when we talk about C and UB, we mean behavior, which is undefined, of operations otherwise defined by the C standard.
I think this is confusing undefined behavior with behavior of something that is undefined. And either way, the C standard explicitly applies to C programs, so even this cute "textualist" interpretation would be wrong, IMO.
Do you know what a metaphor is?
No GCC --hlep isn't in the c standard.
But it is a simple example to illustrate how programs react when it receives something that isn't in the spec. GCC could do anything with Gcc --hlep just like it could do anything with INT_MAX + 1. That doesnt mean that all options open to it are reasonable.
If I typed in GCC --hlep I would be reasonably pissed that it deleted my hard drive. You pointing out that GCC never made any claims about what would happen if I did that doesn't make it ok.
If you come across UD, there's reasonable and unreasonable ways to deal with that. Reformatting your hard drive which is presumably allowed by the C standard isn't reasonable. I would contend that removing checks is also unreasonable.
2 replies →
>No, it's not, and I don't know why you'd think so. UB is a concept applying to C programs, not GCC invocations
What should happen when I invoke --hlep then? The program could give an error, could warn that it's an unrecognised flag. Could ask you if you meant --help. Infer you mean help and give you that, or it could give you a choo Choo train running across the screen. Or it could reformat your hard drive. Just because it isn't specifically listed as UD doesn't mean it's not. If it isn't defined then it's undefined. The question is what is the reasonable thing to do when someone types --hlep. I hope we can agree reformating your hard drive isn't the most reasonable thing to do.
>I think you're confusing UB with unspecified and implementation defined behavior
Am I? What's the reason for not defining integer overflow? Yes unspecified behaviour could be used to allow portability, but so can undefined.
>It's fine if you think something shouldn't be UB, but you have to go lobbying the C standard for that. Compiler writers aren't to blame here.
I'm not saying it shouldn't be UB. I'm saying there's reasonable and unreasonable things to do when you encounter UB. In the article the author took reasonable steps to protect themselves and the compiler undermined that. That isn't reasonable. In exactly the same way that --hlep shouldn't lead to my hard drive getting reformatted.
C gives you enough rope to hang yourself. It isn't required for GCC to tie the noose and stick your head in it though.
I think you're confusing UB with unspecified and implementation defined behavior
> What should happen when I invoke --hlep then? The program could give an error, could warn that it's an unrecognised flag. Could ask you if you meant --help. Infer you mean help and give you that, or it could give you a choo Choo train running across the screen. Or it could reformat your hard drive. Just because it isn't specifically listed as UD doesn't mean it's not. If it isn't defined then it's undefined. The question is what is the reasonable thing to do when someone types --hlep. I hope we can agree reformating your hard drive isn't the most reasonable thing to do.
I honestly don't understand the point of this paragraph.
> Am I? What's the reason for not defining integer overflow? Yes unspecified behaviour could be used to allow portability, but so can undefined.
Yes, you are confused about that. UB is precisely the kind of behavior where the C standard deemed it unsuitable to define as implementation defined or whatever, and it usually has really good reasons to do so. You could look them up instead of asking rhetorically.
> I'm not saying it shouldn't be UB. I'm saying there's reasonable and unreasonable things to do when you encounter UB. In the article the author took reasonable steps to protect themselves and the compiler undermined that. That isn't reasonable. In exactly the same way that --hlep shouldn't lead to my hard drive getting reformatted.
Again, you seem to fundamentally misunderstand how compilers work in this case. They largely don't "encounter" UB; It's optimization passes are coded with the assumption that UB can't happen. The ability to do that is fundamentally the point of UB. Situations like in the article are not a specific act of the compiler to screw you in particular, but an emergent result.
Additionally, I think you you're also confusing Undefined Behavior with 'behavior of something that is undefined'. These are not the same things.
9 replies →
> It's fine if you think something shouldn't be UB, but you have to go lobbying the C standard for that. Compiler writers aren't to blame here.
I'm glad I don't live in your country, where the C standard has been incorporated into law, making it illegal for compiler writers to do things that are helpful to programmers and end users, but aren't required by the standard.
> UD is supposed to allow C to be implemented on different architectures
No, that's wrong. Implementation-Defined Behavior is supposed to allow C to be implemented on different architectures. In those cases, the implementation must define the behavior itself, and stick with it. UB, on the other hand, exists for compiler authors to optimize.
If you want to be mad at someone, be mad at the C standard for defining so much stuff as UB instead of implementation-defined behavior. Integer overflow should really be implementation-defined instead.
>No, that's wrong. Implementation-Defined Behavior is supposed to allow C to be implemented on different architectures.
Is it? We're talking about integer overflow here.
I wasn't in the meetings when writing all the c standards. I'm not convinced this is purely an optimisation thing though.
I would guess the story is more.
Interested party X: "can integer overflow do X?"
Party Y: "no because our processor doesn't work like that.
Party Z: "and it breaks K and R"
Party X: "how about implementation defined?"
Party A: "but our compiler targets 5 different processors"
Party B: "plus that precludes certain optimisations"
Not only to optimize but to write safety tools. If you defined all the behavior, and then someone used some rare behavior like integer overflow by accident, it'd be harder to detect that since you have to assume it was intentional.
> UB, on the other hand, exists for compiler authors to optimize.
Was this really the original reason why there's UB in the C standard, or has this been retconned by 'malicious compiler authors'? ;)
It is the original reason. For example, register allocation is possible because stack smashing is UB.
1 reply →
But then again, UB doesn't mean the compiler author can't treat it as implementation-defined and do something reasonable.
You're getting it backward. UB doesn't immediately stop compilation only due to implementation defined backward compatibility, just because you don't want to break compilation of existing programs each time the compiler converges to the C spec and identified an implementation of undefined behavior.
And since you want some cross-compiler compatibility, you also import's third parties implementation defined UB.
This is not some conceptual reasonable decision, the proper way would be to throw out compilation on each UB behavior. The reality is that the proper way would be too harsh on existing codebase, making people use a less strict compiler or not updating version, which are non-desirable effects for compilers writers.
3 replies →
And sanitizers that throw warnings on undefined behavior do indeed exist.
>UB, on the other hand, exists for compiler authors to optimize.
s/exists for/has been exploited by/g
The worst part is the optimizations aren't even that significant. (I recall a blog post of somebody testing this but I can't find it rn)
It is undefined behaviour if I write GCC --hlep
Well no, it's a compilation error, you need at the very least a semicolon after hlep and from there on it depends on what GCC is. If it's a function you need parentheses around --hlep, if it's a type you need to remove the --, if it's a variable you need to put a semicolon after it,...
Because GCC is all-caps I'm guessing it's a macro, so here's an example of how you could write it (though it won't be UB): https://godbolt.org/z/dYMddrTjj
I'm not sure if you're supporting my pov by showing the absurdity of the other position???
Yeah sure, if my phone auto incorrects gcc to GCC then that is technically meaningless so you're completely free to interpret my comment how you want.
..... Although..... GCC stands for GNU Compiler Collection so it can be reasonably capitalised, so maybe then, rather than saying anything goes we should do something reasonable because then you aren't left saying something really stupid if you're wrong???
Parent point is when the standard talks about UB it refers about translating C code. So parent cheekly interpreted your comment about command line flags (which are outside the remit of the standard) as code instead. I thought it was fitting.