← Back to context

Comment by tpush

3 years ago

Again, you still don't understand.

> It will happen and you need to deal with it when it does, and there are reasonable and unreasonable ways of dealing with that.

A compiler's handling of UB simply can't work the same way handling flag passing works in GCC. Fundamentally.

With GCC, the example is something like:

  if (strcmp(argv[1], "--help") == 0) { /* do help */ } else { /* handle it not being help, for example 'hlep' or whatever */ }

Here, GCC can precisely control what happens when you pass 'hlep'.

Compilers don't and can't work this way. There is no 'if (is_undefined_behavior(ast)) { /screw the user / }'. UB is a property of an execution, i.e. what happens at runtime, and can't _generally_ be detected at compile time. And you very probably do not want checks for every operation that can result in UB at runtime! (But if you do, that's what UBSan is!).

So, the only way to handle UB is either

1) Leaving the semantics of those situation undefined (== not occuring), and coding the transformation passes (so also opt passes) that way.

or

2) Defining some semantics for those cases.

But 2) is just implementation defined behavior! And that is what you're arguing for here. You want signed integer overflow to be unspecified or implementation defined behavior. That's fine, but a job for the committee.

I get what's happening.

It's basically dead code removal. X supposedly can't happen so you never need to check for X.

The instance in the article is about checking for an overflow. The author was handling the situation. C handed him the rope, he used the rope sensibly checking for overflow. GCC took the rope and wrapped it around his neck. Fine GCC (and C) can't detect overflow at compile time and doesn't want to get involved in runtime checks. Leave it to the user then. But GCC isn't leaving it to the user it's undermining the user.

Re 2) (are you referring to gccs committee or the c committee?)

I don't mind what it's deemed to be, I expect GCC to do something reasonable with it. Whatever happens a behavior needs to be decided by someone. Some of those behaviours are reasonable some aren't. If you're doing a check for UB, the reasonable thing, to me is to maintain that check.

I could make a choice when I write an app to assume that user input never exceeds 100 bytes. I could document it saying anything could happen, then reasonably (well many people would disagree) leave it there, that is my choice.

If you come along and put 101bytes of input in you would complain if my app then reformatted your hard drive. Wouldn't you also complain if GCC did the same?

There's atleast a post a week complaining about user hostile practices with regard to apps. Why do compiler writers get a free pass? If I put up code assuming user input would be less than 100 bytes documented or not, someone would raise that as an issue so why the double standard.

I'm not even advocating the equavalent of safe user input. I'm advocating that just because you go outside the bounds of what is defined, you do something reasonable.

  • > If you're doing a check for UB, the reasonable thing, to me is to maintain that check.

    The problem is that you need to do the check before you cause UB, not after, and here the check appears after. If you do the check before, the compiler will not touch it.

    The compiler can't know that this code is part of a UB check (so it should leave it alone), whereas this other code here isn't a UB check but is just computation (so it should assume no UB and optimise it). It just optimises everything, and assumes you don't cause UB anywhere.

    Now, I'm not defending this approach, but C works like this for performance and portability reasons. There are modern alternatives that give you most or all of the performance without all these traps.

    • >for performance and portability reasons

      Is it more performant?

      How would you do the check in the article in a more performant way?

      Philosophically I'm not sure it's even possible. Sure you could do the check before the overflow but any way you slice it that calculation ultimately applies to something that is going to be UB so the compiler is free to optimise it out? Yes you can make it unrelated enough that the compiler doesn't realise. But really if the compiler can always assume you aren't going to overflow integers, then it should be able to optimise away 'stupid' questions like 'if I add X and y, would that be an overflow?'.

      >The compiler can't know that this code is part of a UB check

      If it doesn't know what the code is then it shouldn't be deleting it. It has just rearranged code that it knows is UB, it is now faced with a check on that UB. It could (and does) decide that can't possibly happen, because 'UB'. It could instead decide that it is UB and so doesn't know if this check is meaningful or not, and not delete the check, this to me is the original point of UB, C doesn't know whether your machine is 1s complement, 2s complement or 3s complement, it leaves it to the programmer to deal with the situation, if the programmer knows he's working on 2s complement machines that overflow predictably he can work on that assumption, the compiler isn't expected to know, but it should stay out of the way because the programmer does. The performance of c as I understood it is that overflow check is optional, you aren't forced to check. But you are required to ensure that the check is done if needed, or deal with the consequences.

      Would you get rid of something you don't understand because you can't see it doing something useful. Or would you keep it because you don't know what you might break when you delete it? GCC in this case is deleting something it doesn't understand. Why is that not a bug?

      4 replies →