← Back to context

Comment by Fraterkes

3 days ago

Dumb question about contracts: I was reading the docs (https://c3-lang.org/language-common/contracts/) and this jumped out

"Contracts are optional pre- and post-condition checks that the compiler may use for static analysis, runtime checks and optimization. Note that conforming C3 compilers are not obliged to use pre- and post-conditions at all.

However, violating either pre- or post-conditions is unspecified behaviour, and a compiler may optimize code as if they are always true – even if a potential bug may cause them to be violated.

In safe mode, pre- and post-conditions are checked using runtime asserts."

So I'm probably missing something, but it reads to me like you're adding checks to your code, except there's no guarantee that they will run and whether it's at compile or runtime. And sometimes instead of catching a mistake, these checks will instead silently introduce undefined behaviour into your program. Isn't that kinda bad? How are you supposed to use this stuff reliably?

(otherwise C3 seems really cool!)

Contracts are a way to express invariants, "This shall always be true".

There are three main things you could do with these invariants, the exact details of how to do them, and whether people should be allowed to specify which of these things to do, and if so whether they can pick only for a whole program, per-file, per-function, or whatever, is separate.

1. Ignore the invariants. You wrote them down, a human can read them, but the machine doesn't care. You might just as well use comments or annotate the documentation, and indeed some people do.

2. Check the invariants. If the invariant wasn't true then something went wrong and we might tell somebody about that.

3. Assume these invariants are always true. Therefore the optimiser may use them to emit machine code which is smaller or faster but only works if these invariants were correct.

So for example maybe a language lets you say only that the whole program is checked, or, that the whole program can be assumed true, or, maybe the language lets you pick, function A's contract about pointer validity we're going to check at runtime, but function B's contract that you must pick an odd number, we will use assumption, we did tell you about that odd number requirement, have the optimiser emit that slightly faster machine code which doesn't work for N=0 -- because zero isn't an odd number assumption means it's now fine to use that code.

  • I guess the reason I found it surprising is that I would only use 3 (ie risk introducing UB) for invariants that I was very certain were true, whereas I would mostly use 2 for invariants that I had reason to believe might not always be true. It struck me as odd that you'd use the same tool for scenario's that feel like opposites, especially when you can just switch between these modes with a compiler flag

    • It feels pretty clear to me that these Contracts should only be used for the “very certain” case. Writing code for a specific compiler flag seems very sketchy, so the programmer should assume the harshest interpretation.

      The runtime check thing just sounds like a debugging feature.

  • In other words, in production mode it makes your code faster and less safe; in debug mode it makes your code slower and more safe.

    That's a valid trade-off to make. But it's unexpected for a language that bills itself as "The Ergonomic, Safe and Familiar Evolution of C".

    Those pre/post-conditions are written by humans (or an LLM). Occasionally they're going to be wrong, and occasionally they're not going to be caught in testing.

    It's also unexpected for a feature that naive programmers would expect to make a program more safe.

    To be clear this sounds like a good feature, it's more about expectations management. A good example of that done well is Rust's unsafe keyword.

    • > That's a valid trade-off to make. But it's unexpected for a language that bills itself as "The Ergonomic, Safe and Familiar Evolution of C".

      No, I think this is a very ergonomic feature. It fits nicely because it allows better compilers to use the constraints to optimize more confidently than equivalently-smart C compilers.

      3 replies →

  • Maybe also worth mentioning is that some static analysis is done using these contracts as well. With more coming.

  • Is C3 using a different terminology than standard design by contract?

    Design by contract (as implemented by Eiffel, Ada, etc.) divides the set of conditions into three: Preconditions, postconditions, and invariants. Pre- and postconditions are not invariants by predicate checks on input and output parameters.

    Invariants are conditions expressed on types, and which must be checked on construction and modification. E.g. for a "time range" struct with start/end dates, the invariant should be that the start must precede the end.

  • So the compiler could have debug mode where it checks the invariants and release mode where it assumes they are true and optimizes around that without checking?

    • Yes, and that same pattern already does exist in C and C++. Asserts that are checked in debug builds but presumed true for optimization in release builds.

      2 replies →

  • You've described three different features with three different sets of semantics. Which set of semantics is honored? Unknown!

    This is not software engineering. This is an appeal to faith. Software engineering requires precise semantics, not whatever the compiler feels like doing. You can't even declare that this feature has no semantics, because it actually introduces a vector for UB. This is the sort of "feature" that should not be in any language selling itself as an improved C. It would be far better to reduce the scope to the point where the feature can have precise semantics.

    • > Which set of semantics is honored?

      Typically it's configurable. For example C++ 26 seems to be intending you'll pick a compiler flag to say if you want its do-nothing semantics, or its "tell me about the problem and press on" semantics or just exit immediately and report that. They're not intending (in the standard at least) to have the assume semantic because that is, as you'd expect, controversial. Likewise more fine-grained configuration they're hoping will be taken up as a vendor extension.

      My understanding is that C3 will likely offer the coarse configuration as part of their ordinary fast versus safe settings. Do I think that's a good idea? No, but that's definitely not "Unknown".

      1 reply →

    • not enforced for any given implementation is hardly "unknown". presumably the tin comes with a label saying what's inside

- "Note that conforming C3 compilers are not obliged to use pre- and post-conditions at all." means a compiler doesn't have to use the conditions to select how the code will be compiled, or if there's a compile-time error.

- "However, violating either pre- or post-conditions is unspecified behaviour, and a compiler may optimize code as if they are always true – even if a potential bug may cause them to be violated." basically, it just states the obvious. the compler assumes a true condition is what the code is meant to address. it won't guess how to compile the code when the condition is false.

- "In safe mode, pre- and post-conditions are checked using runtime asserts." it means that there's a 'mode' to activate the conditions during run-time analysis, which implies there's a mode to turn it off. this allows the conditions to stay in the source code without affecting runtime performance when compiled for production/release.

It’s giving you an expression capability so that you can state your intent, in a standardized way, that other tooling can build off. But it’s recognizing that the degree of enforcement depends on applied context. A big company team might want to enforce them rigidly, but a widely used tool like Visual Studio would not want to prevent code from running, so that folks who are introducing themselves to the paradigm can start to see how it would work, through warnings, while still being able to run code.

  • This is not just expressing intent. The documentation clearly states that it's UB to violate them, so you need to be extra careful when using them.

    • Perhaps another helpful paradigm are traffic/construction cones with ‘do not cross’ messages. Sometimes nothing happens, other times you run into wet concrete, other times you get a ticket. They’re just plastic objects, easy to move, but you are not meant to cross them in your vehicle. While concrete bollards are a thing, they are only preferable in some situations.

      1 reply →

    • > documentation clearly states that it's UB to violate them

      Only in "fast" mode. The developer has the choice:

      > Compilation has two modes: “safe” and “fast”. Safe mode will insert checks for out-of-bounds access, null-pointer deref, shifting by negative numbers, division by zero, violation of contracts and asserts.

      3 replies →

It seems to me like a way to standardize what happens all the time anyway. Compilers are always looking for ways to optimize, and that generally means making assumptions. Specifying those assumptions in the code, instead of in flags to the compiler, seems like a win.

I think they are there to help the compiler so the optimizer might (but doesn't have to) assume they are true. It's sometimes very useful to be able to do so. For example if you know that two numbers are always different or that some value is always less than x. In standard C it's impossible to do but major compilers have a way to express it as extensions. GCC for example has:

  if (x)
    __builtin_unreachable();

C3 makes it a language construct. If you want runtime checks for safety you can use assert. The compiler turns those into asserts in safe/debug mode because that help catching bugs in non performance critical builds.

The way I reason about it is that the contracts are more soft conditions that you expect to not really reach. If something always has to be true, even on not-safe mode, you use "actual" code inside the function/macro to check that condition and fail in the desired way.

  • >The way I reason about it is that the contracts are more soft conditions that you expect to not really reach

    What's the difference from an assert then?

    • The difference from an assert is that for "require" they are compiled into the caller frame, so things like stack traces (which is available in safe mode) will point exactly to where the violation happened.

      Because of inlining them at the call site happens, static analysis will already pick up some obvious violations.

      Finally, these contracts may be used to compile time check otherwise untyped arguments to macros.

  • “However, violating either pre- or post-conditions is unspecified behaviour, and a compiler may optimize code as if they are always true – even if a potential bug may cause them to be violated”

    This implies that a compiler would be permitted to remove precisely that actual code that checks the condition in non-safe mode.

    Seems like a deliberately introduced footgun.

    • My understanding of this was that the UB starts only after the value is passed/returned. So if foo() has a contract to only return positive integers, the code within foo can check and ensure this, but if the calling code does it, the compiler might optimize it away.

      2 replies →

Design by contract is good. I've used it in some projects.

https://en.wikipedia.org/wiki/Design_by_contract

I first came across it when I was reading Bertrand Meyer's book, Object-oriented Software Construction.

https://en.wikipedia.org/wiki/Object-Oriented_Software_Const...

From the start of the article:

[ Object-Oriented Software Construction, also called OOSC, is a book by Bertrand Meyer, widely considered a foundational text of object-oriented programming.[citation needed] The first edition was published in 1988; the second edition, extensively revised and expanded (more than 1300 pages), in 1997. Many translations are available including Dutch (first edition only), French (1+2), German (1), Italian (1), Japanese (1+2), Persian (1), Polish (2), Romanian (1), Russian (2), Serbian (2), and Spanish (2).[1] The book has been cited thousands of times. As of 15 December 2011, The Association for Computing Machinery's (ACM) Guide to Computing Literature counts 2,233 citations,[2] for the second edition alone in computer science journals and technical books; Google Scholar lists 7,305 citations. As of September 2006, the book is number 35 in the list of all-time most cited works (books, articles, etc.) in computer science literature, with 1,260 citations.[3] The book won a Jolt award in 1994.[4] The second edition is available online free.[5] ]

https://en.wikipedia.org/wiki/Bertrand_Meyer