← Back to context

Comment by jmull

1 day ago

The analysis you link to is insufficient.

E.g., the first case is "Inferring aliasing". He presents some examples and states, "The compiler cannot infer a function’s aliasing requirements from its declaration or even from its definition."

But why not?

The aliasing requirements come directly from vector. If the compiler has those then determining the aliasing requirements of those functions is straightforward.

Now, maybe there is some argument that a C++ compiler cannot determine the aliasing requirements of vector, but if that's the claim, then the paper should make it, and back it up.

The paper continues in the same vein in the next section, as if the lifetime requirements of map and min cannot be known or cannot bubble up through the functions that call them.

As written, the paper says almost nothing about the feasibility of static analysis of C++ to achieve safety goals for C++.

I imagine it's (implicitly?) referring to avoiding whole-of-program analysis.

For example, given a declaration

  int* func(int* a);

What's the relationship between the return value and the input? You can't know without diving into 'func' itself; they could be the same pointer or it could return a freshly allocated pointer, without getting into the even more esoteric options.

Trying to solve this without recursively analysing a whole program at once is infeasible.

Rust's approach was to require more information to be provided by function definitions, but that's new syntax, and not backwards compatible, so not a palatable option for C++.

  • > avoiding whole-of-program analysis

    Why, though?

    Perhaps it's unfeasibly complex? But if that's the argument, then that's an argument that needs to be made. The paper sets out to refute the idea that C++ already has the information needed for safety analysis, but the examples throw away most of the information C++ does have, without explanation. I can't really take it seriously.

    • In general, there are three reasons to avoid whole program analysis:

      1. Complexity. This manifests as compile times. It takes much longer.

      2. Usability. Error messages are poor, because changes have nonlocal effects.

      3. Stability. This is related to 2. Without requirements expressed in the signature, changes in the body change the API, meaning keeping APIs stable is much harder.

      There’s really a simple reason why it’s not fully feasible in C++ though: C++ supports separate compilation. This means the whole program is not required to be available. Therefore you don’t have the whole program for analysis.

      1 reply →

    • Local reasoning is the foundation of everything formal (this includes type systems) and anyone in the type-system-design space would know that. Graydon Hoare (ex-rust dev) wrote a post about it too (which links to another great without-boat's post in the very first line): https://graydon2.dreamwidth.org/312681.html

      The entire point of having a static-type-system, is to enable local reasoning. Otherwise, we would just do whole program analysis on JS instead of inventing typescript.

The Profiles authors are the ones claiming this uses local analysis only: https://news.ycombinator.com/item?id=41942126

They are clear that Profiles infers everything from function types and not function bodies. Obviously that won't work, but that's what they say.

  • In that post (I think your own?) it says, "Local analysis only. It's not looking in function definitions."

    But local analysis means analysis of function definitions. At least it does to me. I can't think of what else it could mean. I think there must be some aspect of people talking past each other here, using the same words to mean different things.

    Further, I don't think local analysis of the code comprising a function means throwing away the results of that analysis rather than passing it up the line to the analysis of callers of the function. E.g., local analysis of std::sort would establish its aliasing limitations, which would be available to analysis of the body of "f1" from the example in the paper (the results of which, in turn, would be available to callers of f1).

    Now, I don't know if that's actually feasible/workable without the "heavy" annotation that C++ profiles wants to forbid. That's the key question to me.

    •   template<typename _RandomAccessIterator>
          _GLIBCXX20_CONSTEXPR
          inline void
          sort(_RandomAccessIterator __first, _RandomAccessIterator __last)
          {
            // concept requirements
            __glibcxx_function_requires(_Mutable_RandomAccessIteratorConcept<
           _RandomAccessIterator>)
            __glibcxx_function_requires(_LessThanComparableConcept<
           typename iterator_traits<_RandomAccessIterator>::value_type>)
            __glibcxx_requires_valid_range(__first, __last);
            __glibcxx_requires_irreflexive(__first, __last);
      
            std::__sort(__first, __last, __gnu_cxx::__ops::__iter_less_iter());
          }
      

      That's the definition of std::sort. What aliasing information can be gleaned from local analysis of the function? Absolutely nothing.