Comment by yakubin
3 years ago
> told me this form was non-ideal and it was worth voting against (and that they’d want the pure, beautiful C++ version only[1])
I heard about #embed, but I didn't hear about std::embed before. After looking at the proposal, to me it does look a lot better than #embed, because reading binary data and converting it to text, only to then convert it to binary again seems needlessly complex and wasteful. I also don't like that it extends the preprocessor, when IMHO the preprocessor should at worst be left as is, and at best be slowly deprecated in favour of features which compose well with C proper.
Going beyond the gut reaction and moving on to hard data, as you can expect from this design, std::embed of course is faster during compilation than #embed for bigger files (comparable for moderately-sized files, and a bit slower for tiny files).
I'm not a huge fan of C++, but the fact that C++ removed trigraphs in C++17 and that it's generally adding features replacing the preprocessor scores a point with me.
[1]: <https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p10...>
Compilers follow the "as if" principle, they don't have to literally follow the formal rules given by the standard. They could implement #embed by doing as you say, pretty printing out numbers and then parsing them back in again. But that would be an extremely roundabout way to do it, so I doubt anyone will actually do it that way. Unless you're running the compiler in some kind of debugging mode like GCC's -E.
I don’t think the implication is that the C compiler must encode the binary file as a comma-separated integer list and then re-parse it, only act as if it did so.
How would that work? It would need to depend on the grammar of surrounding C code. This directive isn't limited to variable initialisers. You can use it anywhere. So e.g. you can use it inside structure declaration, or between "int main()" and "{". etc. etc. Those will generate errors in subsequent phases, but during preprocessing the compiler doesn't know about it. Then there is also just that:
There are plenty of cases, where it will all behave differently. And if you're going to pretend even more that the preprocessor understands C syntax, then why not just give this job to compiler proper, which actually understands it?
Preprocessing produces a series of tokens, so you would implement it as a new type of token. If you're using something like `-E` you would just pretty-print it as a comma-delimited list of integers. If you're moving on to translation phase 7, you'd have some sort of rules in your parser about where those kinds of tokens can appear . Just like you can't have a return token in an initializer, you wouldn't be allowed to have an embed token outside of one (or whatever the rules are). And you can directly instantiate some kind of node that contains the binary data.
1 reply →
the preprocessor is a great tool to reduce duplication and boilerplate.
People that don't like it generally just don't know how to use it.
People don't dislike it because they are unaware how helpful it can be. They dislike it because they are aware how hacky, fragile and error-prone it is. They want something more robust than text substitution.
People that don't like it generally have used macros that are more sophisticated than just blindly copy pasting text into your source files and have became aware of how absurd that is.
Or perhaps those people can think of better ways to get those benefits that also don't allow things like
which obliterate tooling such as IDEs. Of course, this is a contrived example, but the preprocessor is just one big footgun, which offers no benefits over other ways of solving the problems you mentioned, such as constexpr and perhaps additional, currently unimplemented solutions.
It being possible to misuse a tool does not mean the tool is not very useful.
3 replies →