← Back to context

Comment by sylware

3 years ago

There is way too much in C already.

The first commandment of C is: 'writing a naive C compiler should be "reasonable" for a small team or even one individual'. That's getting harder and harder, longer and longer.

I did move from C being "the best compromise" to "the less worse compromise".

I wish we had a "C-like" language, which would kind of be a high-level assembler which: has no integer promotion or implicit casts, has compile-time/runtime casts (without the horrible c++ syntax), has sized primitive types (u64/s64,f32/f64,etc) at its core, has sized literals (42b,12w,123dw,2qw,etc), has no typedef/generic/volatile/restrict/etc well that sort of horrible things, has compile-time and runtime "const"s, and I am forgetting a lot.

From the main issues: the kernel gcc C dialect (roughly speaking, each linux release uses more gcc extensions). Aggressive optimizations can break some code (while programing some hardware for instance).

Maybe I should write assembly, expect RISC-V to be a success, and forget about all of this.

I wish we had something like typed Lua without Lua’s weird quirks (e.g. indexing by 1), designed with performance enhancement and and safety in mind, and with the features you mention.

But like Lua, the base compiler is really small and simple and can be embedded. And it’s “pseudo-interpreted”: ultimately it’s an ahead-of-time language to support things like function declarations after references and proper type checking, but compiling unoptimized is practically instant and you can load new sources at runtime, start a REPL, and do everything else you can with an interpreted language. Now having a simple compiler with all these features may be impossible, so worse-case there is just a simple interpreter, a separate type-checker, and a separate performance-optimized JIT compiler (like Lua and LuaJIT).

Also like Lua and high-level assembly, debugging unoptimized is also really simple and direct. By default, there aren’t optimizations which elide variables, move instructions around, and otherwise clobber the data so the debugger loses information, not even tail-call optimization. Execution is so simple someone will create a reliable record-replay, time-travel debugger which is fast enough you could run it in production, and we can have true in-depth debugging.

Now that i’ve wrote all that I realize this is basically ML. But oCaml still has weird quirks (the object system), SML too honestly, and I doubt their compilers are small and simple enough to be embedded. So maybe a modern ML dialect with a few new features and none of the more confusing things which are in standard ML.

  • Checkout Nim! It does much of what you describe and its great. The core language is fairly small (not quite lua simple but probably ML comparable). It compiles fast enough that a Nim repl like `inim` is useable to check features and for basic maths, though it requires a C compiler, but TCC [4] works perfectly. Essentially Nim + tcc is pretty close to your description, IMHO. Though I'm not sure TCC supports non-x86 targets.

    I've never used it but Nim does support some hot reloading as well [3]. It also has a real VM if you want to run user scripts and has a nice library for it [1]. Its not quite Lua flexible but for a generally compiled language its impressive.

    Recently I made a wrapper to embed access to the Nim compilers macros at runtime [2]. It took 3-4 hours probably and still compiles in 10s of seconds despite building in a fair bit of the compiler! It was useful for making a code generator for a serializer format. Though I'm not sure its small enough to live on even beefy m4/m7 microcontrollers. Though I'm tempted to try.

    1: https://github.com/beef331/nimscripter 2: https://github.com/elcritch/cdecl/blob/main/src/cdecl/compil... 3: https://nim-lang.org/docs/hcr.html 4: https://bellard.org/tcc/

    • Nim is great, shame it isn't more popular; it's my go-to for what would previously have been Go/Rust/Python.

GCC or Clang with all warnings turned on will give you almost what you want. -Wconversion -Wdouble-promotion and 100s of others. A good way to learn about warning flags (apart from reading the docs) is Clang -Weverything, which will give you many, many warnings.

I agree (with a lot of caveats), but a key value of C is that we do not break peoples code and that means that we cant easily remove things. If we do, we create a lot of problems. This makes it very difficult to keep the language as easy to implement as we would like. As a member of the WG14, I intend to propose that we do make this our prime priority going forward.

> I wish we had a "C-like" language, which would kind of be a high-level assembler which: has no integer promotion or implicit casts, has compile-time/runtime casts (without the horrible c++ syntax), has sized primitive types (u64/s64,f32/f64,etc) at its core, has sized literals (42b,12w,123dw,2qw,etc), has no typedef/generic/volatile/restrict/etc well that sort of horrible things, has compile-time and runtime "const"s, and I am forgetting a lot.

Unsafe Rust code I think fits this model better than C does: it relies on sized primitive types, it has support for both wrapping and non-wrapping arithmetic rather than C's quite frankly odd rules here, it has no automatic implicit casts, it has no strict aliasing rules.

The first commandment of C is: 'writing a naive C compiler should be "reasonable" for a small team or even one individual'. That's getting harder and harder, longer and longer.

100% agreed. I've always viewed C as a "bootstrappable" language, in which it is relatively straightforward to write a working compiler (in a lower level language, likely Asm) which can then be used to bring up the rest of an environment. The preprocessor is actually a little more difficult in some respects to get completely correct, and arguably #embed belongs there, so it's debatable whether this feature is actually adding complexity to the core language.

Your wish for a "C-like" language sounds very much like B.

  • Time for a B+ language?

    There is so much more to remove: 1 loop statement is enough, loop {}, enum should go away with the likes of typeof, etc.

    I wonder if all that makes writing a naive "B+" compiler easier (time/complexity/size) than a plain C compiler. I stay humble since I know removing does not mean easier and faster all the time, the real complexity may be hidden somewhere else.

Are you a programmer? Embed is the easiest feature to implement that I have ever heard

  • I think the blog post provides some insight into the challenges of implementing this.

    • Implementing it without embed. With embed the author says

      > It feels like I wasted a lot of my life achieving something so ridiculously basic that it’s almost laughable

      Which makes me think I should never get involved with an ISO committee, not something I want done fast at least

      2 replies →