Zen-C: Write like a high-level language, run like C

3 days ago (github.com)

From what I can see in the codegen, defer is not implemented "properly": the deferred statements are only executed when the block exits normally; leaving the block via "return", "break", "continue" (including their labelled variants! those interact subtly with outer defers), or "goto" skips them entirely. Which, arguably, should not happen:

    var f = fopen("file.txt", "r");
    defer fclose(f);

    if fread(&ch, 1, 1, f) <= 0 { return -1; }
    return 0;

would not close file if it was empty. In fact, I am not sure how it works even for normal "return 0": it looks like the deferred statements are emitted after the "return", textually, so they only properly work in void-returning function and internal blocks.

  • Did you manage to compile this example?

    • Yes, actually:

          $ cat kekw.zc
          include <stdio.h>
          
          fn main() {
              var f = fopen("file.txt", "r");
              defer fclose(f);
          
              var ch: byte;
              if fread(&ch, 1, 1, f) <= 0 { return -1; }
              return 0;
          }
          $ ./zc --emit-c kekw.zc
          [zc] Compiling kekw.zc...
          $ tail -n 12 out.c
          int main()
          {
              {
              __auto_type f = fopen("file.txt", "r");
              uint8_t ch;
          if ((fread((&ch), 1, 1, f) <= 0))     {
              return (-1);
              }
              return 0;
          fclose(f);
              }
          }

> Mutability

> By default, variables are mutable. You can enable Immutable by Default mode using a directive.

> //> immutable-by-default

> var x = 10; > // x = 20; // Error: x is immutable

> var mut y = 10; > y = 20; // OK

Wait, but this means that if I’m reading somebody’s code, I won’t know if variables are mutable or not unless I read the whole file looking for such directive. Imagine if someone even defined custom directives, that doesn’t make it readable.

  • Given an option that is configurable, why would the default setting be the one that increases probability of errors?

    For some niches the answer is "because the convenience is worth it" (e.g. game jams). But I personally think the error prone option should be opt in for such cases.

    Or to be blunt: correctness should not be opt-in. It should be opt-out.

    I have considered such a flag for my future language, which I named #explode-randomly-at-runtime ;)

    • > Or to be blunt: correctness should not be opt-in. It should be opt-out.

      One can perfectly fine write correct programs using mutable variables. It's not a security feature, it's a design decision.

      That being said, I agree with you that the author should decide if Zen-C should be either mutable or immutable by default, with special syntax for the other case. As it is now, it's confusing when reading code.

    • But why put it as a global metaswitcher instead of having different type infered from initial assignation qualifier?

      Example:

          let integer answer be 42 — this is a constant
          set integer temperature be 37.2 — this is a mutable
      
      

      Or with the more esoglyphomaniac fashion

          cst ↦ 123 // a constant is just a trivial map
          st ← 29.5 // initial assignment inferring float

    • > Given an option that is configurable, why would the default setting be the one that increases probability of errors?

      They're objecting to the "given", though. They didn't comment either way on what the default should be.

      Why should it be configurable? Who benefits from that? If it's to make it so people don't have to type "var mut" then replace that with something shorter!

      (Also neither one is more 'correct')

      2 replies →

  • It's not ideal but it seems like something an LSP could tell you on a hover event. I didn't see an LSP (I didn't look that hard either) but presumably that's within the scope of their mission statement to deliver modern language ergonomics. (But I agree with sibling comments that this should be a keyword. Another decent alternative would be that it's only global in scope.)

  • Other languages also have non-local qays of influencing compiler behavior, for example attributes in rust (standard) or compiler pragmas in C (non-standard).

    When reading working code, it doesn't matter whether the language mode allows variable reassignment. It only matters when you want to change it. And even then, the compiler will yell at you when you do the wrong thing. Testing it out is probably much faster than searching the codebase for a directive. It doesn't seem like a big deal to me.

It's odd that the async/await syntax _exclusively_ uses threads under the hood. I guess it makes for a straightforward implementation, but in every language I've seen the point of async/await is to use an event loop/cooperative multitasking.

  • I’d say that the point of async/await is to create a syntax demarcation between functions which may suspend themselves (or be suspended by a supervisory system) and those functions that process through completely and cannot be suspended (particularly by a supervisory system). The means to enable the suspension of computation and allow other computations to proceed following that suspension are implementation details.

    So, having an async function run on a separate thread from those functions that are synchronous seems a viable way to achieve the underlying goal of continuous processing in the face of computations that involve waiting for some resource to become available.

    I will agree that inspired by C#’s originating and then JavaScripts popularization of the syntax, it is not a stretch to assume async/await is implemented with an event loop (since both languages use such for implementation).

  • Noob question: if it just compiles to threads, is there any need for special syntax in the first place? My understanding was that no language support should be required for blocking on a thread.

    • One advantage is that it gives you the opportunity to move to a more sophisticated implementation later without breaking backwards compatibility (assuming the abstraction does not leak).

    • Async/await should do a little more under the hood than what the typical OS threading APIs provide, for example forwarding function parameters and return values automatically instead of making the user write their own boilerplate structs for that.

Syntax aside, how does this compare to Nim? Nim does similar, I think Crystal does as well? Not entirely sure about Crystal tbh. I guess Nim and Vala, since I believe both transpile to C, so you really get "like C" output from both.

  • From what I see, Zen-C aims to be "C with super-powers". It still uses C pointers for arrays and strings. It transpiles to single human-readable C file without symbol mangling. No safety. Not portable (yet?).

    Nim is a full, independent modern language that uses C as one of its backends. It has its own runtime, optional GC, Unicode strings, bounds checking, and a huge stdlib. You write high-level Nim code and it spits out optimized C you usually don't touch.

    Here’s a little comparison I put together from what I can find in the readme and code:

        Comparison          ZenC           Nim
        
        written in          C              Self-Hosted
        targets             C              C, C++, ObjC, JS, LLVM (via nlvm), Native (in-progress)
        platforms           POSIX          Linux, Windows, MacOS, POSIX, baremetal
        mm strategy         manual/RAII    ARC, ORC(ARC with cycle collector), multiple gc, manual
        generated code      human-readable optimized
        mangling            no             yes
        
        stdlib              bare           extensive/batteries-included
        
        compile-time code   yes            yes
        macros              comptime?      AST manipulation
        
        arrays              C arrays       type and size is retained at all times
        strings             C strings      have capacity and length, support Unicode
        bounds-checking     no             yes (optional)

  • Nim (Python-like) and Crystal (Ruby-like) are not C-like languages. Arguably, those languages are targeting a different audience. There are other C family and C style syntax languages that compile directly to C or has it as one of its backends.

  • man I haven't heard anything about Vala in ages. is it still actively developed/used? how is it?

    • Yes, it is actively being developed.

      Quite easy to make apps with it and GNOME Builder makes it really easy to package it for distribution (creates a proper flatpak environment, no need to make all the boilerplate). It's quite nice to work with, and make stuff happen. Gtk docs and awful deprecation culture (deprecate functions without any real alternative) are still a PITA though.

    • There's a surprising number of GUI apps built using Vala, if you've used Linux long enough, there's a chance you may have used a Vala based GUI and not even known you were. It's just such a nice language, it's a shame it's not more prevalent since Gnome libraries can compile basically anywhere.

    • Vala is still being developed and used in the GNOME ecosystem. Boo, on the other hand, is pretty dead.

  • Crystal compiles directly to object code, using LLVM. It does provide the ability to interoperate with C code; as an example, I use this feature to call ncursesw functions from Crystal.

  • I was also going to mention this reminds me of Vala, which I haven't seen or heard from in 10+ years.

    • Surprisingly theres a shocking number of GUI programs for Linux made with Vala, and ElementaryOS is built using Vala, and all their custom software uses Vala. So it's not dead, just a little known interesting language. :)

An interesting bit to me is that it compiles to (apparently) readable C, I'm not sure how one would use that to their advantage

I am not too familiar with C - is the idea that it's easier to incrementally have some parts of your codebase in this language, with other parts being in regular C?

  • one benefit is that a lot of tooling e.g. for verification etc. is built around C.

    another is that it only has C runtime requirement, so no weird runtime stuff to impelement if youd say want to run on bare metal..you could output the C code and compile it to your target.

  • C2 (http://c2lang.org) similarly compiles to C, but arguably more readable C code from what I can see. The benefits are (1) easy access to pretty much any platform with little extra work (2) significantly less long term work compared to integrating with LLVM or similar (3) if it's readable enough, it might be submitted as "C code" in working environments which mandate C.

  • i think so. The biggest hurdle with new languages is that you are cut off from a 3rdparty library ecosystem. Being compatible with C 3rd party libraries is a big win.

Initial commit was 24h ago, 363 stars, 20 forks already. Man, this goes fast.

  • man has been posting a lot before the initial commit about his library. following the guy on linkedin.

  • Could be bots.

    • It’s not, it’s just how hackernews works. You’ll see new projects hit 1k-10k stars in a matter of a day. You can have the best project, best article to you but if everyone else doesn’t think so it’ll always be at the bottom. Some luck involved too. Bots upvoting a post not organically I doubt is gonna live long on first page.

      4 replies →

    • Definitely could be, but the dev has been posting updates on Twitter for a while now. It could be just some amount of hype they have built.

So, the point of this language is to be able to write code with high productivity, but with the benefit of compiling it to a low level language? Overall it seems like the language repeats what ZIG does, including the C ABI support, manual memory management with additional ergonomics, comptime feature. The biggest difference that comes to mind quickly is that the creator of Zen-C states that it can allow for the productivity of a high level language.

  • It has stringly typed macros. It's not comparable to Zig's comptime, even if it calls it comptime:

        fn main() {
            comptime {
                var N = 20;
                var fib: long[20];
                fib[0] = (long)0;
                fib[1] = (long)1;
                for var i=2; i<N; i+=1 {
                    fib[i] = fib[i-1] + fib[i-2];
                }
    
                printf("// Generated Fibonacci Sequence\n");
                printf("var fibs: int[%d] = [", N);
                for var i=0; i<N; i+=1 {
                    printf("%ld", fib[i]);
                    if (i < N-1) printf(", ");
                }
                printf("];\n");
            }
    
            print "Compile-time generated Fibonacci sequence:\n";
            for i in 0..20 {
                print f"fib[{i}] = {fibs[i]}\n";
            }
        }
    

    It just literally outputs characters, not even tokens like rust's macros, into the compiler's view of the current source file. It has no access to type information, as Zig's does, and can't really be used for any sort of reflection as far as I can tell.

    The Zig equivalent of the above comptime block just be:

        const fibs = comptime blk: {
            var f: [20]u64 = undefined;
            f[0] = 0;
            f[1] = 1;
            for (2..f.len) |i| {
                f[i] = f[i-1] + f[i-2];
            }
            break :blk f; 
        };
    

    Notice that there's no code generation step, the value is passed seamlessly from compile time to runtime code.

  • I wonder, how can a programming language have the productivity of a high-level language ("write like a high-level language"), if it has manual memory management? This just doesn't add up in my view.

    I'm writing my own programming language that tries "Write like a high-level language, run like C.", but it does not have manual memory management. It has reference counting with lightweight borrowing for performance sensitive parts: https://github.com/thomasmueller/bau-lang

  • I am working on mine as well. I think it is very sane to have some activity in this field. I hope we will have high level easy to write code that is fully optimized with very little effort.

  • There are going to be lots of languages competing with Rust and Zig. It's a popular, underserved market. They'll all have their unique angle.

    • I has been served for several decades, however since the late-90's many decided reducing to only C and C++ was the way going forward, now the world is rediscovering it doesn't have to be like that.

    • They're are certainly going to be lots of languages because now with LLMs it's easier (trivial?) to make one + library (case in point: just within last month there're have been posted here ~20 new langs with codebases 20k~100k LOC) but don't really see them competing. Rust and Zig brought actual improvements and are basically replacing usecases that C++/C had limiting the space available to others.

    • Uhm, no? There is barely enough space for Rust, which happens to have a unique feature/value proposition that raises it above the vast majority of its competitors. If you're fine with UB or memory unsafe code, then you go with C simply because its deeply entrenched.

      In that sense Zen-C changed too many things at once for no good reason. If it was just C with defer, there would have been an opportunity to include defer in the next release of the C standard.

Very similar to any other C-like languages compiling to C (like nim, V, and many smaller hobbyist ones), but I love the keyword "embed". It looks like unlimited potential for fast debbuging, and testing the code without writing boilerplate to read the file and so on.

Nice! Compiles in 2s on my unexceptional hardware. But it lacks my other main desiderata in a new language: string interpolation and kebab-case.

  • Oh, it _does_ have string interpolation, my bad. Sadly, not by default -- you still have to go back and add an "f" before the string once you've started typing it and then realize that you want an interpolated string. Also, it doesn't always work -- if I define two interpolated string variables in one function, GCC chokes in a way I'm not understanding. And every interpolated string variable consumes 4K of global memory.

> String Interpolation (F-strings)

This is so nice. It's crazy how other low-level langs don't have it. I know Dlang and Rust have it. Maybe Swift too? The way Dlang does it is nice because you can do a lot of stuff with them at compile time.

i really like this project. fot me its the next level to your own custom C lib.

first your write 'tutorial C'. then after enough segfaults and double frees you start every project with a custom allocator because you've become obsessed with not having that again..., then you implement a library with a custom more generic one as you learn how to implement them, and add primitives you commonly build that lean on that allocator, it will have your refcouters, maybe ctors, dtors etc etc.. (this atleast is my learning path i guess? still have a loooong way to go as always!)

i dont see myself going for a language like this, but i think its inspirational to see where your code can evolve to with enough experience and care

Impressive repos. I've been toying with the ideas myself but it's hard to stay on track with this sort of extremely demanding task. I am however not exporting to C but to low level jit.

A lot of the ideas in there are worth being inspired by.

This feels like a mix of "Cex.C" and "dasae-headers" projects I've seen somewhere before - maybe it's just the Rust and Zig trend.

The author includes some easter-eggs (printing random facts about Zen and various C constructs) which trigger randomly -- check out the file src/zen/zen_facts.c in the repository...

That's something I used to try to write, but failed due to complexity. A meta-preprocessor for C to make it a little bit more bearable...

KUDOS

What about "Cex.C" and "dasae-headers"? they are integrated directly into the C ecosystem

That's a very nice project.

List of remarks:

> var ints: int[5] = {1, 2, 3, 4, 5};

> var zeros: [int; 5]; // Zero-initialized

The zero initialized array is not intuitive IMO.

> // Bitfields

If it's deterministically packed.

> Tagged unions

Same, is the memory layout deterministic (and optimized)?

> 2 | 3 => print("Two or Three")

Any reason not to use "2 || 3"?

> Traits

What if I want to remove or override the "trait Drawing for Circle" because the original implementation doesn't fit my constraints? As long as traits are not required to be in a totally different module than the struct I will likely never welcome them in a programming language.

  • C uses `|` for bitwise OR and `||` for logical OR. I'm assuming this inherited the same operator paradigm since it compiles to C.

The whole language examples seem pretty rational, and I'm especially pleased / shocked by the `loop / repeat 5` examples. I love the idea of having syntax support for "maximum number of iterations", eg:

    repeat 3 {
       try { curl(...) && break }
       except { continue }
    }

...obviously not trying to start any holy wars around exceptions (which don't seem supported) or exponential backoff (or whatever), but I guess I'm kindof shocked that I haven't seen any other languages support what seems like an obvious syntax feature.

I guess you could easily emulate it with `for x in range(3): ...break`, but `repeat 3: ...break` feels a bit more like that `print("-"*80)` feature but for loops.

Am I the only one who saw this syntax and immediately though "Man, this looks almost identical to Rust with a few slight variations"?

  • It seems to just be Rust for people who are allergic to using Rust.

    It looks like a fun project, but I'm not sure what this adds to the point where people would actually use it over C or just going to Rust.

  • I thought the same and felt it looked really out of place to have I8 and F32 instead of i8 and f32 when so much else looks just like Rust. Especially when the rest of the types are all lower case.

    • Agreed, that really stood out as a ... questionable design decision, and felt extremely un-ergonomic which seems to go against the stated goals of the language.

      1 reply →

Why not compile to rust or assembly? C seems like an odd choice.

In fact why not simply write rust to begin with?