Comment by pizlonator

2 years ago

See here: https://github.com/pizlonator/llvm-project-deluge/blob/delug...

I just got the OpenSSH client to work last night.

Here's an example of the kinds of changes you have to make: https://github.com/pizlonator/deluded-openssh-portable/commi...

Most of the changes are just using zalloc and friends instead of malloc and friends. If I reaaaallly wanted to, I could have made it automatic (like, `malloc(sizeof(Foo))` could be interpreted by the compiler as being just `zalloc(Foo, 1)` ... I didn't do that because I sorta think it's too magical and C programmers don't like too much magic).

How do you handle unions?

I'm also having a hard time fully convincing myself of this:

> the allocator will return a pointer to memory that had always been exactly that type. Use-after-free does not lead to type confusion in Fil-C

In the worst case, this seems like you must simply never reallocate memory, or we're discarding parts of the type. If I successively allocate integer arrays of growing lengths, it seems to be it must either return memory that had previously been used with a different type (e.g., a int[5] and an int[3] occupying the same memory at disjoint times) or address space usage in such a program is quadratic, or we're not considering array length as "part of the type", i.e., we're discarding it. (I'm not sure if this is acceptable or not. I think that should be fine, but I'll have to think harder.)

  • > How do you handle unions?

    This union is fine:

        union {
            int* x;
            foo* y;
        }
    

    It's fine because they're both pointers. This union is also fine:

        union {
            int x;
            float y;
        }
    

    It's fine because Fil-C treats both members as "ints" in the underlying type system.

    This union has to change:

        union {
            char* a;
            int b;
        }
    

    You can turn it into a struct or you can move the `char*` member out of it.

    > In the worst case, this seems like you must simply never reallocate memory, or we're discarding parts of the type. If I successively allocate integer arrays of growing lengths, it seems to be it must either return memory that had previously been used with a different type (e.g., a int[5] and an int[3] occupying the same memory at disjoint times) or address space usage in such a program is quadratic, or we're not considering array length as "part of the type", i.e., we're discarding it. (I'm not sure if this is acceptable or not. I think that should be fine, but I'll have to think harder.)

    int[3] and int[5] are both integer typed but have different size. The allocator also uses size segregation. It so happens that the virtual memory used for int[3] will never be reused for int[5] for that reason.

    There's no problem with this; it's just a more aggressive version of segregated allocation.

    The allocator still returns physical pages when they go free. It's just the virtual memory that only gets reused in a way that conforms to type.

    And, that part of the system is the most well-tested. The isoheap allocator has been shipping in WebKit for years.

    • > This union is fine:

      I can type pun int & float pointers. (I think this is the same sort of behavior that I'm going to note at the end.)

      > You can turn it into a struct or you can move the `char` member out of it.*

      A struct is a product type. A union (at least one combined with a tag) is a sum type. I suppose one could use a struct like a union, and the members not corresponding to the variant are just wasted memory …

      This is also a breakage from vanilla C, but it's not the first in your language, so I'm assuming that's acceptable. (And honestly sum types beyond `enum` are pretty rare in C, as C really doesn't help you.)

      > [isoheaps]

      Yeah, so I think your isoheaps are "memory-safe". I would stress that Rust's guarantees are a good bit stricter, and remove behavior you'd still see in your language. Your language wouldn't crash … but it would still go on to do undefined-ish things. (E.g., you'd get a integer, but it might not be the one you expect, or you might write to memory that is in use elsewhere, assuming the write was in-bounds within a current allocation.)

      1 reply →

Attaching capabilities to pointers is sort of what CHERI does, isn't it? And the presumably CHERI can have better performance thanks to the direct hardware support. (Your manifesto mentions a 200x performance impact currently.)

  • CHERI's capabilities are more permissive. For example, if you use-after-free in CHERI, then you can access the "free" memory (or whatever ends up there after another malloc) without trapping regardless of what type ends up there.

    Fil-C never allows pointer memory to allow primitive or vice-versa, and use-after-free means you're at least pointing at data of the same type. Also, Fil-C has a clear path to using concurrent GC and then not have the UaF problem at all, while CHERI has no path to concurrent GC (they can stop the world, kinda maybe).

    It's not meaningful to conclude anything from Fil-C's current perf. In the limit, it's easier to make Fil-C fast than it is to make CHERI fast. For example, in CHERI, if you want to have a thin pointer then you have to throw safety out of the window. The Fil-C plan is to give you thin pointers provided that you opt into more static typing.