Comment by thedufer

2 months ago

> which has a pub date from Oct 19, 2022

I think you're misinterpreting this. That's just the date the most recent version of the library was published. The library is something like 15 years old.

> the standard integer data types (and, therefore, the standard language), that not only have no signedness

I'm not sure what you mean by this - they're signed integers. Maybe you just mean that there aren't unsigned ints in the stdlib?

> and only have Int32 and Int64, but have "one bit is reserved for OCaml's runtime operation".

The "one bit is reserved" is only true for the `int` type (which varies in size depending on the runtime between 31 and 63 bits). Int32 and Int64 really are normal 32- and 64-bit ints. The trade-off is that they're boxed (although IIRC there is work being done to unbox them) so you pay some extra indirection to use them.

> The stdint package also depends on Jane Street's "Dune", which they call a "Fast, portable, and opinionated build system". I don't need or want or need any of its capabilities.

Most packages are moving this way. Building OCaml without a proper build system is a massive pain and completely inscrutable to most people; Dune is a clear step forward. You're free to write custom makefiles all the time for your own code, but most people avoid that.

> The library is something like 15 years old.

It's not clear from the docs, but, yeah, I suspected that might be the case. Thanks.

> I'm not sure what you mean by this - they're signed integers. Maybe you just mean that there aren't unsigned ints in the stdlib?

Yes, that's what I mean. And doesn't that mean that it's fully unsuitable for systems programming, as this entire topic is focused on?

> The "one bit is reserved" is only true for the `int` type (which varies in size depending on the runtime between 31 and 63 bits).

I don't get it. What is it reserved for then, if the int size is determined when the runtime is built? How can that possibly affect the runtime use of ints? Or is any build of an OCaml program able to target (at compile-time) either 32- or 64-bit targets, or does it mean that an OCaml program build result is always a single format that will adapt at runtime to being in either environment?

Once again, I don't see how any of this is suitable for systems programming. Knowing one's runtime details is intrinsic at design-time for dealing with systems-level semantics, by my understanding.

> Building OCaml without a proper build system

But I don't want to build the programming language, I want to use it. Sure, I can recompile gcc if I need to, but that shouldn't be a part of my dev process for building software that uses gcc, IMO.

It looks to me like JaneStreet has taken over OCaml and added a ton of apparatus to facilitate their various uses of it. Of course, I admit that I am very specific and focused on small, tightly-defined software, so multi-target, 3rd-party utilizing software systems are not of interest to me.

It looks to me like OCaml's intrinsic install is designed to facilitate far more advanced features than I care to use, and that looks like those features make it a very ill-suited choice for a systems programming language, where concise, straightforward semantics will win the day for long-term success.

Once again, it looks like we're all basically forced to fall back to C for systems code, even if our bright-eyed bushy tails can dream of nicer ways of getting the job done.

Thanks for your patient and excellent help on this topic.

  • > I don't get it. What is it reserved for then, if the int size is determined when the runtime is built? How can that possibly affect the runtime use of ints?

    Types are fully erased after compilation of an OCaml program. However, the GC still needs to know things about the data it is looking at - for example, whether a given value is a pointer (and thus needs to be followed when resolving liveness questions) or is plain data. Values of type `int` can be stored right alongside pointers because they're distinguishable - the lowest bit is always 0 for pointers (this is free by way of memory alignment) and 1 for ints (this is the 1 bit ints give up - much usage of ints involves some shifting to keep this property without getting the wrong values).

    Other types of data (such as Int64s, strings, etc) can only be handled (at least at function boundaries) by way of a pointer, regardless of whether they fit in, say, a register. Then the whole block that the pointer points to is tagged as being all data, so the GC knows there are no pointers to look for in it.

    > Or is any build of an OCaml program able to target (at compile-time) either 32- or 64-bit targets, or does it mean that an OCaml program build result is always a single format that will adapt at runtime to being in either environment?

    To be clear, you have to choose at build time what you're targeting, and the integer sized is part of that target specification (most processor architectures these days are 64-bit, for example, but compilation to javascript treats javascript as a 32-bit platform, and of course there's still support for various 32-bit architectures).

    > Knowing one's runtime details is intrinsic at design-time for dealing with systems-level semantics, by my understanding.

    Doesn't this mean that C can't be used for systems programming? You don't know the size of `int` there, either.

    > But I don't want to build the programming language, I want to use it.

    I meant building OCaml code, not the compiler.

    • Thanks for the fantastic explanation for how ints are handled in OCaml, but I've got to say that having the low bit be the flag is a strange design decision, IMO, but I understand that aligning the pointers will make the low bit or two irrelevant for them. But, oh!, the poor ints.

      All this said, thanks for putting to bed, once and for all, any notion anyone should have that OCaml can be used as a systems language. Yikes!

      > Doesn't this mean that C can't be used for systems programming? You don't know the size of `int` there, either.

      You know that at compile time, surely, when you set the build target, no? Even the pointer sizes. Besides, after years of C programming, I got to where I never used the nonspecific versions; if I wanted 64-bits unsigned, I would specifically typedef them at the top, and then there's no ambiguity because I specifically declared all vars. (You can see how I did the same thing in F# at the bottom of this reply.)

      It makes working with printf much less problematic, where things can easily go awry. Anyway, I want my semantics to percolate down pyramid-style from a small set of definitions into larger and larger areas of dependence, but cleanly and clearly.

      Sure, DEFINEs can let you do transparent multi-targetting, but it ends up being very brittle, and the bugs are insidious.

      Thanks for your excellence. It's been a joy learning from you here.

      ---

      As an aside, here's a small part of my defs section from the final iteration of my F# base libs, where I created an alias for the various .NET types for standard use in my code:

         type tI4s = System.Int32
         type tI1s = System.SByte
         type tI2s = System.Int16
         type tI8s = System.Int64
      
         type tI1u = System.Byte
         type tI2u = System.UInt16
         type tI4u = System.UInt32
         type tI8u = System.UInt64
      

      Why risk relying on implicit definitions (or inconsistent F# team alias naming conventions) when, instead, everything can be explicity declared and thus unambiguous? (It's really helpful for syscall interop declarations, as I remember it from so many years ago). Plus, it's far more terse, and .NET not being able to compile to a 64-bit executable (IIRC) made it simpler than C/C++'s two kinds of executable targets.

      1 reply →