Comment by jandrewrogers

1 year ago

Honestly, current best practice puts that number right around zero, which you see for ambitious implementations.

A non-obvious issue is that database engines have peculiar requirements for how libraries are designed and implemented which almost no conventional library satisfies. To make matters worse, two different database implementations may have different requirements in this regard, so you can't even share libraries between databases. There are no black boxes in good database engines.

Compression libraries, OpenSSL, ICU, etc. are all common dependencies for databases.

Looking at the dependencies list (https://gist.github.com/tisonkun/06550d2dcd9cf6551887ee6305e...) I see plenty of reasonable things like:

* Base64/checksum/compression encoding libraries

* Encryption/hash libraries

* Platform-specific bindings (likely conditional dependencies)

* Bit hacking/casting/zero-copy libraries like bytemuck, zerocopy, zero-vec, etc.

* "Small"/stack allocated data structure libraries (smallvec, tinystr, etc.)

* Unicode libraries

There are certainly things that would add bloat too, but I think it's silly to pretend like everything here is something a database engine would need custom implementations of.

  • I think you'd be surprised how many of these things are custom implementations in databases. The main motivation is performance. Databases tend to have detailed and well-specified constraints on each use case for data structures and algorithms that can be used to codegen narrowly optimized implementations. You can do significantly better than generic library codecs or data structures in most cases, those implementations lack the context and metaprogramming hooks to make it feasible.

    Combine this with the challenge of implementations being async, non-allocating, compatible with explicitly paged memory, etc and it generally becomes worth the effort.

    You'll find more libraries used at the periphery for integration and compatibility where it matters less but not in the core.

    • Pretty sure it's not due to performance, but due to age of most database code bases and in some cases licensing. How annoying it is to have dependencies in C and C++ also probably a contributing factor.

      I'd rather an author pulls in a tinyvec/serde than tries to make a bespoke implementation.

      2 replies →

> current best practice puts that number right around zero

In the case where the answer is "zero", then that means that one does not actually need a package manager at all, in which case the features of the package manager are not relevant to the choice of language. This would imply that the parent commenter has no need to reject Rust.