Comment by a_t48

1 day ago

I've found that tar processing tends to dominate the time used to do anything with standard OCI layers. I have a more efficient format (that splits apart the layer into metadata+chunks) that I'm open sourcing soon if y'all are interested in using it.

interested. is the split for dedup, parallel pulls, or lazy loading specific files? maybe all.

we've played with some chunking ideas on our end but haven't landed on a format. drop a link when it's out.

  • All of the above, plus being able to reflink to skip copies of large files, plus not having to round trip from disk a few times for tar layers, plus a number of other side benefits. Only using lazy loading for buildkit right now, as it does require FUSE and I want it to be opt in (for robotics contexts, for instance, you never want to lazy load).