Hi, I work at Docker. Registry sees each layer as a SHA and does not store multiple copies of the same SHA for obvious reasons. This is not unique to Hub, it's part of the registry design spec.
Yes, that's what I meant when I mentioned layers; clearly copies of the same layer are not kept :) My question was about block-level, or other forms of deduplication.
Deduplication at the block level would be dependent on the choice of storage driver (https://docs.docker.com/registry/storage-drivers/). In the case of Hub, S3 is the storage medium and that's an object store rather than a block store.
In theory you could modify the spec/application to try to break layers down into smaller pieces but I have a feeling you would reach the point of diminishing returns for normal use cases pretty quickly.
Hi, I work at Docker. Registry sees each layer as a SHA and does not store multiple copies of the same SHA for obvious reasons. This is not unique to Hub, it's part of the registry design spec.
Registry is open source (https://github.com/docker/distribution) and implements the OCI Distribution Specification (https://github.com/opencontainers/distribution-spec/blob/mas...) if you want to dig into it.
Yes, that's what I meant when I mentioned layers; clearly copies of the same layer are not kept :) My question was about block-level, or other forms of deduplication.
Deduplication at the block level would be dependent on the choice of storage driver (https://docs.docker.com/registry/storage-drivers/). In the case of Hub, S3 is the storage medium and that's an object store rather than a block store.
In theory you could modify the spec/application to try to break layers down into smaller pieces but I have a feeling you would reach the point of diminishing returns for normal use cases pretty quickly.
2 replies →