Comment by fc417fc802

2 months ago

> imply that normal API token publishing is somehow not authenticated

Fair enough, although the same reasoning would imply that API token publishing isn't trusted ... well after the recent npm attacks I suppose it might not be at that.

> With what key?

> And there is a signature involved,

So there's already a key involved. I realize its lifetime might not be suitable but presumably the pipeline itself either already possesses or could generate a long lived key to be registered with the central service.

> but it only verifies the identity of the pipeline,

I thought verifying the identity of the pipeline was the entire point? The pipeline singing a fingerprint of the package would enable anyone to verify the provenance of the complete contents (either they'd need a way to look up the key or you could do TOFU but I digress). There's value in being able to verify the integrity of the artifacts in your local cache.

Also, the more independent layers of authentication there are the fewer options an attacker will have. A hypothetical artifact that carried signatures from the developer, the pipeline, and the registry would have a very clear chain of custody.

> it being hard to directly sign arbitrary inputs with just OIDC in a meaningful way

At the end of the day you just need to somehow end up in a situation where the pipeline holds a key that has been authenticated by the package registry. From that point on I'd think that the particular signature scheme would become a trivial implementation detail; you stuff the output into some json or something similar and get on with life.

Has some key complexity gone over my head here?

BTW please don't take this the wrong way. It's not my intent to imply that I know better. As long as the process works it isn't my intent to critique it. I was just honestly surprised to learn that the package content itself isn't signed by the pipeline to prove provenance for downstream consumers and from there I'm just responding to the reasoning you gave. But if the current process does what it set out to do then I've no grounds to object.

4 comments

fc417fc802

woodruffw 2 months ago

> So there's already a key involved. I realize its lifetime might not be suitable but presumably the pipeline itself either already possesses or could generate a long lived key to be registered with the central service.

The key involved is the OIDC IdP's key, which isn't controlled by the maintainer of the project. I think it would be pretty risky to allow this key to directly sign for packages, because this would imply that any party that can use that key for signing can sign for any package. This would mean that any GitHub Actions workflow anywhere would be one signing bug away from impersonating signatures for every PyPI project, which would be exceedingly not good. It would also make the insider risk from a compromised CI/CD provider much larger.

(Again, I really recommend taking a look at the talks I linked. Both Trusted Publishing and attestations were multi-year projects that involved multiple companies, cryptographers, and implementation engineers, and most of your - very reasonable! - questions came up for us as well while designing and planning this work.)

> I thought verifying the identity of the pipeline was the entire point? The pipeline singing a fingerprint of the package would enable anyone to verify the provenance of the complete contents (either they'd need a way to look up the key or you could do TOFU but I digress). There's value in being able to verify the integrity of the artifacts in your local cache.

There are two things here:

1. Trusted Publishing provides a verifiable link between a CI/CD provider (the "machine identity") and a packaging index. This verifiable link is used to issue short-lived, self-scoping credentials. Under the hood, Trusted Publishing relies on a signature from the CI/CD provider (which is an OIDC IdP) to verify that link, but that signature is only over a set of claims about the machine identity, not the package identity.

2. Attestations are a separate digital signing scheme that can use a machine identity. In PyPI's case, we bootstrap trust in a given machine identity by seeing if a project is already enrolled against a Trusted Publisher that matches that identity. But other packaging ecosystems may do other things; I don't know how NPM's attestations work, for example. This digital signing scheme uses a different key, one that's short-lived and isn't managed by the IdP, so that signing events can be made transparent (in the "transparency log" sense) and are associated more meaningfully with the machine identity, not the IdP that originally asserted the machine identity.

> At the end of the day you just need to somehow end up in a situation where the pipeline holds a key that has been authenticated by the package registry. From that point on I'd think that the particular signature scheme would become a trivial implementation detail; you stuff the output into some json or something similar and get on with life.

Yep, this is what attestations do. But a key piece of nuance: the pipeline doesn't "hold" a key per se, it generates a new short-lived key on each run and binds that key to the verified identity sourced from the IdP. This achieves the best of both worlds: users don't need to maintain a long-lived key, and the IdP itself is only trusted as an identity source (and is made auditable for issuance behavior via transparency logging). The end result is that clients that verify attestations don't verify using a specific key; the verify using an identity, and ensure that any particular key matches that identity as chained through an X.509 CA. That entire process is called Sigstore[1].

And no offense taken, these are good questions. It's a very complicated system!

[1]: https://www.sigstore.dev

fc417fc802 2 months ago
> I think it would be pretty risky to allow this key to directly sign for packages, because this would imply that any party that can use that key for signing can sign for any package.
There must be some misunderstanding. For trusted publishing a short lived API token is issued that can be used to upload the finished product. You could instead imagine negotiating a key (ephemeral or otherwise) and then verifying the signature on upload.
Obviously the signing key can't be shared between projects any more than the API token is. I think I see where the misunderstanding arose now. Because I said "just verify the pipeline identity" and you interpreted that as "let end users get things signed by a single global provider key" or something to that effect, right?
The only difference I had intended to communicate was the ability of the downstream consumer to verify the same claim (via signature) that the registry currently verifies via token. But it sounds like that's more or less what attestation is? (Hopefully I understood correctly.) But that leaves me wondering why Trusted Publishing exists at all. By the time you've done the OIDC dance why not just sign the package fingerprint and be done with it? ("We didn't feel like it" is of course a perfectly valid answer here. I'm just curious.)
I did see that attestation has some other stuff about sigstore and countersignatures and etc. I'm not saying that additional stuff is bad, I'm asking if Trusted Publishing wouldn't be improved by offering a signature so that downstream could verify for itself. Was there some technical blocker to doing that?
> the IdP itself is only trusted as an identity source
"Only"? Doesn't being an identity source mean it can do pretty much anything if it goes rogue? (We "only" trust AD as an identity source.)
- woodruffw 2 months ago
  
  > There must be some misunderstanding. For trusted publishing a short lived API token is issued that can be used to upload the finished product. You could instead imagine negotiating a key (ephemeral or otherwise) and then verifying the signature on upload.
  From what authority? Where does that key come from, and why would a verifying party have any reason to trust it?
  (I'm not trying to be tendentious, so sorry if it comes across that way. But I think you're asking good questions that lead to the design that we arrived at with attestations.)
  > I did see that attestation has some other stuff about sigstore and countersignatures and etc. I'm not saying that additional stuff is bad, I'm asking if Trusted Publishing wouldn't be improved by offering a signature so that downstream could verify for itself. Was there some technical blocker to doing that?
  The technical blocker is that there's no obvious way to create a user-originated key that's verifiably associated with a machine identity, as originally verified from the IdP's OIDC credential. You could do something like mash a digest into the audience claim, but this wouldn't be very auditable in practice (since there's no easy way to shoehorn transparency atop that). But some people have done some interesting exploration in that space with OpenPubKey[1], and maybe future changes to OIDC will make something like that more tractable.
  > "Only"? Doesn't being an identity source mean it can do pretty much anything if it goes rogue? (We "only" trust AD as an identity source.)
  Yes, but that's why PyPI (and everyone else who uses Sigstore) mediates its use of OIDC IdPs through a transparency logging mechanism. This is in effect similar to the situation with CAs on the web: a CA can always go rogue, but doing so would (1) be detectable in transparency logs, and (2) would get them immediately evicted from trust roots. If we observed rogue activity from GitHub's IdP in terms of identity issuance, the response would be similar.
  [1]: https://github.com/openpubkey/openpubkey
  
  1 reply →