-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SigningConfig proto to have start dates? #474
Comments
As long as the sharding does not take place until a complete expiration period of the TUF metadata has passed, this shouldn't happen. For PGI, that would mean a) add the new key to TrustedRoot and do a target signing, b) wait a week, c) start the sharding. With that said, I do like the idea of simplifying sharding to require only one root signing event. I could see us having the same validity windows between the key material in the TrustedRoot and SigningConfig. Do you think including validity windows at signing time overly complicates signing? How do we handle overlapping windows? How do we handle overlapping windows and private logs concurrently? To give a specific example:
At time Y-1 month, when we roll out rekorv2, I would want a client publishing to rekorv2 and also internal.log, but not rekorv1. |
technically TrustedRoot + SigningConfig might already have the required data for this, and the only thing strictly needed is the rules for how clients should operate:
This btw is a good example of how keeping SigningConfig and TrustedRoot separate is not as great as it sounded in theory... |
Thanks for calling this out.
My current thinking is that this is the most desirable option. In general this gives a flexible way to plan for a sharding, or change in a service at the time of a signing event, giving clients time to learn about it, and so when the shift happens, there should be no outages, or cases where multiple clients are out of sync. Note that time ranges (between TrustedRoot and SigningConfig) doesn't necessarily need to be 100% in sync, to account for some leeway time (i.e 30min or so), the old service should be valid for verification a bit longer compared to when the new service was commissioned. |
So how's this for a proposal ( message SigningConfig {
string media_type = 5;
string ca_url = 1;
string oidc_url = 2;
- repeated string tlog_urls = 3;
+ repeated TimeBoundURL tlog_urls = 3;
repeated string tsa_urls = 4;
}
+ message TimeBoundURL {
+ string url = 1;
+ TimeRange valid_for = 2; // TimeRange is what we use everywhere else in trusted_root
+ } It not obvious to me that we need to do this for the other types, but we can. Rekor is the only one that will be sharded and url swapped. But we can make all of these |
Also may need to include information about the protocol. If the signingConfig is updated to include an endpoint that is communicating over protocol version V+1, then maybe clients need to know? |
Did you have thoughts on how to handle overlapping windows like in the example I noted above? Maybe the answer is also including log operator, and a client should only write to the latest log for each operator? +1 on including protocol version, this will make transitioning between APIs simpler. |
If we include log operator, we should key the repeated urls on the operator name I think, so simplify the client side processing. I'm assuming the log operator name don't bear any specific meaning in general? Clients who cares ought to know what operators they trust. Or would the behaviour be that the client puts a message on one log from each log operator? This should probably go in to the client spec once we know the details. My gut feeling is that the client should put a message on one log from each configured log operator as the default mode, but allow for configuration. |
I assume you want this because you want to allow clients to use the v1 log for a time event though we have a v2 available already? At least I can't think of other use cases where this would be useful. What kind of time frame are we talking about when both would be valid? Would it be problematic if clients just used both logs during that time? |
Assuming we can avoid clients thinking about operators (as described in previous comment), I suppose this might be enough: message SigningConfig {
...
- repeated string tlog_urls = 3;
+ repeated SigningTLog tlogs = 3;
}
+ message SigningTLog {
+ string url = 1;
+ string apiVersion = 2; // hand waving actual values, but something like "v1" and "v2"
+ TimeRange valid_for = 3; // TimeRange is what we use everywhere else in trusted_root
+ } |
@jku, the example I was thinking of is for users that want to concurrently publish to multiple logs by different operators, either for reliability, verification, or trust. This could be for when we have multiple logs in the ecosystem, or for private deployers who want to write to both their internal log and the public log. I don't think this is farfetched - Certificate Transparency requires publishing to multiple logs with distinct operators. I think we should include operator for signing. We'll need operator for verification as well, as verification policies like "a signature is published to >= 2 logs" should be grouped on log operator, not each log instance. |
I'm just saying that verifying correct behaviour of a sigstore client is already really difficult. Adding complexity to the policy management multiplies the difficulty for each added policy knob. So let's be really sure this is required It feels like defining "operators" only works for non-malicious cases like running rekor v1 and v2 at same time during migration -- in other cases who gets to decide when two logs have separate operators? |
Focusing only on this first migration - How do we want to signify to a client that when given a signing config with both Rekor v1 and Rekor v2, the client should always prefer the latter? When we roll out v2, we can't specify an end-date for v1 (initially) since some clients may not yet have support. We shouldn't have clients writing to both. We could use API version, prefer the highest API version, and write to all logs with that version? In this case, For sharding, if the client policy is to write to all active logs, we could keep the overlapping validity window as tiny as possible. Since creating a new shard can happen before we publish the new key (which is not true with v1, since sharding instantly swaps writes over to the new log), the procedure could look like:
No
Whoever distributes the signing config/trust bundle decides. As we bring up more logs in the ecosystem, we should know who operates which logs. Though I think I'm convinced now we can make signing configs work without |
Update from clients meeting: Maybe each item should be a message for flexibility and we can add fields later. message SigningConfig {
string media_type = 5;
repeated SigningCA cas = 1;
repeated SigningOIDC oidc = 2;
repeated SigningTLog tlogs = 3;
repeated SigningTSA tsas = 4;
} Although these may still each just contain the same kind of information (and be redundant) message SingingX {
string url = 1;
string api_version = 2;
TimeRanger valid_for = 3;
} For public good, these will have 1 of each (unless we're transitioning), but otherwise the workflow for clients could be:
|
Defining "valid", what do you think about "Clients pick the entry with the latest API version that the client supports"? Or do you want it to be the responsibility of trust root maintainers to order the SigningX entries? |
Yeah I guess that's kind of vague
This can fail for old clients
We have to move them to a new version of the client or backport. As adoption grows, the need for backports grows. We might want to strongly define what versions of sigstore clients are supported and have passed conformance (and tuf conformance). Lets also hope we don't have to make breaking changes after rekor v2 |
The API version is listed as a string, do we want to be open ended here? Let the service the define the api version string, and assume clients of that services understands how the api version should be decoded? Can we force this to two integers, one for major and one for minor? Or possibly even a leap number? |
I'm taking a stab at this now, I'll have a PR up shortly. @kommendorkapten, I agree, I've added the major API version as an int. I'm proposing leaving out minor. minor would be nice to have for signifying new methods (as non-breaking changes), but that's going to complicate the selection logic and how we update the SigningConfig with new versions. If I add a method to Fulcio, will I update the current entry? What happens if a client doesn't yet support that minor version? Clients are now also keeping track of per-minor-version methods. |
In order to faciliate clients gracefully handling breaking API changes, the SigningConfig will now include API versions for each of the service URLs so that clients can determine what services they are compatible with. Additionally, we've included validity periods which will be used to faciliate Rekor log sharding, when we spin up new log shards and distribute new key material. Fixes sigstore#474 Signed-off-by: Hayden Blauzvern <[email protected]>
In order to faciliate clients gracefully handling breaking API changes, the SigningConfig will now include API versions for each of the service URLs so that clients can determine what services they are compatible with. Additionally, we've included validity periods which will be used to faciliate Rekor log sharding, when we spin up new log shards and distribute new key material. Fixes sigstore#474 Signed-off-by: Hayden Blauzvern <[email protected]>
A few things to note:
|
When rotating keys for rekor (while doing v2 sharding), currently we would require two signing events for root signing
Root signing is a somewhat expensive process but we don't want the ecosystem to end up in a situation where verifiers slightly behind on time can't verify new signatures.
I think this can be solved by making SigningConfig more flexible to key rotations. Potentially by adding startTimes or some sort of time for when signers should start using it. Signers could go down the list of providers and pick the first one that is valid.
fyi: @jku
The text was updated successfully, but these errors were encountered: