How to do flattening ? #37

rohe · 2019-01-11T09:33:44Z

One of the corner stones of this draft is to allow a trust anchor (federation operator) or for that matter any intermedia entity to limit/restrict the metadata of a leaf entity (e.g. RP/OP).
If for instance the trust anchor decides that the only allowed signing algorithms are elliptic curve algorithms then the evaluated metadata of a leaf entity must only contain such algorithms.

The first attempt at supporting this is what's in the draft right now. A simple system that allows a superior to state the limits of what a leaf entity's metadata can look like.

The reality is of course a bit more complex than what the model assumes.
Take the contacts claim as an example. Here you probably want the subordinates to add their contacts to the ones that are defined by superiors.

Or imaging that a claim (there is none in OP or RP metadata to my knowledge) that has an integer as value. In some cases that value could be something on a scale in which case the superior might want to state you can't have a value >= 10 or < 5. In other cases the value would instead be one in a set and not something you could handle as a something on an ordinal scale.

It would be nice if we could find simple rules one for each value type disregarding what claim it was.
If we got into exceptions (like contacts mentioned above) they would have to be very few anything else would be a disaster.

The text was updated successfully, but these errors were encountered:

alejandro-perez · 2019-01-11T14:24:47Z

We could define a bunch a "policies" for flattening, being the default "is_subset".
Another one could be "ignore", which would mean that any definition in lower MS should prevale.
Another one could be "is_superset", meaning you can add but not remove (eg. imagine a "notification" claim containing a list of emails that should be notified in case of issues).

Finally, we define an additional claim called "policy" being a JSON object, each key is a claim name, and the value is the policy name to apply:

[.....],
"policy": {
   "contacts": "is_superset",
   "a_random_claim": "ignore"
}

This would allow for further policies to be defined.

Just an idea though :)

alejandro-perez · 2019-01-11T14:39:33Z

I thought of two more: max and min. max is the one applied implicitly to exp, for instance.

alejandro-perez · 2019-01-11T14:41:14Z

Another one would be frozen/fixed/readonly. Indicating it MUST stay as it is. I.e. cannot be redefined.

alejandro-perez · 2019-01-11T14:42:56Z

IMO policy should be "is_superset", so lower MS could add policy to undefined claims, but cannot modify those already having a policy.

alejandro-perez · 2019-01-11T15:02:42Z

Alternatively, if this was desired to allow more complexity, a policy could be a JSON object itself, having a name and a set of parameters, but IMO that would be too much complexity.

"policy": {
   "contacts": "is_superset",
   "a_random_claim": "ignore",
   "some_integer_claim": {"name": "range", "min": 5, "max": 15}
}

nckroy · 2019-01-11T18:42:16Z

Here is another set of policies:
must_supply_any_value - the claim must be present with one or more values, but the values required are unspecified
must_supply_uri_value - the claim must be present with a value that is a valid URI, values required are unspecified
must_supply_url_value - the claim must be present with a value that is a valid URL, values required are unspecified
must_supply_https_value - the claim must be present with a value that is a valid HTTPS URL, values required are unspecified
must_supply_urn_value - the claim must be present with a value that is a valid URN
must_supply_email_value - the claim must be present with a value that is a valid e-mail address

nckroy · 2019-01-11T18:58:30Z

These are meant to allow a superior to require specific types of values being supplied, but not what those values have to be. This would support, for example, InCommon's Baseline Expectations program: https://spaces.at.internet2.edu/display/BE/Implementing+Baseline+Expectations+in+InCommon+Metadata

rohe · 2019-01-11T20:49:36Z

Looks like we're about to define a domain specific language.
Different policies can be applied to claims dependent on the claim value type.
Sounds interesting but complex. Also, I'd like to see how a set of policies defined in entity statements in a trust chain are interpreted. How to 'flatten' policies :-/

alejandro-perez · 2019-01-11T22:46:27Z

Looks like we're about to define a domain specific language.

Yeah :(. I would not think of it to be that complex, though.

Sounds interesting but complex. Also, I'd like to see how a set of policies defined in entity statements in a trust chain are interpreted. How to 'flatten' policies :-/

In some previous comment I mentioned that "policies" should implicitly be of type "is_superset", so lower MS could add policies, but could not delete or modify existing ones.

But all of this could be defined on a different document. It could even be defined just for a particular federation. Or a federation could just define on their SLA agreements how claims should be flattened (I think we already discussed this in Copenhagen, didn't we?).

In any case, those were my 2 cents :)

rohe · 2019-01-12T07:29:16Z

I don't believe in federations defining their own flattening functions. That would lead to an interoperability nightmare.
We need one model for how flattening is done.
Now, @alejandro-perez what you propose gets me thinking about claims requests using the claims parameter https://openid.bitbucket.io/connect/openid-connect-core-1_0.html#IndividualClaimsRequests . If we could do something similar to that ...

rohe · 2019-01-12T07:39:04Z

@nckroy You must remember that a leaf entity in the general case may not know which federations it belongs to (and it shouldn't have to).
Think of it this way: every leaf entity states that this is what I'm able to do, whom ever I talk to and then have the policies reduce this to what actually is going to be used. So having a policy that says you have to have this or you have to have that just doesn't work. For instance, what would the leaf entity do if it belonged to 2 federations with conflicting views on which crypto algorithms to use. One stating that everyone should use RS256 and the other adamant about ES256 being used ? The idea in the draft is that the leaf just states "I can do RS256 and ES256" (provided it can) and then at run time have that statement filtered by the policies such that if you chose to work within federation-RS256 then the resulting metadata statement for the leaf would only list RS256 and vice versa.

rohe · 2019-01-12T08:55:26Z

Ok, thought a bit more along the claims request path and came up with these examples:

OP metadata policy:

{
  "scopes_supported": {
    "subset_of": ["openid", "email", "profile"]
  },
  "claims_parameter_supported": {
    "value": true
  },
  "op_policy_uri": {
    "default": "https://op.example.com/policy.html"
  }
}

RP metadata policy:

{
  "response_types": {
    "subset_of": ["code", "code token"]
  },
  "grant_types": {
    "subset_of": ["authorization_code", "implicit"]
  },
  "application_type": {
    "value": "web"
  },
  "contacts": {
    "add" : "support@federation.example.com"
  },
  "policy_uri": {
    "add": "https://federation.example.com/policy.html"
  },
  "id_token_signed_response_alg": {
    "one_of": ["ES256", "ES384", "ES512"]
  },
  "token_endpoint_auth_method": {
    "value": "private_key_jwt"
  }
}

The pattern should be obvious. The key words are:

subset_of: Only these values are allowed. If the list of allowed values are ["A","B","C"] and the OP lists ["A","C","D"] as its values. The flattening would result in the set ["A","C"].

value: The value of this claim is fixed to this one allowed value.

one_of: The value of the claim can be one of the listed

add: There is no limitation of which values to use. This value should be added to the resulting list.

default: If no other value is given this one should be used.

Just an idea :-)

rohe · 2019-01-13T09:07:47Z

Did a Proof-of-concept implementation and this is what I get:

The Federation Operators policy:

{
    "scopes": {
        "subset_of": ["openid", "eduperson"]
    },
    "response_types": {
        "subset_of": ["code", "code id_token"]
    }
}

The organisations policy:

{
    "contacts": {
        "add": ["helpdesk@example.com"]
    },
    "logo_uri": {
        "one_of": ["https://example.com/logo1.jpg", "https://example.com/logo2.jpg"],
        "default": "https://example.com/logo1.jpg"
    },
    "policy_uri": {
        "value": "https://example.com/policy.html"
    },
    "tos_uri": {
        "value": "https://example.com/tos.html"
    }
}

The metadata statement from the RP:

{
    "contacts": ["rp_admins@cs.example.com"],
    "redirect_uris": ["https://cs.example.com/rp1"],
    "response_types": ["code"]
}

And the result after applying the policies:

{
    'contacts': ['rp_admins@cs.example.com', 'helpdesk@example.com'],
    'redirect_uris': ['https://cs.example.com/rp1'],
    'response_types': ['code'],
    'logo_uri': 'https://example.com/logo1.jpg',
    'policy_uri': 'https://example.com/policy.html',
    'tos_uri': 'https://example.com/tos.html'
}

alejandro-perez · 2019-01-14T09:41:45Z

LGTM

rohe · 2019-01-14T09:43:37Z

That's all the confirmation I needed :-)

daserzw · 2019-01-15T12:05:54Z

Looks great to me as well!

Other considerations:

it would be great to have a standard set of policies that MUST be implemented.
extensions to the standard set are possible through other specs or profiles.
a regex policy would probably be a useful addition to the standard set --- it should be applied to strings and list of strings.

nckroy · 2019-01-15T17:58:32Z

Davide, are you thinking certain elements like at least a technical contact should always be a MUST? What other elements would be MUSTs? Probably worth aligning with the new version of saml2int(?) https://kantarainitiative.github.io/SAMLprofiles/saml2int.html

…

On Jan 15, 2019 at 5:05 AM, <Davide Vaghetti ***@***.***)> wrote: Looks great to me as well! Other considerations: it would be great to have a standard set of policies that MUST be implemented. extensions to the standard set are possible through other specs or profiles. a regex policy would probably be a useful addition to the standard set --- it should be applied to strings and list of strings. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub (#37 (comment)), or mute the thread (https://github.com/notifications/unsubscribe-auth/ADXL4JwKSQLaPqRyPvRF5TgfIC_VfnTiks5vDcQjgaJpZM4Z7GM3).

nckroy · 2019-01-15T18:41:56Z

Roland, would it be possible for the Federation Operators policy to include a requirement to supply at least one contact that is an email address? Eventually, would it be possible for the Federation Operators policy to require at least one contact that is an email address, and a specific type, for example "Technical Contact"?

alejandro-perez · 2019-01-15T18:52:09Z

You could do it with the regex policy Davide proposed, couldn't you? El 15 ene. 2019 19:42, Nick Roy <notifications@github.com> escribió: Roland, would it be possible for the Federation Operators policy to include a requirement to supply at least one contact that is an email address? Eventually, would it be possible for the Federation Operators policy to require at least one contact that is an email address, and a specific type, for example "Technical Contact"? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#37 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AB_G6K_gCvm8k4LY2gI7lkRonpQL_CHtks5vDiD1gaJpZM4Z7GM3>. Jisc is a registered charity (number 1149740) and a company limited by guarantee which is registered in England under Company No. 5747339, VAT No. GB 197 0632 86. Jisc’s registered office is: One Castlepark, Tower Hill, Bristol, BS2 0JA. T 0203 697 5800. Jisc Services Limited is a wholly owned Jisc subsidiary and a company limited by guarantee which is registered in England under company number 2881024, VAT number GB 197 0632 86. The registered office is: One Castle Park, Tower Hill, Bristol BS2 0JA. T 0203 697 5800.

rohe · 2019-01-15T19:55:44Z

By using regex as @daserzw proposes you can probably check that a contact is in fact an email address. But there is no way to demand that there is a value at all. At least not for the time being.

In https://openid.bitbucket.io/connect/openid-connect-core-1_0.html#IndividualClaimsRequests there is the verb essential but it's always up to the supplier of the information (in the case of claims requests the OP) to do as they pleases.There is no way to directly enforce anything.

The only thing you can do is run a checking service that runs around and validates the metadata for all the RPs/OPs in the federation.

nckroy · 2019-01-15T21:17:01Z

The running around checking values thing is not going to scale. Can we include the essential part in the policies? Re:regular expressions, that has proven to be problematic for things like shibmd:Scope in the SAML world due to the proliferation of different regex implementations in different languages/frameworks.

…

On Jan 15, 2019 at 12:55 PM, <Roland Hedberg ***@***.***)> wrote: By using regex as @daserzw (https://github.com/daserzw) proposes you can probably check that a contact is in fact an email address. But there is no way to demand that there is a value at all. At least not for the time being. In https://openid.bitbucket.io/connect/openid-connect-core-1_0.html#IndividualClaimsRequests there is the verb essential but it's always up to the supplier of the information (in the case of claims requests the OP) to do as they pleases.There is no way to directly enforce anything. The only thing you can do is run a checking service that runs around and validates the metadata for all the RPs/OPs in the federation. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub (#37 (comment)), or mute the thread (https://github.com/notifications/unsubscribe-auth/ADXL4Eu9pb_H-g_h9DWUM3MlCGp7N2Ynks5vDjJBgaJpZM4Z7GM3).

c00kiemon5ter · 2019-01-16T02:08:18Z

Re:regular expressions, that has proven to be problematic for things like shibmd:Scope in the SAML world due to the proliferation of different regex implementations in different languages/frameworks.

You can define the regex to be of a certain standard; see POSIX BRE, POSIX ERE, PCRE, etc. Pick one to dictate the valid expression -- ie, don't depend on an implementation or programming language..
Notice, that different standards have different capabilities, and these affect the performance of the implementation.

a regex policy would probably be a useful addition to the standard set --- it should be applied to strings

why not numbers, too?

and list of strings.

why a list of strings and not a list of list of strings? I guess, what we care about are the items of those lists, no matter how nested they are. Even though, I have no use case, should this be limited by the standard?

What happens when the regex (or any rule for that matter) is applied to an incompatible value-type?

By using regex as @daserzw proposes you can probably check that a contact is in fact an email address. But there is no way to demand that there is a value at all.

an empty value will not match the regex rule; but, I think what you're saying is that the rule will not be checked, if no such claim is in the response.

"subset_of": ["code", "code id_token"]

is id_token code a valid subset? I think it should be.
Sets by definition are not ordered.
A better representation would be a list of sets:

"subset_of": [{"code"}, {"code", "id_token"}]

This allows you to implement this check (using Python's set datatype) as:

rulesets = [{"code"}, {"code", "id_token"}]
value = set("id_token code".split())
any(value.issubset(rs) for rs in rulesets)

appropriate datatypes are there for other langs.

rohe · 2019-01-16T08:19:45Z

@nckroy We can include essential as a key word but what will the consequences be ?
To go back to something I said a while ago. If we are talking about the general case, not the special 1-level deep version federations today use, then a leaf entry may not know which federations it belongs to so it will not know what policies that will be applied to its metadata.

This together with Andreas leading point: A leaf entry MUST, disregarding if it's an RP or an OP, have an identity that is independent on who it's going to talking to and which federations it belongs to.

Will result in a leaf entity publishing: This is what I can do !

What it doesn't mean is that the metadata used in one context (an RP talking to an OP within the confines of one federation) is absolutely the same as in another context. But what it has meant so far is that an entity's basic view of itself is the same in all contexts.

Whether an entity's metadata lives up to the expectations of a certain federation can be check by anyone that can collect the trust chain starting with the leaf entity and ending in the trust anchor of the federation.

Such a check will definitely happen at run time when 2 parties are gathering metadata about each other.

I guess the best one can do is have the members in the federation be responsible for the entities they own.

daserzw · 2019-01-16T08:23:46Z

Re:regular expressions, that has proven to be problematic for things like shibmd:Scope in the SAML world due to the proliferation of different regex implementations in different languages/frameworks.

You can define the regex to be of a certain standard; see POSIX BRE, POSIX ERE, PCRE, etc. Pick one to dictate the valid expression -- ie, don't depend on an implementation or programming language..
Notice, that different standards have different capabilities, and these affect the performance of the implementation.

I totally agree.

a regex policy would probably be a useful addition to the standard set --- it should be applied to strings

why not numbers, too?

Maybe I'm missing your point, but usually regex does not know about arithmetic and/or quantity, right? So basically numbers will be treated as characters in a string.

and list of strings.

why a list of strings and not a list of list of strings? I guess, what we care about are the items of those lists, no matter how nested they are. Even though, I have no use case, should this be limited by the standard?

Probably not, but consider that the less specific we are, the less clean and interoperable implementations will come out. So, for example you can have also a

What happens when the regex (or any rule for that matter) is applied to an incompatible value-type?

IMO this is a very good question. I think there are two main strategies to deal with that:

keeping a map of all the claims and matching policies, and then fire an error if you spot a wrong match --- it's really cumbersome...
you have basically single-valued and multiple-valued claims:
2.1. subset and add can be applied only to multiple-valued claims.
2.2. one_of, value, default can be applied only to single-valued claims.

By using regex as @daserzw proposes you can probably check that a contact is in fact an email address. But there is no way to demand that there is a value at all.

an empty value will not match the regex rule; but, I think what you're saying is that the rule will not be checked, if no such claim is in the response.

"subset_of": ["code", "code id_token"]

is id_token code a valid subset? I think it should be.
Sets by definition are not ordered.
A better representation would be a list of sets:

"subset_of": [{"code"}, {"code", "id_token"}]

This allows you to implement this check (using Python's set datatype) as:
rulesets = [{"code"}, {"code", "id_token"}]
value = set("id_token code".split())
any(value.issubset(rs) for rs in rulesets)
appropriate datatypes are there for other langs.

rohe · 2019-01-16T08:34:39Z

a regex policy would probably be a useful addition to the standard set --- it should be applied to strings

why not numbers, too?

Why not :-)

and list of strings.

why a list of strings and not a list of list of strings? I guess, what we care about are the items of those lists, no matter how nested they are. Even though, I have no use case, should this be limited by the standard?

There is a fine line between having a solution that covers the 'known' universe of data types and one that covers everything we can think up. I've by design not considered JSON objects for instance !
I think we should stay with simple data types that we know are in use.

What happens when the regex (or any rule for that matter) is applied to an incompatible value-type?

It MUST fail !

By using regex as @daserzw proposes you can probably check that a contact is in fact an email address. But there is no way to demand that there is a value at all.

an empty value will not match the regex rule; but, I think what you're saying is that the rule will not be checked, if no such claim is in the response.

Correct !

"subset_of": ["code", "code id_token"]

is id_token code a valid subset? I think it should be.

This is something I've always have though problematic with the standard.
It's sort of a set but at the same time not.
https://openid.net/specs/oauth-v2-multiple-response-types-1_0.html
defines the set of response types and they are ordered lists.
In reality though ,any decent OAuth2/OIDC library treats them internally as sets.

c00kiemon5ter · 2019-01-16T09:04:17Z

This is something I've always have though problematic with the standard.
It's sort of a set but at the same time not.
https://openid.net/specs/oauth-v2-multiple-response-types-1_0.html
defines the set of response types and they are ordered lists.
In reality though ,any decent OAuth2/OIDC library treats them internally as sets.

From the linked document:

Multiple-Valued Response Types

The OAuth 2.0 specification allows for registration of space-separated response_type parameter values. If a Response Type contains one of more space characters (%20), it is compared as a space-delimited list of values in which the order of values does not matter.

I would say it is an unordered list; this differs from a set in the sense that it can contain duplicate entries.

rohe · 2019-01-16T10:23:55Z

Well, what I've always been baffled about is that they actually explicitly registered multiple valued response types. Why not just say you can combine response types and that on the wire they must be represented as strings with space separated values. Separating the value (a set of values) and the encoding.
But alas no. Which has lead to some implementors assuming/believing that the registered values are the only allowed according to the standard.

rohe · 2019-01-16T13:48:45Z

I want us to use "subset_of": ["code", "code id_token"]
instead of "subset_of": [{"code"}, {"code", "id_token"}]
since the standard says the values are of the form "code id_token".
We just have to make the comparison function smart enough to understand that for response_types,
"code id_token" is equivalent to "id_token code".

daserzw · 2019-01-16T13:59:45Z

I think "subset_of": ["code", "code id_token"] can be deserialized as a set of set, something like:

metadata_policies = {"subset_of": ["code", "code id_token"]}
subset_of_ruleset = Set([Set(rule.split()) for rule in metadata_policies["subset_of"]])
value = set("id_token code".split())
any(value.issubset(rs) for rs in subset_of_ruleset)

c00kiemon5ter · 2019-01-16T14:06:35Z

 metadata_policies = {"subset_of": ["code", "code id_token"]}
-subset_of_ruleset = Set([Set(rule.split()) for rule in metadata_policies["subset_of"]])
+subset_of_ruleset = [set(rule.split()) for rule in metadata_policies["subset_of"]]
 value = set("id_token code".split())
 any(value.issubset(rs) for rs in subset_of_ruleset)

daserzw · 2019-01-16T14:07:33Z

simpler is better ;-)

rohe added the Discussion Needed label Jan 11, 2019

alejandro-perez mentioned this issue Jan 11, 2019

Limiting or expanding arrays and flattening metadata #6

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to do flattening ? #37

How to do flattening ? #37

rohe commented Jan 11, 2019

alejandro-perez commented Jan 11, 2019 •

edited

Loading

alejandro-perez commented Jan 11, 2019

alejandro-perez commented Jan 11, 2019 •

edited

Loading

alejandro-perez commented Jan 11, 2019

alejandro-perez commented Jan 11, 2019

nckroy commented Jan 11, 2019

nckroy commented Jan 11, 2019

rohe commented Jan 11, 2019

alejandro-perez commented Jan 11, 2019

rohe commented Jan 12, 2019

rohe commented Jan 12, 2019 •

edited

Loading

rohe commented Jan 12, 2019 •

edited

Loading

rohe commented Jan 13, 2019 •

edited

Loading

alejandro-perez commented Jan 14, 2019

rohe commented Jan 14, 2019

daserzw commented Jan 15, 2019

nckroy commented Jan 15, 2019 via email

nckroy commented Jan 15, 2019

alejandro-perez commented Jan 15, 2019 via email

rohe commented Jan 15, 2019

nckroy commented Jan 15, 2019 via email

c00kiemon5ter commented Jan 16, 2019 •

edited

Loading

rohe commented Jan 16, 2019

daserzw commented Jan 16, 2019

rohe commented Jan 16, 2019

c00kiemon5ter commented Jan 16, 2019 •

edited

Loading

Multiple-Valued Response Types

rohe commented Jan 16, 2019

rohe commented Jan 16, 2019

daserzw commented Jan 16, 2019 •

edited

Loading

c00kiemon5ter commented Jan 16, 2019

daserzw commented Jan 16, 2019

How to do flattening ? #37

How to do flattening ? #37

Comments

rohe commented Jan 11, 2019

alejandro-perez commented Jan 11, 2019 • edited Loading

alejandro-perez commented Jan 11, 2019

alejandro-perez commented Jan 11, 2019 • edited Loading

alejandro-perez commented Jan 11, 2019

alejandro-perez commented Jan 11, 2019

nckroy commented Jan 11, 2019

nckroy commented Jan 11, 2019

rohe commented Jan 11, 2019

alejandro-perez commented Jan 11, 2019

rohe commented Jan 12, 2019

rohe commented Jan 12, 2019 • edited Loading

rohe commented Jan 12, 2019 • edited Loading

rohe commented Jan 13, 2019 • edited Loading

alejandro-perez commented Jan 14, 2019

rohe commented Jan 14, 2019

daserzw commented Jan 15, 2019

nckroy commented Jan 15, 2019 via email

nckroy commented Jan 15, 2019

alejandro-perez commented Jan 15, 2019 via email

rohe commented Jan 15, 2019

nckroy commented Jan 15, 2019 via email

c00kiemon5ter commented Jan 16, 2019 • edited Loading

rohe commented Jan 16, 2019

daserzw commented Jan 16, 2019

rohe commented Jan 16, 2019

c00kiemon5ter commented Jan 16, 2019 • edited Loading

Multiple-Valued Response Types

rohe commented Jan 16, 2019

rohe commented Jan 16, 2019

daserzw commented Jan 16, 2019 • edited Loading

c00kiemon5ter commented Jan 16, 2019

daserzw commented Jan 16, 2019

alejandro-perez commented Jan 11, 2019 •

edited

Loading

alejandro-perez commented Jan 11, 2019 •

edited

Loading

rohe commented Jan 12, 2019 •

edited

Loading

rohe commented Jan 12, 2019 •

edited

Loading

rohe commented Jan 13, 2019 •

edited

Loading

c00kiemon5ter commented Jan 16, 2019 •

edited

Loading

c00kiemon5ter commented Jan 16, 2019 •

edited

Loading

daserzw commented Jan 16, 2019 •

edited

Loading