Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WASM Configuration v2 #35

Closed
eguzki opened this issue Jun 12, 2023 · 5 comments · Fixed by #37
Closed

WASM Configuration v2 #35

eguzki opened this issue Jun 12, 2023 · 5 comments · Fixed by #37
Assignees
Labels

Comments

@eguzki
Copy link
Contributor

eguzki commented Jun 12, 2023

Current configuration struct

Example:

failure_mode_deny: true
rate_limit_policies:
  - name: toystore
    rate_limit_domain: toystore-app
    upstream_cluster: rate-limit-cluster
    hostnames: ["*.toystore.com"]
    gateway_actions:
      - rules:
          - paths: ["/admin/toy"]
            methods: ["GET"]
            hosts: ["pets.toystore.com"]
        configurations:
          - actions:
            - generic_key:
                descriptor_key: admin
                descriptor_value: "1"

Proposal for a new configuration struct to support RLP v2

Motivation
RFC for the RLP v2 brings routeSelectors and when conditions to qualify traffic to be rate limited. Even thought it is technically possible to implement these rules on the Limitador side using Limitador's conditions, the wasm-shim has all the available info to implement those rules for qualifying traffic at the gateway, thus saving rate limiting traffic at the Limitador side. Current wasm-shim rules based on paths, methods and hosts are not enough to support routeSelectors and when conditions.

Example to illustrate the new wasm shim configuration struct

failureMode: deny
rateLimitPolicies:
  - name: rlp-ns-A/rlp-name-A
    domain: rlp-ns-A/rlp-name-A
    service: rate-limit-cluster
    hostnames: ["*.toystore.com"]
    rules:   
      - conditions:      
        - allOf:
          - selector: <value>
            operator: <value>
            value: <value>
          - selector: <value>
            operator: <value>
            value: <value>
        - allOf:
          - selector: <value>
            operator: <value>
            value: <value>
          - selector: <value>
            operator: <value>
            value: <value>
        data:
        - selector:
            selector: <value>       # Any attribute from a well-known specification TBD ([ref](https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/advanced/attributes#arch-overview-request-attributes))
            key: <value>             # Optional. If not set it defaults to `selector` field value as the descriptor key.
            default: <value>        # Optional. An optional value to use if metadata_key is empty. If not set and no value is present under the metadata_key then no descriptor is generated.
        - static:   
            key: <value>             # Required
            value: <value>          # Required
@eguzki
Copy link
Contributor Author

eguzki commented Jun 12, 2023

@guicassolato open for discussion!

@guicassolato
Copy link
Contributor

I still see too much of Envoy's RL protocol. Also, a mixture of snake case and camel case.

Suggestion:

failureMode: deny # Default. Alternative (more permissive) value: 'allow'
rateLimitPolicies:
- name: toystore
  domain: toystore-app
  service: rate-limit-cluster
  hostnames: ["*.toystore.com"]
  rules: # Each `rule` object has `conditions` and `data` objects; if any block of conditions matches, the component will build a RL request payload ("descriptors") with `data`
  - conditions:      
    - allOf:
      - selector: <value>
        operator: <value>
        value: <value>
      - selector: <value>
        operator: <value>
        value: <value>
    - allOf:
      - selector: <value>
        operator: <value>
        value: <value>
      - selector: <value>
        operator: <value>
        value: <value>
    data:
    - selectors:   
        selector: <value>  # Any attribute from a well-known specification TBD ([ref](https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/advanced/attributes#arch-overview-request-attributes))
        key: <value>       # Optional. If not set it defaults to `selector` field value as the descriptor key.
        default: <value>   # Optional. An optional value to use if metadata_key is empty. If not set and no value is present under the metadata_key then no descriptor is generated.
    - static:              # For RLP controller's usage only - not exposed in the RLP itself
        key: <value>       # Required
        value: <value>     # Required

@eguzki
Copy link
Contributor Author

eguzki commented Jun 12, 2023

LGTM

@guicassolato
Copy link
Contributor

guicassolato commented Jun 26, 2023

I was thinking on a different way to structure the config now that RLPv2's route selectors allow to specify host names.

Today, we map hostnames[] → RLPs. I understand that this was a design decision thought to reflect the way the wasm-shim functions internally when it processes a request in the data plane. Indexing the configs by host name provides a straightforward path from info available in the request, to the set of rules to evaluate. It also mimics how Envoy listeners' and filter chains' configuration is organized, I reckon.

This organization, however, leads to repetition of rules whenever these apply to more than one host name. Additionally, given the current state of kuadrant-operator#204 as of Kuadrant/kuadrant-operator@62a4a43, some incongruous rules may also be generated in the config, effectively mapping host names to rules that are known to never apply to those host names.

Although the later could be fixed by applying stricter filters on what rules (data and conditions) are mapped to each host name, I still believe we here may be mixing how the config has to be structured with the need for an internal hostnames[] → set of rules index for the wasm-shim to function.

For one who reads the wasm config, what we call rateLimitPolicies[].name is actually filled with a host name, despite having another field hostnames at the same level. This is also weird IMO.

With all that, my proposal for a re-org of the wasm config consist of:

  1. rateLimitPolicies is a list of RLPs – not of host names;
  2. RLPs do not repeat within the rateLimitPolicies list – I believe this may simplify reconciliation of the wasm config as well;
  3. the names of the RLPs in the rateLimitPolicies list are filled with the actual names of the RLPs – not with host names;
  4. the rateLimitPolicies[].hostnames field lists the union of all hostnames that the policy applies to;
  5. whatever hostname → RLPs index that is needed by the wasm-shim is internal and built by the wasm-shim

E.g.:

apiVersion: extensions.istio.io/v1alpha1
kind: WasmPlugin
metadata:
  name: kuadrant-istio-ingressgateway
  namespace: istio-system
spec:
  phase: STATS
  pluginConfig:
    failureMode: deny
    rateLimitPolicies:
    - name: rlp-ns-1/rlp-name-1
      domain: rlp-ns-1/rlp-name-1 # not to be confused with hostnames; redundant with `name`, but good for namespacing the rlps in the future (e.g. for multi-tenancy support)
      rules:
      - conditions:
        - allOf:
          - operator: startswith
            selector: request.url_path
            value: /toys
          - operator: eq
            selector: request.method
            value: GET
          - operator: eq
            selector: request.host
            value: '*.toystore.acme.com'
          - operator: neq
            selector: auth.identity.group
            value: admin
        - allOf:
          - operator: startswith
            selector: request.url_path
            value: /toys
          - operator: eq
            selector: request.method
            value: POST
          - operator: eq
            selector: request.host
            value: '*.toystore.acme.com'
          - operator: neq
            selector: auth.identity.group
            value: admin
        data:
        - static:
            key: default/toystore-per-endpoint/toys
            value: "1"
        - selector:
            selector: auth.identity.username
      - conditions:
        - allOf:
          - operator: startswith
            selector: request.url_path
            value: /assets/
          - operator: eq
            selector: request.host
            value: '*.toystore.acme.com'
        - allOf:
          - operator: startswith
            selector: request.url_path
            value: /assets/
          - operator: eq
            selector: request.host
            value: '*.toystore.io'
        data:
        - static:
            key: default/toystore-per-endpoint/assets
            value: "1"
      hostnames:
      - '*.toystore.acme.com'
      - '*.toystore.io'
      service: kuadrant-rate-limiting-service

    - name: rlp-ns-2/rlp-name-2
      domain: rlp-ns-2/rlp-name-2
      rules: […] # some rlp rules that also target '*.toystore.acme.com', but do not target '*.toystore.io'
      hostnames:
      - '*.toystore.acme.com'
      service: kuadrant-rate-limiting-service

  selector:
    matchLabels:
      app: istio-ingressgateway
      istio: ingressgateway
  url: oci://quay.io/kuadrant/wasm-shim:latest

To be fair, this design does not fix the problem of some occasional incongruous rules. For instance, take the 1st rule of the 1st RLP above when hostname == *.toystore.io in a request; the wasm-shim has to evaluate the entire rule to realize it won't generate any data for the RL request related to that rule, but because the 2nd rule applies and it's part of the same RLP, this RLP is nonetheless mapped to the *.toystore.io host name.

On the other hand, the design solves the issue of separating indexing from definition, and therefore the unnecessary repetition of rules.

Happy to hear your thoughts, @eguzki. WDYT?

@guicassolato
Copy link
Contributor

guicassolato commented Jul 5, 2023

@eguzki, I've updated the proposal above after our chat offline today. To sum up,

  • pluginConfig.rateLimitPolicies[].hostnames []string stays and it's filled with the union of all hostnames where the policy applies – even if sometimes some rules of the policy may apply to only some of the hostnames and not all of them
  • No need for pluginConfig.hostnames map[string][]string – the wasm shim will take care of building its index that allows to lookup for a RLP to apply for a given hostname by going through the list of policies and reading their hostnames field.

Additionally,

  • because the RLP controller will no longer group gateway actions per hostname, if the wasm shim spots a RLP listing a hostname already taken by another (previous) RLP in the config, then it will reject the second RLP in association with that hostname.

This makes the structure of the wasm shim configuration, regarding the relation between policies and hostnames, and the behaviour before cases of multiple policies trying to be linked to a same hostname consistent with how Authorino works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
No open projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants