Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-26570][statefun] Remote module configuration interpolation #309

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

FilKarnicki
Copy link
Contributor

What is the purpose of the change

The goal of this PR is to add the ability to define ${placeholders} in the module.yaml's spec object, e.g.:

kind: com.foo.bar/test
spec:
  something: ${REPLACE_ME}
  transport:
    password: ${REPLACE_ME_WITH_A_SECRET}
    array:
      - ${REPLACE_ME}
      - sthElse 

These placeholders are then replaced using

  1. System properties
  2. Environment variables
  3. flink-conf.yaml entries with prefix 'statefun.module.global-config.'
  4. Command line args

where (4) override (3) which override (2) which override (1).

Main changes are

  • RemoteModule resolves placeholders by traversing through the spec json node
  • ComponentJsonObject has a new constructor that allows the use of resolved spec json node (note: rawObjectNode still contains unresolved ${placeholders}, but does not seem to be used anywhere)

Verifying this change

  • Added tests for resolving placeholders, and a test for a value that is not found in the combined key-value config map, where an IllegalArgumentException is thrown.
  • Updated Java, Go, and JS e2e tests to use placeholders

Dependencies (does it add or upgrade a dependency): no
The public API, i.e., is any changed class annotated with @public(Evolving): N/A
The serializers: no
The runtime per-record code paths (performance sensitive): no
Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: no
The S3 file system connector: no

@FilKarnicki FilKarnicki force-pushed the FLINK-26570_remote_module_configuration_interpolation_final branch from 3d3ff6f to 4f63228 Compare March 22, 2022 11:48
Copy link
Contributor

@tillrohrmann tillrohrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for creating this PR @FilKarnicki. I like the flexibility it gives to the user. The main thing I am asking myself is whether it gives users too much power so that they risk shooting themselves into the foot. I don't have a perfect answer for it yet.

Comment on lines +67 to +70
1. System properties
2. Environment variables
3. flink-conf.yaml entries with prefix 'statefun.module.global-config.'
4. Command line args
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is more common to give env variables precedence over flink-conf.yaml values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree in principal. That said, because globalConfiguration is already an combination of args and flink-conf.yaml entries with the statefun.module.global-config. prefix, there's no easy way to put env variables in between them without affecting other parts of the system.

I made a start in my fork and the number of changes is pretty high for what we gain. Please have a look and let me know if I Should working on the change you mentioned in this comment

FilKarnicki@02cd6a9

Comment on lines +76 to +92
kind: io.statefun.endpoints.v2/http
spec:
functions: com.example/*
urlPathTemplate: ${FUNC_PROTOCOL}://${FUNC_DNS}/{function.name}
---
kind: io.statefun.kafka.v1/ingress
spec:
id: com.example/my-ingress
address: ${KAFKA_ADDRESS}:${KAFKA_PORT}
consumerGroupId: my-consumer-group
topics:
- topic: ${KAFKA_INGRESS_TOPIC}
(...)
properties:
- ssl.truststore.location: ${SSL_TRUSTSTORE_LOCATION}
- ssl.truststore.password: ${SSL_TRUSTSTORE_PASSWORD}
(...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the flexibility the templating mechanism gives to the user. The thing I am asking myself is whether it gives too much power so that the user can shoot himself into his foot.

The danger I see is that we add level of indirections that make it harder to reason about what the effective yaml will look like for a user. The underlying problem the issue wants to solve is to pass in information that is only available to the process that runs SF but not the user (e.g. secrets). I am wondering whether there is an alternative to achieve the same but with a bit less power (e.g. allowing substitution only for selected fields (also confusing)). I don't have a perfect answer here. I mainly wanted to hear your and @igalshilman's opinion here.

Maybe one idea could be to log the effective/resolved yaml somewhere so that the user sees what is actually run, if this does not happen already.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of logging the effective yaml somewhere. We should probably make it an opt-in kind of a deal, since we don't want to be automatically logging secrets. I'll hold off on coding this until we hear from @igalshilman

@FilKarnicki
Copy link
Contributor Author

I'm beginning to think that we can achieve the same with a 'jar -uf' at deploy time. This means that this PR can be closed, do you agree @igalshilman / @tillrohrmann ?

@nicklester
Copy link

I still see value in this PR. When deploying into a managed environment such as Kinesis Data Analytics on AWS it is not possible to use something like 'jar -uf' to modify the jar at deploy time. We see value in being able to pass a hostname for API Gateway as an environment variable, for instance, to simplify multi environment deployment in an AWS CDK stack.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants