Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dryrun Output for Topic Creation / Topic configuration applying changed values only #220

Closed
schnaker85 opened this issue Mar 3, 2021 · 6 comments
Labels
dry-run enhancement New feature or request

Comments

@schnaker85
Copy link

Is your feature request related to a problem? Please describe.
While managing hundreds of topics in the topics.yml and applying the whole file the actual change of the configuration is lost. We don't know if we changed a single configuration for a topic or all topics. We are lost with the dryrun option as it does not show "what" changed, only if a topic is newly created or updated (but updated is it always).

Describe the solution you'd like
Running with --dryrun and applying updated configuration for existing topics I would like to see what configuration changes for the current state of the cluster. If a configuration entry is set, deleted or updated with the run

e.g. output of an existing topic with changed cleanup.policy and removed retention.ms would look like:

{
  "Operation" : "com.purbon.kafka.topology.actions.topics.SyncTopicAction",
  "Topic" : "project.test-existing-topic.a0",
  "Action" : "update",
  "Changes" : {
    "cleanup.policy" : "[Update] delete -> cleanup",
    "min.insync.replicas" : "[Update] 2 -> 3",
    "retention.ms" : "[Unset] 2592000000 ->  "
  }
}

This goes hand in hand with only updating/applying topics configuration which have changed. Synchronization with the current state and the julie defined topics.yaml configuration is needed to skip configurations thare are the same as before.

The above should be configurable in the properties to enable/disable the behaviour and make it backwards compatible.

Describe alternatives you've considered
Only incremental changes to topics.yml. So every change to the cluster state is a new file thjatwill be freshly applied (aka with a changelog). But this won't work in our case and is cumbersome.

I already implemented this and will provide a feature branch for a PR soon.

@purbon
Copy link
Collaborator

purbon commented Mar 3, 2021

Hi @schnaker85 , what do you mean by:

While managing hundreds of topics in the topics.yml and applying the whole file the actual change of the configuration is lost.

You mention as well:

We don't know if we changed a single configuration for a topic or all topics.

This should be known by calculating the diff of both files, the old and the new one, right?

@schnaker85
Copy link
Author

schnaker85 commented Mar 4, 2021

We have 3 different topics files for 3 different projects and those contain >50 topics. If they are "new/fresh" we know that the dryrun pointed out the action "create" but in the case we do an update on the file and run it again we only kind if see which topic has changed and not what in that topic has changed. As well as that all the topics are shown as Updated even though nothing has changed.

e.g. before the output was this:
first run:

{
  "Operation" : "com.purbon.kafka.topology.actions.topics.SyncTopicAction",
  "Topic" : "project.test-topic.1",
  "Action" : "create",
},
{
  "Operation" : "com.purbon.kafka.topology.actions.topics.SyncTopicAction",
  "Topic" : "project.test-topic.0",
  "Action" : "create",
}

.... 50more output of topics create

now we change for the 44th topic the retention period and rerun it again, the output was

{
  "Operation" : "com.purbon.kafka.topology.actions.topics.SyncTopicAction",
  "Topic" : "project.test-topic.1",
  "Action" : "update",
},
{
  "Operation" : "com.purbon.kafka.topology.actions.topics.SyncTopicAction",
  "Topic" : "project.test-topic.0",
  "Action" : "update",
}

.... 50more output of topics update

and what we would like to see is the following:

{
  "Operation" : "com.purbon.kafka.topology.actions.topics.SyncTopicAction",
  "Topic" : "project.test-topic.44",
  "Action" : "update",
 "Changes" : {
    "retention.ms" : "[Update] 2592000000 -> 60555505000  "
  }
}

and the rest should not be touched (re-applied on the cluster) or outputed as action update as it has not changed. Of course you would see this on the diff of the topic file but what if something got changed manually on the cluster? In our case we would see this.

do you get my point? The diff is not based on the previous/actual state of the topics.yml file but instead of the topic.yml file and the clusterstate (including all configuration)

@purbon purbon added the dry-run label Mar 8, 2021
@purbon
Copy link
Collaborator

purbon commented Mar 8, 2021

related to #102

@purbon
Copy link
Collaborator

purbon commented Mar 8, 2021

Copy here a comment from the PR as is has a more design or global approach.


Hi @schnaker85, I hope this message find you good. I have been thinking a bit about an idea I would love to pass by you here.

Currently, the DryRun is build-out of the output of actions, so in case an action generates an update, as you do here in your PR, you need to return this data from the client to the action and then print it out somehow there.

I was thinking about these topics:

Having a metadata component in the execution plan. This metadata would be the responsibility of having "the view".
Actions, when run in DryRun mode, would then be executed and operate only on this metadata view, to be able to show the diff. Ala what terraform is doing somehow.
However, I am having one interesting question:

Does the dry-run mode should need a connection to the cluster?
From one side, topologies have "the view" and if we include these metadata holders, we can enrich it with all defaults/other configs.

From the other side, people could have changed the config in the cluster using another methodology, the same thing could happen for terraform.

So I guess, I am trying to think about a generic solution for all the actions.

A dry-run process would be:

(Optional, as require connection) update cluster metadata
apply actions to the metadata.
output the result as text.
Does this makes sense to you?

@harshal-shah
Copy link

Another good example of a similar pattern is how helm does it, helm (v3) stores the last applied template as a secret, then does a 3 way merge between old state, current cluster state and target state, maybe a similar approach could help?

Yes, a working connection to the cluster becomes a pre-requisite in such an implementation but its more realistic and gives the real picture.

@purbon
Copy link
Collaborator

purbon commented Dec 10, 2021

This should be fixed with the merge of #383, closing it for now. Feel free to reopen if not completely solved with the previous PR.

@purbon purbon closed this as completed Dec 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dry-run enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants