Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Elasticsearch's logging configuration #1371

Open
danielmitterdorfer opened this issue Oct 28, 2021 · 6 comments
Open

Use Elasticsearch's logging configuration #1371

danielmitterdorfer opened this issue Oct 28, 2021 · 6 comments
Labels
:Benchmark Candidate Management Anything affecting how Rally sets up Elasticsearch enhancement Improves the status quo
Milestone

Comments

@danielmitterdorfer
Copy link
Member

danielmitterdorfer commented Oct 28, 2021

Historically Rally needed to modify Elasticsearch's logging configuration, e.g. to enable verbose logging for the IndexingMemoryController. We've removed the surrounding infrastructure in Rally a while ago. Therefore, Rally can just use the logging configuration that is shipped with Elasticsearch.

Implementation notes

Before applying the configuration in rally-teams, Rally wipes the Elaticsearch's configuration directory:

self.es_installer.delete_pre_bundled_configuration()

Then, it applies the configuration; it writes files in append mode (this allows mixins to contribute to the configuration):

for name in files:
source_file = os.path.join(root, name)
target_file = os.path.join(absolute_target_root, name)
if plain_text(source_file):
logger.info("Reading config template file [%s] and writing to [%s].", source_file, target_file)
# automatically merge config snippets from plugins (e.g. if they want to add config to elasticsearch.yml)
with open(target_file, mode="a", encoding="utf-8") as f:
f.write(_render_template(env, config_vars, source_file))
else:
logger.info("Treating [%s] as binary and copying as is to [%s].", source_file, target_file)
shutil.copy(source_file, target_file)

A possible solution is to delete all but the log4j2.properties file and then apply the config as is. We should not hardcode that name though but instead introduce a "allowlist" concept (files to keep from the original config). The allowlist would be defined in rally-teams and evaluated by Rally. For backwards-compatibility we should also explicitly skip copying any files on the allowlist.

We also have a dedicated provisioner for the Docker image which writes config files to a directory on the host and then provides each file via a mount:

for name in files:
source_file = os.path.join(root, name)
target_file = os.path.join(absolute_target_root, name)
mounts[target_file] = os.path.join("/usr/share/elasticsearch", relative_root, name)
if plain_text(source_file):
self.logger.info("Reading config template file [%s] and writing to [%s].", source_file, target_file)
with open(target_file, mode="a", encoding="utf-8") as f:
f.write(_render_template(env, self.config_vars, source_file))
else:
self.logger.info("Treating [%s] as binary and copying as is to [%s].", source_file, target_file)
shutil.copy(source_file, target_file)

Here, it is sufficient to only skip copying any file on the allowlist.

@danielmitterdorfer danielmitterdorfer added enhancement Improves the status quo :Benchmark Candidate Management Anything affecting how Rally sets up Elasticsearch labels Oct 28, 2021
@danielmitterdorfer danielmitterdorfer added this to the 2.x milestone Oct 28, 2021
@pquentin
Copy link
Member

Thanks! This make a lot of sense.

One question: would it make more sense to use a denylist instead? It would be more verbose today, but it would match with the list we have in rally-teams, which seems more explicit. And it won't break if Elasticsearch adds a new configuration file (even though that's unlikely).

@danielmitterdorfer
Copy link
Member Author

danielmitterdorfer commented Oct 28, 2021

I'd be fine either way as long as the file name is not hardcoded in Rally. I'd treat my summary above as a proposal / hint rather than a prescription how it must be done. :)

it won't break if Elasticsearch adds a new configuration file (even though that's unlikely).

In fact, I would prefer if benchmarks break on a new configuration file because it explicitly alerts us of a change that might influence benchmarks so we need to make a concious decision how to handle it.

@pquentin
Copy link
Member

A new configuration file would be in the release notes, hopefully? However, elastic/elasticsearch#61474 was only mentioned in passing in https://www.elastic.co/guide/en/elasticsearch/reference/7.10/release-notes-7.10.0.html. Nothing in the release notes warns users about updating their log4j configuration.

I'm wondering because the natural evolution of my suggestion is to just overwrite the files that are in rally-teams instead of having to maintain a list at all. Would that make sense?

@danielmitterdorfer
Copy link
Member Author

A new configuration file would be in the release notes, hopefully?

Even if that would be the case, we benchmark against master so by the time we see the release notes it's likely already too late?

[...] natural evolution of my suggestion is to just overwrite the files that are in rally-teams instead

I like the direction. We'd rather not break our users without any chance for them to upgrade so for a grace period I still think that we need to keep log4j2.properties in rally-teams and treat it specially. I am also not certain whether I fully understand how your proposal would work with the Docker image?

@pquentin
Copy link
Member

I am also not certain whether I fully understand how your proposal would work with the Docker image?

I don't know what makes Docker different here, sorry.

@danielmitterdorfer
Copy link
Member Author

I don't know what makes Docker different here, sorry.

The config files are already embedded in the Docker image. I think it's sufficient that we just don't provide a file outside the container and mount it. So it's just the semantics of "overwriting" files in rally-teams are different but now that I think of it more I don't think that's a problem.

Nevertheless, we still have the issue with backwards-compatibility, both for Docker and the tar.gz distribution. So I still think a deny / allowlist concept is necessary, at least for a transition period.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Benchmark Candidate Management Anything affecting how Rally sets up Elasticsearch enhancement Improves the status quo
Projects
None yet
Development

No branches or pull requests

2 participants