Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider removing support for project-side logging.yml #2281

Closed
Tracked by #2205
antonymilne opened this issue Feb 6, 2023 · 11 comments
Closed
Tracked by #2205

Consider removing support for project-side logging.yml #2281

antonymilne opened this issue Feb 6, 2023 · 11 comments
Labels
Stage: Technical Design 🎨 Ticket needs to undergo technical design before implementation Stage: User Research 🔬 Ticket needs to undergo user research before implementation

Comments

@antonymilne
Copy link
Contributor

antonymilne commented Feb 6, 2023

The project-side logging.yml configuration file has been optional since 0.18.2. Following #2206, a user will be able to specify a logging.yml configuration that takes effect from the beginning of the kedro process (i.e. not when the session is created). This leaves the question of whether there's any point still looking at this file at all.

Pros of removing

  • Identical functionality can be achieved by using KEDRO_LOGGING_CONFIG environment variable
  • Would simplify our code and make our currently quite convoluted setup of logging configuration much nicer: no need to set up framework-side logging configuration that is overriden by project-side any more - instead logging would just be configured once

Cons of removing

  • You could no longer put different things in logging.yml for different kedro run environments (but I doubt anyone does this)
  • Slight technical complication pointed out in next comment

Pro or con depending on your opinion

@antonymilne antonymilne added the Stage: User Research 🔬 Ticket needs to undergo user research before implementation label Feb 6, 2023
@antonymilne antonymilne changed the title Remove project-side logging.yml? Consider removing support for project-side logging.yml Feb 6, 2023
@antonymilne antonymilne added the Stage: Technical Design 🎨 Ticket needs to undergo technical design before implementation label Feb 6, 2023
@antonymilne
Copy link
Contributor Author

Note that if this does get removed there's a subtle piece of functionality that exists in project-side logging.yml that's not easily reproduced in the framework-side one: the project-side logging.yml has something like this:

loggers:
  kedro:
    level: INFO

  iris:
    level: INFO

It's not possible to put this in the framework-side logging.yml because you don't know what the project name is there. We can achieve this by adding it in the Python code in framework/project/__init__.py though by using PACKAGE_NAME (defined in configure_project), something like this:

logging_config["loggers"].append(PACKAGE_NAME: {"level": "INFO"})

@antonymilne
Copy link
Contributor Author

Notes from tech design discussion on 8 February

  • We can introduce a .env file which contains KEDRO_LOGGING_CONFIG and use something like python-dotenv to load it up. This would just be a single file in the project root, not per-kedro run environment. @idanov said there's other use cases for this too?
  • Removing logging.yml file is consistent with general simplification of project template and move to docs. This is a particularly good candidate because the format of the file is quite weird (it's Python logging config, not kedro config)
  • No one seemed to think that it was essential to have per-kedro run environment logging config, but still room for discussion about whether file-based logging should be opt-in or opt-out and how a user should enable/disable it

More research needed...

@amandakys proposed doing more research. Relevant questions to try and answer would be:

  1. Can we move logging.yml to conf (not in base)? i.e. file-based logging enabled by default and it would not depend on run environment
  2. What is the file log used for? (N.B. error.log appears completely useless)
  3. Can we remove the file altogether? i.e. file-based logging disabled by default
  4. Some more advanced starter that has the file but more basic ones don't?
  5. What about users who don't even know that file-based logging is possible?

@sbrugman
Copy link
Contributor

A practical example as user input: we use global logging configuration for adding a handler for mlflow, and to be able to enable/disable.

@antonymilne
Copy link
Contributor Author

@sbrugman thanks for this. Can you give an example of what your logging configuration looks like in this case and how you enable/disable the mlflow handling?

@sbrugman
Copy link
Contributor

It's similar to the rich logger here. We have a custom MlflowHandler that starts the experiment and passes a subset of the logging messages to mlflow.

@noklam
Copy link
Contributor

noklam commented Feb 15, 2023

I do think there are some merit to keep the logging.yml.

As mentioned you can't really configure this log level easily once it is moved into framework. IMO it also makes thing looks more like magic.

loggers:
  kedro:
    level: INFO

  iris:
    level: INFO

In general, I think Python Logging is not a well-understood topic and I still like the idea that it's is setup already in a project and I can easily configure or find out what's the default settings (similar to how we comment out the default options in settings.py. Advance users may go with more sophisticated custom logger implementation.

@noklam
Copy link
Contributor

noklam commented Feb 15, 2023

I have probably missed out on the previous discussion about KEDRO_LOGGING_CONFIG and .env. What is expected to go into the .env file and how much of that will overlap with the new OmegaConfigLoader resolving environment and the TemplatedConfigLoader template variables? i.e. What belongs to globals.yml and .env?

@antonymilne
Copy link
Contributor Author

Sorry for the slow reply @noklam, I only just saw this.

Basically the plan is to do #2206 first, which will introduce the KEDRO_LOGGING_CONFIG environment variable and enable users to directly modify the framework logging setup (e.g. so you could change the logging level to DEBUG from the beginning).

The .env file would just be a way to automatically setup your KEDRO_LOGGING_CONFIG variable so kedro would read that file automatically rather than you needing to do export to set the variable on the shell. It's just a convenience thing and we're not sure whether we actually want to do it. @idanov proposed it but @merelcht and me aren't so sure. Either way, it's a minor convenience and not a core feature really.

This is all unrelated to OmegaConfigLoader and TemplatedConfigLoader or globals.yml. In fact, if @amandakys's user research demonstrates that people don't alter their logging.yml differently for different run environment (very likely I think) then we will be able to completely decouple the logging setup from the config loader which would big simplification and improvement. This way your project logging.yml would just live in conf/logging.yml, not in any run environment, and wouldn't need to use the config loader at all.

@antonymilne
Copy link
Contributor Author

P.S. I definitely see the room for confusion between .env and globals.yml here - it's a very good point. But we're not decided on whether we actually want the .env file. If it does exist then it would be purely to specify KEDRO_LOGGING_CONFIG, nothing to do with run environments. As above we're hoping to completely divorce the logging configuration from the config loader so I think it should be fine.

@antonymilne
Copy link
Contributor Author

antonymilne commented Mar 16, 2023

Conclusions from @amandakys's excellent user research:

  • as expected, very, very few people use different logging.yml files in different run environments
  • a significant number of people do modify their logging.yml for various reasons (changing log formatting, location, suppressing warnings, etc.)
  • consistent with my much less thorough user research from before User research to see if we should remove the file-based logging handlers #1472 a significant number of people are aware of file-based logging

Resulting actions:

  • allowing different logging.yml files in different run environments has minimal value and causes extreme confusion and complication, so we are going to remove the ability to do this: Divorce config loader from logging.yml #2426. (Those who really want to do this will still be able to just by providing different values of KEDRO_CONFIG_LOGGING in their different run environments)
  • logging.yml remains in the project template but moves from conf/base/logging.yml to just conf/logging.yml logging.yml is removed from the project template (see below comment)
  • in order for this file to be used a user needs to set KEDRO_CONFIG_LOGGING, as in Make framework-side logging.yml user-specifiable #2206. If not set, the default framework logging config is used
  • we remove error.log but keep info.log, which moves from logs/info.log to the project root so we can remove the logs folder - Remove logs folder #2429
  • (optional, but probably desired) we make a system so that KEDRO_CONFIG_LOGGING can be set automatically from a .env file or the like - Consider whether we should introduce a .env file to setup logging #2428 see below comment

After these changes, the flow of logging will change from the convoluted setup outlined in #2205 to the following:

  1. LOGGING = _ProjectLogging() is called. This sets up logging according to the file given in optional environment variable KEDRO_CONFIG_LOGGING or, if not specified, the framework default_logging.yml (see Make framework-side logging.yml user-specifiable #2206)
  2. That's it (apart from the stupid complication of the parallel runner, which needs to setup the logging again for each subprocess). Nothing happens with logging configuration when the session is created.

@antonymilne
Copy link
Contributor Author

Modifications to the above plan following technical design discussion on 12 April:

  • the introduction of a new .env file as in Consider whether we should introduce a .env file to setup logging #2428 was generally not popular with the Kedro team due to the fact that it introduces another file to the project template
  • the idea of including logging.yml file in the project template that doesn't get used (and is packaged with conf) was also generally unpopular
  • the number of people who responded to the user research is very low, indicating that in general people probably do not care that much about what we decide to do here
  • hence we will remove logging.yml from the default project template and not pursue Consider whether we should introduce a .env file to setup logging #2428 for now
  • it will still be possible for a user to customise logging.yml and set the KEDRO_LOGGING_CONFIG variable. The only difference is that this file will not be shipped with the default project or starter templates. Instead, instructions on how to create it will live in documentation
  • in future, logging might become a utility as in Introducing Utility Modules to Kedro  #2388

FYI @amandakys

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Stage: Technical Design 🎨 Ticket needs to undergo technical design before implementation Stage: User Research 🔬 Ticket needs to undergo user research before implementation
Projects
Archived in project
Development

No branches or pull requests

3 participants