Skip to content

Commit

Permalink
update docs to reflect change to /conf file structure (#2913)
Browse files Browse the repository at this point in the history
* update docs to reflect change to /conf file structure

Signed-off-by: Dmitry Sorokin <dmd40in@gmail.com>

* update docs, folder structure diagrams

Signed-off-by: Dmitry Sorokin <dmd40in@gmail.com>

---------

Signed-off-by: Dmitry Sorokin <dmd40in@gmail.com>
  • Loading branch information
DimedS authored Aug 9, 2023
1 parent 9b3ab30 commit c833c24
Show file tree
Hide file tree
Showing 6 changed files with 13 additions and 15 deletions.
4 changes: 2 additions & 2 deletions docs/source/data/data_catalog.md
Original file line number Diff line number Diff line change
Expand Up @@ -359,10 +359,10 @@ The list of all available parameters is given in the [Paramiko documentation](ht

You can use the [`kedro catalog create` command to create a Data Catalog YAML configuration](../development/commands_reference.md#create-a-data-catalog-yaml-configuration-file).

This creates a `<conf_root>/<env>/catalog/<pipeline_name>.yml` configuration file with `MemoryDataSet` datasets for each dataset in a registered pipeline if it is missing from the `DataCatalog`.
This creates a `<conf_root>/<env>/catalog_<pipeline_name>.yml` configuration file with `MemoryDataSet` datasets for each dataset in a registered pipeline if it is missing from the `DataCatalog`.

```yaml
# <conf_root>/<env>/catalog/<pipeline_name>.yml
# <conf_root>/<env>/catalog_<pipeline_name>.yml
rockets:
type: MemoryDataSet
scooters:
Expand Down
4 changes: 2 additions & 2 deletions docs/source/development/commands_reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -376,7 +376,7 @@ kedro micropkg pull <link-to-micro-package-sdist-file>
The above command will take the bundled `.tar.gz` file and do the following:

* Place source code in `src/<package_name>/pipelines/<pipeline_name>`
* Place parameters in `conf/base/parameters/<pipeline_name>.yml`
* Place parameters in `conf/base/parameters_<pipeline_name>.yml`
* Pull out tests and place in `src/tests/pipelines/<pipeline_name>`

`kedro micropkg pull` works with PyPI, local and cloud storage:
Expand Down Expand Up @@ -512,7 +512,7 @@ kedro catalog create --pipeline=<pipeline_name>

The command also accepts an optional `--env` argument that allows you to specify a configuration environment (defaults to `base`).

The command creates the following file: `<conf_root>/<env>/catalog/<pipeline_name>.yml`
The command creates the following file: `<conf_root>/<env>/catalog_<pipeline_name>.yml`

#### Notebooks

Expand Down
7 changes: 3 additions & 4 deletions docs/source/nodes_and_pipelines/micro_packaging.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,7 @@ When you package your micro-package, such as a modular pipeline for example, Ked
```text
├── conf
│ └── base
│ └── parameters
│ └── {{pipeline_name*}} <-- All parameter file(s)
│ └── parameters_{{pipeline_name*}} <-- All parameter file(s)
└── src
├── my_project
│ ├── __init__.py
Expand All @@ -35,7 +34,7 @@ When you package your micro-package, such as a modular pipeline for example, Ked
Kedro will also include any requirements found in `src/<package_name>/pipelines/<micropkg_name>/requirements.txt` in the micro-package tar file. These requirements will later be taken into account when pulling a micro-package via `kedro micropkg pull`.

```{note}
Kedro will not package the catalog config files even if those are present in `conf/<env>/catalog/<micropkg_name>.yml`.
Kedro will not package the catalog config files even if those are present in `conf/<env>/catalog_<micropkg_name>.yml`.
```

If you plan to publish your packaged micro-package to some Python package repository like [PyPI](https://pypi.org/), you need to make sure that your micro-package name doesn't clash with any of the existing packages in that repository. However, there is no need to rename any of your source files if that is the case. Simply alias your package with a new name by running `kedro micropkg package --alias <new_package_name> <micropkg_name>`.
Expand Down Expand Up @@ -71,7 +70,7 @@ You can pull a micro-package from a tar file by executing `kedro micropkg pull <
* The `<package_name>` must either be a package name on PyPI or a path to the source distribution file.
* Kedro will unpack the tar file, and install the files in following locations in your Kedro project:
* All the micro-package code in `src/<package_name>/<micropkg_name>/`
* Configuration files in `conf/<env>/parameters/<micropkg_name>.yml`, where `<env>` defaults to `base`.
* Configuration files in `conf/<env>/parameters_<micropkg_name>.yml`, where `<env>` defaults to `base`.
* To place parameters from a different config environment, run `kedro micropkg pull <micropkg_name> --env <env_name>`
* Unit tests in `src/tests/<micropkg_name>`
* Kedro will also parse any requirements packaged with the micro-package and add them to project level `requirements.in`.
Expand Down
3 changes: 1 addition & 2 deletions docs/source/nodes_and_pipelines/modular_pipelines.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,8 +52,7 @@ Running the `kedro pipeline create` command adds boilerplate folders and files f
```text
├── conf
│ └── base
│ └── parameters
│ └── {{pipeline_name}}.yml <-- Pipeline-specific parameters
│ └── parameters_{{pipeline_name}}.yml <-- Pipeline-specific parameters
└── src
├── my_project
│ ├── __init__.py
Expand Down
8 changes: 4 additions & 4 deletions docs/source/tutorial/add_another_pipeline.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ The data science pipeline is made up of the following:
* Two python files within `src/spaceflights/pipelines/data_science`
* `nodes.py` (for the node functions that form the data processing)
* `pipeline.py` (to build the pipeline)
* A yaml file: `conf/base/parameters/data_science.yml` to define the parameters used when running the pipeline
* A yaml file: `conf/base/parameters_data_science.yml` to define the parameters used when running the pipeline
* `__init__.py` files in the required folders to ensure that Python can import the pipeline


Expand All @@ -42,7 +42,7 @@ def split_data(data: pd.DataFrame, parameters: Dict) -> Tuple:
Args:
data: Data containing features and target.
parameters: Parameters defined in parameters/data_science.yml.
parameters: Parameters defined in parameters_data_science.yml.
Returns:
Split data.
"""
Expand Down Expand Up @@ -89,7 +89,7 @@ def evaluate_model(

## Input parameter configuration

Parameters that are used by the `DataCatalog` when the pipeline executes are stored in `conf/base/parameters/data_science.yml`:
Parameters that are used by the `DataCatalog` when the pipeline executes are stored in `conf/base/parameters_data_science.yml`:

<details>
<summary><b>Click to expand</b></summary>
Expand Down Expand Up @@ -276,7 +276,7 @@ candidate_modelling_pipeline.regressor:
```
</details><br/>

2. Update the parameters file for the data science pipeline in `conf/base/parameters/data_science.yml` to replace the existing contents for `model_options` with the following for the two instances of the template pipeline:
2. Update the parameters file for the data science pipeline in `conf/base/parameters_data_science.yml` to replace the existing contents for `model_options` with the following for the two instances of the template pipeline:

<details>
<summary><b>Click to expand</b></summary>
Expand Down
2 changes: 1 addition & 1 deletion docs/source/tutorial/create_a_pipeline.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ The data processing pipeline prepares the data for model building by combining t
* Two python files within `src/spaceflights/pipelines/data_processing`
* `nodes.py` (for the node functions that form the data processing)
* `pipeline.py` (to build the pipeline)
* A yaml file: `conf/base/parameters/data_processing.yml` to define the parameters used when running the pipeline
* A yaml file: `conf/base/parameters_data_processing.yml` to define the parameters used when running the pipeline
* `__init__.py` files in the required folders to ensure that Python can import the pipeline

```{note}
Expand Down

0 comments on commit c833c24

Please sign in to comment.