Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Reduce Mandatory arguments from code and use rule run settings from rules table #13

Closed
asingamaneni opened this issue Aug 13, 2023 · 0 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@asingamaneni
Copy link
Collaborator

Is your feature request related to a problem? Please describe.
As we have started to develop Spark-Expectations as a decorator pattern approach, we took all the options into the decorator and made some mandatory options for user to be provided in code, It has come to lot of options to maintain in code now, which can be avoided by maintaining the same settings from the rules table.

Below are the options that have to be provided right now

@se.with_expectations(
    se.reader.get_rules_from_table(
        product_rules_table="dq_spark_local.dq_rules",
        target_table_name="dq_spark_local.customer_order",
        dq_stats_table_name="dq_spark_local.dq_stats",
    ),
    write_to_table=True,
    row_dq=True,
    agg_dq={
        user_config.se_agg_dq: True,
        user_config.se_source_agg_dq: True,
        user_config.se_final_agg_dq: True,
    },
    query_dq={
        user_config.se_query_dq: True,
        user_config.se_source_query_dq: True,
        user_config.se_final_query_dq: True,
        user_config.se_target_table_view: "order",
    },
    spark_conf=global_spark_Conf,
)

Describe the solution you'd like
We could remove most of the mandatory arguments from the decorator and source them from the rule_table. The options can still exist for debugging and feature development and testing purposes for the engineers but first preference would be taking from the rules_table

@se.with_expectations(
    se.reader.get_rules_from_table(
        product_rules_table="dq_spark_local.dq_rules",
        target_table_name="dq_spark_local.customer_order",
        dq_stats_table_name="dq_spark_local.dq_stats",
    ),
    write_to_table=True,
    write_to_temp_table=True,
    spark_conf=global_spark_Conf
)

row_dq, agg_dq, query_dq arguments which were removed from the decorators arguments can be validated from the rules table columns rule_type, is_active, enable_for_source_dq_validation, enable_for_target_dq_validation

If the user want to override the configuration provided in the rule_table, then should be able to provide them as a single dict kwargs. This essentially will be helpful for engineers testing feature branches and new rules that needs to be tested.

Example:

override_confg: {
    "row_dq": False,
    "agg_dq": False,
    "target_dq": False,
}
override_confg: {
    "row_dq": False,
    "agg_dq": False,
    "source_dq": True,
    "target_dq": False,
}

Below could be the implementation strategy

  • fetch rules which are active

    • check if atleast one rule has source_agg enabled
      • check if atleast one rule has agg_dq type
        • run for each rule
      • check if atleast one rule has query_dq type
        • run for each rule
    • check if atleast one rule had row_dq enable
      * run row_dq and return row_dq_df
      • check if atleast one rule has target_agg enabled
        • check if atleast one rule has agg_dq type
          • run for each rule on row_dq_df
        • check if atleast one rule has query_dq type
          • run for each rule on row_dq_df
  • Send stats by default irrespective of success or failure

Describe alternatives you've considered
NA

Additional context
As we are maturing the product, it is time to probably time to rethink the strategy to make things easy on implementation

@asingamaneni asingamaneni added the enhancement New feature or request label Aug 13, 2023
@asingamaneni asingamaneni added this to the v1.0.0 milestone Sep 8, 2023
asingamaneni pushed a commit that referenced this issue May 29, 2024
* Initializing theme files (#11)

* Adding a basic mantine theme file. It will be updated as the code base develops.

Any colors or gaps will not be hardcoded.

They should first be defined in the theme and then should be used in the code.

* updating .gitignore

* Includes a comprehensive test setup with mock data and support structure. (#12)

setting up tests in the future will be easier.

* 9 tech debt setup public private routing (#13)

* Working version of routing

* Implemented routing, however tests for the login page needs to figured out.

* implemented notification from mantine (#14)

* 8 feature implement login functionality using GitHub oauth (#15)

* clean up

* A simple but beautiful login page

* Github Oauth setup

* 16 tech debt complete todos (#17)

* Fixed testing setup and have all tests for OAuth.

* Fixed remaining TODOs, added tests.

* Adding a DEVELOPER.md and .env.example
asingamaneni added a commit that referenced this issue Jul 29, 2024
* Merge pull request #1

* Base setup for react ui

* Merge pull request #2

* Zustand + React Query setup

* Integrating Zustand, RQ and api.github with a basic implementation

* Completed the mantine + dev tooling setup and implemented a layout  (#3)

* Local test, vite, linting and prettier setup.
Switching to mantine instead of material ui

* Updating readme.md with run scripts details

* Adding basic app layout with mantine

* Adding auth provider which shows a modal for users to enter their github token (#4)

* GitHub Integration + Render Repos List And User Card  (#5)

* 1. Segregating header into a new component
2. Updating header with new styles

* Implement a ReposList component with dropdown.
Supports smooth scrolling and autocomplete search
ReposList shows in Header component.

* Implemented a UserButton Component.
UserButton is in Header

* Implemented use repos
Integrated it with ReposList

Implemented useUser
Integrated it with UserButton

* Update Readme.md

* Reverting IDE changes

* 1. Improved token handling
2. Improved api-client
3. Added a Loading component

* Linting fixes

* Failed attempt at wrapping Error and Loading using a HOC. Will investigate this later.

* Updates to Header and UserMenu

* Add a sample NavBar

* updating tests

* Implement Base Structure and GitHub OAuth Login (#97)

* Initializing theme files (#11)

* Adding a basic mantine theme file. It will be updated as the code base develops.

Any colors or gaps will not be hardcoded.

They should first be defined in the theme and then should be used in the code.

* updating .gitignore

* Includes a comprehensive test setup with mock data and support structure. (#12)

setting up tests in the future will be easier.

* 9 tech debt setup public private routing (#13)

* Working version of routing

* Implemented routing, however tests for the login page needs to figured out.

* implemented notification from mantine (#14)

* 8 feature implement login functionality using GitHub oauth (#15)

* clean up

* A simple but beautiful login page

* Github Oauth setup

* 16 tech debt complete todos (#17)

* Fixed testing setup and have all tests for OAuth.

* Fixed remaining TODOs, added tests.

* Adding a DEVELOPER.md and .env.example

* Initializing theme files (#11)

* Adding a basic mantine theme file. It will be updated as the code base develops.

Any colors or gaps will not be hardcoded.

They should first be defined in the theme and then should be used in the code.

* updating .gitignore

* Includes a comprehensive test setup with mock data and support structure. (#12)

setting up tests in the future will be easier.

* 9 tech debt setup public private routing (#13)

* Working version of routing

* Implemented routing, however tests for the login page needs to figured out.

* implemented notification from mantine (#14)

* 8 feature implement login functionality using GitHub oauth (#15)

* clean up

* A simple but beautiful login page

* Github Oauth setup

* 16 tech debt complete todos (#17)

* Fixed testing setup and have all tests for OAuth.

* Fixed remaining TODOs, added tests.

* Adding a DEVELOPER.md and .env.example

* Restructuring + layout (#19)

* adding yaml file

* 20 load list of organizations on the left cell (#21)

* WIP

* fetch user name before rendering main page

* implement api for user repos

* show available repos on the left hand side

* Rendering an un nested list of files

* rendering yaml files in each project

* implemented rules table

* editing rules.yaml

* Build a working version (#22)

* working modal

* working flow

* Updated repos list

* Refactored CommitsList.tsx and apis

* minor fixes

* clean up

---------

Co-authored-by: Ashok Singamaneni <ashok.singamaneni@nike.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants