Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Common SQLCheckOperators Various Functionality Update #25164

Merged

Commits on Jul 18, 2022

  1. Add batching to SQL Check Operators

    Commit adds a WHERE clause to the sql statement that allows for
    arbitrary batching in a given table.
    denimalpaca committed Jul 18, 2022
    Configuration menu
    Copy the full SHA
    dc01866 View commit details
    Browse the repository at this point in the history
  2. Fix bug with multiple table checks

    When multiple table checks are given to the SQLTableCheckOperator
    and at least one is not a fully aggregate statement, a GROUP BY
    clause was previously needed. This commit updates the operator to
    use the get_pandas_df() method instead of _get_first() to return a
    pandas dataframe object that contains the check names and check
    results from the new style of query. The new style of query uses
    UNION ALL to run each test as its own SELECT statement, bypassing
    the need to do a GROUP BY.
    denimalpaca committed Jul 18, 2022
    Configuration menu
    Copy the full SHA
    7c25227 View commit details
    Browse the repository at this point in the history
  3. Update test failure logic

    Changed name of method from _get_failed_tests to _get_failed_checks
    to better match naming, and updated logic of the method to include
    an optional column param. The query in the column check operator
    is removed from the failed test exception message, as it was only
    ever showing the last query, instead of the relevant one(s). This is
    replaced by the column, which will be more useful in debugging.
    denimalpaca committed Jul 18, 2022
    Configuration menu
    Copy the full SHA
    2a3df61 View commit details
    Browse the repository at this point in the history

Commits on Jul 21, 2022

  1. Add table alias to SQLTableCheckOperator query

    Without a table alias, the query does not run on Postgres and
    other databases. The alias is arbitrary and used only for
    proper query execution.
    denimalpaca committed Jul 21, 2022
    Configuration menu
    Copy the full SHA
    66922f0 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    554e8ba View commit details
    Browse the repository at this point in the history
  3. Add batching to SQL Check Operators

    Commit adds a WHERE clause to the sql statement that allows for
    arbitrary batching in a given table.
    denimalpaca committed Jul 21, 2022
    Configuration menu
    Copy the full SHA
    cf90083 View commit details
    Browse the repository at this point in the history
  4. Fix bug with multiple table checks

    When multiple table checks are given to the SQLTableCheckOperator
    and at least one is not a fully aggregate statement, a GROUP BY
    clause was previously needed. This commit updates the operator to
    use the get_pandas_df() method instead of _get_first() to return a
    pandas dataframe object that contains the check names and check
    results from the new style of query. The new style of query uses
    UNION ALL to run each test as its own SELECT statement, bypassing
    the need to do a GROUP BY.
    denimalpaca committed Jul 21, 2022
    Configuration menu
    Copy the full SHA
    7c20bf6 View commit details
    Browse the repository at this point in the history
  5. Update test failure logic

    Changed name of method from _get_failed_tests to _get_failed_checks
    to better match naming, and updated logic of the method to include
    an optional column param. The query in the column check operator
    is removed from the failed test exception message, as it was only
    ever showing the last query, instead of the relevant one(s). This is
    replaced by the column, which will be more useful in debugging.
    denimalpaca committed Jul 21, 2022
    Configuration menu
    Copy the full SHA
    d364e96 View commit details
    Browse the repository at this point in the history
  6. Add table alias to SQLTableCheckOperator query

    Without a table alias, the query does not run on Postgres and
    other databases. The alias is arbitrary and used only for
    proper query execution.
    denimalpaca committed Jul 21, 2022
    Configuration menu
    Copy the full SHA
    3c300e7 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    5645b5d View commit details
    Browse the repository at this point in the history
  8. Merge branch 'sql_check_operators_various_functionality_update' of gi…

    …thub.com:denimalpaca/airflow into sql_check_operators_various_functionality_update
    denimalpaca committed Jul 21, 2022
    Configuration menu
    Copy the full SHA
    98a28c3 View commit details
    Browse the repository at this point in the history
  9. Move alias to proper query build statement

    The table alias should be in the self.sql query build statement
    as that is where the table it needs to alias is defined.
    denimalpaca committed Jul 21, 2022
    Configuration menu
    Copy the full SHA
    ee697e3 View commit details
    Browse the repository at this point in the history
  10. Add batching to SQL Check Operators

    Commit adds a WHERE clause to the sql statement that allows for
    arbitrary batching in a given table.
    denimalpaca committed Jul 21, 2022
    Configuration menu
    Copy the full SHA
    1754fdc View commit details
    Browse the repository at this point in the history
  11. Fix bug with multiple table checks

    When multiple table checks are given to the SQLTableCheckOperator
    and at least one is not a fully aggregate statement, a GROUP BY
    clause was previously needed. This commit updates the operator to
    use the get_pandas_df() method instead of _get_first() to return a
    pandas dataframe object that contains the check names and check
    results from the new style of query. The new style of query uses
    UNION ALL to run each test as its own SELECT statement, bypassing
    the need to do a GROUP BY.
    denimalpaca committed Jul 21, 2022
    Configuration menu
    Copy the full SHA
    31d0e0e View commit details
    Browse the repository at this point in the history
  12. Update test failure logic

    Changed name of method from _get_failed_tests to _get_failed_checks
    to better match naming, and updated logic of the method to include
    an optional column param. The query in the column check operator
    is removed from the failed test exception message, as it was only
    ever showing the last query, instead of the relevant one(s). This is
    replaced by the column, which will be more useful in debugging.
    denimalpaca committed Jul 21, 2022
    Configuration menu
    Copy the full SHA
    bc4140e View commit details
    Browse the repository at this point in the history
  13. Add table alias to SQLTableCheckOperator query

    Without a table alias, the query does not run on Postgres and
    other databases. The alias is arbitrary and used only for
    proper query execution.
    denimalpaca committed Jul 21, 2022
    Configuration menu
    Copy the full SHA
    24ce964 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    987f6c2 View commit details
    Browse the repository at this point in the history
  15. Merge branch 'sql_check_operators_various_functionality_update' of gi…

    …thub.com:denimalpaca/airflow into sql_check_operators_various_functionality_update
    denimalpaca committed Jul 21, 2022
    Configuration menu
    Copy the full SHA
    a27fc2c View commit details
    Browse the repository at this point in the history
  16. Bug fixes and updates to test and operator

    Fixed bug in test where the dataframe column names did not match
    the operator's expected dataframe column names. Added more info
    to the SQLColumnCheckOperator's batch arg. Fixed the location of
    table aliasing in SQLTableCheckOperator.
    denimalpaca committed Jul 21, 2022
    Configuration menu
    Copy the full SHA
    404eef5 View commit details
    Browse the repository at this point in the history
  17. Configuration menu
    Copy the full SHA
    01bfb2f View commit details
    Browse the repository at this point in the history
  18. Rename parameter batch to partition_clause

    Gives a clearer name to the parameter and adds templating to
    the SQLTableCheckOperator.
    denimalpaca committed Jul 21, 2022
    Configuration menu
    Copy the full SHA
    75d59d1 View commit details
    Browse the repository at this point in the history
  19. Fix typo in docstring

    denimalpaca committed Jul 21, 2022
    Configuration menu
    Copy the full SHA
    0388ac4 View commit details
    Browse the repository at this point in the history
  20. Reformat operator file

    denimalpaca committed Jul 21, 2022
    Configuration menu
    Copy the full SHA
    719a830 View commit details
    Browse the repository at this point in the history