Reverting error_df optimization to fix #101 #104

shinga · 2024-07-09T21:10:24Z

Description

Removes the sequence_nbr optimization logic in writer.py

Related Issue

Motivation and Context

This change reverts a change to write_error_records_final() that caused error metrics to be lower than the actual errors.

How Has This Been Tested?

Tested with Databricks runtime 14.3 LTS and Spark 3.5.0
Compared and matched stats table, error table output with actual errors for 50 row_dqs on a 500 million record dataset.
Performance was not slower than v2.1.0 for this use case.

Screenshots (if appropriate):

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

My code follows the code style of this project.
My change requires a change to the documentation.
I have updated the documentation accordingly.
I have read the CONTRIBUTING document.
I have added tests to cover my changes.
All new and existing tests passed.

asingamaneni

LGTM

Reverting error_df optimization to fix Nike-Inc#101

a7eafe3

shinga marked this pull request as ready for review July 9, 2024 21:34

shinga requested review from asingamaneni and Umeshsp22 as code owners July 9, 2024 21:34

asingamaneni approved these changes Jul 16, 2024

View reviewed changes

asingamaneni merged commit d860ae5 into Nike-Inc:main Jul 16, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reverting error_df optimization to fix #101 #104

Reverting error_df optimization to fix #101 #104

shinga commented Jul 9, 2024 •

edited

Loading

asingamaneni left a comment

Reverting error_df optimization to fix #101 #104

Reverting error_df optimization to fix #101 #104

Conversation

shinga commented Jul 9, 2024 • edited Loading

Description

Related Issue

Motivation and Context

How Has This Been Tested?

Screenshots (if appropriate):

Types of changes

Checklist:

asingamaneni left a comment

Choose a reason for hiding this comment

shinga commented Jul 9, 2024 •

edited

Loading