Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix calc_dataset_view join with label values #2202

Merged
merged 1 commit into from
Dec 12, 2024

Conversation

lampajr
Copy link
Member

@lampajr lampajr commented Dec 5, 2024

Fixes Issue

Fixes #2201

Check List (Check all the applicable boxes)

  • My code follows the code style of this project.
  • My change requires changes to the documentation.
  • I have updated the documentation accordingly.
  • All new and existing tests passed.

@lampajr lampajr requested a review from johnaohara December 5, 2024 09:59
@lampajr lampajr self-assigned this Dec 5, 2024
@lampajr lampajr force-pushed the fix_calc_dataset_view branch from 274d003 to 751f5a3 Compare December 5, 2024 10:36
Comment on lines -639 to -641
// ensure the recalculation is removed, with this approach we should guarantee
// it gets removed even if transaction-level exception occurs, e.g., timeout
Util.registerTxSynchronization(tm, txStatus -> recalculations.remove(testId, status));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johnaohara I also had to remove this because the run transformation is executed async making use of mediator.executeBlocking(.. therefore this recalculations.remove was getting called immediately before returning this method but much before the recalculation was completed which is wrong.. and it does not prevent multiple recalculation on the same test to happen concurrently

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be fairly safe even without that check as we are wrapping the mediator.transform with try-finally that should ensure the finally clause to get called even if transaction-level exceptions are thrown:

                mediator.executeBlocking(() -> {
                    int newDatasets = 0;
                    try {
                        newDatasets = mediator.transform(runId, true);
                    } finally {
                        synchronized (status) {
                            status.finished++;
                            status.datasets += newDatasets;
                            if (status.finished == status.totalRuns) {
                                recalculations.remove(testId, status);
                            }
                        }
                    }
                });

Signed-off-by: Andrea Lamparelli <a.lamparelli95@gmail.com>
@lampajr lampajr force-pushed the fix_calc_dataset_view branch from 751f5a3 to 1d550ce Compare December 5, 2024 17:02
Comment on lines +4674 to +4683
WITH view_agg AS (
SELECT
vc.view_id, vc.id as vcid, array_agg(DISTINCT label.id) as label_ids, jsonb_object_agg(label.name, lv.value) as value FROM dataset_schemas ds
JOIN label ON label.schema_id = ds.schema_id
JOIN viewcomponent vc ON vc.labels ? label.name
JOIN label_values lv ON lv.label_id = label.id AND lv.dataset_id = ds.dataset_id
WHERE ds.dataset_id = datasetId
AND vc.view_id IN (SELECT view.id FROM view JOIN dataset ON view.test_id = dataset.testid WHERE dataset.id = datasetId)
GROUP BY vc.view_id, vcid
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes are:

  • Added lv.dataset_id = ds.dataset_id in the JOIN condition on label_values
  • Added AND vc.view_id IN (SELECT view.id FROM view JOIN dataset ON view.test_id = dataset.testid WHERE dataset.id = datasetId) to ensure we are not including view_components coming from other views!

@lampajr
Copy link
Member Author

lampajr commented Dec 5, 2024

I discovered that the calc_dataset_view is also creating wrong association between the current dataset and views belonging to other tests just because the viewcomponents were not properly filtered out.

Fixed it by adding AND vc.view_id IN (SELECT view.id FROM view JOIN dataset ON view.test_id = dataset.testid WHERE dataset.id = datasetId).

This was referenced Dec 6, 2024
Copy link
Member

@johnaohara johnaohara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lampajr lampajr merged commit 9187a06 into Hyperfoil:master Dec 12, 2024
2 checks passed
@lampajr lampajr deleted the fix_calc_dataset_view branch December 12, 2024 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Horreum procedure calc_dataset_view computes wrong data
2 participants