Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query anonymization should work in fallback log #2804

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

LantaoJin
Copy link
Member

@LantaoJin LantaoJin commented Jul 4, 2024

Description

Query anonymization is not working in the fallback log. JOIN query and other queries which trigger fallback to old engine are not anonymous in logs.

POST /_plugins/_sql
{
  "query" : """
  SELECT s.FlightNum, s.OriginCityName, d.DestCityName
    FROM opensearch_dashboards_sample_data_flights s
      JOIN opensearch_dashboards_sample_data_flights d
        ON s.DestCityName = d.OriginCityName
  """
}

Request SQLQueryRequest(jsonContent={"query":"\n SELECT s.FlightNum, s.OriginCityName, d.DestCityName FROM opensearch_dashboards_sample_data_flights s JOIN opensearch_dashboards_sample_data_flights d ON s.DestCityName = d.OriginCityName\n "}, query=
SELECT s.FlightNum, s.OriginCityName, d.DestCityName FROM opensearch_dashboards_sample_data_flights s JOIN opensearch_dashboards_sample_data_flights d ON s.DestCityName = d.OriginCityName
, path=/_plugins/_sql, format=jdbc, params={pretty=true}, sanitize=true, cursor=Optional.empty) is not supported and falling back to old SQL engine

This PR resolves it by adding a new toString() method with a lambda function parameter anonymizer in SQLQueryRequest since QueryDataAnonymizer is in org.opensearch.sql.legacy package.

With this fixing, the new log show like

Request SQLQueryRequest(query=( SELECT identifier, identifier, identifier FROM table s JOIN table d ON identifier = identifier ), path=/_plugins/_sql, format=jdbc, params={pretty=true}, sanitize=true, cursor=Optional.empty) is not supported and falling back to old SQL engine

Issues Resolved

Resolves #2803

Check List

  • New functionality includes testing.
    • All tests pass, including unit test, integration test and doctest
  • [ ] New functionality has been documented.
    • [ ] New functionality has javadoc added
    • [ ] New functionality has user manual doc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@LantaoJin
Copy link
Member Author

The build failures are not related to this PR. It looks like GLIBC lib is missing in current Linux docker image. All CI tasks passed in Windows docker.

LantaoJin added 3 commits July 9, 2024 14:45
Signed-off-by: Lantao Jin <ltjin@amazon.com>
Signed-off-by: Lantao Jin <ltjin@amazon.com>
Signed-off-by: Lantao Jin <ltjin@amazon.com>
@LantaoJin LantaoJin force-pushed the pr/bugfix_issue_2803 branch from 873ecee to 4572e25 Compare July 9, 2024 06:45
@@ -148,4 +149,22 @@ private boolean shouldSanitize(Map<String, String> params) {
}
return true;
}

/** A new toString() with anonymizer parameter to anonymize its query statement. */
public String toString(Function<String, String> anonymizer) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wonder do you see future requirement for different anonymizer? Is toString(boolean anonymized) or copy to a new request clear enough?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Query anonymization doesn't work in fallback log
2 participants