Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Coral-Service] Add rewriteType field to translation endpoint #455

Merged

Conversation

KevinGe00
Copy link
Contributor

@KevinGe00 KevinGe00 commented Sep 21, 2023

What changes are proposed in this pull request, and why are they necessary?

Adding a rewrite field in the api/translations/translate endpoint to support new coral-service rewrite feature. This PR also modifies coral service to accept Spark as a source language.

How was this patch tested?

no rewrite param specified, returns a translation with no rewrite

curl --header "Content-Type: application/json" \
  --request POST \
  --data '{
    "fromLanguage":"hive", 
    "toLanguage":"trino", 
    "query":"SELECT * FROM db1.airport"
  }' \
  http://localhost:8080/api/translations/translate`

rewrite param is "none", returns a translation with no rewrite

curl --header "Content-Type: application/json" \
    --request POST \                                                                                                              
    --data '{                                  
    "sourceLanguage":"hive",
    "targetLanguage":"trino",
    "query":"SELECT * FROM db1.airport",
    "rewriteType":"none"
    }' \

valid rewrite param, returns translation with requested rewrite

curl --header "Content-Type: application/json" \
    --request POST \                                                                                                              
    --data '{                                  
    "sourceLanguage":"hive",
    "targetLanguage":"trino",
    "query":"SELECT * FROM db1.airport",
    "rewriteType":"incremental"
    }' \

invalid rewrite param, returns a translation with no rewrite

curl --header "Content-Type: application/json" \
    --request POST \                                                                                                              
    --data '{                                  
    "sourceLanguage":"hive",
    "targetLanguage":"trino",
    "query":"SELECT * FROM db1.airport",
    "rewriteType":"notavalidrewritetype"
    }' \

Comment on lines 73 to 74
return ResponseEntity.status(HttpStatus.BAD_REQUEST)
.body("Currently, only Hive, and Trino are supported as source/input languages.\n");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for /input.

Comment on lines 125 to 126
// TODO: replace getSparkIncrementalQueryFromUserSql with getIncrementalQuery
String incrementalQuery = getSparkIncrementalQueryFromUserSql(query);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this work when the source/target specified by the user is not Spark?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getIncrementalQuery(String query, String sourceLanguage, String targetLanguage) takes in a sourceLanguage param.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still not quite clear. Why are not we using getIncrementalQuery here and leaving it as a TODO?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to use getIncrementalQuery, we would need to /api/incremental/rewrite to take in a targetLanguage field in the request.

Since changes to this endpoint could break the existing DBT integration, it might be worth it to make this it's own separate effort especially since this is a more complicated endpoint to test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we hardcode "spark" and get rid of getSparkIncrementalQueryFromUserSql? The TODO will look different as well.

return ResponseEntity.status(HttpStatus.BAD_REQUEST)
.body("Currently, only Hive, Spark, and Trino are supported as engines to generate graphs using.\n");
.body("Currently, only Hive, and Trino are supported as engines to generate graphs using.\n");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us allow Spark and route it to Hive. Also you can say
Currently, only Hive, Spark, and Trino SQL are supported as source languages for Coral IR visualization

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we also do this for the translation endpoint?

@wmoustafa wmoustafa merged commit c86d2eb into linkedin:master Sep 22, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants