Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read Native Query SQL from files #372

Merged
merged 10 commits into from
Mar 21, 2024
Merged

Read Native Query SQL from files #372

merged 10 commits into from
Mar 21, 2024

Conversation

soupi
Copy link
Contributor

@soupi soupi commented Mar 20, 2024

What

Currently we can specify native queries by inlining their SQL as a string in the sql field.
We'd like to be able to reference a .sql file instead, so we can avoid escaping characters and can use newlines.

The new config format will accept the following structures:

"sql": "<inline sql>" (as before, but also:)

"sql": { "inline": "<inline sql>" }

"sql": { "file": "<relative file path>" }

How

  • We parse the NativeQuerySql which represents the SQL field using an intermediate new type NativeQuerySqlExternal.
  • as part of the parse_configuration step and after serde deserialization, we traverse the metadata and convert NativeQuerySqlExternal to NativeQuerySql by reading from file (performing IO) if needed, and parsing to NativeQueryParts. We use the configuration directory that the user supplied to locate the relative path.

We replace the existing sql field in nativequeryinfo with an Either external or internal representation. This is not ideal because it breaks the parse don't validate principle.
We would like to mitigate this in the future by separating the RawConfiguration types and the (execution time) Configuration types completely.
We will do that as part of a larger ticket of redesigning the configuration format so it can be split across multiple files.

@soupi soupi changed the title Gil/sql files Read Native Query SQL from files Mar 21, 2024
fs::read_to_string(&configuration_file)
.await
.map_err(|err| {
Error::IoErrorButStringified(format!("{}: {}", &configuration_file.display(), err))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just want rust to tell me which file it couldn't read, so I added these everywhere. Sorry.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we could instead change Error::IoError to take an Option<PathBuf> in addition.

serde_json::from_str(&configuration_file_contents).map_err(|error| Error::ParseError {
file_path: configuration_file.clone(),
line: error.line(),
column: error.column(),
message: error.to_string(),
})?;
// look for native query sql file references and read from disk.
for native_query_sql in configuration.metadata.native_queries.0.values_mut() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

step after serialization - read from disk.

}
}

struct NQVisitor;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@plcplc taught me that we can avoid all of this cruft by just implementing From/Into.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🌺

#[test]
fn parse_inline_untagged() {
assert_eq!(
serde_json::from_str::<NativeQuerySqlExternal>(r#""select 1""#).unwrap(),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still want to be able to parse untagged inline expressions just as before.

@@ -31,6 +31,8 @@ pub fn translate(env: &Env, state: State) -> Result<Vec<sql::ast::CommonTableExp
let sql: Vec<sql::ast::RawSql> = native_query
.info
.sql
.sql()
.map_err(Error::InternalError)?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might fail because of our types shape, we'll fix this in the future.

@@ -819,7 +819,45 @@ expression: schema
}
}
},
"Native_query_sql": {
"NativeQuerySql": {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new schema for the sql field.


let new_ndc_metadata_dir = temp_deploys_path.join(db_name);

copy_dir::copy_dir(&ndc_metadata_dir_path, &new_ndc_metadata_dir).map_err(|err| {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We now also copy the entire configuration directory.

@@ -1484,7 +1502,7 @@
"description": "A native query used to test support for composite types"
},
"summarize_organizations": {
"sql": "SELECT 'The organization ' || org.name || ' has ' || no_committees::text || ' committees, ' || 'the largest of which has ' || max_members || ' members.' AS result FROM (SELECT orgs.* FROM unnest({{organizations}}) as orgs) AS org JOIN LATERAL ( SELECT count(committee.*) AS no_committees, max(members_agg.no_members) AS max_members FROM unnest(org.committees) AS committee JOIN LATERAL ( SELECT count(*) as no_members FROM unnest(committee.members) AS members) AS members_agg ON true) AS coms ON true",
"sql": { "file": "./native_queries/summarize_organizations.sql" },
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New thing!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New thing!

@soupi soupi marked this pull request as ready for review March 21, 2024 08:57
@soupi soupi requested a review from plcplc March 21, 2024 08:57
Copy link
Contributor

@plcplc plcplc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff! A few easy (stylistic) improvements in comments.

We'll get back to parse-don't-validate eventually 🌺

}

/// A Native Query SQL after file resolution.
/// This is the file that is expected in the metadata when translating requests.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// This is the file that is expected in the metadata when translating requests.
/// This is the `NativeQuerySqlEither`-variant that is expected in the metadata when translating requests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Rephrased the comment.

/// used for deserialization.
fn from(value: NativeQuerySql) -> Self {
match value {
NativeQuerySql::Inline { sql } => NativeQuerySqlExternal::Inline { inline: sql },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, serialization/deserialization roundtripping a configuration turns a

sql: "SELECT.."

into

sql: { inline: "SELECT.." }

...which is fine

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is intended so that we could phase out raw string eventually.

}
}

impl JsonSchema for NativeQuerySqlEither {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to, you can avoid this schema impl with the #[schemars(with="NativeQuerySqlExternal")] attribute on NativeQuerySqlEither and #[schemars(rename="NativeQuerySql")] on NativeQuerySqlExternal.

Although it's arguably mostly a matter of style for this example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this doesn't work because it doesn't want to generate a JsonSchema without NativeQuerySql also having a JsonSchema instance.

Thanks for the suggestions though, I would definitely prefer something like this :)

}
}

struct NQVisitor;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🌺

}

impl JsonSchema for NativeQuerySql {
impl JsonSchema for NativeQueryParts {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can probably also avoid this instance with a #[schemars(rename=..)] attribute.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this doesn't work because it attempts to create a schema for Vec<NativeQueryPart>, even if I add with = "String".

@@ -0,0 +1,22 @@
SELECT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, how I wished I could have had a separate file when I made this query. Boy is it easier on the eyes without string escaping~

@soupi soupi enabled auto-merge March 21, 2024 13:17
@soupi soupi added this pull request to the merge queue Mar 21, 2024
Merged via the queue into main with commit b94fd3a Mar 21, 2024
35 checks passed
@soupi soupi deleted the gil/sql-files branch March 21, 2024 13:30
Copy link
Contributor

@SamirTalwar SamirTalwar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay here. I only have some minor comments on delaying error stringification.

fs::read_to_string(&configuration_file)
.await
.map_err(|err| {
Error::IoErrorButStringified(format!("{}: {}", &configuration_file.display(), err))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we could instead change Error::IoError to take an Option<PathBuf> in addition.

pub fn parse_native_query_from_file(file: std::path::PathBuf) -> Result<NativeQuerySql, String> {
let contents: String = match fs::read_to_string(&file) {
Ok(ok) => Ok(ok),
Err(err) => Err(format!("{}: {}", &file.display(), err)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, here seems like a good place for Error::IoError { path: Option<PathBuf>, inner: io::Error }.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would approve such a PR :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants