Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing to run a dataset's SQL can leave the database in an inconsistent state #60

Open
toolness opened this issue Dec 3, 2020 · 0 comments

Comments

@toolness
Copy link
Contributor

toolness commented Dec 3, 2020

Last night the auto-update of hpd_registrations failed. The actual cause of this error is documented in #61... a side-effect of the error, though, is that some of the functions created by hpd_registrations up to that point (specifically, anyarray_uniq) were removed from the public schema, and once this error was encountered, they were not re-created.

I think this happened because nycdb dataset SQL has code like the following:

DROP FUNCTION IF EXISTS blah();
CREATE OR REPLACE FUNCTION blah()

The behavior of code like this is not straightforward because of our connection's SCHEMA PATH. At the time that this code executes, it's set to a temporary randomly-named schema, but falls back to the public schema (see #4 for why this is done). I think the first line of the above code is actually dropping the function from the public schema (as it's not in the newly-created temporary one), while the second line is creating the function in the new temporary schema. Thus, the function in question now exists in the temporary schema, but doesn't exist in the public one. (It is actually made to exist in the public one further down the line, when we re-run the dataset's SQL in the context of the public schema, which was added in #22.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant