Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mandate everything runs in some fixed locale en_US.UTF8 #3118

Closed
Tracked by #1837
saihaj opened this issue Jan 4, 2022 · 8 comments
Closed
Tracked by #1837

Mandate everything runs in some fixed locale en_US.UTF8 #3118

saihaj opened this issue Jan 4, 2022 · 8 comments
Assignees

Comments

@saihaj
Copy link
Member

saihaj commented Jan 4, 2022

We should mandate that everything runs in some fixed locale (maybe en_US.UTF8, though we'd need to look into if there are other locales that are more palatable to the non-English world) graph-node could then check on startup that its locale and the locale of the database are the expected one.

Originally posted by @lutter in #1837 (comment)

@saihaj
Copy link
Member Author

saihaj commented Jan 23, 2022

I search the codebase and I see this comment

// This check will break if users run Postgres (or even graph-node)
// in a locale other than English. In that case, their database will
// be marked as unavailable even though it is perfectly fine.
if msg.contains("canceling statement")
|| msg.contains("no connection to the server")
|| msg.contains("terminating connection due to conflict with recovery")
which means non english DBs shouldn't work right cause they get marked as unavailable. That means we don't need to panic or anything? cc @lutter

@azf20
Copy link
Contributor

azf20 commented Feb 17, 2022

@saihaj I wonder if we can add a specific check for en_US.UTF8? I think for determinism it's not enough for it to be in English, I think it needs to be specific

@leoyvens
Copy link
Collaborator

The POSIX locale is better for performance, so it seems worth discussing if we should choose that.

@lutter
Copy link
Collaborator

lutter commented Feb 19, 2022

This discussion is a good start. One thing to note is that switching collation would require rebuilding all indexes with text columns in them.

@azf20
Copy link
Contributor

azf20 commented Apr 29, 2022

The majority of indexers responding to the straw poll are running en_US.UTF8. Proposing that we adopt that as the standard on the network, but @lutter am wondering what the migration path is for any indexers currently running a different locale? I don't know what the implications are - ideally it would be as simple as a reconfiguration, but presumably indices would need to be regenerated?

@dotansimha dotansimha moved this from Todo to In Progress in GraphQL API May 2, 2022
@lutter
Copy link
Collaborator

lutter commented May 2, 2022

The 'right' choice would be to use the C locale because it performs better in queries etc. than en_US.UTF8 (I am told that it can speed up comparisons by 10x)

Either way, the safest way to change locale is to recreate the database - the default locale is baked in when the database is initialized. That means the indexers would have to dump their database, recreate it, and restore from backup. There might be ways for us to change that when generating queries, but it'll require some more work within graph-node

@azf20
Copy link
Contributor

azf20 commented May 5, 2022

OK - so they wouldn't need to re-index, just restore from backup? Interested in @fordN's opinion from an indexer perspective here?

@saihaj saihaj moved this from In Progress to In review in GraphQL API Jun 15, 2022
@dotansimha dotansimha moved this from In review to Awaiting release in GraphQL API Aug 25, 2022
@saihaj saihaj moved this from Awaiting release to In review in GraphQL API Sep 14, 2022
@azf20 azf20 assigned lutter and unassigned saihaj Nov 21, 2022
@saihaj
Copy link
Member Author

saihaj commented Nov 1, 2023

#4163

@saihaj saihaj closed this as completed Nov 1, 2023
@github-project-automation github-project-automation bot moved this from In review to Done in GraphQL API Nov 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

5 participants