Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add PostgreSQL support #255

Open
ormsbee opened this issue Nov 5, 2024 · 6 comments
Open

Add PostgreSQL support #255

ormsbee opened this issue Nov 5, 2024 · 6 comments
Assignees

Comments

@ormsbee
Copy link
Contributor

ormsbee commented Nov 5, 2024

This repo currently has a set of collation-related helpers that it uses to help normalize model collation behavior between MySQL and SQLite:

They manifest in the migrations files like this:

('key', openedx_learning.lib.fields.MultiCollationCharField(db_collations={'mysql': 'utf8mb4_bin', 'sqlite': 'BINARY'}, max_length=500)),
('title', openedx_learning.lib.fields.MultiCollationCharField(db_collations={'mysql': 'utf8mb4_unicode_ci', 'sqlite': 'NOCASE'}, max_length=500)),

The MultiCollationCharField is also used in one place in edx-platform

Anyone looking to add PostgreSQL support would need to look for all instances of MultiCollationCharField and MultiCollationTextField and put the postgres-specific mappings there. That includes the convenience methods like case_insensitive_char_field in fields.py, as well as in the individual migration files that have the sqlite/mysql arguments serialized out.

@ormsbee
Copy link
Contributor Author

ormsbee commented Nov 5, 2024

@qasimgulzar: These are some of the specifics that have to happen for this repo to support PostgreSQL. Thank you for taking this work on.

@qasimgulzar
Copy link

Thank you @ormsbee for sharing this with me. I am going to take care of it.

@qasimgulzar
Copy link

@ormsbee could you please add some details of the background of this field, e.g: why this field is there any why is it required to be MultiCollationCharField. It will be very helpful while implementing and testing the change.

@qasimgulzar
Copy link

Update.-.1.1.-.min.mov

@ormsbee I have managed to run edx-platform with postgresql. I think it's going to bring some joy here.

@ormsbee
Copy link
Contributor Author

ormsbee commented Nov 14, 2024

There were two primary motivations for MultiCollationCharField:

1. To force utf8mb4 encoding, even in databases that were still configured for utf8mb3.

Up until Redwood, the Tutor default database connection was utf8, which in MySQL translated into their utf8mb3 encoding, which lacks support for emojis and certain other Unicode code points that require 4 bytes to encode in UTF8. Many people had already upgraded their instances through their own scripts and configuration, and some instances that didn't use Tutor were already connecting via utf8mb4.

In the middle of this weird, transitional period, we were creating new data models to store library (and eventually course) content, and we wanted to make sure that the generated tables could properly store 4-byte characters no matter what the database connection settings were for that particular instance at the time.

2. To normalize case sensitivity behavior for certain fields.

We have fields where case-sensitivity is important, like identifier keys. We've gotten into issues with this sort of mismatch in the past, where CourseOverviews store their course keys in a case insensitive manner, but Modulestore stores them in a case sensitive manner, making collision possible (e.g. two courses in Modulestore that map to the same CourseOverview). We also run many of our tests using in-memory sqlite, which is case sensitive by default–as opposed to MySQL's default case insensitivity.

Given that, it's convenient to have something like MultiCollationCharField in order to force consistent constraint-checking and ordering behavior between the two. That way, models can declare that, "this should be a case-insensitive char field" and get consistent behavior across databases, without the error-prone process of altering collations manually in the migration file.

@qasimgulzar
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants