[SIP-59] Proposal for Database migration standards #13351

robdiciuccio · 2021-02-25T23:56:57Z

[SIP] Proposal for database migration standards

Motivation

Reduce pain around metadata database migrations by ensuring standards are followed and appropriate reviews are obtained before merging.

Proposed Change

SIP-57 (Semantic Versioning) introduced standards for avoiding breaking changes and general best practices for database migrations. The proposed changes below are in addition to those standards:

All migrations must support rollbacks. Migrations must have a functional downgrade method to effectively rollback schema changes introduced in the upgrade method. If a migration makes changes to data that are not easily undone (e.g. fix: Retroactively add granularity param to charts #12960), the changes introduced must be non-breaking and idempotent.
Migrations should be atomic and configured to complete fully in a single run, using a single transaction where appropriate. Any failures should trigger a rollback to the previous state. Partial migrations should be avoided.
Any constraints added within a migration should include an explicit name, e.g. sa.ForeignKeyConstraint(["user_id"], ["ab_user.id"], name='fk_user_id').
PRs introducing database migrations must include runtime estimates and downtime expectations.
Care should be taken to not introduce expensive DDL operations such as adding unnecessary constraints/indexes or setting column default values on tables potentially containing thousands of rows. [1][2]
- Indexes in Postgres tables should be added and removed CONCURRENTLY.
Migrations for breaking changes and cleanup (e.g. removal of columns) that should be held for the next major version, per the guidelines in SIP-57, should be accumulated in /superset/migrations/next/ for evaluation and inclusion in a future release.
Establish Github code owners on the superset/migrations directory to ensure PMC members are notified of new or updated migrations.
Require two approvals for PRs that include database migrations, including committers from multiple organizations.
PRs including database migrations should be kept open for a minimum review period of two business days to allow for adequate review, unless circumstances such as a critical vulnerability or breakage require faster turnaround.

New or Changed Public Interfaces

None.

New dependencies

No additional package dependencies.

Migration Plan and Compatibility

Workflow changes only. PR template will be updated with guidelines. Process for running migrations unchanged.

Rejected Alternatives

The status quo, which has resulted in quite a bit of thrash, deployment roadblocks and external discussions between Superset users.

The text was updated successfully, but these errors were encountered:

etr2460 · 2021-02-26T16:35:44Z

Love the suggestions, thanks for driving this! A couple pieces of feedback:

Migration files must have a functional down method to effectively rollback changes introduced in the up method.

This isn't always possible, especially when the migration is doing some repair of the metadata (vs. adding/removing columns and tables). See #12960 for an example of an impossible migration to write a down method for. Maybe this can be more precise by saying that all migrations that modify the structure of the DB/it's columns must have a down method?

PRs introducing database migrations must include runtime estimates and downtime expectations.

Love this, let's plan to add these as fields in the PR template?

Establish Github code owners

This will be our first use of code owners I think, do you have any thoughts about using this more broadly across the repo? Or have you only thought about the migration use case so far?

Nothing here besides my first point should be considered blocking though, and I'll happily vote +1 on this initiative once the thread is created!

john-bodley · 2021-02-27T04:56:19Z

Should we also consider how we could provide near zero-downtime for migrations which involve DDL operations or is this outside the scope of this SIP?

mistercrunch · 2021-03-02T05:33:36Z

I was just talking to an engineer today (Arash @ Preset) about the idea of using ExtraJSONMixin or similar pattern to accumulate / delay database migrations. In his case he wanted to add a few new fields to the highly contentious Query model and I pointed out to it.
https://github.com/apache/superset/blob/master/superset/models/helpers.py#L456-L477

It seems like ExtraJSONMixin could be further improved to be more seamless if we wanted to, but I'm not sure how people feel about it.

A few other ideas around this SIP:

I would recommend using an accumulation pattern around cleanup that are not immediately needed, meaning if for instance we want to remove columns in the database, we can remove the field from the model, but delay the related database field cleanup in some sort of migrations/next.py where we accumulate those cleanup migration scripts and defer until the next big number release (say 2.0.0) where downtime may be expected
It'd be so nice to have blue/green forward compatible stamps on migrations, meaning previous version of the app is guaranteed to work with future version of the database. In many cases if the migration is not blue-green compatible it should be clearly identified as it requires downtime. I'd recommend really pushing PRs to meet this req and pushing to using the accumulation pattern when that's not the case.

mistercrunch · 2021-03-02T05:39:42Z

Migration files must have a functional down method to effectively rollback changes introduced in the up method.

This isn't always possible, especially when the migration is doing some repair of the metadata (vs. adding/removing columns and tables).

It could be possible in some cases by keeping data as backup / renaming column to enable just that. Of course that doesn't always work as new objects get created and may be missing the backup, it may get very tricky to provide that guarantee where you may have to maintain the old and new field with the related old/new logic... Probably over-complicated, but we can see on a case per case basis if it makes sense to try to guarantee that down-migration. If it's not possible we may want to try to delay that migration until a bigger release if possible.

rusackas · 2021-03-02T18:59:36Z

must include runtime estimates and downtime expectations

We've seen instances in the past where one contributor thought runtime/downtime would be minimal based on their perceived use cases. When merged, other orgs had significantly/exponentially more data that needed migration, and the execution time was a pain point. How can we most accurately provide realistic/reasonable estimates given the fairly disparate use cases and datasets of Superset users/institutions?

robdiciuccio · 2021-03-03T23:35:28Z

Migration files must have a functional down method to effectively rollback changes introduced in the up method.

This isn't always possible, especially when the migration is doing some repair of the metadata (vs. adding/removing columns and tables).

Good point. The primary goal here is to be able to successfully rollback from any migration. The example you provided is idempotent and additive, which fits the criteria. How about this updated language?

All migrations must support rollbacks. Migrations must have a functional downgrade method to effectively rollback schema changes introduced in the upgrade method. If a migration makes changes to data that are not easily undone (e.g. #12960), the changes introduced must be non-breaking and idempotent.

robdiciuccio · 2021-03-03T23:37:38Z

This will be our first use of code owners I think, do you have any thoughts about using this more broadly across the repo? Or have you only thought about the migration use case so far?

Another use case I'm thinking about for code owners is the new ephemeral test environment workflow code: adding Preset code owners to ensure AWS resources are not changed without account owner approval.

robdiciuccio · 2021-03-03T23:44:57Z

We've seen instances in the past where one contributor thought runtime/downtime would be minimal based on their perceived use cases. When merged, other orgs had significantly/exponentially more data that needed migration, and the execution time was a pain point. How can we most accurately provide realistic/reasonable estimates given the fairly disparate use cases and datasets of Superset users/institutions?

Yeah, that's a bit tricky. One idea is to provide run times for different row counts, which could then be reasonably extrapolated for larger datasets. In general, committers notified via the proposed Github code owners should know if the tables being altered will incur significant migration overhead.

Require two approvals for PRs that include database migrations, including committers from multiple organizations.

Should we also require that the PR be open for review for a minimum period of time (48h?) to ensure committers from different orgs have time to review?

robdiciuccio · 2021-03-04T01:38:29Z

Should we also consider how we could provide near zero-downtime for migrations which involve DDL operations or is this outside the scope of this SIP?

Making this work for all metadata DB types will be difficult, as the pitfalls and tooling are different for each. We could add some guidance around things like setting default values and creating indexes on tables with many rows, but DDL is going to potentially cause downtime on some systems unless you're using a tool like pt-online-schema-change (for MySQL).

robdiciuccio · 2021-03-04T01:39:47Z

Ran across this guidance in the Alembic docs about naming constraints. Thoughts on including this as a requirement for migrations?

craig-rueda · 2021-03-08T23:47:31Z

To build on Rob's point above, I'd like to add that, I've noticed several migrations that do things like call commit() on their current session multiple times (usually in a loop), which breaks the atomic guarantee of migrations. I'm sure Alembic wraps the current session and intercepts calls to commit() under the covers, but we should still be checking for this sort of thing.

robdiciuccio · 2021-03-09T19:01:16Z

I would recommend using an accumulation pattern around cleanup that are not immediately needed, meaning if for instance we want to remove columns in the database, we can remove the field from the model, but delay the related database field cleanup in some sort of migrations/next.py where we accumulate those cleanup migration scripts and defer until the next big number release (say 2.0.0) where downtime may be expected

@mistercrunch agreed, I added an item for accumulating breaking/cleanup migrations for the next major release

It'd be so nice to have blue/green forward compatible stamps on migrations, meaning previous version of the app is guaranteed to work with future version of the database. In many cases if the migration is not blue-green compatible it should be clearly identified as it requires downtime. I'd recommend really pushing PRs to meet this req and pushing to using the accumulation pattern when that's not the case.

I think the standards set forth in SIP-57 re: breaking changes should accomplish this goal, unless you have something else in mind?

robdiciuccio · 2021-03-09T19:02:43Z

@craig-rueda I added some detail around atomicity of migrations

robdiciuccio · 2021-03-11T01:08:41Z

Updated the SIP above based on feedback in this thread. Will send for a vote on Friday if there's no other discussion items.

betodealmeida · 2021-03-16T00:02:58Z

@robdiciuccio @evans regarding "PRs introducing database migrations must include runtime estimates and downtime expectations", I'm working on a script to run benchmarks on migrations that pre-populates the models:

#13561

robdiciuccio · 2021-03-23T23:41:13Z

The SIP has been approved with nine binding +1 votes, four non-binding +1 votes, zero 0 votes and zero -1 votes.

superset-github-bot bot added the preset-io label Feb 25, 2021

robdiciuccio added the sip Superset Improvement Proposal label Feb 25, 2021

robdiciuccio closed this as completed Mar 23, 2021

This was referenced Mar 24, 2021

chore: Add CODEOWNERS for superset/migrations #13759

Merged

chore: Update PR template for SIP-59 DB migrations process #13855

Merged

betodealmeida mentioned this issue Mar 30, 2021

feat: create table with long name #13871

Merged

8 tasks

hughhhh mentioned this issue Mar 30, 2021

refactor: move CTAS/CVAS field II #13877

Merged

6 tasks

benjreinhart mentioned this issue Mar 31, 2021

fix(#13378): Ensure g.user is set for impersonation #13878

Merged

8 tasks

rusackas mentioned this issue Mar 31, 2021

fix: consistent left margin for dashboard layout. Fixes #13863 #13884

Merged

8 tasks

michael-s-molina mentioned this issue Mar 31, 2021

test: Adds tests to the filter scope components #13887

Merged

8 tasks

AAfghahi mentioned this issue Mar 31, 2021

feature: Importing Saved queries commands and API #13890

Closed

8 tasks

This was referenced Sep 13, 2024

fix: unable to disallow csv upload on header menu #30271

Merged

fix(dashboard): Invalid owner's name displayed after updates #30272

Merged

fix(dashboard): invalid button style in undo/redo button #30273

Merged

This was referenced Sep 13, 2024

fix(migration): 87d38ad83218 failing on upgrade #30275

Merged

fix(examples): fix examples uri for sqlite #30277

Open

This was referenced Sep 14, 2024

fix(install/docker): use zstd-baked image for building superset-frontend in containerized env #30279

Merged

refactor(frontend): migrate 6 Enzyme-based tests to RTL, part 2 #30281

Open

lautaro79 mentioned this issue Sep 14, 2024

Feat/DP-235 assign roles on account creation Datakimia-org/superset#4

Merged

9 tasks

nsivarajan mentioned this issue Sep 15, 2024

chore(GAQ): [Merge 5.0] Breaking change: Remove GLOBAL_ASYNC_QUERIES_REDIS_CONFIG #30284

Open

9 tasks

This was referenced Sep 16, 2024

feat: SH-6243 export csv fixes scribe-security/superset#24

Merged

feat: mimnor translation changes action button scribe-security/superset#25

Merged

Feature/sh 6243 fix missing scribe-security/superset#26

Merged

yevhenii1337 mentioned this issue Sep 16, 2024

feat(dashbord): hide filters of scope allclinics/superset#14

Merged

9 tasks

pascini mentioned this issue Sep 16, 2024

Filters Views to only show the ones starting with the string vw_ss Add Empower logo to the application pascini/superset-sigga#2

Merged

9 tasks

watsonam mentioned this issue Sep 17, 2024

fix(dashboard): invalid button style in undo/redo button (#30273) watsonam/superset#1

Merged

9 tasks

This was referenced Sep 18, 2024

chore(UPDATING.md): Add item to UPDATING.md describing translations build flag martyngigg/superset#1

Closed

chore(UPDATING.md): Add item to UPDATING describing translations build flag #30313

Merged

This was referenced Sep 18, 2024

feat: reading language from url + resource type check fix #30318

Open

SRV-795 feat - reading language from url Webgains/superset#4

Open

kasiazjc mentioned this issue Sep 18, 2024

chore(explore): Medium font weight for section headers #30321

Merged

9 tasks

villebro mentioned this issue Sep 18, 2024

feat(jinja): add option to format time filters using strftime #30323

Merged

9 tasks

betodealmeida mentioned this issue Sep 18, 2024

test #30324

Draft

9 tasks

SBIN2010 mentioned this issue Sep 18, 2024

fix:(legacy-plugin-chart-country-map)Fix the territories of Karelia and Murmansk(#30326) #30328

Open

9 tasks

mz0in mentioned this issue Sep 18, 2024

vv mz0in/superset#34

Merged

9 tasks

nyohasstium mentioned this issue Sep 19, 2024

SRV-762 feat - firing resizing event Webgains/superset#6

Open

9 tasks

Antonio-RiveroMartnez mentioned this issue Sep 19, 2024

fix(table): Use extras in queries #30335

Merged

9 tasks

geido mentioned this issue Sep 19, 2024

feat(Digest): Add RLS at digest generation for Charts and Dashboards #30336

Open

9 tasks

torgge mentioned this issue Sep 19, 2024

docs: sql-templating #30337

Merged

9 tasks

michael-s-molina mentioned this issue Sep 19, 2024

fix: KeyError 'sql' when opening a Trino virtual dataset #30339

Merged

9 tasks

rusackas mentioned this issue Sep 19, 2024

fix(CI): increase node JS heap size #30340

Open

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SIP-59] Proposal for Database migration standards #13351

[SIP-59] Proposal for Database migration standards #13351

robdiciuccio commented Feb 25, 2021 •

edited

Loading

etr2460 commented Feb 26, 2021

john-bodley commented Feb 27, 2021

mistercrunch commented Mar 2, 2021 •

edited

Loading

mistercrunch commented Mar 2, 2021 •

edited

Loading

rusackas commented Mar 2, 2021 •

edited

Loading

robdiciuccio commented Mar 3, 2021

robdiciuccio commented Mar 3, 2021

robdiciuccio commented Mar 3, 2021

robdiciuccio commented Mar 4, 2021

robdiciuccio commented Mar 4, 2021

craig-rueda commented Mar 8, 2021

robdiciuccio commented Mar 9, 2021

robdiciuccio commented Mar 9, 2021

robdiciuccio commented Mar 11, 2021

betodealmeida commented Mar 16, 2021

robdiciuccio commented Mar 23, 2021

[SIP-59] Proposal for Database migration standards #13351

[SIP-59] Proposal for Database migration standards #13351

Comments

robdiciuccio commented Feb 25, 2021 • edited Loading

[SIP] Proposal for database migration standards

Motivation

Proposed Change

New or Changed Public Interfaces

New dependencies

Migration Plan and Compatibility

Rejected Alternatives

etr2460 commented Feb 26, 2021

john-bodley commented Feb 27, 2021

mistercrunch commented Mar 2, 2021 • edited Loading

mistercrunch commented Mar 2, 2021 • edited Loading

rusackas commented Mar 2, 2021 • edited Loading

robdiciuccio commented Mar 3, 2021

robdiciuccio commented Mar 3, 2021

robdiciuccio commented Mar 3, 2021

robdiciuccio commented Mar 4, 2021

robdiciuccio commented Mar 4, 2021

craig-rueda commented Mar 8, 2021

robdiciuccio commented Mar 9, 2021

robdiciuccio commented Mar 9, 2021

robdiciuccio commented Mar 11, 2021

betodealmeida commented Mar 16, 2021

robdiciuccio commented Mar 23, 2021

robdiciuccio commented Feb 25, 2021 •

edited

Loading

mistercrunch commented Mar 2, 2021 •

edited

Loading

mistercrunch commented Mar 2, 2021 •

edited

Loading

rusackas commented Mar 2, 2021 •

edited

Loading