-
-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trigger events before and after drop table statements #1036
Conversation
Hi As mentioned in the template and in the code of conduct, we appreciate if before opening a PR an issue explaining the issue or use case was opened |
Hi @CaselIT |
Note that this PR does not solve geoalchemy/geoalchemy2#374 because the table given as parameter to |
the SQLAlchemy Table object is not part of the op.create_table() or op.drop_table() API; these methods are passed string table names and an optional set of columns. so in that regard, the "events" don't make much sense, and the reason create_table() has support for the on-table-create events is to support triggering DDL for the Enum and Boolean datatypes when they make use of CHECK constraints or PG native ENUM types. if you need full blown Table objects with state on them passed to the event then I'd not use op.drop_table(), I'd run my_table.drop(op.get_bind()), which will run the event handlers for that Table also. |
Ok, I see. So I guess the best would be to define helpers in |
does GeoAlchemy need to DROP other constructs that go along with the Table? the "Alembic" way to do this would be to have those DROP commands right in the migration script in a literal way. We have similar problems with PG ENUM, right now people add separate CREATE/DROP for PG ENUM objects and our autogenerate solution (someday) would be to render these statements in the migration script directly. The autogenerate process can be customized so if you wanted to auto-generate additional geoalchemy-specific directives, there are ways to make that happen, https://alembic.sqlalchemy.org/en/latest/api/autogenerate.html#customizing-revision-generation contains some background on this. The approach of having the directives right in the migration file makes it clear what's actually going on. |
adding, my suggestion of |
Ok I see, thanks for these details. |
yes we have this problem with ENUM and despite many years of people complaining about it, I haven't taken the time to figure out a single solution that can be the official "way to do it". as far as intercepting "DROP TABLE" to write the SQL for that single statement in a different way, that in itself is straightforward and you wouldn't use events for that, you'd use the |
Hi @zzzeek I tried to gather everything we need from the different docs, I hope it's accurate enough. PostgreSQL + PostGIS 1 (we should probably deprecate support of PostGIS < 2)Create tableCREATE TABLE my_table (
id SERIAL NOT NULL,
PRIMARY KEY (id)
);
SELECT AddGeometryColumn('my_table_schema', 'my_table', 'geom', 4326, 'POINT', 2); If CREATE INDEX idx_my_table_geom ON my_table USING gist (geom); Drop tableSELECT DropGeometryColumn('my_table_schema', 'my_table', 'geom');
DROP TABLE my_table; or SELECT DropGeometryTable('my_table_schema', 'my_table'); PostgreSQL + PostGIS 2 or PostGIS 3Create tableCREATE TABLE my_table (
id SERIAL NOT NULL,
geom geometry(POINT, 4326),
PRIMARY KEY (id)
); If CREATE INDEX idx_my_table_geom ON my_table USING gist (geom); Drop tableDROP TABLE my_table; SQLite + SpatiaLite 4Create tableCREATE TABLE my_table (
id SERIAL NOT NULL,
geom GEOMETRY,
PRIMARY KEY (id)
);
SELECT RecoverGeometryColumn('my_table', 'geom', 4326, 'XY'); Note: Creating a dummy column then calling If SELECT CreateSpatialIndex('my_table', 'geom'); Drop tableSELECT DropGeoTable('my_table'); SQLite + SpatiaLite 5Create tableCREATE TABLE my_table (
id SERIAL NOT NULL,
geom GEOMETRY,
PRIMARY KEY (id)
);
SELECT RecoverGeometryColumn('my_table', 'geom', 4326, 'XY'); Note: Creating a dummy column then calling If SELECT CreateSpatialIndex('my_table', 'geom'); Drop tableSELECT DropTable(NULL, 'my_table'); Note: This will automatically remove the table and all associated indexes (which are tables themselves) but it is only available for Spatialite >= 5. Rename tableSELECT RenameTable(NULL, 'my_table', 'my_new_table'); Rename columnSELECT RenameColumn(NULL, 'my_table', 'old_col_name', 'new_col_name'); Miscellaneous
MySQL (not supported yet)Create tableCREATE TABLE my_table (
id SERIAL NOT NULL,
geom GEOMETRY NOT NULL SRID 4326
); Note that the geometry type is not given here. If CREATE TABLE my_table (
id SERIAL NOT NULL,
geom GEOMETRY NOT NULL SRID 4326,
SPATIAL INDEX(geom)
); or CREATE TABLE my_table (
id SERIAL NOT NULL,
geom GEOMETRY NOT NULL SRID 4326
);
ALTER TABLE my_table ADD SPATIAL INDEX(geom); or CREATE TABLE geom (
id SERIAL NOT NULL,
geom GEOMETRY NOT NULL SRID 4326
);
CREATE SPATIAL INDEX geom ON my_table(geom); Drop tableDROP TABLE my_table; |
So it looks like maybe you would be compiling on op.drop_table("my_table") that directive simply does not supply the information you need to know that there are geometry constructs on the table. I think the cleanest approach for geoalchemy would be that you have your own directives, it seems like for the CREATE side things are column-specific, while on the DROP side they are table specific: op.create_geometry_column("my_table", Column("gist", GEOMETRY), ...)
op.drop_geometry_table("my_table", extra_geom_info = ...) With this kind of API, you can add any kinds of arguments and options you need explicitly, and you then have a straightforward way to run all the DDL you need. A complete example of how to create a custom op directive is at: https://alembic.sqlalchemy.org/en/latest/api/operations.html#operation-plugins . Decorators like I am getting the impression that for the moment, you have "CREATE" working because you get the whole Table and you can look around for geometry information. So you can start with just the drop operation above. For autogenerate support, there are two APIs that may be of use. The first is to be able to support detection of a totally new kind of object, like a trigger or a sequence, that's documented at https://alembic.sqlalchemy.org/en/latest/api/autogenerate.html#autogenerating-custom-operation-directives . However, at least for from alembic.autogenerate import rewriter
from alembic.operations import ops
writer = rewriter.Rewriter()
@writer.rewrites(ops.DropTableOp)
def add_column(context, revision, op):
if has_geometry_things(op.table):
return [
DropGeometryTableOp(op.table_name, extra_geometry_things)
]
else:
return op your this is a lot to take in but alembic should have everything you need to make a first class migration API here. There is already one project using these APIs extensively which you can see at https://olirice.github.io/alembic_utils/. This user came to me a few years ago and I got them started with all these APIs and they seem to have all worked out. |
Ok, I think I see what to do. I just have two questions:
Thanks! |
I think for the moment that's how it would have to be if rewriter is what works best here.
produce_migrations gives you the MigrationScript structure alone, that's before a hook like this would be called. While you can call the rewriter directly on that, if you wanted to test the integration into "process_revision_directives" you'd want to use the RevisionContext object. Clearly you can test at higher or lower levels here but the most end-to-end test of all the APIs would be if you were calling into command.revision(). It returns the Script structure that references the generated module.
|
Ok, I think I am almost good now, thanks to your help. As far as I can see, I just have a last issue with spatial indexes. Postgresqldef upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
op.add_geospatial_column('new_spatial_table', sa.Column('new_geom_with_idx', Geometry(geometry_type='LINESTRING', srid=4326, spatial_index=False, from_text='ST_GeomFromEWKT', name='geometry'), nullable=True))
op.add_geospatial_column('new_spatial_table', sa.Column('new_geom_without_idx', Geometry(geometry_type='LINESTRING', srid=4326, spatial_index=False, from_text='ST_GeomFromEWKT', name='geometry'), nullable=True))
op.drop_geospatial_index('idx_new_spatial_table_geom_with_idx', table_name='new_spatial_table', column_name='geom_with_idx')
op.create_geospatial_index('idx_new_spatial_table_new_geom_with_idx', 'new_spatial_table', ['new_geom_with_idx'], unique=False, postgresql_using='gist', postgresql_ops={})
op.drop_geospatial_column('new_spatial_table', 'geom_without_idx')
op.drop_geospatial_column('new_spatial_table', 'geom_with_idx')
# ### end Alembic commands ###
def downgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
op.add_geospatial_column('new_spatial_table', sa.Column('geom_with_idx', Geometry(geometry_type='LINESTRING', srid=4326, spatial_index=False, from_text='ST_GeomFromEWKT', name='geometry'), autoincrement=False, nullable=True))
op.add_geospatial_column('new_spatial_table', sa.Column('geom_without_idx', Geometry(geometry_type='LINESTRING', srid=4326, spatial_index=False, from_text='ST_GeomFromEWKT', name='geometry'), autoincrement=False, nullable=True))
op.drop_geospatial_index('idx_new_spatial_table_new_geom_with_idx', table_name='new_spatial_table', postgresql_using='gist', postgresql_ops={}, column_name='new_geom_with_idx')
op.create_geospatial_index('idx_new_spatial_table_geom_with_idx', 'new_spatial_table', ['geom_with_idx'], unique=False, postgresql_using='gist', postgresql_ops={})
op.drop_geospatial_column('new_spatial_table', 'new_geom_without_idx')
op.drop_geospatial_column('new_spatial_table', 'new_geom_with_idx')
# ### end Alembic commands ### SQLitedef upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
op.add_geospatial_column('new_spatial_table', sa.Column('new_geom_with_idx', Geometry(geometry_type='LINESTRING', srid=4326, spatial_index=False, from_text='ST_GeomFromEWKT', name='geometry'), nullable=True))
op.add_geospatial_column('new_spatial_table', sa.Column('new_geom_without_idx', Geometry(geometry_type='LINESTRING', srid=4326, spatial_index=False, from_text='ST_GeomFromEWKT', name='geometry'), nullable=True))
op.create_geospatial_index('idx_new_spatial_table_new_geom_with_idx', 'new_spatial_table', ['new_geom_with_idx'], unique=False, postgresql_using='gist', postgresql_ops={})
op.drop_geospatial_column('new_spatial_table', 'geom_without_idx')
op.drop_geospatial_column('new_spatial_table', 'geom_with_idx')
# ### end Alembic commands ###
def downgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
op.add_geospatial_column('new_spatial_table', sa.Column('geom_with_idx', Geometry(geometry_type='LINESTRING', srid=4326, spatial_index=False, from_text='ST_GeomFromEWKT', name='geometry'), nullable=True))
op.add_geospatial_column('new_spatial_table', sa.Column('geom_without_idx', Geometry(geometry_type='LINESTRING', srid=4326, spatial_index=False, from_text='ST_GeomFromEWKT', name='geometry'), nullable=True))
op.drop_geospatial_index('idx_new_spatial_table_new_geom_with_idx', table_name='new_spatial_table', postgresql_using='gist', postgresql_ops={}, column_name='new_geom_with_idx')
op.drop_geospatial_column('new_spatial_table', 'new_geom_without_idx')
op.drop_geospatial_column('new_spatial_table', 'new_geom_with_idx')
# ### end Alembic commands ### As far as I can see, it is because the batch operation renames the temporary table using a regular |
the code example you have there isn't using batch.....but overall yeah all this great extensibility is kind of non-existent for batch mode. if the issue is just the rename_table() operation, you could add a directive to override alembic.ddl.base.RenameTable: @compiles(RenameTable, "sqlite")
def visit_rename_table(
element: "RenameTable", compiler: "DDLCompiler", **kw
) -> str:
# your code here
return "%s RENAME TO %s" % (
alter_table(compiler, element.table_name, element.schema),
format_table_name(compiler, element.new_table_name, element.schema),
) but you just have a table name there. So maybe if you had some global registry of table names -> which of them are geo, something like that, to tell when to use the "rename geo" version of things. or maybe it works in all cases, dunno. |
Ups sorry, I didn't commit the last change. The batch operation is here: https://github.com/adrien-berchet/geoalchemy2/blob/fix_sqlite_reflection/geoalchemy2/alembic_helpers.py#L141
Ah yeah I could try that. I am not a fan of global registries but let's try :) Thanks again! |
why use batch mode internally like that? alembic's documented approach has the batch operation publicly inside the migration script itself. the reason is so that a series of changes to a single table can all be done at once, since batch involves copying the entire table. if someone has a SQLite database important enough that they actually need to run migrations on it, it might also have millions of rows of data. allowing the batch directive to be in the migration script also allows space for some additional directives like constraints and things that batch mode wont detect; there's probably geo-specific concepts that can benefit from this part of it being exposed. |
I naively tried to use batch mode here because SQLite can not drop a column directly, but that's indeed not a satisfactory solution. |
Hi there, |
hey @adrien-berchet - sorry I didnt reply previously. i have no time to do anything these days and it looked like you were on the right track. looks like 1.7.8 has a lot to release. that said, we should probably still merge the actual PR here since it's already done and nice to have. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, this is sqla-tester setting up my work on behalf of zzzeek to try to get revision ea44e7f of this pull request into gerrit so we can run tests and reviews and stuff
New Gerrit review created for change ea44e7f: https://gerrit.sqlalchemy.org/c/sqlalchemy/alembic/+/3882 |
No problem, you already helped me a lot and I was finally able to make it work :)
Ok, as you prefer. In that case I suggest to make it clear in the doc that the table passed to these events is not the actual table but a dummy one (without all the columns and indexes attached). |
yes please, please open an issue "use case" otherwise it will get lost :) |
That's definitely not possible without writing a dialect subclass. Can you provide specifics on this index reflection? in alembic, this should be done using a custom comparison plugin, that would compare these additional indexes: https://alembic.sqlalchemy.org/en/latest/api/autogenerate.html#registering-a-comparison-function
good point |
off the top of my head my idea was that we could accept a function that gets passed with the connection, the indexes loaded and the table(s) reflected, and could modify the loaded indexes? Or alternatively it could make sense have something like this in the dialect events (so for every reflection method?) probably not something for v2, but maybe could stay there as an idea? |
it depends on what the problem is. if index reflection is producing the wrong answer, we should just fix that. if it's omitting things because they rely on plugins, they should write custom comparators on the alembic side. basically I dont think reflection should have injectable alt-code. |
Ok, I will do that 👍
The spatial indexes must be queried in a specific way with SQLite dialect (see https://github.com/adrien-berchet/geoalchemy2/blob/fix_sqlite_reflection/geoalchemy2/alembic_helpers.py#L57) and added to regular indexes. Nevertheless, these indexes should not be propagated during copies (that's why I use
This could be very convenient indeed. In both cases (function or events), I could just use the function I wrote for the monkey patch except it would not call the 'normal behavior' at the beginning since it would get the regular indexes as parameters. |
|
Gerrit review https://gerrit.sqlalchemy.org/c/sqlalchemy/alembic/+/3882 has been merged. Congratulations! :) |
Description
Add
before_drop
andafter_drop
events inalembic.ddl.impl.DefaultImpl.drop_table()
.Fixes #1037