Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data in enum tables should be tracked in console-generated migrations #2817

Closed
shugydw opened this issue Sep 2, 2019 · 19 comments
Closed

Data in enum tables should be tracked in console-generated migrations #2817

shugydw opened this issue Sep 2, 2019 · 19 comments
Assignees
Labels
c/cli Related to CLI c/console Related to console k/enhancement New feature or improve an existing feature

Comments

@shugydw
Copy link

shugydw commented Sep 2, 2019

Version: Beta.6
Image: Docker CLI Migrations
Platform: K8s
Host: macOS Mojave

The new Enum table seems it requires at least one row - to be able to be connected foreign key constraint. It breaks migrations as well.

Here is an excerpt from the logs:

...
...
...
level=fatal msg="apply failed: [constraint-violation] the table \"my_enum_table\" 
cannot be used as an enum because the table must have at least one row
($.args[6].args)\r\nFile: 
'1567433643579_alter_table_public_my_enum_table_set_enum_true.up.yaml
...
...
...

Thanks!

@shugydw shugydw changed the title Enum tables break migrations for foreign key constraint Empty Enum tables cannot be used for foreign key constraints Sep 2, 2019
@lexi-lambda
Copy link
Contributor

GraphQL enums must contain at least one value, according to the specification:

Type Validation

Enum types have the potential to be invalid if incorrectly defined.

  1. An Enum type must define one or more unique enum values.

So insert at least one value into your enum table before marking it as an enum.

@rikinsk
Copy link
Member

rikinsk commented Sep 3, 2019

I believe the issue is mixing two separate things.

You can definitely have foreign key constraints involving an empty table (whether enum table or not). I am curious to know if there was any error which made you believe otherwise.

The root problem here seems to be that you cannot have an enum table with no rows.
This will cause a problem while applying migrations as generally inserting data is not tracked as a migration. Your problem will be fixed if you have an intermediate migration file between creating the enum table and setting it as enum table which inserts the enum values into the table.

The ideal flow for this would be as follows:

  • Create an enum table
  • Use the RawSQL tab of the console to insert the enum values and mark the insert as a migration.
  • Set the table as an enum.

This will ensure that applying migrations won't break.

Will be adding this to the docs as well.
Let me know if any other issues are faced.

@rikinsk rikinsk added the support/needs-more-info Needs more details/info/repro instructions label Sep 3, 2019
@shugydw
Copy link
Author

shugydw commented Sep 3, 2019

Thanks, @rikinsk

I've created an intermediate migration with an insert. Although I wanted to have the values in the enum table to be dynamically created (by the admin or someone who takes care of the backend), I guess I am fine with the migration as long as it works!

Although not related to this, but how do I automatically track the foreign key relations. I'm not sure if I have to pass any env, but I couldn't find any. It seems it is not automatic, as I read from #2726 ?

@rikinsk
Copy link
Member

rikinsk commented Sep 4, 2019

@shugydw The insert migration is a necessity as the GraphQL spec doesn't allow defining GraphQL enums without any values. I guess you could simply have a dummy placeholder value inserted via migrations to begin with which can then dynamically be updated.

Regarding automatically tracking foreign key relations, if I understand your requirement correctly, you can check out the thread at #1418 (and upvote to push it forward :)). I hope that is the use-case you were looking for.

@rikinsk rikinsk removed the support/needs-more-info Needs more details/info/repro instructions label Sep 4, 2019
@rikinsk rikinsk changed the title Empty Enum tables cannot be used for foreign key constraints empty enum tables break migrations Sep 4, 2019
@lexi-lambda
Copy link
Contributor

Responding to a comment @rikinsk13 made on slack:

Is it possible to not enforce the "needs to have at least one row" policy? Can we just ignore an enum table if it has no rows. ie: not generate the enum out of it unless it has values.

I’ve been thinking about this, too. It sounds like it’s been more of a hiccup for people than I expected it to be, so it would be nice to somehow avoid the need to worry about it. Here are my current thoughts:

  • Inserting or deleting rows from an enum table is a (GraphQL) schema change, so I think those changes should usually be in the migrations. The scenario @shugydw mentioned is admittedly a little different though, and I hadn’t really considered that use case.

  • If we allow empty enum tables, we need to decide what to do with columns that reference the enum table. Here are the approaches I’ve already considered so far:

    1. Just give columns that reference empty enums type text. This isn’t a good solution, since that type is woefully inaccurate, and the whole point of using enum tables is to add type safety. (Consider: every insertion into that value will fail, which is confusing.)

    2. Omit columns that reference empty enum tables from the schema. This doesn’t really work, either, since those fields have no default values (they can’t, since there are no valid values), so every insertion would always fail. Plus, it’s just confusing.

    3. Omit entire tables with empty enum columns from the schema. Consider that, if an enum table has no rows, then any tables with columns that reference the enum table can’t have any rows either (since the foreign key constraint is unsatisfiable). Therefore, including the table in the schema is rather pointless. On the other hand, this behavior sounds pretty confusing.

    4. Generate a GraphQL scalar instead of an enum when the enum table has no values. That is, instead of generating a type like enum foo_enum { one, two, ... }, we’d just generate scalar foo_enum.

      I think this is the most reasonable solution, since it lies the least about the type without confusingly dropping schema elements, but it’s still not ideal: general-purpose GraphQL typecheckers don’t know anything at all about scalars, so they accept everything. Which is the opposite of what we want! But I don’t know how to do anything better.

@0x777
Copy link
Member

0x777 commented Sep 6, 2019

So far in our metadata system, we don't have anything which fails silently and I think we should aim to keep that up unless it is detrimental.

Is it possible to not enforce the "needs to have at least one row" policy? Can we just ignore an enum table if it has no rows. ie: not generate the enum out of it unless it has values.

My biggest concern is that if we relax the schema generation, the same metadata could generate different GraphQL schema based on the data in the tables which I don't think is a good idea. Instead, adding such enums should fail and I completely agree with this:

Inserting or deleting rows from an enum table is a (GraphQL) schema change, so I think those changes should usually be in the migrations.

We should just improve the console experience for this:

  1. add an option to add data as a migration.
  2. and once a table is marked as an enum, make (1) the default for this table.

@lexi-lambda
Copy link
Contributor

So far in our metadata system, we don't have anything which fails silently and I think we should aim to keep that up unless it is detrimental.

I agree. This feels like a bit of a gray area, though, since there’s really nothing morally wrong with a type that is not inhabited, and there’s no way for it to break any kind of data integrity were it allowed. The only problem is that GraphQL does not provide any support for uninhabited types.

In general I share your instinct that reporting an error is much better than trying to implement some kind of DWIM interpretation. I don’t really like any of the “solutions” I listed, and I’d be reluctant to implement any of them. I just also sympathize with the constraint feeling a little artificial.

My biggest concern is that if we relax the schema generation, the same metadata could generate different GraphQL schema based on the data in the tables

I think this is already the case, as the actual values of the enum are not stored in the metadata, they’re stored in the tables. However, that’s admittedly a smaller kind of difference than changing a column to an entirely different type.

We should just improve the console experience for this

I agree!

@rikinsk
Copy link
Member

rikinsk commented Sep 6, 2019

#1766 has already been requested for similar purposes

@marionschleifer marionschleifer added c/server Related to server and removed k/question labels Sep 8, 2019
@Inch4Tk
Copy link

Inch4Tk commented Sep 22, 2019

To add to this, it would be great if
hasura migrate create "init" --from-server
would also include the current enum states.
Otherwise following this guide: https://blog.hasura.io/resetting-hasura-migrations/ will lead to constraint violation errors upon applying the migrations to a new environment.

My current workaround is to simply keep a separate file around with all the sql insert statements for inserting the enums and copy pasting that to the end of the created init file. So it is not the most urgent thing, as resetting migrations is something done rarely, but a good quality of life thing, so nobody will get confused after resetting his migrations.

@GavinRay97
Copy link
Member

Ran into this issue today, deploying with auto-migrations image from k8s caused the Hasura instance to fail due to the enums breaking everything.

If I could make a suggestion @lexi-lambda, I think one of the easier ways to solve this issue would be to bundle the rows in the enum table as part of the migration (essentially a seed), since the Hasura enum spec defines that there must exist at least one entry in the table for it to be valid.

@lexi-lambda
Copy link
Contributor

If I could make a suggestion @lexi-lambda, I think one of the easier ways to solve this issue would be to bundle the rows in the enum table as part of the migration (essentially a seed), since the Hasura enum spec defines that there must exist at least one entry in the table for it to be valid.

Yes, I agree. I think that we should treat DML changes to enum tables as if they were DDL changes (since they morally are), and they should go in the console-generated migrations. I think it makes sense to repurpose this issue as a feature request for that feature, so I’ll do that.

@lexi-lambda lexi-lambda added c/console Related to console k/enhancement New feature or improve an existing feature and removed c/server Related to server labels Dec 20, 2019
@lexi-lambda lexi-lambda changed the title empty enum tables break migrations Data in enum tables should be tracked in console-generated migrations Dec 20, 2019
@joshuarobs
Copy link

@Inch4Tk
Your suggestion inspired me to create this command that I run. It generates a list of SQL INSERT statements for all the tables in the public schema (minus _Migrations). It basically gets a list of all the table names, then cleans it up, and puts it into a command to be used.

I made this because I don't want to bother keeping track of changes of my rows. This is basically my hacky workaround while there is no official seeding functionality in Hasura.

I intend to use this to help migrate data from my local Docker server to a cloud one, mainly part of my DevOps pipeline. I haven't actually gotten there yet, but knowing that I have this command to automate obtaining data, makes it a little more easier to seed data to the production server.

docker exec docker_container_name pg_dump -U postgres -d postgres --column-inserts --data-only $(docker exec docker_container_name psql -q -A -t -U postgres -c "SELECT table_name FROM information_schema.tables WHERE table_schema = 'public' ORDER BY table_schema, table_name" | sed -e 's/^/-t /' | sed 's/$/ /' | tail -n +2 | tr -d '\n') | sed -n -e '/^INSERT INTO public/p' >> dump.sql

Usage: Replace docker_container_name with the name of the actual Postgres container in docker. (Run docker_ps and look for the one with the Image of postgres). Then run the command. After that, you'll get a dump file containing all the data in your public table. You can then do what @Inch4Tk suggested, copy and paste these INSERT statements at the end of the init.sql file.

I could probably extend the command so that it can do this automatically too, but I'm still trying to figure out my workflow, so I guess I'll settle with having to copy and paste it for now. If anyone wishes to extend the command, feel free to do so.

@shahidhk
Copy link
Member

Data in enum tables should be exported while exporting the migrations from server using cli

@joshuarobs
Copy link

@shahidhk Thanks for the update! I have a few questions regarding this:

  1. What is the command or flags in the command line to make this work and/or to disable it?
  2. I have some tables that are "enum-like", as in, they are essential static data, but consist of more than just two string fields of value and description. My server treats these tables similar to actual enums. Is there a way to have additional non-enum table rows being exported, e.g. through some flags?

@maxcan
Copy link

maxcan commented Apr 1, 2020

Second what @Inch4Tk has said. However, I do see one potential issue which is that sometimes the enum might grow overtime in a way that isn't desirable to include in squashed migrations. I think making an addition flag on enum tables: "Include Data in Migrations" or something would be a good middle ground.

@GavinRay97
Copy link
Member

Second what @Inch4Tk has said. However, I do see one potential issue which is that sometimes the enum might grow overtime in a way that isn't desirable to include in squashed migrations. I think making an addition flag on enum tables: "Include Data in Migrations" or something would be a good middle ground.

The ability to include data in migrations as a flag --include-data is present here in my PR:
#3614

@maxcan
Copy link

maxcan commented Apr 1, 2020

@GavinRay97 thanks! it would be nice if --include-data could be restricted to only enum tables or just take a list of tables to include. We dont necessarily want to include all the data we've put in during development.

@tirumaraiselvan
Copy link
Contributor

The console side of the issue is solved by #4993 and will be out in the next release (v1.4). Since this issue is primarily a discussion around this I am closing it.

The CLI side of the issue (enum rows while initializing migrations using --from-server flag) was originally reported here: #4222 and is a better place to continue the discussion.

@Yash-hegde
Copy link

Is there any way in which I can directly mark a table as an enum (using migrations) without having to do it manually via the console?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c/cli Related to CLI c/console Related to console k/enhancement New feature or improve an existing feature
Projects
None yet
Development

No branches or pull requests