-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
managing seed or data-only migrations #2431
Comments
@marionschleifer Currently I seed data by a pipeline that waits until hasura becomes ready (,which means migration of shema is done, so I can insert to database,) and then executes INSERT statements. Could I ask
|
|
@marionschleifer Thank you for the prompt response :)
If I understand what you mentioned clear, then migrations would look like this, where
The reason I want seed data is to easily populate data on local development environment. So seed data must be prevented to be fed to production database in our case. But satisfying this with migration model introduces, at least to my project, an inconvenient workflow. Let's say I add a new migration named
While I want to just simply execute In my opinion, "seed" should be an independent concept from migration. That's because migration can be applied to both dev and production environment, whereas seed, at least in my case, should only be applied to dev environment. For example, prisma provides Therefore I suggest a new way. Users add and manage 'seed sql files' under
And the command below would execute every .sql files in 'seeds' folder in alphanumeric order. hasura seed What's more, when using graphql-engine.cli-migrations docker image, attaching the seeds directory as a volume to # other options like `-p` are omitted for brevity
docker run \
-v hasura/migrations:/hasura-migrations \
-v hasura/seeds:/hasura-seeds \
hasura/graphql-engine:v1.0.0-alpha42.cli-migrations Or, rather than attaching two volumes, just attaching a single directory, which contains docker run \
-v hasura:/hasura \
hasura/graphql-engine:v1.0.0-alpha42.cli-migrations Applying seed by graphql-engine.cli-migrations matters. Because I frequently remove docker volumes, and want to re-initialize postgres and hasura with the desired "state"(metadata, schema, and seed data) by just executing So, this will allow users to easily build local development workflow, while still providing easy migration to production. How do you feel? |
@rikinsk @shahidhk Based on @jjangga0214 comments, do you think we can solve this by allowing a user to run an arbitrary migration/SQL without running it as a migration which updates the migration state on the database?
This will make it easy to run a seed data style migration, and we don't need to explicitly code in a seed directory into the CLI command flow? |
Why isn't executing an insert SQL after the migrations are run good enough? Any reason for this to be a migration if its not to migrate from one system to another edit: I might have responded without understanding the solutions and requirements properly. @coco98 solution is basically same as what I was going for. But I dont see how it would help to add the seed data automatically at start up like @jjangga0214's solution |
Any news? |
For the CLI migrations image, we can provide another environment variable Then use can keep one directory for schema/metadata migrations and another for see data. |
I'd like to suggest an additional proposal for seed data management. hasura seed apply # or `hasura seed push`? This command reads seed data, and inserts it to Postgres.
hasura seed create # or `hasura seed pull`? This command reads existing data from Postgres, and generates seed data with format the user wants (e.g by flag), possibly one of SQL, JSON, GraphQL. It should probably write the seed data on file system, or print by stdout for pipeline. |
@jjangga0214 thank you for your observations. We have labelled it as |
I also am in the process of trying to come up with a solution for this at the moment. I think the optimum solution here is to use the My initial thought was to implement calling the Have a look at this: I figure the way it would work is that either through CLI flags or the Console, users can select whether they want to track just the schema, or the data plus the schema in migrations. This may be a bit of an undertaking, because it is going to require (I assume):
A short-term solution would be to modify the Go CLI to accept a flag for whether or not it should dump the data as well: Modify
graphql-engine/cli/commands/migrate_create.go Lines 85 to 98 in f8ffbda
graphql-engine/cli/migrate/database/hasuradb/schema_dump.go Lines 8 to 18 in f8ffbda
I am not sure who best to tag about this. @marionschleifer @coco98 |
I do not have any experience writing Go, but here is what I imagine the function would roughly look like (barring the fact that if we are going to pass config here, it should be in an object (I suppose Go calls them interfaces)): func (h *HasuraDB) ExportSchemaDump(schemaNames []string, includeData bool) ([]byte, error) {
// For whatever it is worth, I would actually pass these flags as
// --no-owner and --no-acl because it carries more immediate semantic meaning.
// I had to look these flags up in pg_dump documentation to find out what they did
opts := []string{"--no-owner", "--no-acl"}
if includeData {
opts = append(opts, "--inserts")
} else {
opts = append(opts, "--schema-only")
}
// Probably want to add in a []string for tableNames here as well
// Then the implementation would look like:
if tableNames {
for _, table := range tableNames {
opts = append(opts, "-t", table)
}
}
for _, s := range schemaNames {
opts = append(opts, "--schema", s)
}
query := SchemaDump{
Opts: opts,
CleanOutput: true,
}
resp, body, err := h.sendSchemaDumpQuery(query)
}``` |
@shahidhk could you please help @GavinRay97 to get started on this? ✨ |
@lexi-lambda This has cross-cutting concerns with #2817 and would potentially take care of that too |
I added functionality for seed scripts into the CLI, but have some minor details to work out. Have not heard back from the Hasura team on this so it may be the case they are either on holiday or its not super high on roadmap. Never written Go before and unsure how much more needs to be done on it to properly integrate it, so no promises if/when it will get merged into core. Would be really neat if it did though. |
I would actually love an option where I can update the seed data in the hasura console tab. Adjust the row and have the opportunity after I've updated a couple of rows to save them as a migration. So instead of writing for every small change a new SQL statement I would prefer to take a database snapshot from time to time. Any ideas? |
Often one would like to add some data into tables as part of the db init process. We should document how to achieve that.
track thisThis is a migrationEdit:
The above task is done. There are further suggestions for managing seed data migrations in the comments. Leaving this issue open for that discussion
The text was updated successfully, but these errors were encountered: