Add JSON output support to pg-schema-diff plan. #175

tonycosentini · 2024-09-18T05:03:50Z

Description

This adds support for different output formats when running the plan subcommand. Currently it only supports pretty printing (which is what already exists today) or JSON.

I rarely/never write golang so this PR was more or less a first attempt to get a discussion going, happy to refactor as needed.

Motivation

I'm working on a tool written in Python that is using the pg-schema-diff to generate schema plans. It would be great to have a structured output option so I can just parse the output in Python.

Alternatively I can build some kind of bindings for this library, that would require waaay more overhead than just being able to parse the JSON output.

I think this would make the tool much more versatile.

Testing

Just manual testing. I'm happy to add test coverage, but I didn't see any other CLI tests (besides some options parsing tests) so I wasn't really sure where to start.

CLAassistant · 2024-09-18T05:03:57Z

All committers have signed the CLA.

bplunkett-stripe

Looks really solid! A couple tiny nits. Thanks for implementing this :)

bplunkett-stripe · 2024-09-19T02:21:46Z

cmd/pg-schema-diff/plan_cmd.go

+		*e = outputFormat(v)
+		return nil
+	default:
+		return errors.New(`must be one of "pretty" or "json"`)


nit: include the unknown value the user used. Also let's group with the other plan flag structs

It looks like Cobra prepends this. I added the value the user used and it came out like this:

Error: invalid argument "test" for "--output-format" flag: must be one of "pretty" or "json", value passed in was "test"

I'm going to leave it out, but if you think it's useful because the error might be used outside of Cobra let me know and I can add it back in.

Here's how it looks by default:

Error: invalid argument "test" for "--output-format" flag: must be one of "pretty" or "json"

bplunkett-stripe · 2024-09-19T02:22:21Z

cmd/pg-schema-diff/plan_cmd.go

@@ -139,6 +172,8 @@ func createPlanFlags(cmd *cobra.Command) *planFlags {
 		),
 	)

+	cmd.Flags().Var(&flags.outputFormat, "output-format", "Change the output format for what is printed. Defaults to pretty-printed human-readable output. (options: pretty, json)")


nit: set default value to pretty and make a required param

There's a default value now, but it has to be on the struct itself, when using Var there is no way to set a default.

I also didn't mark it as required because when it's marked as required via MarkFlagRequired the CLI will always require something to be passed in, even though the struct instance already has it set.

Makes sense!

bplunkett-stripe · 2024-09-19T02:24:44Z

cmd/pg-schema-diff/plan_cmd.go

-		fmt.Printf("\n%s\n", header("Generated plan"))
-		fmt.Println(planToPrettyS(plan))
+
+		if planFlags.outputFormat == "" || planFlags.outputFormat == outputFormatPretty {


Let's not support an empty string for output format (once the param is default and required). Let's also error out if the enum value is not known, i.e., if neither if condition matches, then return an error.

Alternatively (up to you!), you could add to the output format type an additional property called print:

{ name string print func(plan) } .... ... planFlags.outputFormat.print(plan)

Then we wouldn't need any sort of conditional switchhere.

I reworked this as suggested - now each output format has a function to convert the plan to an output string.

Thanks for the suggestion, it's definitely much cleaner.

tonycosentini · 2024-09-19T13:05:26Z

@bplunkett-stripe I think I addressed everything - apologies if something is off, just let me know and I can sort it out.

Also thanks for jumping on this, much appreciated.

bplunkett-stripe · 2024-09-19T17:14:27Z

cmd/pg-schema-diff/plan_cmd.go

@@ -78,6 +77,11 @@ type (
 		targetDatabaseDSN string
 	}

+	outputFormat struct {
+		identifier            string
+		convertToOutputString func(diff.Plan) string


bplunkett-stripe · 2024-09-19T17:16:48Z

cmd/pg-schema-diff/plan_cmd.go

@@ -139,6 +172,8 @@ func createPlanFlags(cmd *cobra.Command) *planFlags {
 		),
 	)

+	cmd.Flags().Var(&flags.outputFormat, "output-format", "Change the output format for what is printed. Defaults to pretty-printed human-readable output. (options: pretty, json)")


Makes sense!

bplunkett-stripe · 2024-09-19T17:18:24Z

@tonycosentini before merging, could you paste some example outputs of a "pretty" plan and a "json" plan. Just want to make sure those look okay! (Currently CLI tests are just by-hand right now, so the current practice to just to include some example runs in your PR)!

tonycosentini · 2024-09-20T02:56:13Z

@bplunkett-stripe here are some examples

Normal output without specifying any format (defaults to pretty):

➜  pg-schema-diff git:(json_output) pg-schema-diff plan --dsn postgresql://postgres@localhost/empty --schema-dir ../test-schema.sql 
################################ Generated plan ################################
1. CREATE SCHEMA "sample_code_review_schema";
	-- Statement Timeout: 3s

2. CREATE TABLE "sample_code_review_schema"."test" (
	"id" bigint NOT NULL,
	"created_at" timestamp without time zone,
	"updated_at" timestamp without time zone,
	"product" character varying(255) COLLATE "pg_catalog"."default",
	"cost" numeric,
	"user_name" character varying(100) COLLATE "pg_catalog"."default",
	"email" character varying(255) COLLATE "pg_catalog"."default"
);
	-- Statement Timeout: 3s

3. CREATE UNIQUE INDEX CONCURRENTLY test_pkey ON sample_code_review_schema.test USING btree (id);
	-- Statement Timeout: 20m0s
	-- Lock Timeout: 3s

4. ALTER TABLE "sample_code_review_schema"."test" ADD CONSTRAINT "test_pkey" PRIMARY KEY USING INDEX "test_pkey";
	-- Statement Timeout: 3s

5. CREATE INDEX CONCURRENTLY test_product_idx ON sample_code_review_schema.test USING btree (product) WHERE (cost > (100)::numeric);
	-- Statement Timeout: 20m0s
	-- Lock Timeout: 3s

6. DROP TABLE "public"."test";
	-- Statement Timeout: 20m0s
	-- Lock Timeout: 3s
	-- Hazard DELETES_DATA: Deletes all rows in the table (and the table itself)

Passing pretty as output format:

➜  pg-schema-diff git:(json_output) pg-schema-diff plan --dsn postgresql://postgres@localhost/empty --schema-dir ../test-schema.sql --output-format pretty
################################ Generated plan ################################
1. CREATE SCHEMA "sample_code_review_schema";
	-- Statement Timeout: 3s

2. CREATE TABLE "sample_code_review_schema"."test" (
	"id" bigint NOT NULL,
	"created_at" timestamp without time zone,
	"updated_at" timestamp without time zone,
	"product" character varying(255) COLLATE "pg_catalog"."default",
	"cost" numeric,
	"user_name" character varying(100) COLLATE "pg_catalog"."default",
	"email" character varying(255) COLLATE "pg_catalog"."default"
);
	-- Statement Timeout: 3s

3. CREATE UNIQUE INDEX CONCURRENTLY test_pkey ON sample_code_review_schema.test USING btree (id);
	-- Statement Timeout: 20m0s
	-- Lock Timeout: 3s

4. ALTER TABLE "sample_code_review_schema"."test" ADD CONSTRAINT "test_pkey" PRIMARY KEY USING INDEX "test_pkey";
	-- Statement Timeout: 3s

5. CREATE INDEX CONCURRENTLY test_product_idx ON sample_code_review_schema.test USING btree (product) WHERE (cost > (100)::numeric);
	-- Statement Timeout: 20m0s
	-- Lock Timeout: 3s

6. DROP TABLE "public"."test";
	-- Statement Timeout: 20m0s
	-- Lock Timeout: 3s
	-- Hazard DELETES_DATA: Deletes all rows in the table (and the table itself)

JSON output:

➜  pg-schema-diff git:(json_output) pg-schema-diff plan --dsn postgresql://postgres@localhost/empty --schema-dir ../test-schema.sql --output-format json
{
  "statements": [
    {
      "ddl": "CREATE SCHEMA \"sample_code_review_schema\"",
      "timeout_ms": 3000,
      "lock_timeout_ms": 3000,
      "hazards": null
    },
    {
      "ddl": "CREATE TABLE \"sample_code_review_schema\".\"test\" (\n\t\"id\" bigint NOT NULL,\n\t\"created_at\" timestamp without time zone,\n\t\"updated_at\" timestamp without time zone,\n\t\"product\" character varying(255) COLLATE \"pg_catalog\".\"default\",\n\t\"cost\" numeric,\n\t\"user_name\" character varying(100) COLLATE \"pg_catalog\".\"default\",\n\t\"email\" character varying(255) COLLATE \"pg_catalog\".\"default\"\n)",
      "timeout_ms": 3000,
      "lock_timeout_ms": 3000,
      "hazards": null
    },
    {
      "ddl": "CREATE UNIQUE INDEX CONCURRENTLY test_pkey ON sample_code_review_schema.test USING btree (id)",
      "timeout_ms": 1200000,
      "lock_timeout_ms": 3000,
      "hazards": null
    },
    {
      "ddl": "ALTER TABLE \"sample_code_review_schema\".\"test\" ADD CONSTRAINT \"test_pkey\" PRIMARY KEY USING INDEX \"test_pkey\"",
      "timeout_ms": 3000,
      "lock_timeout_ms": 3000,
      "hazards": null
    },
    {
      "ddl": "CREATE INDEX CONCURRENTLY test_product_idx ON sample_code_review_schema.test USING btree (product) WHERE (cost \u003e (100)::numeric)",
      "timeout_ms": 1200000,
      "lock_timeout_ms": 3000,
      "hazards": null
    },
    {
      "ddl": "DROP TABLE \"public\".\"test\"",
      "timeout_ms": 1200000,
      "lock_timeout_ms": 3000,
      "hazards": [
        {
          "type": "DELETES_DATA",
          "message": "Deletes all rows in the table (and the table itself)"
        }
      ]
    }
  ],
  "current_schema_hash": "620ef80a9065c439"
}

Invalid output format:

➜  pg-schema-diff git:(json_output) pg-schema-diff plan --dsn postgresql://postgres@localhost/empty --schema-dir ../test-schema.sql --output-format invalid
Error: invalid argument "invalid" for "--output-format" flag: must be one of "pretty" or "json"
Usage:
  pg-schema-diff plan [flags]

Aliases:
  plan, diff

Flags:
      --data-pack-new-tables                     If set, will data pack new tables in the plan to minimize table size (re-arranges columns). (default true)
      --disable-plan-validation                  If set, will disable plan validation. Plan validation runs the migration against a temporarydatabase with an identical schema to the original, asserting that the generated plan actually migrates the schema to the desired target.
      --dsn string                               Connection string for the database (DB password can be specified through PGPASSWORD environment variable)
      --exclude-schema stringArray               Exclude the specified schema in the plan
  -h, --help                                     help for plan
      --include-schema stringArray               Include the specified schema in the plan
  -s, --insert-statement stringArray             'index=<index> statement="<statement>" timeout=<duration> lock_timeout=<duration>' values. Will insert the statement at the index in the generated plan. This follows normal insert semantics. Example: -s 'index=1 statement="SELECT pg_sleep(5)" timeout=5s lock_timeout=1s'
  -l, --lock-timeout-modifier stringArray        list of 'pattern="<regex>" timeout=<duration>', where if a statement matches the regex, the statement will have the target lock timeout. If multiple regexes match, the latest regex will take priority. Example: -t 'pattern="CREATE TABLE" timeout=5m'
      --output-format outputFormat               Change the output format for what is printed. Defaults to pretty-printed human-readable output. (options: pretty, json) (default pretty)
      --schema-dir stringArray                   Directory of .SQL files to use as the schema source (can be multiple). Use to generate a diff between the target database and the schema in this directory.
      --schema-source-dsn string                 DSN for the database to use as the schema source. Use to generate a diff between the target database and the schema in this database.
  -t, --statement-timeout-modifier stringArray   list of 'pattern="<regex>" timeout=<duration>', where if a statement matches the regex, the statement will have the target statement timeout. If multiple regexes match, the latest regex will take priority. Example: -t 'pattern="CREATE TABLE" timeout=5m'

I kind of think the casing on the JSON keys is a bit strange, I think ideally it would be snake case or camel case, I rarely (never?) see JSON with capitalized key names. However it seems like this isn't super easy to tweak with the built-in JSON library, I'm curious if you think it's worth reformatting it. edit: I just pushed another commit that cleans up the formatting a bit - the keys are now all lowercase + the times are converted to milliseconds, before it was nanoseconds which was a bit confusing. The example has been updated.

…illiseconds.

bplunkett-stripe · 2024-09-21T00:23:39Z

Sorry I hit the merge button for you! You don't have write permissions on the repo, so you can't hit the merge button unfortunately. The commit is authored by you though! 🎉

This is a very nice improvement to the repo! Thanks for making it.

Add JSON output support to pg-schema-diff plan.

aleclarson · 2024-11-27T21:48:06Z

This is exactly what I needed. Thanks @tonycosentini!

Add JSON output support to pg-schema-diff plan.

4b855ac

tonycosentini force-pushed the json_output branch from 4b4fdd6 to 4b855ac Compare September 18, 2024 06:48

bplunkett-stripe requested changes Sep 19, 2024

View reviewed changes

Code review follow up.

e796397

tonycosentini requested a review from bplunkett-stripe September 19, 2024 13:04

bplunkett-stripe previously approved these changes Sep 19, 2024

View reviewed changes

Update JSON formatting - use lowercase key names + convert times to m…

f95a4b0

…illiseconds.

tonycosentini dismissed bplunkett-stripe’s stale review via f95a4b0 September 20, 2024 03:08

bplunkett-stripe approved these changes Sep 20, 2024

View reviewed changes

bplunkett-stripe merged commit f036cb2 into stripe:main Sep 21, 2024
7 checks passed

tonycosentini deleted the json_output branch October 3, 2024 00:20

aleclarson pushed a commit to pg-nano/pg-schema-diff that referenced this pull request Nov 11, 2024

Add JSON output support to pg-schema-diff plan. (stripe#175)

ae17d9f

Add JSON output support to pg-schema-diff plan.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add JSON output support to pg-schema-diff plan. #175

Add JSON output support to pg-schema-diff plan. #175

tonycosentini commented Sep 18, 2024

CLAassistant commented Sep 18, 2024 •

edited

Loading

bplunkett-stripe left a comment

bplunkett-stripe Sep 19, 2024

tonycosentini Sep 19, 2024 •

edited

Loading

bplunkett-stripe Sep 19, 2024

tonycosentini Sep 19, 2024

bplunkett-stripe Sep 19, 2024

bplunkett-stripe Sep 19, 2024

tonycosentini Sep 19, 2024

tonycosentini commented Sep 19, 2024

bplunkett-stripe Sep 19, 2024

bplunkett-stripe Sep 19, 2024

bplunkett-stripe commented Sep 19, 2024

tonycosentini commented Sep 20, 2024 •

edited

Loading

bplunkett-stripe commented Sep 21, 2024

aleclarson commented Nov 27, 2024

Add JSON output support to pg-schema-diff plan. #175

Add JSON output support to pg-schema-diff plan. #175

Conversation

tonycosentini commented Sep 18, 2024

Description

Motivation

Testing

CLAassistant commented Sep 18, 2024 • edited Loading

bplunkett-stripe left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tonycosentini Sep 19, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tonycosentini commented Sep 19, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bplunkett-stripe commented Sep 19, 2024

tonycosentini commented Sep 20, 2024 • edited Loading

bplunkett-stripe commented Sep 21, 2024

aleclarson commented Nov 27, 2024

CLAassistant commented Sep 18, 2024 •

edited

Loading

tonycosentini Sep 19, 2024 •

edited

Loading

tonycosentini commented Sep 20, 2024 •

edited

Loading