Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import an existing schema #51

Open
jmporchet opened this issue Aug 12, 2024 · 7 comments
Open

Import an existing schema #51

jmporchet opened this issue Aug 12, 2024 · 7 comments
Labels
enhancement New feature or request

Comments

@jmporchet
Copy link

Feature request

Is your feature request related to a problem? Please describe.

I would like to load an existing schema into the tool in order to work on it

Describe the solution you'd like

Upload a sql file that contains the schema of my database

Describe alternatives you've considered

the LLM suggests using the importCsv and exportCsv methods, but the csv uplaod functionality is to create a new database

Additional context

I would love to be able to talk about advanced use cases like pg functions, triggers, vectors etc.

@jmporchet jmporchet added the enhancement New feature or request label Aug 12, 2024
@alphanumericaracters
Copy link

ok, when i export on supabase as csv using:

SELECT 
    table_schema,
    table_name,
    column_name,
    ordinal_position AS position,
    data_type,
    is_nullable,
    column_default
FROM 
    information_schema.columns
WHERE 
    table_schema = 'public'
ORDER BY 
    table_name, ordinal_position;
  • Say that is an excel file (and is not an xls or xlsm, etc)

Screenshot 2024-08-26 at 02-27-18 Postgres Sandbox

  • No way that i can talk with AI using my existing db arquitecture lloking for features and optimization.
  • It would be great to be able to load this CSV with the schema (as I’ve shown) to populate it with 5 or 10 example data entries, run some tests, explore alternative relationships, and so on.

With all the code, categories, roles, relationships, schemas, functions, triggers, etc., it becomes difficult to spot optimization errors or relationships, search for features, etc.

Especially when working alone.

I am currently working with a group of social workers, aiming to map the multiple dimensions of a social group, family, and the individual situation of each member. The goal is to create new studies and projects that help improve the quality of life for people in vulnerable situations with limited resources and members with disabilities.

@amerryma
Copy link

So, what I'm hoping to do is actually embed postgres-new into my development workflow. I want it to be able to look at my current supabase schema for whatever branch I have checked out. I then want to be able to iterate on a new feature quickly within a local running copy of postgres-new. Then when I'm satisfied with the schema, I want to just export the "diff" as a migration and commit the change. This is why it would be important for me to be able to load an existing schema (and even test data).

@jmporchet
Copy link
Author

ok, when i export on supabase as csv using:

SELECT 
    table_schema,
    table_name,
    column_name,
    ordinal_position AS position,
    data_type,
    is_nullable,
    column_default
FROM 
    information_schema.columns
WHERE 
    table_schema = 'public'
ORDER BY 
    table_name, ordinal_position;
* Say that is an excel file (and is not an xls or xlsm, etc)

I think CSV upload is only meant to have you upload tabular data and postgres.new will create a database schema from it.
It's not clear enough from the UI or from the docs IMHO....

@alphanumericaracters
Copy link

Okay, I kept trying different options to see if I could achieve what I needed. The first attempt was to ask Mr. Postrgre (that's what I affectionately call it) if it accepted Markdown notation, to which it responded yes. Then I created a function with SELECT so that it could perform the query and I could copy it in Markdown.

However, when I used the prompt 'create all those tables and their relationships', it threw an error because the order of the tables was not correct (without relationships first and then the ones with relationships).

a_solution

CREATE OR REPLACE FUNCTION get_schema_info()
RETURNS TABLE (
    schema_name TEXT,
    table_name TEXT,
    column_name TEXT,
    ordinal_position INTEGER,
    data_type TEXT,
    is_nullable BOOLEAN,
    foreign_key_info TEXT
) AS $$
BEGIN
    RETURN QUERY
    SELECT 
        c.table_schema::TEXT AS schema_name,
        c.table_name::TEXT,
        c.column_name::TEXT,
        c.ordinal_position::INTEGER,
        c.data_type::TEXT,
        (c.is_nullable = 'YES')::BOOLEAN AS is_nullable,
        COALESCE(
            'REFERENCES ' || fk.foreign_table_name || '(' || fk.foreign_column_name || ')',
            NULL
        ) AS foreign_key_info
    FROM 
        information_schema.columns c
    LEFT JOIN (
        SELECT
            kcu.table_schema,
            kcu.table_name,
            kcu.column_name,
            ccu.table_name AS foreign_table_name,
            ccu.column_name AS foreign_column_name
        FROM 
            information_schema.key_column_usage kcu
        JOIN 
            information_schema.constraint_column_usage ccu
            ON kcu.constraint_name = ccu.constraint_name
            AND kcu.constraint_schema = ccu.constraint_schema
        WHERE 
            kcu.table_schema = 'public'
    ) fk ON c.table_schema = fk.table_schema 
        AND c.table_name = fk.table_name 
        AND c.column_name = fk.column_name
    WHERE 
        c.table_schema = 'public'
    ORDER BY 
        c.table_name, 
        c.ordinal_position;
END;
$$ LANGUAGE plpgsql;
SELECT * FROM get_schema_info();

The second attempt involved ordering the function to display tables without relationships first.

However, I forgot that some tables are purely relationships themselves, and an error message appeared. So, until the ban is lifted, I won’t be able to contribute further to the cause.

Screenshot 2024-08-26 at 22-14-01 Postgres Sandbox

Next time, I'll try adding another level of ordering to the function. For example:

  • The barrios table has a foreign key referencing the localidad table, which in turn has a foreign key referencing the provincia table.

  • The creation order should be specified in Markdown so that Mr. Postgre.new doesn’t encounter issues during SQL execution.

Therefore, the provincia table should be created before the localidad table, and the localidad table should be created before the barrios table.

I hope this information helps others to know 'how not to do it,' which is just as important, if not more so, than 'how to do it right.'

@alphanumericaracters
Copy link

Okay, the ban is gone, and I tested it with this function and SELECT statement.

It ran perfectly, returned the table as expected, I copied it as Markdown, and Mr. Postgre ran the code without any issues or problems.

So i promt; 'Create all the tables and relations' (then paste de MarkDown in the same prompt)

CREATE OR REPLACE FUNCTION get_schema_info()
RETURNS TABLE (
    schema_name TEXT,
    table_name TEXT,
    column_name TEXT,
    ordinal_position INTEGER,
    data_type TEXT,
    is_nullable BOOLEAN,
    foreign_key_info TEXT
) AS $$
BEGIN
    -- Create a temporary table to store the dependency information
    CREATE TEMPORARY TABLE IF NOT EXISTS temp_table_dependencies (
        tbl_name TEXT,
        depends_on TEXT,
        dependency_level INTEGER
    );

    -- Populate the temporary table with initial dependency information
    INSERT INTO temp_table_dependencies (tbl_name, depends_on)
    SELECT DISTINCT
        kcu.table_name AS tbl_name,
        ccu.table_name AS depends_on
    FROM 
        information_schema.key_column_usage kcu
    JOIN 
        information_schema.constraint_column_usage ccu
        ON kcu.constraint_name = ccu.constraint_name
        AND kcu.constraint_schema = ccu.constraint_schema
    WHERE 
        kcu.table_schema = 'public'
    AND kcu.table_name != ccu.table_name;  -- Exclude self-references

    -- Calculate dependency levels
    WITH RECURSIVE dependency_cte AS (
        SELECT 
            td.tbl_name, 
            td.depends_on, 
            1 AS level
        FROM 
            temp_table_dependencies td
        UNION ALL
        SELECT 
            d.tbl_name, 
            cte.depends_on, 
            cte.level + 1
        FROM 
            temp_table_dependencies d
        JOIN 
            dependency_cte cte ON d.depends_on = cte.tbl_name
    )
    UPDATE temp_table_dependencies td
    SET dependency_level = subquery.max_level
    FROM (
        SELECT tbl_name, MAX(level) AS max_level
        FROM dependency_cte
        GROUP BY tbl_name
    ) AS subquery
    WHERE td.tbl_name = subquery.tbl_name;

    -- Set dependency_level to 0 for tables not in the dependency chain
    UPDATE temp_table_dependencies
    SET dependency_level = 0
    WHERE dependency_level IS NULL;

    -- Return the query result
    RETURN QUERY
    SELECT 
        c.table_schema::TEXT AS schema_name,
        c.table_name::TEXT AS table_name,
        c.column_name::TEXT AS column_name,
        c.ordinal_position::INTEGER AS ordinal_position,
        c.data_type::TEXT AS data_type,
        (c.is_nullable = 'YES')::BOOLEAN AS is_nullable,
        COALESCE(
            'REFERENCES ' || fk.foreign_table_name || '(' || fk.foreign_column_name || ')',
            NULL
        ) AS foreign_key_info
    FROM 
        information_schema.columns c
    LEFT JOIN (
        SELECT
            kcu.table_schema,
            kcu.table_name,
            kcu.column_name,
            ccu.table_name AS foreign_table_name,
            ccu.column_name AS foreign_column_name
        FROM 
            information_schema.key_column_usage kcu
        JOIN 
            information_schema.constraint_column_usage ccu
            ON kcu.constraint_name = ccu.constraint_name
            AND kcu.constraint_schema = ccu.constraint_schema
        WHERE 
            kcu.table_schema = 'public'
    ) fk ON c.table_schema = fk.table_schema 
        AND c.table_name = fk.table_name 
        AND c.column_name = fk.column_name
    LEFT JOIN
        temp_table_dependencies td ON c.table_name = td.tbl_name
    WHERE 
        c.table_schema = 'public'
    ORDER BY 
        COALESCE(td.dependency_level, 0),
        c.table_name, 
        c.ordinal_position;

    -- Clean up: drop the temporary table
    DROP TABLE IF EXISTS temp_table_dependencies;
END;
$$ LANGUAGE plpgsql;

-- Execute the function to see the results
SELECT * FROM get_schema_info();

Obviously, it will all depend on the number of tables and columns within them, as I don't know how many characters can be included in a single prompt.

  • In this example, the function returned 147 rows, and as you can see, there are 7 columns in the resulting table.

Yes

@alphanumericaracters
Copy link

All the material shared here by me (code and intellectual property) is made under the 'programmer with 2 coffees and three hands' agreement; (MIT, Apache, and blah, blah, blah... the things you like so much).

licence

I don’t need a law to do what’s right… and you?

@alphanumericaracters
Copy link

Mr. Postgre image looks like:

mrpostgre
(With Dalle-2)

"...You better wait, wait a minute (Wait, wait a minute, Mr. Postgre)
Wait a minute, wait a minute, wait a minute (Wait, wait a minute, Mr. Postgre)
Please, Mr. Postgre (Wait, wait a minute, Mr. Postgre)
[{Deliver the Json, the sooner the better}]..."

The marvelettes: Please Mr postman

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants