Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use glaredb with dbt/great expectations: fix postgres functionality #2436

Closed
talagluck opened this issue Jan 17, 2024 · 12 comments
Closed

Use glaredb with dbt/great expectations: fix postgres functionality #2436

talagluck opened this issue Jan 17, 2024 · 12 comments
Labels
bug Something isn't working
Milestone

Comments

@talagluck
Copy link
Contributor

talagluck commented Jan 17, 2024

Description

When attempting to create a connection from dbt to a glaredb cloud instance using a postgresql connection string, I hit the following error:

14:37:27  1 check failed:
14:37:27  dbt was unable to connect to the specified database.
The database returned the following error:

  >Database Error
  error with status PGRES_EMPTY_QUERY and no message from the libpq

Check your database credentials and try again. For more information, visit:
https://docs.getdbt.com/docs/configure-your-profile

Given that dbt should work fine with a postgres connector, I would expect this to work, but there's a mismatch here. Sean mentioned that he tested this a few months back and it worked, so there may have been a regression here.

I also get the same error when attempting to run Great Expectations with GlareDB - Great Expectations uses SqlAlchemy under the surface.

@talagluck talagluck added the bug Something isn't working label Jan 17, 2024
@talagluck talagluck changed the title Use glaredb with dbt - fix postgres functionality Use glaredb with dbt: fix postgres functionality Jan 17, 2024
@talagluck talagluck changed the title Use glaredb with dbt: fix postgres functionality Use glaredb with dbt/great expectations: fix postgres functionality Jan 17, 2024
@universalmind303 universalmind303 added this to the next milestone Jan 17, 2024
@talagluck
Copy link
Contributor Author

Steps to reproduce for Great Expectations:

pip install "great_expectations[postgresql]"

In a notebook (or python script) run:

import great_expectations as gx
context = gx.get_context() # gets a great expectations project context 
ds = context.sources.add_postgres(name="glaredb", connection_string=<CONNECTION_STRING>)

The connection string I used was "postgresql://6AhiEN7GQDmo:<PASSWORD>@o_PRocU0j.proxy.glaredb.com:6543/rough_glitter"

@tychoish
Copy link
Contributor

I'm pretty sure that this is #2273, and if you can you specify autocommit=True in some way that it gets set on the psycopg2 (I found some docs but I don't know the best way here.) and see if it helps, that'd be cool.

@scsmithr
Copy link
Member

Good news, I think I got further (thanks to correcting the transaction stubs and connecting to qa). Bad news, seems great expectation is expecting a pg_catalog.version() function:

~/Code/github.com/glaredb/glaredb/bindings/python [1] % just example ge
../../.venv/bin/python examples/ge.py
Traceback (most recent call last):
  File "/Users/sean/Code/github.com/glaredb/glaredb/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1910, in _execute_context
    self.dialect.do_execute(
  File "/Users/sean/Code/github.com/glaredb/glaredb/.venv/lib/python3.11/site-packages/sqlalchemy/engine/default.py", line 736, in do_execute
    cursor.execute(statement, parameters)
psycopg2.errors.InternalError_: Error during planning: Invalid function 'pg_catalog.version'.


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/sean/Code/github.com/glaredb/glaredb/.venv/lib/python3.11/site-packages/great_expectations/datasource/fluent/sql_datasource.py", line 1143, in test_connection
    engine.connect()
  File "/Users/sean/Code/github.com/glaredb/glaredb/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 3325, in connect
    return self._connection_cls(self, close_with_result=close_with_result)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/sean/Code/github.com/glaredb/glaredb/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 96, in __init__
    else engine.raw_connection()
         ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/sean/Code/github.com/glaredb/glaredb/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 3404, in raw_connection
    return self._wrap_pool_connect(self.pool.connect, _connection)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/sean/Code/github.com/glaredb/glaredb/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 3371, in _wrap_pool_connect
    return fn()
           ^^^^
  File "/Users/sean/Code/github.com/glaredb/glaredb/.venv/lib/python3.11/site-packages/sqlalchemy/pool/base.py", line 327, in connect
    return _ConnectionFairy._checkout(self)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/sean/Code/github.com/glaredb/glaredb/.venv/lib/python3.11/site-packages/sqlalchemy/pool/base.py", line 894, in _checkout
    fairy = _ConnectionRecord.checkout(pool)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/sean/Code/github.com/glaredb/glaredb/.venv/lib/python3.11/site-packages/sqlalchemy/pool/base.py", line 493, in checkout
    rec = pool._do_get()
          ^^^^^^^^^^^^^^
  File "/Users/sean/Code/github.com/glaredb/glaredb/.venv/lib/python3.11/site-packages/sqlalchemy/pool/impl.py", line 145, in _do_get
    with util.safe_reraise():
  File "/Users/sean/Code/github.com/glaredb/glaredb/.venv/lib/python3.11/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
    compat.raise_(
  File "/Users/sean/Code/github.com/glaredb/glaredb/.venv/lib/python3.11/site-packages/sqlalchemy/util/compat.py", line 211, in raise_
    raise exception
  File "/Users/sean/Code/github.com/glaredb/glaredb/.venv/lib/python3.11/site-packages/sqlalchemy/pool/impl.py", line 143, in _do_get
    return self._create_connection()
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/sean/Code/github.com/glaredb/glaredb/.venv/lib/python3.11/site-packages/sqlalchemy/pool/base.py", line 273, in _create_connection
    return _ConnectionRecord(self)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/sean/Code/github.com/glaredb/glaredb/.venv/lib/python3.11/site-packages/sqlalchemy/pool/base.py", line 388, in __init__
    self.__connect()
  File "/Users/sean/Code/github.com/glaredb/glaredb/.venv/lib/python3.11/site-packages/sqlalchemy/pool/base.py", line 704, in __connect
    )._exec_w_sync_on_first_run(self.dbapi_connection, self)
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/sean/Code/github.com/glaredb/glaredb/.venv/lib/python3.11/site-packages/sqlalchemy/event/attr.py", line 320, in _exec_w_sync_on_first_run
    self(*args, **kw)
  File "/Users/sean/Code/github.com/glaredb/glaredb/.venv/lib/python3.11/site-packages/sqlalchemy/event/attr.py", line 334, in __call__
    fn(*args, **kw)
  File "/Users/sean/Code/github.com/glaredb/glaredb/.venv/lib/python3.11/site-packages/sqlalchemy/util/langhelpers.py", line 1695, in go
    return once_fn(*arg, **kw)
           ^^^^^^^^^^^^^^^^^^^
  File "/Users/sean/Code/github.com/glaredb/glaredb/.venv/lib/python3.11/site-packages/sqlalchemy/engine/create.py", line 670, in first_connect
    dialect.initialize(c)
  File "/Users/sean/Code/github.com/glaredb/glaredb/.venv/lib/python3.11/site-packages/sqlalchemy/dialects/postgresql/psycopg2.py", line 793, in initialize
    super(PGDialect_psycopg2, self).initialize(connection)
  File "/Users/sean/Code/github.com/glaredb/glaredb/.venv/lib/python3.11/site-packages/sqlalchemy/dialects/postgresql/base.py", line 3441, in initialize
    super(PGDialect, self).initialize(connection)
  File "/Users/sean/Code/github.com/glaredb/glaredb/.venv/lib/python3.11/site-packages/sqlalchemy/engine/default.py", line 408, in initialize
    self.server_version_info = self._get_server_version_info(
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/sean/Code/github.com/glaredb/glaredb/.venv/lib/python3.11/site-packages/sqlalchemy/dialects/postgresql/base.py", line 3687, in _get_server_version_info
    v = connection.exec_driver_sql("select pg_catalog.version()").scalar()
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/sean/Code/github.com/glaredb/glaredb/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1770, in exec_driver_sql
    return self._exec_driver_sql(
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/sean/Code/github.com/glaredb/glaredb/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1674, in _exec_driver_sql
    ret = self._execute_context(
          ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/sean/Code/github.com/glaredb/glaredb/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1953, in _execute_context
    self._handle_dbapi_exception(
  File "/Users/sean/Code/github.com/glaredb/glaredb/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 2134, in _handle_dbapi_exception
    util.raise_(
  File "/Users/sean/Code/github.com/glaredb/glaredb/.venv/lib/python3.11/site-packages/sqlalchemy/util/compat.py", line 211, in raise_
    raise exception
  File "/Users/sean/Code/github.com/glaredb/glaredb/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1910, in _execute_context
    self.dialect.do_execute(
  File "/Users/sean/Code/github.com/glaredb/glaredb/.venv/lib/python3.11/site-packages/sqlalchemy/engine/default.py", line 736, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.InternalError: (psycopg2.errors.InternalError_) Error during planning: Invalid function 'pg_catalog.version'.

[SQL: select pg_catalog.version()]
(Background on this error at: https://sqlalche.me/e/14/2j85)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/sean/Code/github.com/glaredb/glaredb/bindings/python/examples/ge.py", line 4, in <module>
    ds = context.sources.add_postgres(
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/sean/Code/github.com/glaredb/glaredb/.venv/lib/python3.11/site-packages/great_expectations/datasource/fluent/sources.py", line 472, in add_datasource
    datasource.test_connection()
  File "/Users/sean/Code/github.com/glaredb/glaredb/.venv/lib/python3.11/site-packages/great_expectations/datasource/fluent/sql_datasource.py", line 1145, in test_connection
    raise TestConnectionError(
great_expectations.datasource.fluent.interfaces.TestConnectionError: Attempt to connect to datasource failed with the following error message: (psycopg2.errors.InternalError_) Error during planning: Invalid function 'pg_catalog.version'.

[SQL: select pg_catalog.version()]
(Background on this error at: https://sqlalche.me/e/14/2j85)
error: Recipe `example` failed on line 29 with exit code 1

But this seems easy enough to add in.

@talagluck
Copy link
Contributor Author

Cool! Yes, it can be kind of challenging to pass in arguments to Great Expectations that it isn't expecting. But my hunch is something like what Sam said, that there is an issue with psycopg2 or SQL Alchemy, which is why this isn't working with a few different tools.

@universalmind303
Copy link
Contributor

@scsmithr Is the next logical step for this then to implement pg_catalog.version()

@scsmithr
Copy link
Member

I added this last week, and a test to confirm: https://github.com/GlareDB/glaredb/blob/main/testdata/sqllogictests/functions/version.slt#L6-L14

That combined with the pg_tables thing I think will get us further here. I think we can retest (w/ qa) and see what happens.

@talagluck
Copy link
Contributor Author

I hit a new issue now with dbt:

Database Error
  Error during planning: Unable to fetch table provider for 'pg_matviews': failed to resolve: failed to find table: pg_matviews

universalmind303 added a commit that referenced this issue Jan 29, 2024
@talagluck this should fix the great-expectations issue you referenced
in #2436

---------

Co-authored-by: Sean Smith <scsmithr@gmail.com>
@tychoish
Copy link
Contributor

tychoish commented Feb 1, 2024

Just to double check, it would be good to verify if there are other issues, and understand where we're at on this. (@talagluck)

It might also be good, to see what the script is that you're using to test this, then we can cut out the longer iteration cycle.

tychoish pushed a commit that referenced this issue Feb 1, 2024
@talagluck this should fix the great-expectations issue you referenced
in #2436

---------

Co-authored-by: Sean Smith <scsmithr@gmail.com>
@talagluck
Copy link
Contributor Author

On the latest test, I get:

Database Error
  Error during planning: Unable to fetch table provider for 'pg_rewrite': failed to resolve: failed to find table: pg_rewrite

I've been working on a script today. It's a bit more involved than I expected, but I'll have something soon.

@talagluck
Copy link
Contributor Author

talagluck commented Feb 2, 2024

PR with test is here: #2580

universalmind303 pushed a commit that referenced this issue Feb 8, 2024
These are to make another step toward fixing #2436. 

I built the definitions off of
https://www.postgresql.org/docs/16/catalog-pg-depend.html and
https://www.postgresql.org/docs/16/catalog-pg-rewrite.html, but please
double-check that I did this correctly.

---------

Co-authored-by: Grey <grey@glaredb.com>
@greyscaled
Copy link
Contributor

@talagluck is this closable now, or what's the tl:dr status update?

@talagluck
Copy link
Contributor Author

Yup! I'll close this and open up more specific issues as needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants