Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-852] [CT-836] [Bug] externalbrowser authenticator opens a browser tab for each dbt thread #198

Closed
1 task done
Tolsto opened this issue Jul 12, 2022 · 22 comments
Closed
1 task done
Assignees
Labels
enhancement New feature or request Stale

Comments

@Tolsto
Copy link

Tolsto commented Jul 12, 2022

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

Running dbt with Snowflake and authenticator: externalbrowser will open a browser login tab for each dbt thread.

Expected Behavior

Only a single browser window/tab should open regardless how many threads are configured.

Steps To Reproduce

Execute dbt run with the following config (assuming you have at least 4 models to utilize all threads)

org:
  outputs:
    dev:
      [...]
      authenticator: externalbrowser
      threads: 4
      type: snowflake

Relevant log output

No response

Environment

- OS: MacOS 12.4 (arm64)
- Python: 3.9.13
- dbt: 1.1.0 (snowflake plugin 1.1.0)

What database are you using dbt with?

snowflake

Additional Context

No response

@Tolsto Tolsto added bug Something isn't working triage labels Jul 12, 2022
@github-actions github-actions bot changed the title [Bug] externalbrowser authenticator opens a browser tab for each dbt thread [CT-836] [Bug] externalbrowser authenticator opens a browser tab for each dbt thread Jul 12, 2022
@jtcohen6
Copy link
Contributor

Thanks for opening @Tolsto! I'm going to transfer this to the dbt-snowflake repository, since that's where the fix would be needed.

@jtcohen6 jtcohen6 transferred this issue from dbt-labs/dbt-core Jul 18, 2022
@github-actions github-actions bot changed the title [CT-836] [Bug] externalbrowser authenticator opens a browser tab for each dbt thread [CT-852] [CT-836] [Bug] externalbrowser authenticator opens a browser tab for each dbt thread Jul 18, 2022
@jtcohen6
Copy link
Contributor

jtcohen6 commented Aug 2, 2022

@Tolsto Sorry you're running into this — it's definitely not the intended behavior, nor the behavior I've been seeing while running dbt-snowflake on my own machine, also macOS 12.4 with M1 (arm64). It's also not logic that really lives in dbt-snowflake — it's being passed through to the authentication mechanism defined in snowflake-connector-python.

dbt does open separate connections for each thread, and doesn't reuse connections between threads. For some authentication methods, each connection requires its own auth handshake (#201).

If you're using externalbrowser-based authentication, though, you should be able to cache the authentication token between runs (and between threads in the same run) by using keyring / macOS Keychain Access.

  • Could you try reinstalling dbt-snowflake in a new virtual environment, and re-running?
  • Could you check to see if you have keyring installed in your Python environment running dbt?
$ pip freeze | grep 'keyring'
keyring==21.8.0

@Tolsto
Copy link
Author

Tolsto commented Aug 3, 2022

The caching works but only after the first invocation of dbt (and authentication). As soon as the token expires, dbt shows the mentioned behaviour. I should add that several of my co-workers experience the same issues (on non-M1 Macs).

@jtcohen6
Copy link
Contributor

jtcohen6 commented Aug 3, 2022

@Tolsto Ah, okay, got it! Glad to hear the caching works generally, and that this is just an issue when re-authenticating.

Are you materializing your models in multiple logical databases? That is, do you have multiple database values configured for models in your project?

I've been able to reproduce this behavior by configuring multiple. At the start of each invocation, dbt runs caching queries in its main thread: show terse schemas in database {database}. If you have multiple databases, it runs a show query for each one, and it runs them in parallel. This is what I see in my debug-level logs (logs/dbt.log) when two browser windows open:

[0m08:38:19.105985 [debug] [ThreadPool]: On list_dbt_test: /* {"app": "dbt", "dbt_version": "1.3.0b1", "profile_name": "sandbox-snowflake", "target_name": "dev", "connection_name": "list_dbt_test"} */
show terse schemas in database dbt_test
    limit 10000
�[0m08:38:19.106283 [debug] [ThreadPool]: Acquiring new snowflake connection "list_ad_hoc"
�[0m08:38:19.106415 [debug] [ThreadPool]: Opening a new connection, currently in state init
�[0m08:38:19.107906 [debug] [ThreadPool]: Using snowflake connection "list_ad_hoc"
�[0m08:38:19.108326 [debug] [ThreadPool]: On list_ad_hoc: /* {"app": "dbt", "dbt_version": "1.3.0b1", "profile_name": "sandbox-snowflake", "target_name": "dev", "connection_name": "list_ad_hoc"} */
show terse schemas in database ad_hoc
    limit 10000
�[0m08:38:19.108419 [debug] [ThreadPool]: Opening a new connection, currently in state init

@Tolsto
Copy link
Author

Tolsto commented Aug 3, 2022

Yes, we have multiple databases.

@yingqiaozheng
Copy link

yingqiaozheng commented Oct 17, 2022

Is this issue fixed in any version of dbt-snowflake yet? Or is there any workaround by updating the local configuration? or anything else?

@JudyCamps
Copy link

JudyCamps commented Jan 17, 2023

I am also looking for a work-around to this. I can't run dbt core due to this authentication issue.
I see a similar ticket here #208 CT-880

@yingqiaozheng
Copy link

I had this issue when I install dbt using brew, then I uninstall the dbt and re-install it using pip, the issue was gone. I thing the installation packages are different between them . Give it a try and good luck .

@JudyCamps
Copy link

Thanks @yingqiaozheng . I removed the brew version and re-installed with pip. Unfortunately, I have the same behavior - infinite loop of tabs opening affirming the snowflake connection, but dbt doesn't seem to accept it.

@almekpoh
Copy link

almekpoh commented Jan 27, 2023

I have the same behaviour, and it's a bit annoying! Doing a fix on this ?🙏🏾🙏🏾 or a workaround?

@system1-dersek
Copy link

system1-dersek commented Feb 2, 2023

We're also seeing a ton of browser tabs opened. Looking at the dbt.log, it appears that it's opening a separate connection per database, per schema. It seems like it's not just the model materialization databases its considering, but also the databases of sources as well - that or our project is referencing another project.

Maybe it's reasonable to require that each 'database' is a different connection in non-snowflake databases, but since snowflake databases are more like a schema, and schema like a sub-schema, it feels like there's an assumption that isn't holding up well. (Maybe snowflake's additional scoping has lead to us organizing data in a way that other databases would consider an antipattern, which makes it tricky for a multi-database tool like dbt)

Could dbt treat a snowflake account as a single database for connection purposes and switch between databases/schemas/roles in sql when something needs to run in a specific context? I think that would still require a connection per-thread, but it's a lot better than per-database, per-schema by about 2 orders of magnitude in our case.

Another alternative might be for snowflake to provide a way to get a time-limited token back from a general connection that could be used to create new connections without needing to use the original authentication method.

@system1-dersek
Copy link

system1-dersek commented Feb 3, 2023

Looks like the limited time token already exists: https://docs.snowflake.com/en/sql-reference/parameters.html#label-allow-id-token

https://docs.getdbt.com/reference/warehouse-setups/snowflake-setup#sso-authentication
Once I had pip installed keyring, DBT was able to make use of this to only create a single tab for SSO once ALLOW_ID_TOKEN was enabled on our account.

Sounds like Tolsto already is using the keyring approach, so the issues with expiration may still apply

@Fleid Fleid self-assigned this Feb 17, 2023
@Fleid
Copy link
Contributor

Fleid commented Feb 22, 2023

I'm a bit puzzled about what we can do here.
It looks like dbt-snowflake is behaving as expected?

@Tolsto do you still experience the issue? Have you tried this workaround?
Is your Snowflake account set up for connection caching?

@Tolsto
Copy link
Author

Tolsto commented Feb 23, 2023

I'm a bit puzzled about what we can do here. It looks like dbt-snowflake is behaving as expected?

@Tolsto do you still experience the issue? Have you tried this workaround? Is your Snowflake account set up for connection caching?

Still have the problem, connection caching is activated.
As far as I can tell, the workaround is for people whose keyring integration doesn't work. That's not the case for me, it just tries to authenticate 4 times when I run dbt. Each successful authentication then saves the token in the keychain.
A possible solution might be to do a single Snowflake query to trigger the authentication and only then proceed with executing any further queries as the token is then cached in the keyring and no further authentications would be required.

@Fleid
Copy link
Contributor

Fleid commented Feb 27, 2023

That makes sense, thank you.

I'll mark that as an enhancement then. The code is working as expected, but it needs to be changed to optimize the behavior.
How painful would you say this is to you in using dbt? The scale starting at paper cut, all the way to making you question the state of software engineering in 2023? ;)

@Fleid Fleid added enhancement New feature or request and removed bug Something isn't working awaiting_response labels Feb 27, 2023
@Tolsto
Copy link
Author

Tolsto commented Feb 28, 2023

It's mildly annoying. However, I have to explain to all new dbt users why this is happening as everybody thinks that something is wrong with their setup.

@github-actions
Copy link
Contributor

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

@github-actions github-actions bot added the Stale label Aug 27, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Sep 3, 2023

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 3, 2023
@samkessaram
Copy link

Commenting to signify continued interest; this would greatly improve the development experience!

@markproctor1
Copy link

This is an ongoing annoyance! We use around 20 database/schema combinations and each opens a tab.

  • Usually 1/20 or so will not load.
  • A wall of text is created on the terminal to reflect the wall of tabs.
  • If any tabs don't load the terminal hangs and can't be interrupted (tried extensively)
  • I close all tabs, I kill the terminal
  • Reconnect

Not show stopping but definitely highest annoyance.

@Xemptuous
Copy link

I am also having this issue with externalbrowser auth with Snowflake.

We use 4 threads, and on first connect we get 4 browser tab popups for auth - of which only one or so can actually work - but after that it's good to go. Problem is we have to pkill dbt and rerun for it to work. This auth lasts for a while until the token expires.

My team's solution has become to limit threads in dev to 1, but this limits performance. Would love to see a PR to address this!

@busbina
Copy link

busbina commented Jun 28, 2024

I set reuse_connections to True and now it opens 3 tabs, instead of 83, two of which look like they are for the history. Still a bug, but at least its manageable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Stale
Projects
None yet
Development

No branches or pull requests