Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The organisation is an integer #88

Closed
wants to merge 4 commits into from

Conversation

Fokko
Copy link
Contributor

@Fokko Fokko commented May 23, 2020

While rebasing #85 against the latest master, I've encountered an issue with the connection:

root@f425d97e576a:/dbtlake# dbt docs generate
Running with dbt=0.17.0-rc2
Encountered an error while reading profiles:
  ERROR Runtime Error
  Credentials in profile "default", target "dev" invalid: 1285381041234567 is not of type 'string'
Defined profiles:
 - default

With the following config (slightly modified of course):

 default:
   target: dev
   outputs:
     dev:
       method: http
       type: spark
       schema: fokko
       organization: "1285381041234567"
       host: "westeurope.azuredatabricks.net"
       port: 443
       token: "..."
       cluster: "..."
       connect_retries: 20
       connect_timeout: 60
       threads: 8

 config:
   send_anonymous_usage_stats: False

@Fokko
Copy link
Contributor Author

Fokko commented May 23, 2020

The big question here is, can an organization start with a zero 🤔

@jtcohen6
Copy link
Contributor

jtcohen6 commented May 25, 2020

Hey @Fokko, check out this similar issue: dbt-labs/dbt-core#2384

The resolution there was to introduce a special Jinja filter as_text (dbt-labs/dbt-core#2395). I'm hopeful that would work for you here, and enable someone to also include a leading 0 if need be.

@Fokko
Copy link
Contributor Author

Fokko commented Jun 1, 2020

I just checked, and this works at the latest master. However, I think it would be better to change it to an integer. the as_text solution feels a bit awkward.

@jtcohen6
Copy link
Contributor

jtcohen6 commented Jun 1, 2020

I'm all for switching this to be an integer if we feel good about that going forward. I'm cognizant of the README note added by this older commit: 9ab1da3. (I've since removed the full comment only to summarize it a bit more succinctly.) Have you ever seen a non-numeric organization ID, or one that requires a leading 0?

If you feel good about this change, could you update the "Configuring your profile" section of the README accordingly?

@Fokko
Copy link
Contributor Author

Fokko commented Jun 1, 2020

For a current project, we have many (20+) workspaces, and I haven't seen any with a leading zero, so I think we're safe on this one. The Python ints are of unlimited size, so I think we're safe there as well. @poidra02 any thoughts on this one?

@Fokko
Copy link
Contributor Author

Fokko commented Jun 8, 2020

@jtcohen6 Great work on the 0.17 release! Would it be possible to include this in 0.17 dbt-spark? :)

@poidra02
Copy link
Contributor

poidra02 commented Jun 8, 2020

hey @Fokko - sorry for the delay in getting back to you. I could get on board, but there are a couple things that I suggest we consider:

  • this org id is a field entirely owned by Databricks, and I haven't seen any guidance around how it's generated. whether it will stay the same now or into the future is an open question. the original choice to keep it a string was to avoid chasing any changes given we were reverse engineering it.
  • it looks like Azure Databricks will keep the workspace IDs in the URL, but they're changing the URL to be region independent (which is a good thing): [https://docs.microsoft.com/en-us/azure/databricks/workspace/migrate-workspace-urls]. while there's no timing the old URL will go away, if we do make changes, we should consider moving to the new URL format.

if we feel that the risk of chasing changing workspace ids/orgs is small (or the level of effort to change and test that is small) then I could get behind the change. though the choice to make it a strings was deliberate to avoid that obviously, and I'm not sure I'm understanding just yet what challenge a string is introducing here to weigh the trade off myself.

but happy to talk about it further...

@jtcohen6
Copy link
Contributor

jtcohen6 commented Jun 9, 2020

this org id is a field entirely owned by Databricks, and I haven't seen any guidance around how it's generated.

I agree with this hesitation. I asked some folks from Databricks with whom I've been in contact whether an organization ID can ever contain non-numeric characters or a leading 0. I'm waiting on a response from them.

Given that this is functional (albeit clunky) with organization ID as a string, and that switching it to be an integer has potential risk—likelihood unknown—of being a breaking change for some future accounts, I'm inclined to keep it as a string for now until we receive more explicit direction from Databricks.

@jtcohen6
Copy link
Contributor

This will be resolved by the change we made to native env rendering in v0.17.1 (discourse)

@jtcohen6 jtcohen6 mentioned this pull request Jul 20, 2020
@jtcohen6 jtcohen6 closed this in #99 Jul 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants