Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use profile specified in --profile with dbt init #7450

Merged
merged 5 commits into from
Sep 15, 2023

Conversation

ezraerb
Copy link
Contributor

@ezraerb ezraerb commented Apr 24, 2023

resolves #6154

Description

If --profile is specified with dbt init, and the profile exists, initialize the project with that as the profile instead of creating a new profile. If the profile does not exist, or the command is run inside an existing project, it fails with an error.

Checklist

@ezraerb ezraerb requested a review from a team April 24, 2023 20:32
@ezraerb ezraerb requested a review from a team as a code owner April 24, 2023 20:32
@cla-bot cla-bot bot added the cla:yes label Apr 24, 2023
@dbeatty10
Copy link
Contributor

Thanks for submitting this pull request @ezraerb !

The core team is hard at work making lots of improvements so it may be a while before an engineer is assigned to work on this PR.

In the meantime, #7609 is updating the same area of code (to resolve a different issue), so one or the other will need to be updated depending on the merge order. We'll probably want to merge #7609 first, so you'd need to resolve the conflicts afterwards.

@dbeatty10 dbeatty10 added the ready_for_review Externally contributed PR has functional approval, ready for code review from Core engineering label May 19, 2023
Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>
@ezraerb
Copy link
Contributor Author

ezraerb commented May 23, 2023

Sure thing. Let me know when I need to make edits after the other merge.

@dbeatty10
Copy link
Contributor

@ezraerb #7609 is merged now ✅

@ezraerb ezraerb requested review from a team as code owners May 25, 2023 04:12
@ezraerb ezraerb requested review from emmyoop and aranke and removed request for iknox-fa and a team May 25, 2023 04:12
@ezraerb
Copy link
Contributor Author

ezraerb commented May 25, 2023

Merged code has been pushed to my branch

core/dbt/task/init.py Outdated Show resolved Hide resolved
core/dbt/task/init.py Outdated Show resolved Hide resolved
with open(os.path.join(project.project_root, project_name, "dbt_project.yml"), "r") as f:
assert (
f.read()
== f"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this a fixture in a constant at file top or in a fixture.py?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking the other init tests shows that all of them use this pattern. Its probably worth getting more opinions before changing it.

core/dbt/task/init.py Outdated Show resolved Hide resolved
):
manager = Mock()
manager.attach_mock(mock_prompt, "prompt")
manager.prompt.side_effect = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious why Black didn't reformat this or the line below it to be the same.

core/dbt/task/init.py Outdated Show resolved Hide resolved
Copy link
Contributor

@VersusFacit VersusFacit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there! Thanks a bundle for all this great work to pull these arguments through such that they obey the stated flag interface!! 🙇‍♀️

I had some time for reviewing PRs and do a bunch of task directory work, so I offered some thought for how to rework your code to be inline with our conventions and practices.

I may not be the one to pull this PR over the finish line, but I hope my comments help you on your way. Welcome to the community as a first-time contributor 🎉

Comment on lines 300 to 317
# If the user specified an existing profile to use, use it instead of generating a new one
user_profile_name = getattr(get_flags(), "PROFILE", None)
if user_profile_name:
# Verify it exists. Can't use the regular profile validation routine because it assumes
# the project file exists
raw_profiles = read_profile(profiles_dir)
if user_profile_name not in raw_profiles:
print("Could not find profile named '{}'".format(user_profile_name))
sys.exit(1)
profile_name = user_profile_name
profile_specified = True
else:
profile_name = project_name
profile_specified = False
self.create_new_project(project_name, profile_name)

# Ask for adapter only if skip_profile_setup flag is not provided and no profile to use was specified.
if not self.args.skip_profile_setup and not profile_specified:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

User experience

From a user experience perspective, I don't think we need to raise an error if the specified profile is not found. I'd rather just create it anytime it doesn't exist.

Implementation

I defer to whoever ends up being the code reviewer for this, but see below for some suggestions for refactoring.

The run method is long and has a lot of conditionals, which makes it harder to read. Refactoring this would make it easier to maintain in the future.

So I'd suggest refactoring this logic into its own method (similar to how check_if_can_write_profile is its own method). Maybe something similar this (completely untested!):

def check_if_profile_exists(self, profile_name: Optional[str] = None) -> bool:
    profile_exists = False  # assume it doesn't exist unless proven otherwise
    user_profile_name = getattr(get_flags(), "PROFILE", None)
    profiles_dir = get_flags().PROFILES_DIR
    if user_profile_name:
        raw_profiles = read_profile(profiles_dir)
        profile_exists = user_profile_name in raw_profiles
    return profile_exists

Then this method can be applied in one or more places to use the specified profile if it exists (and create it otherwise).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For user experience, I think more input is needed. The risk with creating the profile if it does not exist is the classic typo problem, where a misspelling creates a new profile instead of using the one the user actually wanted. The requirements said "existing profile" so I put in the check.

For implementation, the test is only done once currently, but shrinking a big method is usually a good idea. Refactored.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you push the refactored code @ezraerb?

This is the most recent that I'm seeing:
image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still working on a few other comments, so have not pushed yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes have been pushed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My first instinct was to agree with @dbeatty10 on this point:

From a user experience perspective, I don't think we need to raise an error if the specified profile is not found. I'd rather just create it anytime it doesn't exist.

I definitely appreciate the typo annoyance: dbt init --profile defaut would lead to the writing of a whole new defaut profile, when you intended to use the existing default profile. In that case, it would be better to get an explicit error.

But it means that, when initializing a new project from scratch, you have exactly two options:

  1. Do not pass --profile flag: Initialize a new project and a new profile, with the same name as your project.
  2. Pass the --profile flag. It must match an existing profile.

If we took Doug's recommendation, there would be three options:

  1. Do not pass --profile flag to initialize a new project and a new profile. The profile name will match your supplied project name (reasonable default behavior).
  2. Pass the --profile flag, and its value matches an existing profile: Initialize a new project, do not write a new profile.
  3. Pass the --profile flag, and its value doesn't match an existing profile: Initialize a new project and a new profile. The profile name will match what you passed into the --profile flag.

We could even take this one step further, and provide the same flexibility that we offer when running dbt init within an existing project:

The profile <profile-name> already exists in /Users/jerco/.dbt/profiles.yml. Do you wish to overwrite it? [y/N]: N

I don't have strong feelings either way. As a heuristic to make this determination, I'm thinking about how there are two "modes" of dbt init:

  1. Interactive. Likely first time using dbt. Need to set up everything.
  2. Programmatic. Someone who has used dbt before, likely on this machine. Wants to skip the click interactivity and jump straight to dbt init <project_name> --skip-profile-setup, or dbt init <project_name> --profile <existing_profile_name>.

This --profile flag feels designed for persona / use case (2). The first user is less likely to want fine-grained control over exactly how the profile is being named — we should provide the easiest path from start to finish, with some sensible defaults along the way. And second user (slightly more experienced) is less likely to want the interactive walkthrough for setting up their profile.

Which is to say: I'm convinced enough that the behavior implemented in this PR is an acceptable user experience. We'll need to document the behavior in a new "Existing profile" section here: https://docs.getdbt.com/reference/commands/init

Comment on lines +316 to +318
if not self.args.skip_profile_setup:
profile_name = self.get_profile_name_from_current_project()
self.setup_profile(profile_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ezraerb Can you help me understand why we'd want or need self.get_profile_name_from_current_project() to be guarded by if not self.args.skip_profile_setup:?

Could we safely do the following outside of if statement?

            profile_name = self.get_profile_name_from_current_project()

If so, then we could avoid repeating self.setup_profile(profile_name) twice (once for each case) and instead we can just do it once at the very end (IFF profile setup is being skipped).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are four cases that need to be considered:

  1. Initializing inside an existing project, no profile specified. The profile is created (the code being commented on)
  2. Initializing inside an existing project, profile specified. This raises an error.
  3. Initializing outside an existing project, no profile specified. Create the project and then create the profile.
  4. Initializing outside an existing project, profile specified. Create the project only.

The difference between cases 3 and 4 prohibits having a catch-all profile creation at the end of the method. Either signal variables are needed or the calls need to be duplicated in the various paths. Previous viewers expressed a desire for the latter as they believe it is easier to follow.

@jtcohen6 jtcohen6 added the user docs [docs.getdbt.com] Needs better documentation label Sep 8, 2023
Copy link
Member

@emmyoop emmyoop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ezraerb this is looking great! We have some time set aside to help you get this over the finish line an merged in.

There's just one more small change I'd like to request. Updating the help text will make it more obvious that the --profile flag must refer to an existing profile.

The help text is here. Just adding in the word "existing" would be beneficial:

help="Which existing profile to load. Overrides setting in dbt_project.yml.",

Other than that you'll need to rebase off main to pull in the most recent version of core so we can run some tests. I'm not seeing a reason the current test is failing but we can get a better picture once you branch is updated.

@ezraerb
Copy link
Contributor Author

ezraerb commented Sep 15, 2023

I pushed an update with the help text update. When I did a merge from the main branch from upstream, git claimed there was nothing to merge. Should I do a full rebase instead?

@codecov
Copy link

codecov bot commented Sep 15, 2023

Codecov Report

❗ No coverage uploaded for pull request base (main@7c1bd91). Click here to learn what that means.
Patch has no changes to coverable lines.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #7450   +/-   ##
=======================================
  Coverage        ?   86.55%           
=======================================
  Files           ?      175           
  Lines           ?    25638           
  Branches        ?        0           
=======================================
  Hits            ?    22192           
  Misses          ?     3446           
  Partials        ?        0           
Flag Coverage Δ
integration 83.34% <0.00%> (?)
unit 65.10% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@emmyoop emmyoop self-requested a review September 15, 2023 14:41
Copy link
Member

@emmyoop emmyoop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ezraerb this looks great! Thanks for making that last small change. The tests are passing and we're ready to merge this in! Thanks for your work on getting this fixed!

@emmyoop emmyoop merged commit 3f5ebe8 into dbt-labs:main Sep 15, 2023
48 checks passed
@FishtownBuildBot
Copy link
Collaborator

Opened a new issue in dbt-labs/docs.getdbt.com: dbt-labs/docs.getdbt.com#4080

peterallenwebb pushed a commit that referenced this pull request Sep 25, 2023
* Use profile specified in --profile with dbt init

* Update .changes/unreleased/Fixes-20230424-161642.yaml

Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>

* Refactor run() method into functions, replace exit() calls with exceptions

* Update help text for profile option

---------

Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>
peterallenwebb added a commit that referenced this pull request Sep 29, 2023
* Add new get_catalog_relations macro, allowing dbt to specify which relations in a schema the adapter should return data about

* Implement postgres adapter support for relation filtering on catalog queries

* Code review changes adding feature flag for catalog-by-relation-list support

* Use profile specified in --profile with dbt init (#7450)

* Use profile specified in --profile with dbt init

* Update .changes/unreleased/Fixes-20230424-161642.yaml

Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>

* Refactor run() method into functions, replace exit() calls with exceptions

* Update help text for profile option

---------

Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>

* add TestLargeEphemeralCompilation (#8376)

* Fix a couple of issues in the postgres implementation of get_catalog_relations

* Add relation count limit at which to fall back to batch retrieval

* Better feature detection mechanism for adapters.

* Code review changes to get_catalog_relations and adapter feature checking

* Add changelog entry

---------

Co-authored-by: ezraerb <ezraerb@alum.mit.edu>
Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>
Co-authored-by: Michelle Ark <MichelleArk@users.noreply.github.com>
QMalcolm pushed a commit that referenced this pull request Oct 9, 2023
* Add new get_catalog_relations macro, allowing dbt to specify which relations in a schema the adapter should return data about

* Implement postgres adapter support for relation filtering on catalog queries

* Code review changes adding feature flag for catalog-by-relation-list support

* Use profile specified in --profile with dbt init (#7450)

* Use profile specified in --profile with dbt init

* Update .changes/unreleased/Fixes-20230424-161642.yaml

Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>

* Refactor run() method into functions, replace exit() calls with exceptions

* Update help text for profile option

---------

Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>

* add TestLargeEphemeralCompilation (#8376)

* Fix a couple of issues in the postgres implementation of get_catalog_relations

* Add relation count limit at which to fall back to batch retrieval

* Better feature detection mechanism for adapters.

* Code review changes to get_catalog_relations and adapter feature checking

* Add changelog entry

---------

Co-authored-by: ezraerb <ezraerb@alum.mit.edu>
Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>
Co-authored-by: Michelle Ark <MichelleArk@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla:yes ready_for_review Externally contributed PR has functional approval, ready for code review from Core engineering user docs [docs.getdbt.com] Needs better documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CT-1418] [Bug] CLI argument --profile ignored in dbt init
6 participants