-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move starter project into dbt
repo
#3474
Conversation
'py.typed', | ||
] | ||
}, | ||
include_package_data = True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple articles I read said to use include_package_data
instead of package_data
with the MANIFEST.in file so that's why I made this change. You don't have to list the files then.
Example:
https://newbedev.com/how-include-static-files-to-setuptools-python-package
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like it! We did the same in dbt-spark a few months ago: dbt-labs/dbt-spark#151. To quote a wise person, "This is much easier."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work!! I just took this for a spin, the new approach seems great.
- Did we lose the model files (
models/example/*.sql
) themselves? - Have you checked in with our Experience colleagues about whether this approach is something they could leverage in the future (rather than reimplementing)?
- I left some stylistic comments, things I've long been annoyed by but never mustered the energy to change. Now's our chance!
- I also left a comment around some related-but-separate improvements to
init
. It shouldn't be considered blocking to this PR, but if we could find a neat way to do it, it will have been a long time coming. - While taking this for a spin, I played around with a fix to dbt init --adapter default to the available adapter #2814. I'll open a separate PR for that based off this branch.
name: 'my_new_project' | ||
version: '1.0.0' | ||
config-version: 2 | ||
|
||
# This setting configures which "profile" dbt uses for this project. | ||
profile: 'default' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would love if we could figure out a way to automatically populate name
and profile
with the project_name
argument supplied to dbt init
. Right now, that project_name
arg is just used to name the file directory, but not the actual project name. Pretty confusing! Also, we tend to discourage using a profile named default
, and yet here we are...
I know all we're really doing here is shutil.copytree
. Is there any sane way to try editing the files after copying?
For reference, I was just triaging a related issue, so I mentioned this over there as well: #3462 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a lot of expertise with jinja templates :). Perhaps we could make a dbt_project jinja template and substitute the project name. Or just do a python substitution if it's simpler. I guess jinja would only make sense if there are other ways we could leverage it to customize the project file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh, you're not wrong. Back in the day, we did a lot of customization for client projects over here: https://github.com/fishtown-analytics/dbt-init
The purpose there was cross-applying some pretty opinionated configs for a given adapter. I think, for our purposes, we shouldn't go much further beyond name
and profile
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll take a crack at doing this in another PR to follow this one up. So we want project_name
to be the name
and profile
field in the dbt_project.yml file (just to verify)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's right!
'py.typed', | ||
] | ||
}, | ||
include_package_data = True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like it! We did the same in dbt-spark a few months ago: dbt-labs/dbt-spark#151. To quote a wise person, "This is much easier."
🤦♀️ I had my global .gitignore file set to filter out sql and I missed this so good catch |
Co-authored-by: Jeremy Cohen <jeremy@fishtownanalytics.com>
Co-authored-by: Jeremy Cohen <jeremy@fishtownanalytics.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 looks good!
I confirmed that the source and wheel distributions include the starter project files. In the dbt/core
directory, I ran python setup.py sdist bdist_wheel
which creates a source (*.tar.gz
) and a wheel (*.whl
) distribution. I installed these in separate temporary virtualenvs and confirmed the project existed and init
command worked for both.
I think if we want a comprehensive project initialization, we could consider using a scaffolding tool like cookiecutter. This might be overkill for the current starter project thought.
We've had this idea before: dbt-labs/dbt-init#31. I'm open to it! Agree that it's probably not necessary for now, but might be a good idea if we want |
Addresses issue #3005
Description
I moved the starter_project from its own repo into this repo. I copied the starter project from the
dbt-yml-config-version-2
tag.This will stop cloning the project from the other repo and instead will package the starter project within the dbt release and then copy it to the given directory.
Checklist
CHANGELOG.md
and added information about my change to the "dbt next" section.