-
Notifications
You must be signed in to change notification settings - Fork 23
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
98 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,98 @@ | ||
# Automatic dependency installation | ||
|
||
This document outlines dependency handling for the tools used by the template. | ||
|
||
## Summary | ||
|
||
The global SCons download can be replaced by a distribution of scons-local at the top-level of the template. This entirely removes the SCons dependency. | ||
|
||
Package dependencies for each language should be enumerated in config-global. We write a script in gslab_python that reads config-global and calls additional scripts in other languages to check for, install, and optionally update these dependencies. The python script is executed at runtime before SCons runs any build steps, and it only runs the language-specific script if its builder is active. All dependencies are tracked in the same way and installed without user interaction. | ||
|
||
We could ask people to download a distribution of `anaconda` for their bundled Python and R binaries. This seems like more trouble than just asking people to install them according to the basic instruction. | ||
|
||
There's no workaround for git, git-lfs, Stata, MATLAB, LyX, or LaTeX. These should (as necessary) be enumerated in the top-level `README.md` alongside any other dependencies. | ||
|
||
## Scons-local | ||
|
||
We should include a [scons-local](http://scons.org/faq.html#What.27s_the_difference_between_the_scons.2C_scons-local.2C_and_scons-src_packages.3F) distribution at the top level of template. | ||
|
||
> It's intended to be dropped in to and shipped with packages of other software that want to build with SCons, but which don't want to have to require that their users install SCons. | ||
The actual use is similar to that of scons, except we'd run the directory using | ||
|
||
```bash | ||
python ../scons/scons.py <optional args> | ||
``` | ||
|
||
instead of | ||
|
||
```bash | ||
scons <optional args> | ||
``` | ||
|
||
### Open questions | ||
|
||
* Do we include the unpacked directory (2.3 MB, assumed above) or a compressed archive of scons-local (650 KB). | ||
* Cloning will be slightly faster with the compressed archive. Though the difference is negligible compared to the pdfs. | ||
* SCons seems to assume that only the compressed archive is included. | ||
* There's room for paths to break when unpacking the archive. | ||
* There's one more step for each fresh clone if we only include the compressed archive. | ||
|
||
## Packages | ||
|
||
The proposed package management solution is parsimonious and standard, but it's off the beaten track as far as "real people" are concerned. Below I outline the way developers seem to manage their packages for Python and R. I also | ||
|
||
### Python | ||
|
||
Packages, versions, and remote locations can be loaded into `requirements.txt`. It looks like this | ||
|
||
```txt | ||
PyYaml | ||
-e git+https://github.com/gslab-econ/gslab_python@4.1.0#egg=gslab-tools | ||
``` | ||
|
||
These packages can be installed at the versions from the locations via | ||
|
||
```bash | ||
pip install -r requirements.txt | ||
``` | ||
|
||
This will overwrite other packages of the same name in the default package storage location. | ||
|
||
See [here](https://pip.pypa.io/en/stable/user_guide/) for more detail. Also note that we can give `requirements.txt` an arbitrary name. | ||
|
||
We could use `requirements.txt` to track all python package dependencies for each project, instead of tracking them in the `README.md`. This would cut down on clutter and speed up development. | ||
|
||
#### Open questions | ||
|
||
##### Virtual environments | ||
|
||
Most people seem to use a `requirements.txt` in conjunction with a project-specific virtual environment ("virtualenv"). The virtualenv keeps package installation local to the project, so `requirements.txt` can't overwrite packages the user has already installed. | ||
|
||
Python virtualenvs are [platform dependent and cannot be moved to machines with different environments](https://virtualenv.pypa.io/en/stable/userguide/#making-environments-relocatable). Real people use virtualenvs for their own sanity, but they're not something we can easily force on other users. I'm in favor of using `requirements.txt` for global package installation. | ||
|
||
##### `conda` | ||
|
||
We can supposedly use `conda` to create platform independent virtualenvs bound to a particular version of python. I spent a few hours playing with this and was finally able to load `PyYaml` in an environment, but I couldn't get `gslab_scons`. The `conda` method might be more robust, but it seems tough to work with. It also introduces dependencies on the `conda` distribution. | ||
|
||
### R | ||
|
||
We can use [`packrat`](http://rstudio.github.io/packrat/) to store a snapshot of all loaded R packages for a project in template/analysis. When a user (or scons) starts R in a `packrat` directory those packages (and no others) are automatically loaded. If any required package—including `packrat` itself`—isn't installed, then `packrat` will install that package as well. | ||
|
||
This seems like a great tool, it's developed by the rstudio team, and I'm definitely in favor of making it standard. We can move R package dependencies from `README.md` into `packrat` to help cut down on clutter and speed up development. | ||
|
||
#### Open questions | ||
|
||
##### Masking packages | ||
|
||
One issue is that `packrat` masks all packages not in a snapshot when you're working inside it's directory, so you might need to re-install `ggplot2` locally even if you have it globally. I could see this upsetting some users if code breaks in unexpected places or they need to install packages they're used to having. | ||
|
||
#### Subdirectories | ||
|
||
By default, `packrat` only runs when R is started from within it's directory, not any subdirectories. This could slow down interactive coding if packages are stored in `packrat` that aren't available system-wide. It's not a huge deal though, as missing packages can always be installed a la carte, and `packrat` can be `init`ed from any directory. | ||
|
||
This isn't a problem for `scons` because each script is executed from the same directory as the `SConstruct`, so we just need to put the `packrat` directory there. It's also not a problem if scripts in `source` load packages that aren't stored in `packrat`, as `packrat` will automatically attempt to install and save them at runtime. | ||
|
||
## Other Notes | ||
|
||
Maybe add link to [chocolatey](https://chocolatey.org/) for windows users. |