Skip to content

Python Environment Search

Don Jayamanne edited this page May 20, 2024 · 6 revisions

Environment types supported and info retrieved

Kind Env Path Python Exe Python Version Tool Exe Tool Version Python run Command
Notes sysprefix
More... Same as `sysprefix`
Path to environment folder.
In the case of conda, this is conda env folder, even if Python is not found in it.
More... Except for Conda Environments, as we can have conda without python.
Such as a Conda Env for Rust or Java, etc
E.g. 3.12.1
More... Must contain all 3 parts of version.
`3.12` is not enough, must be `3.12.1`. Can also be `3.10.0-a1` or the like.
Path to the tool
More... E.g. fully qualified path to conda, pyenv, etc
Version of the tool
More... Version of the tool, such as conda, pyenv, etc
Cmd to run Python
More... E.g. when dealing with global python install, then this value is `[]`.
However for conda this would be `[, run, -n, python]`.
Windows Registry N/A N/A
Windows Store
❌ More... Best we can get is first two parts of version (e.g. 3.12).
However this has been deemed as not deseirable.
N/A N/A
PyEnv
❌ More... No way to get this without spawning the pyenv executable.
Conda
PipEnv
❌ More... This can be installed anywhere. Even if found globally, user can install pipenv into a virtual environment or some other Python Env. It can be done, but cannot guarantee that it will be the right one. It will be best guess effort. Everything (other items discovered/identified) else is guaranteed to be correct.
❌ More... Same as tool exe
Homebrew N/A N/A
VirtualEnvWrapper
❌ More... This can be installed anywhere. Even if found globally, user can install pipenv into a virtual environment or some other Python Env. It can be done, but cannot guarantee that it will be the right one. It will be best guess effort. Everything (other items discovered/identified) else is guaranteed to be correct.
❌ More... Same as tool exe
Venv N/A N/A
VirtualEnv
❌ More... This can be installed anywhere. Even if found globally, user can install pipenv into a virtual environment or some other Python Env. It can be done, but cannot guarantee that it will be the right one. It will be best guess effort. Everything (other items discovered/identified) else is guaranteed to be correct.
❌ More... Same as tool exe

TODO

  • Searching Python executables in current Path
  • Getting missing information of Python environments (version, envPath, etc) Currently assumption is this will be done in TS layer.
  • Searching for Python in Workspace folders.

Design principles

  • Avoid spawning processes (this is slow, e.g. conda)
  • Async only if required
    Sync version has proved to be fast enough not to warrant async (not yet).
  • Report environments as they are discovered.
    Not done Unfortunately this hasn't been done in the current implementation. The perf was fast enough and we find environments in a few milliseconds. However if users mounted a network drive, then this could take a few seconds. Thus finding all conda envs in a network drive could be very slow, reporting them as thye are found is more efficient.
  • Identify environment information in one sweep.
    Avoid incremental buildup of env information (unnecessary and problematic).

Global Search algorithm

Details
  1. First search registry Why? Because this is a reliable source of truth.
    It will never contain duplicates.
    Will always contain latest & accurate information.
    It can point to Conda environments, and we can search for conda from the path provided in registry.

  2. Search virtualenvwrapper Why? Because there's a specific location where these are stored.
    It will never contain duplicates, no symlinks.
    If the envs in the special location are not pipenv, they they must be virtualenvwrapper. & nothing else.

  3. Search pyenv Why? Because there's a specific location where these are stored.
    It will never contain duplicates, no symlinks.

  4. Homebrew Why? Because there's a specific location where these are stored.
    It will contain symlinks, but they are in known locations.

  5. Conda Why? Its impossible for environments to be mistakenly identified as conda.
    Its either a conda env or not.
    But this is done after pyenv and windows registry as both of those can also contain conda.

  6. Windows Store This is done after Registry.

  7. Finally go through all known directories where environments can be found. E.g. /usr/bin, /usr/local/bin, /opt, etc & look for Python environments.
    These environments can only be one of

  • virtualenvwrapper
  • pipenv
  • venv
  • virtualenv

At every stage we need to ensure the subsequent search does not try to process and environment that was already processed in a previous step.
This way we avoid duplicates.
Some of the above steps can definitely be done in parallel.

Details of each Environment types

Details

1. Windows Registry

Details

OS

  • Windows

Limitations

  • Currently only looks for installations from PythonCore and ContinuumAnalytics keys. Should we look iterate through all the keys under HKLM/Software & HKCU/Software ? @karthiknadig /cc

Pseduo code for algorithm

Pseudo code
for company of [PythonCore, ContinuumAnalytics]:
    for each key in [HKLM, HKCU]:
        for each installed_version in `<key>/Software/Python/<company>`
            // installed_version are values like 3.12, 3.10, 3.9, etc
            install_key = `<key>/Software/Python/<company>/<installed_version>InstallPath`
            env_path = `install_key/(Default)`
            exe = `install_key/(ExecutablePath)`

            if company == PythonCore and `exe` exists on disc:
                version = `install_key/(Version)` // SysVersion contains only first 2 parts of version
                display_name = `install_key/(DisplayName)`
                

                👍 track this environment

            else if company == ContinuumAnalytics and `exe` exists on disc:
                We treat this as a conda env
                👍 Now use `conda` algorithm to get all conda environments under the directory `env_path`.

2. Windows Store

Details

OS

  • Windows

Limitations

  • No way to get the full version (we can only get the first two part of the version)

Pseduo code for algorithm

Pseudo code
for each directory under `<home>/AppData/Local/Microsoft/WindowsApps`:
    if directory does not start with `PythonSoftwareFoundation.Python.`:
        continue

    if `python.exe` does not exists in the directory:
        continue
    app_model_key = `HKCU/Software/Classes/Local Settings/Software/Microsoft/Windows/CurrentVersion/AppModel`;
    package_name = `<app_model_key>/SystemAppData/<directory name>/Schemas/(PackageFullName)`
    key = `<app_model_key>/Repository/Packages/<package_name>`
    env_path = `<key>/(PackageRootFolder)`
    display_name = `<key>/(DisplayName)`
    exe = `python.exe`
    // No way to get the full version information.
    👍 track this environment

3. Homebrew

Details

OS

  • MacOS
  • Linux

** Notes**

Pseduo code for algorithm

Pseudo code
homebrew_dir = find this folder (either `HOMEBREW_PREFIX` or default to directory defined here https://docs.brew.sh/Installation)
for each file under `<homebrew_dir>/bin`:
    if we have a python executable and its a symlink, then proceed
    if not, then skip this file

    resolve the symlink and verify the file is in one of the known homebrew directories.
    if not, then skip this file

    Extract the version from the file path.

    Compute the env_path by extracting the version information.
    The env_path is known directories on MacOS (Intel & Silicon) and Linux.

4. Conda

Details

OS

  • MacOS
  • Linux
  • Windows

** Notes**

  • Structure of conda envionment
    • A conda install folder will have a default environment with Python
    • the envs sub directory in a conda installation contains all conda environments belonging to that conda installation
  • Conda environments
    • This is computed from a list of hardcoded locations (for windows, linux, macos)
    • This includes locations in .condarc file
    • This includes locations in environments.txt file
    • This includes all sub directories under the env directory in the all known conda installation folders
  • .condarc
    • The .condarc file can be configured to store conda environments in other locations instead of <conda install folder>/envs
    • There can be multiple conda installations on a machine, each with its own set of environments
    • There can be multiple directories defined in .condarc file to store conda environments
    • The environments.txt contains all known conda environments on the current machine (including base env, which is the same as install location of conda)
    • The directories returned by conda info --json are the directories where conda environments are stored
  • Version of conda can be found in <conda install folder>/conda-meta/conda-<version>.json file
  • User can install multiple versions of conda,
    • In usual (default) locations & also in custom locations
    • Also can be installed via pyenv, in which case the conda installations are located under ~/.pyenv/versions/<conda install version>
  • conda-meta\<package name>-<version>.json files
    • This contains information about all conda packages installed into a conda environment.
    • The root (base) conda environment too has such a directory.
    • Given conda is a package in its own right, the version of conda can also be found in the conda-meta directory.
      (after all, you can update conda using conda update conda).
    • Similarly if Python is installed in a conda environment, the version of Python can be found in the conda-meta directory.
  • conda-meta\history file
    • This contains a history of all installations of this conda environment.
    • One key information is the command used to create the conda environment itself.
    • The entry looks like so # cmd: /Users/donjayamanne/.pyenv/versions/miniconda3-latest/bin/conda create -n myenv python=3.10
    • Thus using the history file we can find the conda installation folder. This is useful in cases where conda environments are created using -p option.

What if conda is installed in some custom locations that we have no idea about? In such cases the assumption is that the environments.txt file will contain an entry to the base env. Using that information we can get the conda directory and get the conda exe and version info.

Even if environments.txt file is empty, we will look for environments in known locations and from there we can find the conda install folder (recall history file).

What if we have a custom conda env created in current workspace folder, and we do not know where Conda is installed? In such cases we can just inspect the conda-meta/history file in the conda env folder and get the conda installation folder.

How do we generate command to run Python in an environment? If the Conda env is in the envs folder, then use <conda exe> -n <env name> python. If the Conda env is the root (base) folder, then use <conda exe> -n base python. For all other cases use <conda exe> -p <fully qualified path to folder> python.

Pseduo code for algorithm

Pseudo code
// Step 1
// Get a list of all known directories where conda environments can be found
// 1. environments.txt file
// 2. .condarc file
// 3. Other known locations

// Step 2
// We hardcode some of the commonly known install directories of conda, miniconda, miniforge, etc for all platforms.
for each known_install_folder in [<home>/anaconda3, <home>/miniconda3, etc]:
    conda_exe = `<known_install_folder>/bin/conda` (windows is slightly different)
    conda_version = `<known_install_folder>/conda_meta/conda-<version>.json`
    python_exe = `<known_install_folder>/bin/python` (windows is slightly different)
    python_version = `<known_install_folder>/conda_meta/conda-<version>.json`

    // Step 2.1
    // We now have conda exe, version, default python information
    // Conda run command is computed as `[<fully qualified path to conda_exe>, run, -n, <name> python]`
    for each env in `<known_install_folder>/envs`:
        // env is a conda environment
        // Find python exe and version in the conda-meta directory
        // If python is not found, then this is a conda env without Python.
        // These are named environments that are activated (run) using the `-n` option.

    // Previously we captured a list of all known conda envs 
    // Go through those one by one and inspect the conda-meta/history file
    // And check whether that env was created by this current conda installation
    // If so, then associate that env with this conda installation
    // Next remove that env folder from the list captured in step 1.

// Step 3
// Finally go through all of the remaining conda envs that were captured in step 1
// & did not get removed by step 2.1.
// Go into the env folder one by one
// Inspect the conda-meta/history file and try to find the installation location of the conda by parsing the `cmd:` line.
// If we find a conda installation, then process that folder as we did inside step 2.1

5. Pyenv

Details

OS

  • MacOS
  • Linux
  • Windows

** Notes**

  • Structure of pyenv directory
    • The versions folder contains all installed versions/copies of Python
  • Windows has a separate implementation of pyenv
    • Windows version does not support installation of various flavours of Python, not even conda.
  • For Python we only focus on the standard python installations shipped by pyenv. I.e. jython, pypy, etc are not considered. We only look for folders in versions directory that contains just version info.
  • The version of pyenv can be extracted from the file path of pyenv installation. If installed using homebrew, then this is installed in /opt/hombrew/bin/pyenv, /usr/local/bin/pyenv & that resolves to the real path /opt/homebrew/Cellar/pyenv/2.4.0/libexec/pyenv, /usr/local/Cellar/pyenv/2.4.1/libexec/pyenv

Conda:

  • The versions directory is the only place where Python installations are stored.
  • Conda can be installed into the versions directory via pyenv.
  • Identifying a conda install is easy, just look for conda exe and a few other files. Once we find such folders, we pass that onto the conda algorithm to get all conda environments. These are treated as conda environments and not pyenv.

Virtual Envs:

  • Using the pyenv virtual envs plugin, one can create virtual environments.
  • These are also located in the same versions directory.
  • However these have a pyvenv.cfg file in the root of the environment.
  • When creating virtual envs using this plugin, pyenv creates an envs directory in the root of the python envionment that was used to create this env.
    I.e. the virtual env is symlinked in ~/.pyenv/versions as well as ~/.pyenv/<python version>/envs. We never list these environments (under the envs directory), no need, its just a duplicate.

Commands used to run Python:

  • For conda envionments, the commands used to run/activate the envs are generated by the conda algorithm.
  • For all others we just use <fully qualified python exe> as the command.

Pseduo code for algorithm

Pseudo code
for each dir in pyenv directory:
    if dir is a conda installation folder:
        // Pass this onto the conda algorithm
        continue

    version = extract version from folder name
    env_path = the directory itself

    if <dir>/pyvenv.cfg exists:
        // This is a virtual env in pyenv
    else
        // This is a regulary pyenv installation    

6. PipEnv

Details

OS

  • MacOS
  • Linux
  • Windows

** Notes**

  • They have a .project file in the root of the environment This file contains the path to the project directory thats associated with this environment.
  • They have a .pyvenv.cfg file in the root of the environment This file contains the version of Python used in this environment.
  • Very similar to virtualenvwrapper environments, but the difference is that the project will have a Pipfile and/or Pipfile.lock files. Pseduo code for algorithm
Pseudo code
for each dir in globally known directories:
    if `.project` file exists in the directory:
        proceed
    if the path in the `.project` file contains a `Pipfile` file:
        // This is a pipenv environment

    version = extracted from `.pyvenv.cfg` file

7. VirtualEnvWrapper

Details

OS

  • MacOS
  • Linux
  • Windows

** Notes**

  • They are regular Python environments created a specific location
  • The location is defined in the WORKON_HOME environment variable.
  • Else defaults to ~/.virtualenvs
  • They too have a have a .project file in the root of the environment This file contains the path to the project directory thats associated with this environment.
  • They have a .pyvenv.cfg file in the root of the environment This file contains the version of Python used in this environment.
  • Very similar to pipenv environments, but the difference is that the project directory will not have Pipfile or Pipfile.lock files. Pseduo code for algorithm
Pseudo code
for each dir in $WORKON_HOME:
    if this is not a pipenv environment:
        proceed

    if `.project` file exists in the directory:
        // This is a virtualenvwrapper
    version = extracted from `.pyvenv.cfg` file

8. venv

Details

OS

  • MacOS
  • Linux
  • Windows

** Notes**

  • They have a .pyvenv.cfg file in the root of the environment This file contains the version of Python used in this environment.

Pseduo code for algorithm

Pseudo code
for each dir in $WORKON_HOME:
    version = extracted from `.pyvenv.cfg` file

9. VirtualEnv

Details

OS

  • MacOS
  • Linux
  • Windows

Notes

  • They do not have .pyvenv.cfg file
  • They have activation scripts in the bin/scripts directory
Clone this wiki locally