From 66661a7445a8b122c523b91074445c233ffda33d Mon Sep 17 00:00:00 2001 From: Benjamin Straus <37525700+bstraus1@users.noreply.github.com> Date: Mon, 12 Oct 2020 14:34:38 -0400 Subject: [PATCH] Remove the uncertainty forest tutorials These tutorials are are on the doc branch in their most up-to-date form. --- ...taintyForest_Tutorial_1-Installation.ipynb | 83 ---------- ...aintyForest_Tutorial_2-Package-Setup.ipynb | 72 -------- ...ertaintyForest_Tutorial_3-Data-Setup.ipynb | 156 ------------------ 3 files changed, 311 deletions(-) delete mode 100644 tutorials/UncertaintyForestTutorials/UncertaintyForest_Tutorial_1-Installation.ipynb delete mode 100644 tutorials/UncertaintyForestTutorials/UncertaintyForest_Tutorial_2-Package-Setup.ipynb delete mode 100644 tutorials/UncertaintyForestTutorials/UncertaintyForest_Tutorial_3-Data-Setup.ipynb diff --git a/tutorials/UncertaintyForestTutorials/UncertaintyForest_Tutorial_1-Installation.ipynb b/tutorials/UncertaintyForestTutorials/UncertaintyForest_Tutorial_1-Installation.ipynb deleted file mode 100644 index 41386e56e0..0000000000 --- a/tutorials/UncertaintyForestTutorials/UncertaintyForest_Tutorial_1-Installation.ipynb +++ /dev/null @@ -1,83 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Tutorial Overview\n", - "This set of five tutorials (installation, package setup, data setup, running, analyzing) will explain the UncertaintyForest class. After following the steps below, you should have the ability to run the code on your own machine and interpret the results.\n", - "\n", - "# 1. Installation\n", - "## *Goal: Clone the repository on your local machine and understand what it includes*\n", - "\n", - "### Let's clone the repository\n", - "Steps:\n", - "1. Open the command line on your local machine (called \"Terminal\" on Mac)\n", - "2. Navigate to the location where you'd like to put the repository.\n", - " 1. Find a location in a file explorer (\"Finder\" on Mac)\n", - " 2. Type \"cd \" in the command prompt\n", - " 3. Drag and drop the folder where you'd like to place the repository from the file explorer to the command line\n", - " The command prompt should show something like:\n", - " `bstraus@BS-Mac ~ % cd /Users/bstraus/Desktop `\n", - "3. Type `git clone REPOSITORY_URL` where `REPOSITORY_URL` is replaced by the URL of the neurodata/progressive-learning repository (as of 2020-09-21, it is https://github.com/neurodata/progressive-learning)\n", - "4. Wait for the process to finish. You'll know it's done because you'll see the first part of the command prompt pop up. For me, that looks like: `bstraus@BS-Mac ~ %`\n", - "\n", - "Congrats! You've now cloned the progressive-learning repository.\n", - "\n", - "Last step here, install the package with:\n", - "`python3 setup.py install`\n", - "\n", - "### Let's take a tour\n", - "Currently, you're looking at this tutorial, which lives in progressive-learning/tutorials/.\n", - "This folder also currently houses a notebook running one of the experiments.\n", - "\n", - "In the root directory, we have:\n", - "* `progressive-learning/docs` : contains files that will tell you requirements (we'll use this later), contributing guidelines, and some other administrative files\n", - "\n", - "* `progressive-learning/experiments` : contains notebooks and results for many of the experiments that utilize the functions/classes in the repository\n", - "\n", - "* `progressive-learning/proglearn` : the heart of the repository containing the python files for the progressive learning classes. We'll focus on the UncertaintyForest class which lives in the `forest.py` file in this directory.\n", - "\n", - "* `progressive-learning/tests` : contains python files for various tests\n", - "\n", - "* `progressive-learning/tutorials` : contains python notebooks (like this one) that will guide you through using the classes in the repository and running the experiments\n", - "\n", - "In future notebooks of this tutorial, we'll discuss how to prepare to run the code for the UncertaintyForest class. That code lives in the `progressive-learning/proglearn/forest.py` file. \n", - "\n", - "But, for now, we'll prepare to do that by making a virtual environment and installing the required packages to run that code.\n", - "\n", - "### You're done with part 1 of the tutorial!\n", - "\n", - "### Move on to part 2 (called \"UncertaintyForest_Tutorial_2-Package-Setup\")\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.7.0" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/tutorials/UncertaintyForestTutorials/UncertaintyForest_Tutorial_2-Package-Setup.ipynb b/tutorials/UncertaintyForestTutorials/UncertaintyForest_Tutorial_2-Package-Setup.ipynb deleted file mode 100644 index 8b244c65ab..0000000000 --- a/tutorials/UncertaintyForestTutorials/UncertaintyForest_Tutorial_2-Package-Setup.ipynb +++ /dev/null @@ -1,72 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Tutorial Overview\n", - "This set of five tutorials (installation, package setup, data setup, running, analyzing) will explain the UncertaintyForest class. After following the steps below, you should have the ability to run the code on your own machine and interpret the results.\n", - "\n", - "If you haven't seen it already, take a look at the first part of this set of tutorials called `UncertaintyForest_Tutorials_1-Installation`\n", - "\n", - "# 2: Package Setup\n", - "## *Goal: Create a virtual environment and install requirements per requirements.txt in order to run the UncertaintyForest class*\n", - "\n", - "### First, let's create the virtual environment \n", - "**Note:** that the following instructions were designed for Mac operating systems. If you're running another OS, look for the equivalent steps tailored to that OS.\n", - "\n", - "1. Open the command line on your local machine (called \"Terminal\" on Mac)\n", - "2. Navigate to the location where you'd like to put the virtual environment.\n", - " 1. Find a location in a file explorer (\"Finder\" on Mac)\n", - " 2. Type \"cd \" in the command prompt\n", - " 3. Drag and drop the folder where you'd like to place the virtual environment from the file explorer to the command line\n", - " The command prompt should show something like:\n", - " `bstraus@BS-Mac ~ % cd /Users/bstraus/Desktop `\n", - "3. Create the virtual environment by typing `python3 -m venv UncertaintyForestEnv`\n", - "\n", - "### Next, let's install the requirements for running the UncertaintyForest class\n", - "4. Activate the virtual environment by typing `source UncertaintyForestEnv/bin/activate`\n", - "5. Navigate to the folder `progressive-learning/docs/`. You can do this with the same process as in step 2 above.\n", - "5. Install necessary packages by typing `pip install -r requirements.txt`\n", - "6. You'll also want to install the following packages by typing the code below:\n", - " 1.`pip install jupyterlab`\n", - " 2.`pip install notebook`\n", - " 3.`pip install numpy scipy pandas scikit-learn matplotlib seaborn joblib keras tensorflow tqdm ipywidgets`\n", - "\n", - "You now have set up your virtual environment and installed necessary packages. Note that you'll need to activate your virtual environment each time you want to run things for this class. You can do this easily by repeating steps 1, 2, and 4.\n", - "\n", - "### You're done with part 2 of the tutorial!\n", - "\n", - "### Move on to part 3 (called \"UncertaintyForest_Tutorial_3-Data-Setup\")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.7.0" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/tutorials/UncertaintyForestTutorials/UncertaintyForest_Tutorial_3-Data-Setup.ipynb b/tutorials/UncertaintyForestTutorials/UncertaintyForest_Tutorial_3-Data-Setup.ipynb deleted file mode 100644 index c03dd3966e..0000000000 --- a/tutorials/UncertaintyForestTutorials/UncertaintyForest_Tutorial_3-Data-Setup.ipynb +++ /dev/null @@ -1,156 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Tutorial Overview\n", - "This set of five tutorials (installation, package setup, data setup, running, analyzing) will explain the UncertaintyForest class. After following the steps below, you should have the ability to run the code on your own machine and interpret the results.\n", - "\n", - "If you haven't seen it already, take a look at the first and second parts of this set of tutorials called `UncertaintyForest_Tutorials_1-Installation` and `UncertaintyForest_Tutorial_2-Package-Setup`\n", - "\n", - "# 3: Data Setup\n", - "## *Goal: Understand the data and the parameters that will be passed to the UncertaintyForest instance*" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### First, we have to import some modules to have everything we need. " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The top two sections are standard packages, the third block is just specifying where to look for the packages listed below, the fourth block is another standard package, and the final block is for importing the actual UncertaintyForest class." - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [], - "source": [ - "import numpy as np\n", - "import seaborn as sns\n", - "import matplotlib.pyplot as plt\n", - "\n", - "from sklearn.ensemble import RandomForestClassifier\n", - "from sklearn.calibration import CalibratedClassifierCV\n", - "from sklearn.model_selection import train_test_split\n", - "from sklearn.ensemble import BaggingClassifier\n", - "from sklearn.tree import DecisionTreeClassifier\n", - "\n", - "from tqdm.notebook import tqdm\n", - "from joblib import Parallel, delayed\n", - "\n", - "from proglearn.forest import UncertaintyForest" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Now, we create the function that will make data that we'll train on.\n", - "Here, we use randomized data because if the learner can learn that, then it can learn most anything." - ] - }, - { - "cell_type": "code", - "execution_count": 14, - "metadata": {}, - "outputs": [], - "source": [ - "def generate_data(n, d, var): \n", - " '''\n", - " Parameters\n", - " ---\n", - " n : int\n", - " The number of data to be generated\n", - " d : int\n", - " The number of features to generate for each data point\n", - " var : double\n", - " The variance in the data\n", - " '''\n", - " # create the mean matrix for the data (here it's just a mean of 1)\n", - " means = [np.ones(d) * -1, np.ones(d)] \n", - " \n", - " # create the data with the given parameters (variance)\n", - " X = np.concatenate([np.random.multivariate_normal(mean, var * np.eye(len(mean)), \n", - " size=int(n / 2)) for mean in means]) \n", - " \n", - " # create the labels for the data\n", - " y = np.concatenate([np.ones(int(n / 2)) * mean_idx for mean_idx in range(len(means))])\n", - " \n", - " return X, y" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Lastly, the parameters of the uncertainty forest are defined." - ] - }, - { - "cell_type": "code", - "execution_count": 15, - "metadata": {}, - "outputs": [], - "source": [ - "# Real Params.\n", - "n_train = 50\n", - "n_test = 10000\n", - "d = 100\n", - "var = 0.25\n", - "num_trials = 10\n", - "n_estimators = 100" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "It will be important to understand each of these parameters, so we'll go into more depth on what they mean:\n", - "* `n_train` is the number of training data that will be used to train the learner\n", - "* `n_test` is the number of test data that will be used to assess how well the learner is at classifying\n", - "* `d` is the dimensionality of the input space (i.e. how many features the data has)\n", - "* `var` is the variance of the data\n", - "* `num_trials` is the number of times we'll generate data, train, and test to make sure our results are not outliers\n", - "* `num_estimators` is the number of trees in the forest" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### You're done with part 3 of the tutorial!\n", - "\n", - "### Move on to part 4 (called \"UncertaintyForest_Tutorial_4-Running\")" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.7.0" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -}