diff --git a/.github/workflows/ci-notebook.yml b/.github/workflows/ci-notebook.yml index 08f72687042..97b57cb580a 100644 --- a/.github/workflows/ci-notebook.yml +++ b/.github/workflows/ci-notebook.yml @@ -45,8 +45,27 @@ jobs: pip install --requirement requirements/test.txt --quiet --find-links https://download.pytorch.org/whl/torch_stable.html pip install --requirement requirements/notebooks.txt --quiet --find-links https://download.pytorch.org/whl/torch_stable.html + - name: Cache datasets + uses: actions/cache@v2 + with: + path: flash_examples/finetuning # This path is specific to Ubuntu + # Look to see if there is a cache hit for the corresponding requirements file + key: flash-datasets_finetuning + + - name: Cache datasets + uses: actions/cache@v2 + with: + path: flash_examples/predict # This path is specific to Ubuntu + # Look to see if there is a cache hit for the corresponding requirements file + key: flash-datasets_predict + - name: Run Notebooks run: | - #treon --threads=1 --exclude=notebooks/kaggle_tumor.ipynb # with more threads this requires to much RAM - # ignore error code 5 (no tests) - sh -c 'pytest notebooks/ ; ret=$?; [ $ret = 5 ] && exit 0 || exit $ret' + set -e + jupyter nbconvert --to script flash_notebooks/finetuning/tabular_classification.ipynb + jupyter nbconvert --to script flash_notebooks/predict/classify_image.ipynb + jupyter nbconvert --to script flash_notebooks/predict/classify_tabular.ipynb + + ipython flash_notebooks/finetuning/tabular_classification.py + ipython flash_notebooks/predict/classify_image.py + ipython flash_notebooks/predict/classify_tabular.py \ No newline at end of file diff --git a/CHANGELOG.md b/CHANGELOG.md new file mode 100644 index 00000000000..4a589789b76 --- /dev/null +++ b/CHANGELOG.md @@ -0,0 +1,19 @@ +# Changelog + +All notable changes to this project will be documented in this file. + +The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/). + + +## [0.1.0] - 01/02/2021 + +### Added + +- Added flash_notebook examples ([#9](https://github.com/PyTorchLightning/pytorch-lightning/pull/9)) + +### Changed + +### Fixed + + +### Removed \ No newline at end of file diff --git a/flash/vision/classification/data.py b/flash/vision/classification/data.py index 552bba297a1..1f8c0f6f968 100644 --- a/flash/vision/classification/data.py +++ b/flash/vision/classification/data.py @@ -318,7 +318,7 @@ def from_folders( train/dog/xxz.png train/cat/123.png train/cat/nsdf3.png - train/cat/asd932_.png + train/cat/asd932.png Args: train_folder: Path to training folder. diff --git a/flash_notebooks/finetuning/image_classification.ipynb b/flash_notebooks/finetuning/image_classification.ipynb new file mode 100644 index 00000000000..80e9fa4a045 --- /dev/null +++ b/flash_notebooks/finetuning/image_classification.ipynb @@ -0,0 +1,282 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "purple-muscle", + "metadata": {}, + "source": [ + "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/PyTorchLightning/lightning-flash/blob/master/flash_notebooks/finetuning/image_classification.ipynb)" + ] + }, + { + "cell_type": "markdown", + "id": "usual-israeli", + "metadata": {}, + "source": [ + "In this notebook, we'll go over the basics of lightning Flash by finetuning an ImageClassifier on [Hymenoptera Dataset](https://www.kaggle.com/ajayrana/hymenoptera-data) containing ants and bees images.\n", + "\n", + "# Finetuning\n", + "\n", + "Finetuning consists of four steps:\n", + " \n", + " - 1. Train a source neural network model on a source dataset. For computer vision, it is traditionally the [ImageNet dataset](http://www.image-net.org/search?q=cat). As training is costly, library such as [Torchvion](https://pytorch.org/docs/stable/torchvision/index.html) library supports popular pre-trainer model architectures . In this notebook, we will be using their [resnet-18](https://pytorch.org/hub/pytorch_vision_resnet/).\n", + " \n", + " - 2. Create a new neural network called the target model. Its architecture replicates the source model and parameters, expect the latest layer which is removed. This model without its latest layer is traditionally called a backbone\n", + " \n", + " - 3. Add new layers after the backbone where the latest output size is the number of target dataset categories. Those new layers, traditionally called head will be randomly initialized while backbone will conserve its pre-trained weights from ImageNet.\n", + " \n", + " - 4. Train the target model on a target dataset, such as Hymenoptera Dataset with ants and bees. At training start, the backbone will be frozen, meaning its parameters won't be updated. Only the model head will be trained to properly distinguish ants and bees. On reaching first finetuning milestone, the backbone latest layers will be unfrozen and start to be trained. On reaching the second finetuning milestone, the remaining layers of the backend will be unfrozen and the entire model will be trained. In Flash, `trainer.finetune(..., unfreeze_milestones=(first_milestone, second_milestone))`.\n", + "\n", + " \n", + "\n", + "---\n", + " - Give us a ⭐ [on Github](https://www.github.com/PytorchLightning/pytorch-lightning/)\n", + " - Check out [Flash documentation](https://lightning-flash.readthedocs.io/en/latest/)\n", + " - Check out [Lightning documentation](https://pytorch-lightning.readthedocs.io/en/latest/)\n", + " - Join us [on Slack](https://join.slack.com/t/pytorch-lightning/shared_invite/zt-f6bl2l0l-JYMK3tbAgAmGRrlNr00f1A)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "sapphire-counter", + "metadata": {}, + "outputs": [], + "source": [ + "%%capture\n", + "! pip install lightning-flash" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "chubby-incidence", + "metadata": {}, + "outputs": [], + "source": [ + "import flash\n", + "from flash.core.data import download_data\n", + "from flash.vision import ImageClassificationData, ImageClassifier" + ] + }, + { + "cell_type": "markdown", + "id": "central-netscape", + "metadata": {}, + "source": [ + "## 1. Download data\n", + "The data are downloaded from a URL, and save in a 'data' directory." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "through-munich", + "metadata": {}, + "outputs": [], + "source": [ + "download_data(\"https://pl-flash-data.s3.amazonaws.com/hymenoptera_data.zip\", 'data/')" + ] + }, + { + "cell_type": "markdown", + "id": "chief-footwear", + "metadata": {}, + "source": [ + "

2. Load the data

\n", + "\n", + "Flash Tasks have built-in DataModules that you can abuse to organize your data. Pass in a train, validation and test folders and Flash will take care of the rest.\n", + "Creates a ImageClassificationData object from folders of images arranged in this way:\n", + "\n", + "\n", + " train/dog/xxx.png\n", + " train/dog/xxy.png\n", + " train/dog/xxz.png\n", + " train/cat/123.png\n", + " train/cat/nsdf3.png\n", + " train/cat/asd932.png\n", + "\n", + "\n", + "Note: Each sub-folder content will be considered as a new class." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "helpful-glass", + "metadata": {}, + "outputs": [], + "source": [ + "datamodule = ImageClassificationData.from_folders(\n", + " train_folder=\"data/hymenoptera_data/train/\",\n", + " valid_folder=\"data/hymenoptera_data/val/\",\n", + " test_folder=\"data/hymenoptera_data/test/\",\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "extraordinary-tablet", + "metadata": {}, + "source": [ + "### 3. Build the model\n", + "Create the ImageClassifier task. By default, the ImageClassifier task uses a [resnet-18](https://pytorch.org/hub/pytorch_vision_resnet/) backbone to train or finetune your model.\n", + "For [Hymenoptera Dataset](https://www.kaggle.com/ajayrana/hymenoptera-data) containing ants and bees images, ``datamodule.num_classes`` will be 2.\n", + "Backbone can easily be changed with `ImageClassifier(backbone=\"resnet50\")` or you could provide your own `ImageClassifier(backbone=my_backbone)`" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "adjusted-acrobat", + "metadata": {}, + "outputs": [], + "source": [ + "model = ImageClassifier(num_classes=datamodule.num_classes)" + ] + }, + { + "cell_type": "markdown", + "id": "sweet-pottery", + "metadata": {}, + "source": [ + "### 4. Create the trainer. Run once on data\n", + "\n", + "The trainer object can be used for training or fine-tuning tasks on new sets of data. \n", + "\n", + "You can pass in parameters to control the training routine- limit the number of epochs, run on GPUs or TPUs, etc.\n", + "\n", + "For more details, read the [Trainer Documentation](https://pytorch-lightning.readthedocs.io/en/latest/trainer.html).\n", + "\n", + "In this demo, we will limit the fine-tuning to run just one epoch using max_epochs=2." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "molecular-string", + "metadata": {}, + "outputs": [], + "source": [ + "trainer = flash.Trainer(max_epochs=3)" + ] + }, + { + "cell_type": "markdown", + "id": "criminal-string", + "metadata": {}, + "source": [ + "### 5. Finetune the model\n", + "The `unfreeze_milestones=(0, 1)` will unfreeze the latest layers of the backbone on epoch `0` and the rest of the backbone on epoch `1`. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "documentary-donna", + "metadata": {}, + "outputs": [], + "source": [ + "trainer.finetune(model, datamodule=datamodule, unfreeze_milestones=(0, 1))" + ] + }, + { + "cell_type": "markdown", + "id": "civic-wednesday", + "metadata": {}, + "source": [ + "### 6. Test the model" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "public-regard", + "metadata": {}, + "outputs": [], + "source": [ + "trainer.test()" + ] + }, + { + "cell_type": "markdown", + "id": "above-dietary", + "metadata": {}, + "source": [ + "### 7. Save it!" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "canadian-nudist", + "metadata": {}, + "outputs": [], + "source": [ + "trainer.save_checkpoint(\"image_classification_model.pt\")" + ] + }, + { + "cell_type": "markdown", + "id": "worthy-february", + "metadata": {}, + "source": [ + "\n", + "

Congratulations - Time to Join the Community!

\n", + "
\n", + "\n", + "Congratulations on completing this notebook tutorial! If you enjoyed it and would like to join the Lightning movement, you can do so in the following ways!\n", + "\n", + "### Help us build Flash by adding support for new data-types and new tasks.\n", + "Flash aims at becoming the first task hub, so anyone can get started to great amazing application using deep learning. \n", + "If you are interested, please open a PR with your contributions !!! \n", + "\n", + "\n", + "### Star [Lightning](https://github.com/PyTorchLightning/pytorch-lightning) on GitHub\n", + "The easiest way to help our community is just by starring the GitHub repos! This helps raise awareness of the cool tools we're building.\n", + "\n", + "* Please, star [Lightning](https://github.com/PyTorchLightning/pytorch-lightning)\n", + "\n", + "### Join our [Slack](https://join.slack.com/t/pytorch-lightning/shared_invite/zt-f6bl2l0l-JYMK3tbAgAmGRrlNr00f1A)!\n", + "The best way to keep up to date on the latest advancements is to join our community! Make sure to introduce yourself and share your interests in `#general` channel\n", + "\n", + "### Interested by SOTA AI models ! Check out [Bolt](https://github.com/PyTorchLightning/pytorch-lightning-bolts)\n", + "Bolts has a collection of state-of-the-art models, all implemented in [Lightning](https://github.com/PyTorchLightning/pytorch-lightning) and can be easily integrated within your own projects.\n", + "\n", + "* Please, star [Bolt](https://github.com/PyTorchLightning/pytorch-lightning-bolts)\n", + "\n", + "### Contributions !\n", + "The best way to contribute to our community is to become a code contributor! At any time you can go to [Lightning](https://github.com/PyTorchLightning/pytorch-lightning) or [Bolt](https://github.com/PyTorchLightning/pytorch-lightning-bolts) GitHub Issues page and filter for \"good first issue\". \n", + "\n", + "* [Lightning good first issue](https://github.com/PyTorchLightning/pytorch-lightning/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22)\n", + "* [Bolt good first issue](https://github.com/PyTorchLightning/pytorch-lightning-bolts/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22)\n", + "* You can also contribute your own notebooks with useful examples !\n", + "\n", + "### Great thanks from the entire Pytorch Lightning Team for your interest !\n", + "\n", + "" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.9" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/flash_notebooks/finetuning/tabular_classification.ipynb b/flash_notebooks/finetuning/tabular_classification.ipynb new file mode 100644 index 00000000000..ec1b59db6c7 --- /dev/null +++ b/flash_notebooks/finetuning/tabular_classification.ipynb @@ -0,0 +1,252 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "least-injury", + "metadata": {}, + "source": [ + "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/PyTorchLightning/lightning-flash/blob/master/flash_notebooks/finetuning/tabular_classification.ipynb)" + ] + }, + { + "cell_type": "markdown", + "id": "effective-being", + "metadata": {}, + "source": [ + "In this notebook, we'll go over the basics of lightning Flash by training a TabularClassifier on [Titanic Dataset](https://www.kaggle.com/c/titanic).\n", + "\n", + "---\n", + " - Give us a ⭐ [on Github](https://www.github.com/PytorchLightning/pytorch-lightning/)\n", + " - Check out [Flash documentation](https://lightning-flash.readthedocs.io/en/latest/)\n", + " - Check out [Lightning documentation](https://pytorch-lightning.readthedocs.io/en/latest/)\n", + " - Join us [on Slack](https://join.slack.com/t/pytorch-lightning/shared_invite/zt-f6bl2l0l-JYMK3tbAgAmGRrlNr00f1A)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "infinite-profession", + "metadata": {}, + "outputs": [], + "source": [ + "%%capture\n", + "! pip install lightning-flash" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "organized-faculty", + "metadata": {}, + "outputs": [], + "source": [ + "from pytorch_lightning.metrics.classification import Accuracy, Precision, Recall\n", + "\n", + "import flash\n", + "from flash.core.data import download_data\n", + "from flash.tabular import TabularClassifier, TabularData" + ] + }, + { + "cell_type": "markdown", + "id": "daily-participation", + "metadata": {}, + "source": [ + "### 1. Download the data\n", + "The data are downloaded from a URL, and save in a 'data' directory." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "australian-showcase", + "metadata": {}, + "outputs": [], + "source": [ + "download_data(\"https://pl-flash-data.s3.amazonaws.com/titanic.zip\", 'data/')" + ] + }, + { + "cell_type": "markdown", + "id": "skilled-master", + "metadata": {}, + "source": [ + "### 2. Load the data\n", + "Flash Tasks have built-in DataModules that you can abuse to organize your data. Pass in a train, validation and test folders and Flash will take care of the rest.\n", + "\n", + "Creates a TabularData relies on [Pandas DataFrame](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html). " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "focal-checkout", + "metadata": {}, + "outputs": [], + "source": [ + "datamodule = TabularData.from_csv(\n", + " \"./data/titanic/titanic.csv\",\n", + " test_csv=\"./data/titanic/test.csv\",\n", + " categorical_input=[\"Sex\", \"Age\", \"SibSp\", \"Parch\", \"Ticket\", \"Cabin\", \"Embarked\"],\n", + " numerical_input=[\"Fare\"],\n", + " target=\"Survived\",\n", + " val_size=0.25,\n", + ")\n" + ] + }, + { + "cell_type": "markdown", + "id": "universal-holiday", + "metadata": {}, + "source": [ + "### 3. Build the model\n", + "\n", + "Note: Categorical columns will be mapped to the embedding space. Embedding space is set of tensors to be trained associated to each categorical column. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "electoral-guide", + "metadata": {}, + "outputs": [], + "source": [ + "model = TabularClassifier.from_data(datamodule, metrics=[Accuracy(), Precision(), Recall()])" + ] + }, + { + "cell_type": "markdown", + "id": "suspended-corrections", + "metadata": {}, + "source": [ + "### 4. Create the trainer. Run 10 times on data" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "molecular-gateway", + "metadata": {}, + "outputs": [], + "source": [ + "trainer = flash.Trainer(max_epochs=10)" + ] + }, + { + "cell_type": "markdown", + "id": "convinced-wesley", + "metadata": {}, + "source": [ + "### 5. Train the model" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "aboriginal-shield", + "metadata": {}, + "outputs": [], + "source": [ + "trainer.fit(model, datamodule=datamodule)" + ] + }, + { + "cell_type": "markdown", + "id": "ambient-huntington", + "metadata": {}, + "source": [ + "### 6. Test model" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "hired-membrane", + "metadata": {}, + "outputs": [], + "source": [ + "trainer.test()" + ] + }, + { + "cell_type": "markdown", + "id": "amateur-extension", + "metadata": {}, + "source": [ + "### 7. Save it!" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "boxed-performer", + "metadata": {}, + "outputs": [], + "source": [ + "trainer.save_checkpoint(\"tabular_classification_model.pt\")" + ] + }, + { + "cell_type": "markdown", + "id": "corporate-humanity", + "metadata": {}, + "source": [ + "\n", + "

Congratulations - Time to Join the Community!

\n", + "
\n", + "\n", + "Congratulations on completing this notebook tutorial! If you enjoyed it and would like to join the Lightning movement, you can do so in the following ways!\n", + "\n", + "### Help us build Flash by adding support for new data-types and new tasks.\n", + "Flash aims at becoming the first task hub, so anyone can get started to great amazing application using deep learning. \n", + "If you are interested, please open a PR with your contributions !!! \n", + "\n", + "\n", + "### Star [Lightning](https://github.com/PyTorchLightning/pytorch-lightning) on GitHub\n", + "The easiest way to help our community is just by starring the GitHub repos! This helps raise awareness of the cool tools we're building.\n", + "\n", + "* Please, star [Lightning](https://github.com/PyTorchLightning/pytorch-lightning)\n", + "\n", + "### Join our [Slack](https://join.slack.com/t/pytorch-lightning/shared_invite/zt-f6bl2l0l-JYMK3tbAgAmGRrlNr00f1A)!\n", + "The best way to keep up to date on the latest advancements is to join our community! Make sure to introduce yourself and share your interests in `#general` channel\n", + "\n", + "### Interested by SOTA AI models ! Check out [Bolt](https://github.com/PyTorchLightning/pytorch-lightning-bolts)\n", + "Bolts has a collection of state-of-the-art models, all implemented in [Lightning](https://github.com/PyTorchLightning/pytorch-lightning) and can be easily integrated within your own projects.\n", + "\n", + "* Please, star [Bolt](https://github.com/PyTorchLightning/pytorch-lightning-bolts)\n", + "\n", + "### Contributions !\n", + "The best way to contribute to our community is to become a code contributor! At any time you can go to [Lightning](https://github.com/PyTorchLightning/pytorch-lightning) or [Bolt](https://github.com/PyTorchLightning/pytorch-lightning-bolts) GitHub Issues page and filter for \"good first issue\". \n", + "\n", + "* [Lightning good first issue](https://github.com/PyTorchLightning/pytorch-lightning/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22)\n", + "* [Bolt good first issue](https://github.com/PyTorchLightning/pytorch-lightning-bolts/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22)\n", + "* You can also contribute your own notebooks with useful examples !\n", + "\n", + "### Great thanks from the entire Pytorch Lightning Team for your interest !\n", + "\n", + "" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.9" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/flash_notebooks/finetuning/text_classification.ipynb b/flash_notebooks/finetuning/text_classification.ipynb new file mode 100644 index 00000000000..98e0c36a423 --- /dev/null +++ b/flash_notebooks/finetuning/text_classification.ipynb @@ -0,0 +1,297 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "coordinate-grounds", + "metadata": {}, + "source": [ + "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/PyTorchLightning/lightning-flash/blob/master/flash_notebooks/finetuning/text_classification.ipynb)" + ] + }, + { + "cell_type": "markdown", + "id": "olive-consensus", + "metadata": {}, + "source": [ + "In this notebook, we'll go over the basics of lightning Flash by finetunig a TextClassifier on [IMDB Dataset](https://www.imdb.com/interfaces/).\n", + "\n", + "# Finetuning\n", + "\n", + "Finetuning consists of four steps:\n", + " \n", + " - 1. Train a source neural network model on a source dataset. For text classication, it is traditionally a transformer model such as BERT [Bidirectional Encoder Representations from Transformers](https://arxiv.org/abs/1810.04805) trained on wikipedia.\n", + "As those model are costly to train, [Transformers](https://github.com/huggingface/transformers) or [FairSeq](https://github.com/pytorch/fairseq) libraries provides popular pre-trained model architectures for NLP. In this notebook, we will be using [tiny-bert](https://huggingface.co/prajjwal1/bert-tiny).\n", + "\n", + " \n", + " - 2. Create a new neural network the target model. Its architecture replicates all model designs and their parameters on the source model, expect the latest layer which is removed. This model without its latest layers is traditionally called a backbone\n", + " \n", + "\n", + "- 3. Add new layers after the backbone where the latest output size is the number of target dataset categories. Those new layers, traditionally called head, will be randomly initialized while backbone will conserve its pre-trained weights from ImageNet.\n", + " \n", + "\n", + "- 4. Train the target model on a target dataset, such as IMDB Dataset to learn to predict the associated sentiment of movie reviews. At training start, the backbone will be frozen, meaning its parameters won't be updated. Only the model head will be trained to between negative and positive reviews. On reaching first finetuning milestone, the backbone latest layers will be unfrozen and start to be trained. On reaching the second finetuning milestone, the remaining layers of the backend will be unfrozen and the entire model will be trained. In Flash, `unfreeze_milestones` controls those milestone and be used as such `trainer.finetune(..., unfreeze_milestones=(first_milestone, second_milestone))`.\n", + "\n", + "---\n", + " - Give us a ⭐ [on Github](https://www.github.com/PytorchLightning/pytorch-lightning/)\n", + " - Check out [Flash documentation](https://lightning-flash.readthedocs.io/en/latest/)\n", + " - Check out [Lightning documentation](https://pytorch-lightning.readthedocs.io/en/latest/)\n", + " - Join us [on Slack](https://join.slack.com/t/pytorch-lightning/shared_invite/zt-f6bl2l0l-JYMK3tbAgAmGRrlNr00f1A)" + ] + }, + { + "cell_type": "markdown", + "id": "photographic-reggae", + "metadata": {}, + "source": [ + "### Setup \n", + "Lightning Flash is easy to install. Simply ```pip install lightning-flash```" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "trying-malaysia", + "metadata": {}, + "outputs": [], + "source": [ + "%%capture\n", + "! pip install lightning-flash" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "atlantic-insulin", + "metadata": {}, + "outputs": [], + "source": [ + "import flash\n", + "from flash.core.data import download_data\n", + "from flash.text import TextClassificationData, TextClassifier" + ] + }, + { + "cell_type": "markdown", + "id": "effective-amino", + "metadata": {}, + "source": [ + "### 1. Download the data\n", + "The data are downloaded from a URL, and save in a 'data' directory." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "refined-embassy", + "metadata": {}, + "outputs": [], + "source": [ + "download_data(\"https://pl-flash-data.s3.amazonaws.com/imdb.zip\", 'data/')" + ] + }, + { + "cell_type": "markdown", + "id": "ecological-positive", + "metadata": {}, + "source": [ + "

2. Load the data

\n", + "\n", + "Flash Tasks have built-in DataModules that you can abuse to organize your data. Pass in a train, validation and test folders and Flash will take care of the rest.\n", + "Creates a TextClassificationData object from csv file." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "intense-mediterranean", + "metadata": {}, + "outputs": [], + "source": [ + "datamodule = TextClassificationData.from_files(\n", + " train_file=\"data/imdb/train.csv\",\n", + " valid_file=\"data/imdb/valid.csv\",\n", + " test_file=\"data/imdb/test.csv\",\n", + " input=\"review\",\n", + " target=\"sentiment\",\n", + " batch_size=512\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "typical-surveillance", + "metadata": { + "jupyter": { + "outputs_hidden": true + } + }, + "source": [ + "### 3. Build the model\n", + "\n", + "Create the TextClassifier task. By default, the TextClassifier task uses a [tiny-bert](https://huggingface.co/prajjwal1/bert-tiny) backbone to train or finetune your model demo. You could use any models from [transformers - Text Classification](https://huggingface.co/models?filter=text-classification,pytorch)\n", + "\n", + "Backbone can easily be changed with such as `TextClassifier(backbone='bert-tiny-mnli')`" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "simplified-bernard", + "metadata": {}, + "outputs": [], + "source": [ + "model = TextClassifier(num_classes=datamodule.num_classes)" + ] + }, + { + "cell_type": "markdown", + "id": "convertible-fiber", + "metadata": { + "jupyter": { + "outputs_hidden": true + } + }, + "source": [ + "### 4. Create the trainer. Run once on data" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "leading-generic", + "metadata": {}, + "outputs": [], + "source": [ + "trainer = flash.Trainer(max_epochs=1)" + ] + }, + { + "cell_type": "markdown", + "id": "meaningful-anderson", + "metadata": { + "jupyter": { + "outputs_hidden": true + } + }, + "source": [ + "### 5. Fine-tune the model" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "bearing-composite", + "metadata": {}, + "outputs": [], + "source": [ + "trainer.finetune(model, datamodule=datamodule, unfreeze_milestones=(0, 1))" + ] + }, + { + "cell_type": "markdown", + "id": "comfortable-butler", + "metadata": { + "jupyter": { + "outputs_hidden": true + } + }, + "source": [ + "### 6. Test model" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "regular-wednesday", + "metadata": {}, + "outputs": [], + "source": [ + "trainer.test()" + ] + }, + { + "cell_type": "markdown", + "id": "sudden-acquisition", + "metadata": { + "jupyter": { + "outputs_hidden": true + } + }, + "source": [ + "### 7. Save it!" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "minimal-possession", + "metadata": {}, + "outputs": [], + "source": [ + "trainer.save_checkpoint(\"text_classification_model.pt\")" + ] + }, + { + "cell_type": "markdown", + "id": "administrative-rapid", + "metadata": {}, + "source": [ + "\n", + "

Congratulations - Time to Join the Community!

\n", + "
\n", + "\n", + "Congratulations on completing this notebook tutorial! If you enjoyed it and would like to join the Lightning movement, you can do so in the following ways!\n", + "\n", + "### Help us build Flash by adding support for new data-types and new tasks.\n", + "Flash aims at becoming the first task hub, so anyone can get started to great amazing application using deep learning. \n", + "If you are interested, please open a PR with your contributions !!! \n", + "\n", + "\n", + "### Star [Lightning](https://github.com/PyTorchLightning/pytorch-lightning) on GitHub\n", + "The easiest way to help our community is just by starring the GitHub repos! This helps raise awareness of the cool tools we're building.\n", + "\n", + "* Please, star [Lightning](https://github.com/PyTorchLightning/pytorch-lightning)\n", + "\n", + "### Join our [Slack](https://join.slack.com/t/pytorch-lightning/shared_invite/zt-f6bl2l0l-JYMK3tbAgAmGRrlNr00f1A)!\n", + "The best way to keep up to date on the latest advancements is to join our community! Make sure to introduce yourself and share your interests in `#general` channel\n", + "\n", + "### Interested by SOTA AI models ! Check out [Bolt](https://github.com/PyTorchLightning/pytorch-lightning-bolts)\n", + "Bolts has a collection of state-of-the-art models, all implemented in [Lightning](https://github.com/PyTorchLightning/pytorch-lightning) and can be easily integrated within your own projects.\n", + "\n", + "* Please, star [Bolt](https://github.com/PyTorchLightning/pytorch-lightning-bolts)\n", + "\n", + "### Contributions !\n", + "The best way to contribute to our community is to become a code contributor! At any time you can go to [Lightning](https://github.com/PyTorchLightning/pytorch-lightning) or [Bolt](https://github.com/PyTorchLightning/pytorch-lightning-bolts) GitHub Issues page and filter for \"good first issue\". \n", + "\n", + "* [Lightning good first issue](https://github.com/PyTorchLightning/pytorch-lightning/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22)\n", + "* [Bolt good first issue](https://github.com/PyTorchLightning/pytorch-lightning-bolts/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22)\n", + "* You can also contribute your own notebooks with useful examples !\n", + "\n", + "### Great thanks from the entire Pytorch Lightning Team for your interest !\n", + "\n", + "" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.9" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/flash_notebooks/generic_task.ipynb b/flash_notebooks/generic_task.ipynb new file mode 100644 index 00000000000..6303e0b7c8a --- /dev/null +++ b/flash_notebooks/generic_task.ipynb @@ -0,0 +1,249 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "improved-minnesota", + "metadata": {}, + "source": [ + "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/PyTorchLightning/lightning-flash/blob/master/flash_notebooks/generic_task.ipynb)" + ] + }, + { + "cell_type": "markdown", + "id": "commercial-reunion", + "metadata": {}, + "source": [ + "In this notebook, we'll go over the basics of lightning Flash by creating a ClassificationTask with a custom Convolutional Model and train it on [MNIST Dataset](http://yann.lecun.com/exdb/mnist/)\n", + "\n", + "---\n", + " - Give us a ⭐ [on Github](https://www.github.com/PytorchLightning/pytorch-lightning/)\n", + " - Check out [Flash documentation](https://lightning-flash.readthedocs.io/en/latest/)\n", + " - Check out [Lightning documentation](https://pytorch-lightning.readthedocs.io/en/latest/)\n", + " - Join us [on Slack](https://join.slack.com/t/pytorch-lightning/shared_invite/zt-f6bl2l0l-JYMK3tbAgAmGRrlNr00f1A)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "compliant-address", + "metadata": {}, + "outputs": [], + "source": [ + "%%capture\n", + "! pip install lightning-flash" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "innovative-aquatic", + "metadata": {}, + "outputs": [], + "source": [ + "import pytorch_lightning as pl\n", + "from torch import nn, optim\n", + "from torch.utils.data import DataLoader, random_split\n", + "from torchvision import datasets, transforms\n", + "\n", + "from flash import ClassificationTask" + ] + }, + { + "cell_type": "markdown", + "id": "dress-perspective", + "metadata": {}, + "source": [ + "### 1. Load a basic backbone" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "under-conditions", + "metadata": {}, + "outputs": [], + "source": [ + "model = nn.Sequential(\n", + " nn.Flatten(),\n", + " nn.Linear(28 * 28, 128),\n", + " nn.ReLU(),\n", + " nn.Linear(128, 10),\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "flying-supply", + "metadata": {}, + "source": [ + "### 2. Load a dataset" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "accepting-graphics", + "metadata": {}, + "outputs": [], + "source": [ + "dataset = datasets.MNIST('./data', download=True, transform=transforms.ToTensor())" + ] + }, + { + "cell_type": "markdown", + "id": "quality-reception", + "metadata": {}, + "source": [ + "### 3. Split the data randomly" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "fifteen-tunnel", + "metadata": {}, + "outputs": [], + "source": [ + "train, val, test = random_split(dataset, [50000, 5000, 5000])" + ] + }, + { + "cell_type": "markdown", + "id": "realistic-bradley", + "metadata": {}, + "source": [ + "### 4. Create the model" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "dental-spouse", + "metadata": {}, + "outputs": [], + "source": [ + "classifier = ClassificationTask(model, loss_fn=nn.functional.cross_entropy, optimizer=optim.Adam, learning_rate=10e-3)" + ] + }, + { + "cell_type": "markdown", + "id": "naval-invention", + "metadata": {}, + "source": [ + "### 5. Create the trainer" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "paperback-browse", + "metadata": {}, + "outputs": [], + "source": [ + "trainer = pl.Trainer(\n", + " max_epochs=10,\n", + " limit_train_batches=128,\n", + " limit_val_batches=128,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "hindu-title", + "metadata": {}, + "source": [ + "### 6. Train the model" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "naked-beauty", + "metadata": {}, + "outputs": [], + "source": [ + "trainer.fit(classifier, DataLoader(train), DataLoader(val))" + ] + }, + { + "cell_type": "markdown", + "id": "according-defense", + "metadata": {}, + "source": [ + "### 7. Test the model" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "deadly-narrow", + "metadata": {}, + "outputs": [], + "source": [ + "results = trainer.test(classifier, test_dataloaders=DataLoader(test))" + ] + }, + { + "cell_type": "markdown", + "id": "searching-chester", + "metadata": {}, + "source": [ + "\n", + "

Congratulations - Time to Join the Community!

\n", + "
\n", + "\n", + "Congratulations on completing this notebook tutorial! If you enjoyed it and would like to join the Lightning movement, you can do so in the following ways!\n", + "\n", + "### Help us build Flash by adding support for new data-types and new tasks.\n", + "Flash aims at becoming the first task hub, so anyone can get started to great amazing application using deep learning. \n", + "If you are interested, please open a PR with your contributions !!! \n", + "\n", + "\n", + "### Star [Lightning](https://github.com/PyTorchLightning/pytorch-lightning) on GitHub\n", + "The easiest way to help our community is just by starring the GitHub repos! This helps raise awareness of the cool tools we're building.\n", + "\n", + "* Please, star [Lightning](https://github.com/PyTorchLightning/pytorch-lightning)\n", + "\n", + "### Join our [Slack](https://join.slack.com/t/pytorch-lightning/shared_invite/zt-f6bl2l0l-JYMK3tbAgAmGRrlNr00f1A)!\n", + "The best way to keep up to date on the latest advancements is to join our community! Make sure to introduce yourself and share your interests in `#general` channel\n", + "\n", + "### Interested by SOTA AI models ! Check out [Bolt](https://github.com/PyTorchLightning/pytorch-lightning-bolts)\n", + "Bolts has a collection of state-of-the-art models, all implemented in [Lightning](https://github.com/PyTorchLightning/pytorch-lightning) and can be easily integrated within your own projects.\n", + "\n", + "* Please, star [Bolt](https://github.com/PyTorchLightning/pytorch-lightning-bolts)\n", + "\n", + "### Contributions !\n", + "The best way to contribute to our community is to become a code contributor! At any time you can go to [Lightning](https://github.com/PyTorchLightning/pytorch-lightning) or [Bolt](https://github.com/PyTorchLightning/pytorch-lightning-bolts) GitHub Issues page and filter for \"good first issue\". \n", + "\n", + "* [Lightning good first issue](https://github.com/PyTorchLightning/pytorch-lightning/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22)\n", + "* [Bolt good first issue](https://github.com/PyTorchLightning/pytorch-lightning-bolts/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22)\n", + "* You can also contribute your own notebooks with useful examples !\n", + "\n", + "### Great thanks from the entire Pytorch Lightning Team for your interest !\n", + "\n", + "" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.9" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/flash_notebooks/predict/classify_image.ipynb b/flash_notebooks/predict/classify_image.ipynb new file mode 100644 index 00000000000..a4c92c64be2 --- /dev/null +++ b/flash_notebooks/predict/classify_image.ipynb @@ -0,0 +1,223 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "thousand-hormone", + "metadata": {}, + "source": [ + "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/PyTorchLightning/lightning-flash/blob/master/flash_notebooks/predict/classify_image.ipynb)" + ] + }, + { + "cell_type": "markdown", + "id": "working-spyware", + "metadata": {}, + "source": [ + "In this notebook, we'll go over the basics of lightning Flash for making predictions with ImageClassifier on [Hymenoptera Dataset](https://www.kaggle.com/ajayrana/hymenoptera-data) containing ants and bees images.\n", + "\n", + "---\n", + " - Give us a ⭐ [on Github](https://www.github.com/PytorchLightning/pytorch-lightning/)\n", + " - Check out [the documentation](https://pytorch-lightning.readthedocs.io/en/latest/)\n", + " - Join us [on Slack](https://join.slack.com/t/pytorch-lightning/shared_invite/zt-f6bl2l0l-JYMK3tbAgAmGRrlNr00f1A)\n", + " - Find finetuning notebook used to generate the weights [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/PyTorchLightning/lightning-flash/blob/master/flash_notebooks/finetuning/image_classification.ipynb)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "floral-system", + "metadata": {}, + "outputs": [], + "source": [ + "%%capture\n", + "! pip install lightning-flash" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "upper-shoot", + "metadata": {}, + "outputs": [], + "source": [ + "from flash import Trainer\n", + "from flash.core.data import download_data\n", + "from flash.vision import ImageClassificationData, ImageClassifier" + ] + }, + { + "cell_type": "markdown", + "id": "square-gospel", + "metadata": {}, + "source": [ + "### 1. Download the data" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "decent-surgery", + "metadata": {}, + "outputs": [], + "source": [ + "download_data(\"https://pl-flash-data.s3.amazonaws.com/hymenoptera_data.zip\", 'data/')" + ] + }, + { + "cell_type": "markdown", + "id": "covered-studio", + "metadata": {}, + "source": [ + "### 2. Load the model from a checkpoint\n", + "\n", + "`ImageClassifier.load_from_checkpoint` supports both url or local_path to a checkpoint. If provided with an url, the checkpoint will first be downloaded and laoded to re-create the model. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "necessary-candle", + "metadata": {}, + "outputs": [], + "source": [ + "model = ImageClassifier.load_from_checkpoint(\"https://flash-weights.s3.amazonaws.com/image_classification_model.pt\")" + ] + }, + { + "cell_type": "markdown", + "id": "three-colors", + "metadata": {}, + "source": [ + "### 3a. Predict what's on a few images! ants or bees?\n", + "\n", + "`ImageClassifier.predict` supports a list of image paths to make an inference on." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "acknowledged-anger", + "metadata": {}, + "outputs": [], + "source": [ + "predictions = model.predict([\n", + " \"data/hymenoptera_data/val/bees/65038344_52a45d090d.jpg\",\n", + " \"data/hymenoptera_data/val/bees/590318879_68cf112861.jpg\",\n", + " \"data/hymenoptera_data/val/ants/540543309_ddbb193ee5.jpg\",\n", + "])" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "latter-checklist", + "metadata": {}, + "outputs": [], + "source": [ + "print(predictions)" + ] + }, + { + "cell_type": "markdown", + "id": "dimensional-ferry", + "metadata": {}, + "source": [ + "### 3b. Or generate predictions with a whole folder!\n", + "\n", + "For scaling for inference on 32 gpus, it is as simple as `Trainer(num_gpus=32).predict(...)`" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "inside-bailey", + "metadata": {}, + "outputs": [], + "source": [ + "datamodule = ImageClassificationData.from_folder(folder=\"data/hymenoptera_data/predict/\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "drawn-synthetic", + "metadata": {}, + "outputs": [], + "source": [ + "predictions = Trainer().predict(model, datamodule=datamodule)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "chinese-musical", + "metadata": {}, + "outputs": [], + "source": [ + "print(predictions)" + ] + }, + { + "cell_type": "markdown", + "id": "sudden-asbestos", + "metadata": {}, + "source": [ + "\n", + "

Congratulations - Time to Join the Community!

\n", + "
\n", + "\n", + "Congratulations on completing this notebook tutorial! If you enjoyed it and would like to join the Lightning movement, you can do so in the following ways!\n", + "\n", + "### Help us build Flash by adding support for new data-types and new tasks.\n", + "Flash aims at becoming the first task hub, so anyone can get started to great amazing application using deep learning. \n", + "If you are interested, please open a PR with your contributions !!! \n", + "\n", + "\n", + "### Star [Lightning](https://github.com/PyTorchLightning/pytorch-lightning) on GitHub\n", + "The easiest way to help our community is just by starring the GitHub repos! This helps raise awareness of the cool tools we're building.\n", + "\n", + "* Please, star [Lightning](https://github.com/PyTorchLightning/pytorch-lightning)\n", + "\n", + "### Join our [Slack](https://join.slack.com/t/pytorch-lightning/shared_invite/zt-f6bl2l0l-JYMK3tbAgAmGRrlNr00f1A)!\n", + "The best way to keep up to date on the latest advancements is to join our community! Make sure to introduce yourself and share your interests in `#general` channel\n", + "\n", + "### Interested by SOTA AI models ! Check out [Bolt](https://github.com/PyTorchLightning/pytorch-lightning-bolts)\n", + "Bolts has a collection of state-of-the-art models, all implemented in [Lightning](https://github.com/PyTorchLightning/pytorch-lightning) and can be easily integrated within your own projects.\n", + "\n", + "* Please, star [Bolt](https://github.com/PyTorchLightning/pytorch-lightning-bolts)\n", + "\n", + "### Contributions !\n", + "The best way to contribute to our community is to become a code contributor! At any time you can go to [Lightning](https://github.com/PyTorchLightning/pytorch-lightning) or [Bolt](https://github.com/PyTorchLightning/pytorch-lightning-bolts) GitHub Issues page and filter for \"good first issue\". \n", + "\n", + "* [Lightning good first issue](https://github.com/PyTorchLightning/pytorch-lightning/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22)\n", + "* [Bolt good first issue](https://github.com/PyTorchLightning/pytorch-lightning-bolts/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22)\n", + "* You can also contribute your own notebooks with useful examples !\n", + "\n", + "### Great thanks from the entire Pytorch Lightning Team for your interest !\n", + "\n", + "" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.9" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/flash_notebooks/predict/classify_tabular.ipynb b/flash_notebooks/predict/classify_tabular.ipynb new file mode 100644 index 00000000000..98429de0acd --- /dev/null +++ b/flash_notebooks/predict/classify_tabular.ipynb @@ -0,0 +1,180 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "electronic-moscow", + "metadata": {}, + "source": [ + "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/PyTorchLightning/lightning-flash/blob/master/flash_notebooks/predict/classify_tabular.ipynb)" + ] + }, + { + "cell_type": "markdown", + "id": "typical-lotus", + "metadata": {}, + "source": [ + "In this notebook, we'll go over the basics of lightning Flash for making predictions with TabularClassifier on [Titanic Dataset](https://www.kaggle.com/c/titanic).\n", + "\n", + "---\n", + " - Give us a ⭐ [on Github](https://www.github.com/PytorchLightning/pytorch-lightning/)\n", + " - Check out [Flash documentation](https://lightning-flash.readthedocs.io/en/latest/)\n", + " - Check out [Lightning documentation](https://pytorch-lightning.readthedocs.io/en/latest/)\n", + " - Join us [on Slack](https://join.slack.com/t/pytorch-lightning/shared_invite/zt-f6bl2l0l-JYMK3tbAgAmGRrlNr00f1A)\n", + " - Find finetuning notebook used to generate the weights [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/PyTorchLightning/lightning-flash/blob/master/flash_notebooks/finetuning/tabular_classification.ipynb)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "interracial-builder", + "metadata": {}, + "outputs": [], + "source": [ + "%%capture\n", + "! pip install lightning-flash" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "grave-giant", + "metadata": {}, + "outputs": [], + "source": [ + "from flash.core.data import download_data\n", + "from flash.tabular import TabularClassifier" + ] + }, + { + "cell_type": "markdown", + "id": "governmental-found", + "metadata": {}, + "source": [ + "### 1. Download the data" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "documentary-mandate", + "metadata": {}, + "outputs": [], + "source": [ + "download_data(\"https://pl-flash-data.s3.amazonaws.com/titanic.zip\", 'data/')" + ] + }, + { + "cell_type": "markdown", + "id": "optimum-coordinator", + "metadata": {}, + "source": [ + "### 2. Load the model from a checkpoint\n", + "\n", + "`TabularClassifier.load_from_checkpoint` supports both url or local_path to a checkpoint. If provided with an url, the checkpoint will first be downloaded and laoded to re-create the model. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "quantitative-horizontal", + "metadata": {}, + "outputs": [], + "source": [ + "model = TabularClassifier.load_from_checkpoint(\n", + " \"https://flash-weights.s3.amazonaws.com/tabular_classification_model.pt\")" + ] + }, + { + "cell_type": "markdown", + "id": "genuine-feelings", + "metadata": {}, + "source": [ + "### 3. Generate predictions from a sheet file! Who would survive?\n", + "\n", + "`TabularClassifier.predict` support both DataFrame and path to `.csv` file." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "proved-favorite", + "metadata": {}, + "outputs": [], + "source": [ + "predictions = model.predict(\"data/titanic/titanic.csv\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "alpha-access", + "metadata": {}, + "outputs": [], + "source": [ + "print(predictions)" + ] + }, + { + "cell_type": "markdown", + "id": "perfect-disposal", + "metadata": {}, + "source": [ + "\n", + "

Congratulations - Time to Join the Community!

\n", + "
\n", + "\n", + "Congratulations on completing this notebook tutorial! If you enjoyed it and would like to join the Lightning movement, you can do so in the following ways!\n", + "\n", + "### Help us build Flash by adding support for new data-types and new tasks.\n", + "Flash aims at becoming the first task hub, so anyone can get started to great amazing application using deep learning. \n", + "If you are interested, please open a PR with your contributions !!! \n", + "\n", + "\n", + "### Star [Lightning](https://github.com/PyTorchLightning/pytorch-lightning) on GitHub\n", + "The easiest way to help our community is just by starring the GitHub repos! This helps raise awareness of the cool tools we're building.\n", + "\n", + "* Please, star [Lightning](https://github.com/PyTorchLightning/pytorch-lightning)\n", + "\n", + "### Join our [Slack](https://join.slack.com/t/pytorch-lightning/shared_invite/zt-f6bl2l0l-JYMK3tbAgAmGRrlNr00f1A)!\n", + "The best way to keep up to date on the latest advancements is to join our community! Make sure to introduce yourself and share your interests in `#general` channel\n", + "\n", + "### Interested by SOTA AI models ! Check out [Bolt](https://github.com/PyTorchLightning/pytorch-lightning-bolts)\n", + "Bolts has a collection of state-of-the-art models, all implemented in [Lightning](https://github.com/PyTorchLightning/pytorch-lightning) and can be easily integrated within your own projects.\n", + "\n", + "* Please, star [Bolt](https://github.com/PyTorchLightning/pytorch-lightning-bolts)\n", + "\n", + "### Contributions !\n", + "The best way to contribute to our community is to become a code contributor! At any time you can go to [Lightning](https://github.com/PyTorchLightning/pytorch-lightning) or [Bolt](https://github.com/PyTorchLightning/pytorch-lightning-bolts) GitHub Issues page and filter for \"good first issue\". \n", + "\n", + "* [Lightning good first issue](https://github.com/PyTorchLightning/pytorch-lightning/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22)\n", + "* [Bolt good first issue](https://github.com/PyTorchLightning/pytorch-lightning-bolts/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22)\n", + "* You can also contribute your own notebooks with useful examples !\n", + "\n", + "### Great thanks from the entire Pytorch Lightning Team for your interest !\n", + "\n", + "" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.9" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/flash_notebooks/predict/classify_text.ipynb b/flash_notebooks/predict/classify_text.ipynb new file mode 100644 index 00000000000..c3bc61accb6 --- /dev/null +++ b/flash_notebooks/predict/classify_text.ipynb @@ -0,0 +1,229 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "blessed-program", + "metadata": {}, + "source": [ + "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/PyTorchLightning/lightning-flash/blob/master/flash_notebooks/predict/classify_text.ipynb)" + ] + }, + { + "cell_type": "markdown", + "id": "automotive-store", + "metadata": {}, + "source": [ + "In this notebook, we'll go over the basics of lightning Flash for making predictions with TextClassifier on [IMDB Dataset](https://www.imdb.com/interfaces/).(https://www.kaggle.com/ajayrana/hymenoptera-data).\n", + "\n", + "---\n", + " - Give us a ⭐ [on Github](https://www.github.com/PytorchLightning/pytorch-lightning/)\n", + " - Check out [the documentation](https://pytorch-lightning.readthedocs.io/en/latest/)\n", + " - Join us [on Slack](https://join.slack.com/t/pytorch-lightning/shared_invite/zt-f6bl2l0l-JYMK3tbAgAmGRrlNr00f1A)\n", + " - Find finetuning notebook used to generate the weights [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/PyTorchLightning/lightning-flash/blob/master/flash_notebooks/finetuning/text_classification.ipynb)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "capable-board", + "metadata": {}, + "outputs": [], + "source": [ + "%%capture\n", + "! pip install lightning-flash" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "gentle-boards", + "metadata": {}, + "outputs": [], + "source": [ + "from pytorch_lightning import Trainer\n", + "\n", + "from flash.core.data import download_data\n", + "from flash.text import TextClassificationData, TextClassifier" + ] + }, + { + "cell_type": "markdown", + "id": "rocky-chocolate", + "metadata": {}, + "source": [ + "### 1. Download the data" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "permanent-curve", + "metadata": {}, + "outputs": [], + "source": [ + "download_data(\"https://pl-flash-data.s3.amazonaws.com/imdb.zip\", 'data/')" + ] + }, + { + "cell_type": "markdown", + "id": "legal-drink", + "metadata": {}, + "source": [ + "### 2. Load the model from a checkpoint\n", + "\n", + "`TextClassifier.load_from_checkpoint` supports both url or local_path to a checkpoint. If provided with an url, the checkpoint will first be downloaded and laoded to re-create the model. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "refined-passenger", + "metadata": {}, + "outputs": [], + "source": [ + "model = TextClassifier.load_from_checkpoint(\"https://flash-weights.s3.amazonaws.com/text_classification_model.pt\")" + ] + }, + { + "cell_type": "markdown", + "id": "illegal-adjustment", + "metadata": {}, + "source": [ + "### 2a. Classify a few sentences! How was the movie?\n", + "\n", + "The model can perform sentimennt predictions directly from a list of sentences." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "derived-current", + "metadata": {}, + "outputs": [], + "source": [ + "predictions = model.predict([\n", + " \"Turgid dialogue, feeble characterization - Harvey Keitel a judge?.\",\n", + " \"The worst movie in the history of cinema.\",\n", + " \"I come from Bulgaria where it 's almost impossible to have a tornado.\"\n", + " \"Very, very afraid\"\n", + " \"This guy has done a great job with this movie!\",\n", + "])" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "geological-chart", + "metadata": {}, + "outputs": [], + "source": [ + "print(predictions)" + ] + }, + { + "cell_type": "markdown", + "id": "ceramic-blackjack", + "metadata": {}, + "source": [ + "### 2b. Or generate predictions from a sheet file!\n", + "\n", + "For scaling for inference on 32 gpus, it is as simple as `Trainer(num_gpus=32).predict(...)`" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "cellular-breach", + "metadata": {}, + "outputs": [], + "source": [ + "datamodule = TextClassificationData.from_file(\n", + " predict_file=\"data/imdb/predict.csv\",\n", + " input=\"review\",\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "signed-holmes", + "metadata": {}, + "outputs": [], + "source": [ + "predictions = Trainer().predict(model, datamodule=datamodule)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "statistical-therapist", + "metadata": {}, + "outputs": [], + "source": [ + "print(predictions)" + ] + }, + { + "cell_type": "markdown", + "id": "different-origin", + "metadata": {}, + "source": [ + "\n", + "

Congratulations - Time to Join the Community!

\n", + "
\n", + "\n", + "Congratulations on completing this notebook tutorial! If you enjoyed it and would like to join the Lightning movement, you can do so in the following ways!\n", + "\n", + "### Help us build Flash by adding support for new data-types and new tasks.\n", + "Flash aims at becoming the first task hub, so anyone can get started to great amazing application using deep learning. \n", + "If you are interested, please open a PR with your contributions !!! \n", + "\n", + "\n", + "### Star [Lightning](https://github.com/PyTorchLightning/pytorch-lightning) on GitHub\n", + "The easiest way to help our community is just by starring the GitHub repos! This helps raise awareness of the cool tools we're building.\n", + "\n", + "* Please, star [Lightning](https://github.com/PyTorchLightning/pytorch-lightning)\n", + "\n", + "### Join our [Slack](https://join.slack.com/t/pytorch-lightning/shared_invite/zt-f6bl2l0l-JYMK3tbAgAmGRrlNr00f1A)!\n", + "The best way to keep up to date on the latest advancements is to join our community! Make sure to introduce yourself and share your interests in `#general` channel\n", + "\n", + "### Interested by SOTA AI models ! Check out [Bolt](https://github.com/PyTorchLightning/pytorch-lightning-bolts)\n", + "Bolts has a collection of state-of-the-art models, all implemented in [Lightning](https://github.com/PyTorchLightning/pytorch-lightning) and can be easily integrated within your own projects.\n", + "\n", + "* Please, star [Bolt](https://github.com/PyTorchLightning/pytorch-lightning-bolts)\n", + "\n", + "### Contributions !\n", + "The best way to contribute to our community is to become a code contributor! At any time you can go to [Lightning](https://github.com/PyTorchLightning/pytorch-lightning) or [Bolt](https://github.com/PyTorchLightning/pytorch-lightning-bolts) GitHub Issues page and filter for \"good first issue\". \n", + "\n", + "* [Lightning good first issue](https://github.com/PyTorchLightning/pytorch-lightning/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22)\n", + "* [Bolt good first issue](https://github.com/PyTorchLightning/pytorch-lightning-bolts/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22)\n", + "* You can also contribute your own notebooks with useful examples !\n", + "\n", + "### Great thanks from the entire Pytorch Lightning Team for your interest !\n", + "\n", + "" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.9" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/notebooks/general_model.py b/notebooks/general_model.py deleted file mode 100644 index 1c2b8bed460..00000000000 --- a/notebooks/general_model.py +++ /dev/null @@ -1,24 +0,0 @@ -import pytorch_lightning as pl -from torch import nn, optim -from torch.utils.data import DataLoader, random_split -from torchvision import datasets, transforms - -from flash import Task - -# model -model = nn.Sequential( - nn.Flatten(), - nn.Linear(28 * 28, 128), - nn.ReLU(), - nn.Linear(128, 10), -) - -# data -dataset = datasets.MNIST('./data_folder', download=True, transform=transforms.ToTensor()) -train, val = random_split(dataset, [55000, 5000]) - -# task -classifier = Task(model, loss_fn=nn.functional.cross_entropy, optimizer=optim.Adam) - -# train -pl.Trainer().fit(classifier, DataLoader(train), DataLoader(val)) diff --git a/notebooks/image-classification.ipynb b/notebooks/image-classification.ipynb deleted file mode 100644 index ce6d14a8afc..00000000000 --- a/notebooks/image-classification.ipynb +++ /dev/null @@ -1,609 +0,0 @@ -{ - "cells": [ - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - " " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "In this article we will go over how to use the `Flash.vision.ImageClassifier` to train your own deep learning model! We'll start off by importing the various libraries we will need to use:" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [], - "source": [ - "from pathlib import Path\n", - "from urllib.request import urlopen\n", - "from zipfile import ZipFile\n", - "from io import BytesIO\n", - "from PIL import Image\n", - "from sklearn.model_selection import train_test_split\n", - "from pprint import pprint\n", - "import pandas as pd\n", - "import matplotlib.pyplot as plt\n", - "\n", - "\n", - "# Flash and PyTorch Lightning\n", - "from pl_flash.vision import ImageClassifier, ImageClassificationData\n", - "import pytorch_lightning as pl" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Next, we will download some data. Here we will be using a dataset consisting of images of cats and dogs, so we can train a model to differentiate between the two. Feel free to use any other dataset with as many categories as you would like!" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "/home/teddy/anaconda3/lib/python3.7/site-packages/ipykernel/ipkernel.py:287: DeprecationWarning: `should_run_async` will not call `transform_cell` automatically in the future. Please pass the result to `transformed_cell` argument and any exception that happen during thetransform in `preprocessing_exc_tuple` in IPython 7.17 and above.\n", - " and should_run_async(code)\n" - ] - } - ], - "source": [ - "data_path = Path(\"data/dogs-vs-cats/\")\n", - "\n", - "if not data_path.exists():\n", - " with urlopen(\"https://pl-flash-data.s3.amazonaws.com/dogs-vs-cats.zip\") as resp:\n", - " with ZipFile(BytesIO(resp.read())) as file:\n", - " file.extractall('data/')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "With the files downloaded, we can gather all of the images in a list simply by looking for files with the `.jpg` extension:" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[PosixPath('data/dogs-vs-cats/dog.3781.jpg'),\n", - " PosixPath('data/dogs-vs-cats/dog.10546.jpg'),\n", - " PosixPath('data/dogs-vs-cats/dog.9858.jpg'),\n", - " PosixPath('data/dogs-vs-cats/cat.12197.jpg'),\n", - " PosixPath('data/dogs-vs-cats/cat.10233.jpg')]\n" - ] - } - ], - "source": [ - "files = list(data_path.glob(\"*.jpg\"))\n", - "pprint(files[:5]) # print first 5 files" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "We can see that each image includes either `cat` or `dog` in the filename, so we can use this to generate our labels for each image:" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "['dog', 'dog', 'dog', 'cat', 'cat']\n" - ] - } - ], - "source": [ - "labels = [\"cat\" if \"cat\" in f.name else \"dog\" for f in files]\n", - "pprint(labels[:5]) # print first 5 labels" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Before we create our dataset, lets just make sure the images and labels look correct:" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [ - { - "data": { - "image/png": "\n", - "text/plain": [ - "
" - ] - }, - "metadata": { - "needs_background": "light" - }, - "output_type": "display_data" - } - ], - "source": [ - "fig = plt.figure(figsize=(6,4), dpi=150)\n", - "for i in range(9):\n", - " fig.add_subplot(3, 3, i + 1)\n", - " plt.imshow(Image.open(files[i]))\n", - " plt.title(labels[i]); plt.axis(\"off\")\n", - "fig.tight_layout()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": 6, - "metadata": {}, - "outputs": [], - "source": [ - "train_files, test_files, train_labels, test_labels = train_test_split(\n", - " files, labels, test_size=0.10\n", - ")\n", - "train_files, valid_files, train_labels, valid_labels = train_test_split(\n", - " train_files, train_labels, test_size=0.10\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": 11, - "metadata": {}, - "outputs": [], - "source": [ - "data = ImageClassificationData.from_filepaths(\n", - " train_filepaths=train_files, \n", - " train_labels=train_labels,\n", - " valid_filepaths=valid_files,\n", - " valid_labels=valid_labels,\n", - " test_filepaths=test_files,\n", - " test_labels=test_labels,\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": 12, - "metadata": {}, - "outputs": [], - "source": [ - "task = ImageClassifier(num_classes=2, metrics=pl.metrics.Accuracy())" - ] - }, - { - "cell_type": "code", - "execution_count": 13, - "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "GPU available: True, used: True\n", - "TPU available: False, using: 0 TPU cores\n", - "LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]\n", - "\n", - " | Name | Type | Params\n", - "----------------------------------------\n", - "0 | metrics | ModuleDict | 0 \n", - "1 | backbone | Sequential | 11 M \n", - "2 | head | Sequential | 1 K \n" - ] - }, - { - "data": { - "application/vnd.jupyter.widget-view+json": { - "model_id": "", - "version_major": 2, - "version_minor": 0 - }, - "text/plain": [ - "HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validation sanity check', layout=Layout…" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "application/vnd.jupyter.widget-view+json": { - "model_id": "b0bd3a484f414afb91a56e2544400e42", - "version_major": 2, - "version_minor": 0 - }, - "text/plain": [ - "HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Training', layout=Layout(flex='2'), max…" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "application/vnd.jupyter.widget-view+json": { - "model_id": "", - "version_major": 2, - "version_minor": 0 - }, - "text/plain": [ - "HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "application/vnd.jupyter.widget-view+json": { - "model_id": "", - "version_major": 2, - "version_minor": 0 - }, - "text/plain": [ - "HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "application/vnd.jupyter.widget-view+json": { - "model_id": "", - "version_major": 2, - "version_minor": 0 - }, - "text/plain": [ - "HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "application/vnd.jupyter.widget-view+json": { - "model_id": "", - "version_major": 2, - "version_minor": 0 - }, - "text/plain": [ - "HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "application/vnd.jupyter.widget-view+json": { - "model_id": "", - "version_major": 2, - "version_minor": 0 - }, - "text/plain": [ - "HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "application/vnd.jupyter.widget-view+json": { - "model_id": "", - "version_major": 2, - "version_minor": 0 - }, - "text/plain": [ - "HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "application/vnd.jupyter.widget-view+json": { - "model_id": "", - "version_major": 2, - "version_minor": 0 - }, - "text/plain": [ - "HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "application/vnd.jupyter.widget-view+json": { - "model_id": "", - "version_major": 2, - "version_minor": 0 - }, - "text/plain": [ - "HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "application/vnd.jupyter.widget-view+json": { - "model_id": "", - "version_major": 2, - "version_minor": 0 - }, - "text/plain": [ - "HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "application/vnd.jupyter.widget-view+json": { - "model_id": "", - "version_major": 2, - "version_minor": 0 - }, - "text/plain": [ - "HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n" - ] - }, - { - "data": { - "text/plain": [ - "1" - ] - }, - "execution_count": 13, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "trainer = pl.Trainer(\n", - " gpus=1, \n", - " max_epochs=5,\n", - " log_every_n_steps=1,\n", - ")\n", - "\n", - "trainer.fit(task, data)" - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "metadata": {}, - "outputs": [ - { - "data": { - "application/vnd.jupyter.widget-view+json": { - "model_id": "07b97a2eb8624924b8cbdd5e65120a33", - "version_major": 2, - "version_minor": 0 - }, - "text/plain": [ - "HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Testing', layout=Layout(flex='2'), max=…" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "--------------------------------------------------------------------------------\n", - "DATALOADER:0 TEST RESULTS\n", - "{'test_accuracy': tensor(0.9400, device='cuda:0'),\n", - " 'test_cross_entropy': tensor(0.2033, device='cuda:0'),\n", - " 'train_accuracy': tensor(1., device='cuda:0'),\n", - " 'train_cross_entropy': tensor(0.2229, device='cuda:0'),\n", - " 'val_accuracy': tensor(0.9256, device='cuda:0'),\n", - " 'val_cross_entropy': tensor(0.2121, device='cuda:0')}\n", - "--------------------------------------------------------------------------------\n", - "\n" - ] - } - ], - "source": [ - "trainer.test();" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.7.7" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/notebooks/image_classifier.py b/notebooks/image_classifier.py deleted file mode 100644 index 85381793821..00000000000 --- a/notebooks/image_classifier.py +++ /dev/null @@ -1,48 +0,0 @@ -# -*- coding: utf-8 -*- -# + -import os -from io import BytesIO -from urllib.request import urlopen -from zipfile import ZipFile - -import pytorch_lightning as pl -import torch - -from flash.vision import ImageClassificationData, ImageClassifier - -# - - -# First we'll download our data: - -with urlopen("https://download.pytorch.org/tutorial/hymenoptera_data.zip") as resp: - with ZipFile(BytesIO(resp.read())) as file: - file.extractall('data/') - -# Our data is sorted by class in train and val folders: -# ``` -# hymenoptera_data -# ├── train -# │ ├── ants -# │ └── bees -# └── val -# ├── ants -# └── bees -# ``` -# We can create a `pl.DataModule` from this like so: - -data = ImageClassificationData.from_folders( - train_folder="data/hymenoptera_data/train/", - valid_folder="data/hymenoptera_data/val/", - batch_size=4, -) - -model = ImageClassifier( - backbone="resnet18", - num_classes=2, - metrics=pl.metrics.Accuracy(), - optimizer=torch.optim.SGD, - learning_rate=0.001, -) - -trainer = pl.Trainer(max_epochs=25, fast_dev_run=os.getenv("TEST_ENV", False)) -trainer.fit(model, data) diff --git a/notebooks/text-classification.ipynb b/notebooks/text-classification.ipynb deleted file mode 100644 index 5e4731c39be..00000000000 --- a/notebooks/text-classification.ipynb +++ /dev/null @@ -1,502 +0,0 @@ -{ - "cells": [ - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "/home/teddy/anaconda3/lib/python3.7/site-packages/tensorflow/python/data/ops/iterator_ops.py:546: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working\n", - " class IteratorBase(collections.Iterator, trackable.Trackable,\n", - "/home/teddy/anaconda3/lib/python3.7/site-packages/tensorflow/python/data/ops/dataset_ops.py:106: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working\n", - " class DatasetV2(collections.Iterable, tracking_base.Trackable,\n", - "PyTorch version 1.6.0 available.\n", - "TensorFlow version 2.3.0 available.\n" - ] - } - ], - "source": [ - "from pathlib import Path\n", - "from urllib.request import urlopen\n", - "from zipfile import ZipFile\n", - "from io import BytesIO\n", - "from PIL import Image\n", - "from sklearn.model_selection import train_test_split\n", - "from pprint import pprint\n", - "import pandas as pd\n", - "import matplotlib.pyplot as plt\n", - "\n", - "\n", - "# Flash and PyTorch Lightning\n", - "from pl_flash.text import TextClassificationData, TextClassifier\n", - "import pytorch_lightning as pl" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "/home/teddy/anaconda3/lib/python3.7/site-packages/ipykernel/ipkernel.py:287: DeprecationWarning: `should_run_async` will not call `transform_cell` automatically in the future. Please pass the result to `transformed_cell` argument and any exception that happen during thetransform in `preprocessing_exc_tuple` in IPython 7.17 and above.\n", - " and should_run_async(code)\n" - ] - } - ], - "source": [ - "data_path = Path(\"data/imdb\")\n", - "\n", - "if not data_path.exists():\n", - " with urlopen(\"https://pl-flash-data.s3.amazonaws.com/imdb.zip\") as resp:\n", - " with ZipFile(BytesIO(resp.read())) as file:\n", - " file.extractall(\"data/\")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
reviewsentiment
0Japanese indie film with humor and philosophy ...positive
1Isaac Florentine has made some of the best wes...negative
2After seeing the low-budget shittier versions ...negative
3I've seen the original English version on vide...positive
4Ahh, nuthin' like cheesy, explopitative, semi-...negative
\n", - "
" - ], - "text/plain": [ - " review sentiment\n", - "0 Japanese indie film with humor and philosophy ... positive\n", - "1 Isaac Florentine has made some of the best wes... negative\n", - "2 After seeing the low-budget shittier versions ... negative\n", - "3 I've seen the original English version on vide... positive\n", - "4 Ahh, nuthin' like cheesy, explopitative, semi-... negative" - ] - }, - "execution_count": 3, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "pd.read_csv(data_path/\"train.csv\").head()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "/home/teddy/anaconda3/lib/python3.7/site-packages/ipykernel/ipkernel.py:287: DeprecationWarning: `should_run_async` will not call `transform_cell` automatically in the future. Please pass the result to `transformed_cell` argument and any exception that happen during thetransform in `preprocessing_exc_tuple` in IPython 7.17 and above.\n", - " and should_run_async(code)\n", - "Checking /home/teddy/.cache/huggingface/datasets/d927c670fd53408efaea423294f823daf050357872041f191bfea8af06a952b6.03756fef6da334f50a7ff73608e21b5018229944ca250416ce7352e25d84a552.py for additional imports.\n", - "Found main folder for dataset https://raw.githubusercontent.com/huggingface/datasets/1.0.1/datasets/csv/csv.py at /home/teddy/.cache/huggingface/modules/datasets_modules/datasets/csv\n", - "Found specific version folder for dataset https://raw.githubusercontent.com/huggingface/datasets/1.0.1/datasets/csv/csv.py at /home/teddy/.cache/huggingface/modules/datasets_modules/datasets/csv/0d06ce3712951dae7909fb214283b88efab3578535edb5eebd37c498b7a35277\n", - "Found script file from https://raw.githubusercontent.com/huggingface/datasets/1.0.1/datasets/csv/csv.py to /home/teddy/.cache/huggingface/modules/datasets_modules/datasets/csv/0d06ce3712951dae7909fb214283b88efab3578535edb5eebd37c498b7a35277/csv.py\n", - "Couldn't find dataset infos file at https://raw.githubusercontent.com/huggingface/datasets/1.0.1/datasets/csv/dataset_infos.json\n", - "Found metadata file for dataset https://raw.githubusercontent.com/huggingface/datasets/1.0.1/datasets/csv/csv.py at /home/teddy/.cache/huggingface/modules/datasets_modules/datasets/csv/0d06ce3712951dae7909fb214283b88efab3578535edb5eebd37c498b7a35277/csv.json\n", - "Using custom data configuration default\n", - "Overwrite dataset info from restored data version.\n", - "Loading Dataset info from /home/teddy/.cache/huggingface/datasets/csv/default-4c551d7a04ff9804/0.0.0/0d06ce3712951dae7909fb214283b88efab3578535edb5eebd37c498b7a35277\n", - "Reusing dataset csv (/home/teddy/.cache/huggingface/datasets/csv/default-4c551d7a04ff9804/0.0.0/0d06ce3712951dae7909fb214283b88efab3578535edb5eebd37c498b7a35277)\n", - "Constructing Dataset for split train, validation, test, from /home/teddy/.cache/huggingface/datasets/csv/default-4c551d7a04ff9804/0.0.0/0d06ce3712951dae7909fb214283b88efab3578535edb5eebd37c498b7a35277\n", - "100%|██████████| 3/3 [00:00<00:00, 708.74it/s]\n", - "Testing the mapped function outputs\n", - "Testing finished, running the mapping function on the dataset\n", - "Loading cached processed dataset at /home/teddy/.cache/huggingface/datasets/csv/default-4c551d7a04ff9804/0.0.0/0d06ce3712951dae7909fb214283b88efab3578535edb5eebd37c498b7a35277/cache-58e611b9754fe092.arrow\n", - "Testing the mapped function outputs\n", - "Testing finished, running the mapping function on the dataset\n", - "Loading cached processed dataset at /home/teddy/.cache/huggingface/datasets/csv/default-4c551d7a04ff9804/0.0.0/0d06ce3712951dae7909fb214283b88efab3578535edb5eebd37c498b7a35277/cache-58e4215b5b071b22.arrow\n", - "Testing the mapped function outputs\n", - "Testing finished, running the mapping function on the dataset\n", - "Loading cached processed dataset at /home/teddy/.cache/huggingface/datasets/csv/default-4c551d7a04ff9804/0.0.0/0d06ce3712951dae7909fb214283b88efab3578535edb5eebd37c498b7a35277/cache-fb7a4e9ad680462c.arrow\n", - "Testing the mapped function outputs\n", - "Testing finished, running the mapping function on the dataset\n", - "Caching processed dataset at /home/teddy/.cache/huggingface/datasets/csv/default-4c551d7a04ff9804/0.0.0/0d06ce3712951dae7909fb214283b88efab3578535edb5eebd37c498b7a35277/cache-7a9447f8aca59275.arrow\n" - ] - }, - { - "data": { - "application/vnd.jupyter.widget-view+json": { - "model_id": "89c14577165545abb8caeb0031f47d8c", - "version_major": 2, - "version_minor": 0 - }, - "text/plain": [ - "HBox(children=(FloatProgress(value=0.0, max=23.0), HTML(value='')))" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Done writing 22500 examples in 99118186 bytes /home/teddy/.cache/huggingface/datasets/csv/default-4c551d7a04ff9804/0.0.0/0d06ce3712951dae7909fb214283b88efab3578535edb5eebd37c498b7a35277/tmpt2m67mqy.\n", - "Testing the mapped function outputs\n", - "Testing finished, running the mapping function on the dataset\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Caching processed dataset at /home/teddy/.cache/huggingface/datasets/csv/default-4c551d7a04ff9804/0.0.0/0d06ce3712951dae7909fb214283b88efab3578535edb5eebd37c498b7a35277/cache-16a4b77e3363d2e5.arrow\n" - ] - }, - { - "data": { - "application/vnd.jupyter.widget-view+json": { - "model_id": "9e28ddbb86154051bd54680287e1759d", - "version_major": 2, - "version_minor": 0 - }, - "text/plain": [ - "HBox(children=(FloatProgress(value=0.0, max=3.0), HTML(value='')))" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Done writing 2500 examples in 11005031 bytes /home/teddy/.cache/huggingface/datasets/csv/default-4c551d7a04ff9804/0.0.0/0d06ce3712951dae7909fb214283b88efab3578535edb5eebd37c498b7a35277/tmpbqcwwuwh.\n", - "Testing the mapped function outputs\n", - "Testing finished, running the mapping function on the dataset\n", - "Caching processed dataset at /home/teddy/.cache/huggingface/datasets/csv/default-4c551d7a04ff9804/0.0.0/0d06ce3712951dae7909fb214283b88efab3578535edb5eebd37c498b7a35277/cache-8b6ae889fbd14d76.arrow\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n" - ] - }, - { - "data": { - "application/vnd.jupyter.widget-view+json": { - "model_id": "c4039a2fb24f469aaaf0f4cbcf19c13f", - "version_major": 2, - "version_minor": 0 - }, - "text/plain": [ - "HBox(children=(FloatProgress(value=0.0, max=25.0), HTML(value='')))" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Done writing 25000 examples in 110161107 bytes /home/teddy/.cache/huggingface/datasets/csv/default-4c551d7a04ff9804/0.0.0/0d06ce3712951dae7909fb214283b88efab3578535edb5eebd37c498b7a35277/tmp87iwmjei.\n", - "Set __getitem__(key) output type to torch for ['input_ids', 'labels'] columns (when key is int or slice) and don't output other (un-formatted) columns.\n", - "Set __getitem__(key) output type to torch for ['input_ids', 'labels'] columns (when key is int or slice) and don't output other (un-formatted) columns.\n", - "Set __getitem__(key) output type to torch for ['input_ids', 'labels'] columns (when key is int or slice) and don't output other (un-formatted) columns.\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n" - ] - } - ], - "source": [ - "data = TextClassificationData.from_files(\n", - " train_file=data_path/\"train.csv\",\n", - " valid_file=data_path/\"valid.csv\",\n", - " test_file=data_path/\"test.csv\",\n", - " input=\"review\",\n", - " target=\"sentiment\",\n", - " batch_size=32\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias']\n", - "- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model).\n", - "- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).\n", - "Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.weight', 'classifier.bias']\n", - "You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.\n" - ] - } - ], - "source": [ - "task = TextClassifier(num_classes=2, metrics=pl.metrics.Accuracy())" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "GPU available: True, used: True\n", - "TPU available: False, using: 0 TPU cores\n", - "LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]\n", - "\n", - " | Name | Type | Params\n", - "----------------------------------------------------------\n", - "0 | metrics | ModuleDict | 0 \n", - "1 | model | BertForSequenceClassification | 109 M \n" - ] - }, - { - "data": { - "application/vnd.jupyter.widget-view+json": { - "model_id": "", - "version_major": 2, - "version_minor": 0 - }, - "text/plain": [ - "HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validation sanity check', layout=Layout…" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "/home/teddy/anaconda3/lib/python3.7/site-packages/datasets/arrow_dataset.py:835: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:141.)\n", - " return torch.tensor(x, **format_kwargs)\n", - "/home/teddy/anaconda3/lib/python3.7/site-packages/datasets/arrow_dataset.py:835: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:141.)\n", - " return torch.tensor(x, **format_kwargs)\n", - "/home/teddy/anaconda3/lib/python3.7/site-packages/datasets/arrow_dataset.py:835: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:141.)\n", - " return torch.tensor(x, **format_kwargs)\n", - "/home/teddy/anaconda3/lib/python3.7/site-packages/datasets/arrow_dataset.py:835: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:141.)\n", - " return torch.tensor(x, **format_kwargs)\n", - "/home/teddy/anaconda3/lib/python3.7/site-packages/datasets/arrow_dataset.py:835: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:141.)\n", - " return torch.tensor(x, **format_kwargs)\n", - "/home/teddy/anaconda3/lib/python3.7/site-packages/datasets/arrow_dataset.py:835: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:141.)\n", - " return torch.tensor(x, **format_kwargs)\n" - ] - }, - { - "data": { - "application/vnd.jupyter.widget-view+json": { - "model_id": "45e080a5ccdd4dcb97db0f21a13666c8", - "version_major": 2, - "version_minor": 0 - }, - "text/plain": [ - "HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Training', layout=Layout(flex='2'), max…" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "/home/teddy/anaconda3/lib/python3.7/site-packages/datasets/arrow_dataset.py:835: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:141.)\n", - " return torch.tensor(x, **format_kwargs)\n", - "/home/teddy/anaconda3/lib/python3.7/site-packages/datasets/arrow_dataset.py:835: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:141.)\n", - " return torch.tensor(x, **format_kwargs)\n", - "/home/teddy/anaconda3/lib/python3.7/site-packages/datasets/arrow_dataset.py:835: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:141.)\n", - " return torch.tensor(x, **format_kwargs)\n" - ] - } - ], - "source": [ - "trainer = pl.Trainer(\n", - " gpus=1, \n", - " max_epochs=5,\n", - " log_every_n_steps=1,\n", - ")\n", - "\n", - "trainer.fit(task, data)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.7.7" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -}