Skip to content

nikolaypavlov/spark-nlp-workshop

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spark NLP Workshop

Prerequisites

Docker version 18.09.2+

Installation

  1. Clone the repo.
git clone https://github.com/nikolaypavlov/spark-nlp-workshop.git
cd spark-nlp-workshop
  1. Download the News Category Dataset from Kaggle and unzip it into data directory.

  2. Build the container and start it.

If you have make utility in your environment:

make build
make run

If you don't have make utility:

docker build -t spark-nlp-workshop .
docker run -it --rm -e LANG=C.UTF-8 -e LC_ALL=C.UTF-8 -v `pwd`/data:/app/data -p 8888:8888 spark-nlp-workshop

Note: for Windows use %cd% or full path to the data directory instead of `pwd`/ to start the container

  1. Open Jupyter notebook in the browser: http://localhost:8888 and paste session token from the Terminal to login form.

Session token

  1. Open spark-spacy.ipynb notebook and try to run the code blocks to test that everything works fine.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published