This example will extract named entity from text and extract with spark_apply() of sparklyr. spacyr is R binding of SpaCy, which is Python library for NLP. spacyr requires Python with Spacy.
- Run
install_spacy.sh
on CDSW terminal - Install following packages on R session:
devtools::install_github("rstudio/sparklyr")
install.packages(c("janeaustenr"))
- Open sparcyr-sparklyr.R on CDSW session and run all
If you are not CDSW user, please run them on gateway node.