Skip to content

Project showing implementation of AutoSuggestion and AutoCorrection using Elasticsearch, PySpark and Flask.

License

Notifications You must be signed in to change notification settings

Wolvarun9295/AutoComplete--Flask-Elasticsearch-PySpark

Repository files navigation

AutoSuggestion and AutoCorrection using Flask, Elasticsearch and PySpark

This project shows the working of AutoSuggestion and AutoCorrection similar to how some websites suggest when a search query is typed in the search bar.

There are 2 ways in which this project has been made:

  1. Using Completion Suggester with Mapping
  2. Using Fuzzy Query without Mapping

Prerequisites:

  • Python3 (less than Python3.8 to avoid compatibility issues)
$ sudo apt-get install python3
$ sudo apt-get install python3-pip
  • Java JDK8 (required for Spark) and JDK11 (required for Elasticsearch)
  • Apache Spark (v2.4.x preferable to avoid compatibility issues) and also install the PySpark using pip
$ sudo pip3 install pyspark
$ sudo pip3 install elasticsearch

Steps to run the code

To see how Ingestion of data in Elasticsearch using PySpark is done, checkout this repo - Spark-Elasticsearch-5MilData

  1. Since the dataset is in tar.gz format, we have to untar it first using:
$ gunzip file.tar.gz
  1. First add your dataset path in the dataToES.py file and run to ingest the data in Elasticsearch. Running this file will first clean the data and then ingest the data to the index.

    NOTE: If running the Completion Suggester option, the cleaned data will be stored in a folder named movies_clean_dataset and it will create an index with the Completion Suggester mapping.

$ python3 dataToES.py
  1. Finally, run the run.py file and access the landing page on localhost:5000 if running on Local System or on 0.0.0.0:5000 if running on AWS or GCP.
$ python3 run.py

Directory Sturcture of the Flask App:

The following screenshot shows the working of this project:

References

License and Copyright

© Varun I. Nagrare

Licensed under the MIT License

About

Project showing implementation of AutoSuggestion and AutoCorrection using Elasticsearch, PySpark and Flask.

Topics

Resources

License

Stars

Watchers

Forks