Skip to content

Latest commit

 

History

History
39 lines (27 loc) · 1.85 KB

README.md

File metadata and controls

39 lines (27 loc) · 1.85 KB

TIMIDY - Statistical NLP Final Project

A chat system where the responses are generated based on the input text

Team: Ismayil Hasanov, Swati Raghuwanshi, Han Wang

Instructor: Yassine Benajiba

Introduction

In this project we applied and experimented with the current state-of-art NLP techniques to try and solve this problem. We used Recurrent Neural Networks (RNNs), specifically sequence to sequence (seq2seq) deep learning methodology.

Dataset

Chat corpus repo (https://github.com/Marsan-Ma/chat_corpus)

Methods

  • Preprocessing the data
  • Building the models
  • Testing the models

Files

  • Word-based: word_seq2seq.py
  • Character-based: Character_seq2seq.py
  • Word-based with embedding: word_embedding_seq2seq.py

References:

[1] Li, et al. (2016). A persona-based neural conversation model. Association for Computational Linguistics. 2016.

[2] Zhou, et al. (2016). Answer sequence learning with neural networks for answer selection in community question answering. Association for Computational Linguistics. 2015.

[3] Pilato, et al. (2011). A modular architecture for adaptive chatbots. Semantic Computing (ICSC), 2011 Fifth IEEE International Conference on. IEEE, 2011

[4] Deep Learning for Chatbots: Encoder-Decoder Image Retrieved from: http://www.wildml.com/2016/04/deep-learning-for-chatbots-part-1-introduction/

[5] Cho, et al. (2014). Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. Association for Computational Linguistics. 2014.

[6] Marsan Ma. Twitter_scraper. https://github.com/Marsan-Ma/chat_corpus Web site.

[7] Sequence-to-sequence learning in Keras Retrieved from: https://blog.keras.io/a-ten-minute-introduction-to-sequence-to-sequence-learning-in-keras.html

[8] Bahdanau D, Cho K, Bengio Y (2014). Neural machine translation by jointly learning to align and translate, ICLR. 2015.