Static webpage is currently down due to no more firebase support.
~~See the static webapp on my github pages (use WIFI: the model will cost ~14mb on page load).~~
See the python code on my google colab here.
Demonstration of the project:
All Georgian names (with count >= 5) are taken and used for training the model. In total ~14k names are used with approximately 45:55-F:M distribution. The dataset was legally acquired from the Georgian government. I am not making it public to not get in any trouble.
We use a 2-stacked LSTM model with MAXLEN LSTM cells per stack. Each cell accepts a vector of length VOCABLEN. In short, the input is represented with one-hot encoded vectors for names; such that each name is represented by a vector of shape (MAXLEN, VOCABLEN).
MAXLEN is a hyperparameter and VOCABLEN is derived after reading the input data (it depends on the char_idx dictionary, which is a map of all present characters to a number, e.g. 'a' : 0, and so on).
Moreover, we shuffle the train-test data for N iterations of M epochs each to help reduce overfitting. I do not have much information on this after researching. It was simply a choice since I thought it would help the process.
- Use the count variable for each name to feed the LSTM cells more info about each name.
- Aid overfitting even more (this is not so easy to fix without trial and error).
The model was constructed by prdeepakbabu at LSTM_RNN_architecture.jpg