A simple and self-descriptive implementation of neural networks with support of different topology, activation functions and optimization strategies.
A useful tool if you are interested to discover the intuition of neural networks through experimentation.
For quick demo you can see the notebooks with a detailed example on how to train and use a NN:
- Feedforward NN on binary classification problem
- Using mini-batches for training
- Multi-class classification with FNN
Or you can see the following snippet on how to train a binary classifier of 2 hidden layers (50, 20 nodes) with tanh
activation function for the hidden layers and sigmoid
for the output on a dataset
of m
examples with 10-sized feature vector:
from mynn import FNN
from mynn.activation import TanhActivation, SigmoidActivation
nn = FNN(
layers_config=[
(50, TanhActivation),
(20, TanhActivation),
(1, SigmoidActivation)
],
n_x=10
)
# To train the vector we need to assure that input
# vector is in (n, m) shape where n is the size of input
# feature vector and m the number of samples
X, Y = load_dataset()
nn.train(X, Y)
# To predict the class you can use the `predict()` method
Y_pred = nn.predict(X_test)
# If you need to take the prediction probability you can
# just perform a forward propagation
Y_proba = nn.forward(X_test)
Library provides abstract interface for activation functions as long it is possible to provide a forward computation and a derivative estimation. The package is shipped with 3 implementations of activation functions:
ReLU
Tanh
Sigmoid
Softmax
The optimization strategy is a core component for any optimization problem. Depending the strategy the training can be faster, more effective. For this reason it is very beneficial to understand the mechanics behind each algorithm before choosing one. Sebastian Ruder has published a very nice blog post where he presents in detail different optimization algorithms.
MyNN is shipped with different implementation of optimization strategies.
GradientDescent
: Typical Gradient Descent algorithmGradientDescentMomentum
: It works as the gradient descent but takes into account the previous grads in order to keep a momentumAdaptiveGradientDescent
: A gradient descent that adapts learning rate per optimized parameter.RMSProp
: Gradient descent that adapts learnig rate based on the average of recent magnitudes of the gradients for the weight.Adam
(Default): Adaptive moment estimation, an algorithm that encapsulates both the ideas of momentum and RMSProp.
The library supports different ways to initialize the weights of the linear functions. By default it ships with the following strategies:
ConstantWeightInitializer
: It will initialize weight from a predefined list of constant values.NormalWeightInitializer
: It will initialize weights with random values following a normal distribution.VarianceScalingWeightInitializer
: (Default with scale=2) It will initialize weights with random values following a normal distribution scaled to ensure a specific variance per layer.
MyNN supports regularization but only one method per train. As it does not distinguish the stage of training and usage, you need to disable regularization after training in order to take reasonable results. Currenlty the following algorithms are supported:
L2Regularization
: Regularization on the cost function using L2 norm on the weights of neurons.DropoutRegularization
: Inverse dropout with support for different keep_probability per layer.
There are many problems that demand encoding of variable Y
at train time and decoding
at prediction time.
The following en(de)coders are currently supported by MyNN:
ThresholdBinaryEncoderDecoder
: The most basic encoder for binary classification problem. It can encode class labels to probabilities0.0
,1.0
and can decode probabilities to labels using a standard threshold.OneHotEncoderDecoder
: OneHot encoder can be used for multi-class classification problem where the class is a single value inY
variable. At encoding it will turn each valueY
to one-hot vector ofC
size (whereC = len(classes)
)with only one element being1.0
. At decoding it will translate a probability vector to a unique value using hard-max.
MyNN only works for python >= 3.6
. It is proposed to use virtualenv
to perform
installation.
$ virtualenv -ppython3.6 venv
$ source venv/bin/activate
You can install the dependencies using requirements.txt
file.
$ pip install -r requirements.txt