This repo is part of an attempt to develop various neural network models from scratch in python and providing alternative implementations of them for devices with CUDA-enabled GPUs.
- Numpy 1.20.3
- PyCUDA 2021.1
pip install numpy pycuda
common activation functions e.g. sigmoid, tanh, ReLU, softmax and their derivatives
implements a layer in an artificial neural network. supports common weight initialization schemes like glorot normal and glorot uniform.
implements common loss functions (e.g. binary cross-entropy, mse, categorical crossentropy etc.) and their derivatives
contains the implementation of the ANN model with support for vector output. batching like gradient descent, stochastic gradient descent, mini-batch gradient descent have been implemented. 2nd moment based optimizers like adagrad, and RMSprop as well as ADAM based on both 1st and 2nd moments have been implemented.
model = network.Network(inputDim=inputDim, initializationScheme=initializationScheme)
inputDim: (int) the number of features in input (also the number of neurons in input layer)
initializationScheme: (string) default: randn. can be any of the values below:
1. randn: random normal with mean 0
2. he : he initialization
3. glorot-normal
4. glorot-uniform
model.addLayer(dim=dim, activation=activation)
dim: (int) the number of hidden layers
activation: (string) the activation function of the layer. can be any one of the following:
1. relu
2. tanh
3. sigmoid
4. leaky-relu
5. softmax
model.compile(loss=loss_func, optimizer=optimizer, batch_type=batch_type, batch_size=batch_size)
loss: (string) the loss function of the network. default: binary-crossentropy. can be any one of the following:
1. binary-crossentropy
2. mse
3. cat-crossentropy
optimizer: (string) the gradient descent optimizer for reducing loss. default: gd. can be any one of the following:
1. gd: gradient descent
2. adagrad
3. rmsprop
4. adam
batch_type: (string) the batching for gradient descent. default: bgd. can be any one of the following:
1. bgd: batch gradient descent
2. sgd: stochastic gradient descent
3. mbgd: mini-batch gradient descent
batch_size: (int) batch size if mbgd is used for batching
model.train(X_train, y_train, epochs=epochs, alpha=alpha, verbose=verbose)
epochs: (int) number of epochs to train the neural network. default: 100
alpha: (int) learning rate for gradient descent. default: 0.1
verbose: (boolean) default: false
pred, accuracy = model.predict(X, y)
returns
- pred: numpy array the prediction matrix
- accuracy: float the accuracy of prediction (if target variable is categorical)
overall code to train and test:
model = network.Network(inputDim=4, intializationScheme='glorot-uniform')
model.addLayer(dim = 3, activation='sigmoid')
model.addLayer(dim = 1, activation='sigmoid')
model.compile(optimizer='adam', batch_type="mbgd", batch_size=32)
model.train(X_train.T, y_train.T, alpha=0.1, epochs = 100)
using a 2 layer neural ann with 300 hidden units and 784 (28 x 28 images) input units
- achieved 97.78% accuracy over test set on training over grayscale images of handwritten letters
- achieved around 96% accuracy over mnist digits dataset