Skip to content

arka816/deep-scratch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

deep-scratch

This repo is part of an attempt to develop various neural network models from scratch in python and providing alternative implementations of them for devices with CUDA-enabled GPUs.

dependencies

  • Numpy 1.20.3
  • PyCUDA 2021.1

installation

pip install numpy pycuda

description

activations.py

common activation functions e.g. sigmoid, tanh, ReLU, softmax and their derivatives

layer.py

implements a layer in an artificial neural network. supports common weight initialization schemes like glorot normal and glorot uniform.

loss.py

implements common loss functions (e.g. binary cross-entropy, mse, categorical crossentropy etc.) and their derivatives

network.py

contains the implementation of the ANN model with support for vector output. batching like gradient descent, stochastic gradient descent, mini-batch gradient descent have been implemented. 2nd moment based optimizers like adagrad, and RMSprop as well as ADAM based on both 1st and 2nd moments have been implemented.

documentation

model = network.Network(inputDim=inputDim, initializationScheme=initializationScheme)

      inputDim: (int) the number of features in input (also the number of neurons in input layer)
      initializationScheme: (string) default: randn. can be any of the values below:
            1. randn: random normal with mean 0
            2. he : he initialization
            3. glorot-normal
            4. glorot-uniform

model.addLayer(dim=dim, activation=activation)

      dim: (int) the number of hidden layers
      activation: (string) the activation function of the layer. can be any one of the following:
            1. relu
            2. tanh
            3. sigmoid
            4. leaky-relu
            5. softmax

model.compile(loss=loss_func, optimizer=optimizer, batch_type=batch_type, batch_size=batch_size)

      loss: (string) the loss function of the network. default: binary-crossentropy. can be any one of the following:
            1. binary-crossentropy
            2. mse
            3. cat-crossentropy
      optimizer: (string) the gradient descent optimizer for reducing loss. default: gd. can be any one of the following:
            1. gd: gradient descent
            2. adagrad
            3. rmsprop
            4. adam
      batch_type: (string) the batching for gradient descent. default: bgd. can be any one of the following:
            1. bgd: batch gradient descent
            2. sgd: stochastic gradient descent
            3. mbgd: mini-batch gradient descent
      batch_size: (int) batch size if mbgd is used for batching

model.train(X_train, y_train, epochs=epochs, alpha=alpha, verbose=verbose)

      epochs: (int) number of epochs to train the neural network. default: 100
      alpha: (int) learning rate for gradient descent. default: 0.1
      verbose: (boolean) default: false

pred, accuracy = model.predict(X, y)

returns

  1. pred: numpy array the prediction matrix
  2. accuracy: float the accuracy of prediction (if target variable is categorical)

overall code to train and test:

model = network.Network(inputDim=4, intializationScheme='glorot-uniform')
model.addLayer(dim = 3, activation='sigmoid')
model.addLayer(dim = 1, activation='sigmoid')
model.compile(optimizer='adam', batch_type="mbgd", batch_size=32)
model.train(X_train.T, y_train.T, alpha=0.1, epochs = 100)

results

using a 2 layer neural ann with 300 hidden units and 784 (28 x 28 images) input units

  • achieved 97.78% accuracy over test set on training over grayscale images of handwritten letters
  • achieved around 96% accuracy over mnist digits dataset

About

deep-learning algorithms written from scratch

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published