Neural network training workflow

This workflow allow to train and optimize NN using arbitrary data set for regression and binary classification problem.
But data for training shoud be aggregate in correct form.

Path to data controlled from config as follows:

dataDir = Configs.defPath / Configs.instrument / Configs.pathLabel

For each Date shoud exists separate folder in 'dataDir' directory.
For example, in dataDir should be folders: "2019-01-01", "2019-01-02", "2019-01-03" ... etc.
In each of these folders should be two files:

indicators.npy - data set with features
targets.npy - data set with targets.

There are two ways to train the model, depending on the size of the input data set.
If there is a lot of free RAM, then you can load all the data into memory. (batchTraining = False)
If the amount of training data is very large, you can use the option to train in parts. (batchTraining = True)
If batchTraining = True, then for one batch will be taken one day, or one file.

Requiremets:

python 3.6
pandas
numpy
keras
tensorflow
sklearn

Config description:

startDay - Start day for traing in dataDir folder
endDay - End day for traing in dataDir folder
defPath - default path to dataDir
key - label that will be added to results folder name
instrument - default path to dataDir
pathLabel - default path to dataDir
target - target column name
test2train - test to train split ratio
interleaveSplit - use interleave test to train split (each three days to train, each 4-th day to test)
trainLabel - label that will be added to model
batchTraining - use or not batch training
batchNormalization - use or not batch normalization
saveMeansStd - save or not means and standart deviations
predictionTask - classification or regression task
maxEvals - number of evaluations
earlyStop - early stop for NN. If no better results mode than earlyStop epochs, then sstop trainig
earlyStop_minDelta - when the improvement(prev-current) higher than this delta it is considered as improvement
fImportanceFile - feature importance file, if exists.
useTopImportantFeatures - use numbver of best features in training from fImportanceFile. ("all", number>0).
useTensorBoard = False
logLevel - logLevel=2 --> no full training history shows. Only at the end of epoch
batchTrMaxBS = 1
clth - if 0 rescale regression target 1 if >0 else 0
hpSpace = {
'epochs': [10000], - number of eposhs to use
'hu': [100, 150, 200, 250, 300, 350, 400], - number of hidden units in first hidded layer
'hu2': [0],
'dropout': [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9], - number of hidden units in second hidded layer
'act1': ['sigmoid'],
'act2': ['sigmoid'],
'optimizer': ['adam'],
'loss': ['binary_crossentropy'],
'bSize': [50000] - if BatchTrain batchSize = 1 day
}

Check Configs in trainDNN.py
If you want to run multiple targets, check targets list in trainDNN.py before 'for' loop.
If you want to add your own new loss function, you can add it into lossFunctions.py, and then specify it in trainDNN.py Config class.

Run training: python trainDNN.py
Run TensorBoard(in directory where trainDNN.py): tensorboard --logdir='TensorBoardLogs/' --host=10.12.1.59 --port=8999 (check if port not busy)

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
DNN		DNN
readme.md		readme.md
trainDNN.py		trainDNN.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Neural network training workflow

Requiremets:

Config description:

NN architeture scheme:

Tensorboard output during training:

About

Releases

Packages

Languages

romario076/Neural-network-training-workflow

Folders and files

Latest commit

History

Repository files navigation

Neural network training workflow

Requiremets:

Config description:

NN architeture scheme:

Tensorboard output during training:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages