Under the PGPortfolio/pgportfolio
directory, there is a json file called net_config.json
,
holding all the configuration of the agent and could be modified outside the program code.
"layers"
- layers list of the CNN, including the output layer
"type"
- domain is {"ConvLayer", "FullyLayer", "DropOut", "MaxPooling", "AveragePooling", "LocalResponseNormalization", "SingleMachineOutput", "LSTMSingleMachine", "RNNSingleMachine"}
"filter shape"
- shape of the filter (kernal) of the Convolution Layer
"input"
"window_size"
- number of columns of the input matrix
"coin_number"
- number of rows of the input matrix
"feature_number"
- number of features (just like RGB in computer vision)
- domain is {1, 2, 3}
- 1 means the feature is ["close"], last price of each period
- 2 means the feature is ["close", "volume"]
- 3 means the features are ["close", "high", "low"]
"input "
"start_date"
- start date of the global data matrix
- format is yyyy/MM/dd
"end_date"
- start date of the global data matrix
- format is yyyy/MM/dd
- The performance could varied a lot in different time ranges.
"volume_average_days"
- number of days of volume used to select the coins
"test_portion"
- portion of backtest data, ranging from 0 to 1. The left is training data.
"global_period"
- trading period and period of prices in input window.
- should be a multiple of 300 (seconds)
"coin_number"
- number of assets to be traded.
- does not include cash (i.e. btc)
"online"
- if it is not online, the program will select coins and generate inputs from the local database.
- if it is online, new data that dose not exist in the database would be saved
- First, modify the
PGPortfolio/pgportfolio/net_config.json
file. - make sure current directory is under
/PGPortfolio/
and typepython main.py --mode=generate --repeat=1
- this will make 1 subfolders under the
train_package
- in each subfolder, there is a copy of the
net_config.json
--repeat=n
, n could followed by any positive integers. The random seed of each the subfolder is from 0 to n-1 sequentially.- Notably, random seed could also affect the performance in a large scale.
- this will make 1 subfolders under the
- type
python main.py --mode=train --processes=1
- this will start training one by one of the n folder created just now
- do not start more than 1 processes if you want to download data online
- "--processes=n" means start n processes running parallely.
- add "--device=gpu" if your tensorflow support gpu.
- On GTX1080Ti you should be able to run 4-5 training process together.
- On GTX1060 you should be able to run 2-3 training together.
- Each training process is made up from 2 stages:
- Pre-training, log example:
INFO:root:average time for data accessing is 0.00070324587822
INFO:root:average time for training is 0.0032548391819
INFO:root:==============================
INFO:root:step 3000
INFO:root:------------------------------
INFO:root:the portfolio value on test set is 2.24213
log_mean is 0.00029086
loss_value is -0.000291
log mean without commission fee is 0.000378
INFO:root:==============================
* Backtest with rolling train, log example:
DEBUG:root:==============================
INFO:root:the step is 1433
INFO:root:total assets are 17.732482 BTC
- after that, check the result summary of the training in
nntrader/train_package/train_summary.csv
- tune the hyper-parameters based on the summary, and go to 1 again.
There are three types of logging of each training.
- In each subfolder
- There is a text file called
programlog
, which is the log generated by the running programming. - There is a
tensorboard
folder saves the data about the training process which could be viewed by tensorboard.- type
tensorboard --logdir=train_package/1
to use tensorboard
- type
- There is a text file called
- The summary infomation of this training, including network configuration, portfolio value on validation set and test set etc., will be saved in the
train_summary.csv
undertrain_pakage
folder
- The trained weights of the network are saved at
train_package/1
named asnetfile
(including 3 files).
- Type
python main.py --mode=download_data
you can download data without starting training - The program will use the configurations in
PGPortfolio/pgportfolio/net_config
to select coins and download necessary data to train the network. - The downloading speed could be very slow and sometimes even have error in China.
- For those who cann't download data, please check the first release where there is a
Data.db
file, put it in the database folder. Make sure theonline
ininput
innet_config.json
to befalse
and run the example.- Note that using the this file, you shouldn't make any changes to input data configuration(For example
start_date
,end_date
orcoin_number
) otherwise incorrect result might be presented.
- Note that using the this file, you shouldn't make any changes to input data configuration(For example
Note: Before back-testing, you need to suceessfully finish training of algo first
- Type
python main.py --mode=backtest --algo=1
to execute backtest with rolling train(i.e. online learning in supervised learning) on the target model. --algo
could be either the name of traditional method or the index of training folder
OLPS summary:
- type
python main.py --mode=plot --algos=crp,olmar,1 --labels=crp,olmar,nnagent
,for example, to plot --algos
could be the name of the tdagent algorithms or the index of nnagent--labels
is the name of related algorithm that will be shown in the legend- result is
- type
python main.py --mode=table --algos=1,olmar,ons --labels=nntrader,olmar,ons
--algos
and--lables
are the same as in plotting case- result:
average max drawdown negative day negative periods negative week portfolio value positive periods postive day postive week sharpe ratio
nntrader 1.001311 0.225874 781 1378 114 25.022516 1398 1995 2662 0.074854
olmar 1.000752 0.604886 1339 1451 1217 4.392879 1319 1437 1559 0.035867
ons 1.000231 0.217216 1144 1360 731 1.770931 1416 1632 2045 0.032605
- use
--format
arguments to change the format of the table, could beraw
html
csv
orlatex
. The default one is raw.