You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was doing a binary classification task on a relatively large dataset(2.5GB, with my memory 16GB) , at about 500th rounds, it appears that one of the iteration is abnormal, causing both of training set and validation set error going up. These abnormal happens from time to time, sometimes simply changing a parameter such as bagging fraction from 0.8 to 0.9 can solve this problem, but it happends again if I add a new features.
At first i thought it is a problem of CPU core communicating problem, , but changing num_threads didn't seems helpful to me
As you see in the picture, the highlight output line is obviously abnormal. I knew that GBDT algorithm, the training error only decrease, never increase. However this line output is very different from the normal ones. It's logloss is higher, and very different from the one before and the one after.
The problem I also seen once on my Linux server's Lightgbm before. It is very disturbing and seriously affecting the accuracy of my model. I wonder why this happen and how to solve it.
BTW: I wonder if this is a memory related problem because my data file is very large, sometimes causing "Memory Error" when the code excuate the "lgb.train()" line.
Another appearance with different features and different parameters:
Also, the abnormal situation is NOT random, I can reproduce the result as many time as i want if I rerun my code.
Environment info
Operating System: Windows 10,
Python Version: Python 2.7 Anaconda + Lightgbm
Error Message:
I was doing a binary classification task on a relatively large dataset(2.5GB, with my memory 16GB) , at about 500th rounds, it appears that one of the iteration is abnormal, causing both of training set and validation set error going up. These abnormal happens from time to time, sometimes simply changing a parameter such as bagging fraction from 0.8 to 0.9 can solve this problem, but it happends again if I add a new features.
At first i thought it is a problem of CPU core communicating problem, , but changing num_threads didn't seems helpful to me
As you see in the picture, the highlight output line is obviously abnormal. I knew that GBDT algorithm, the training error only decrease, never increase. However this line output is very different from the normal ones. It's logloss is higher, and very different from the one before and the one after.
The problem I also seen once on my Linux server's Lightgbm before. It is very disturbing and seriously affecting the accuracy of my model. I wonder why this happen and how to solve it.
BTW: I wonder if this is a memory related problem because my data file is very large, sometimes causing "Memory Error" when the code excuate the "lgb.train()" line.
Another appearance with different features and different parameters:
Also, the abnormal situation is NOT random, I can reproduce the result as many time as i want if I rerun my code.
My Parameter:
params = {
'boosting_type': 'gbdt',
'objective': 'binary',
'metric': 'binary_logloss',
'num_leaves': 96-1,
'learning_rate': 0.03,
'feature_fraction': 0.6,
'bagging_fraction': 0.8,
'bagging_freq': 1,
'verbose': 1
}
The text was updated successfully, but these errors were encountered: