-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[optional] support loading and saving models with protobuf #908
Conversation
@wxchan |
LICENSE: https://github.com/google/protobuf/blob/master/LICENSE I add a test and it passes on both linux and osx, I am not sure about windows. test on osx: https://travis-ci.org/wxchan/LightGBM/jobs/274043104, it takes too long. |
@wxchan what is the compression ratio of using protobuf, compared with raw text format ? I think using protobuf is too heavy, can we just use a simple binary format (like the dataset bin file) ? |
@guolinke I didn't test on big model yet. My computer takes too long to load big datasets. The good thing of protobuf is, it's readable through .proto file. We can easily know what features are saved. |
@guolinke I made a test on Higgs dataset text ->proto |
.travis.yml
Outdated
- os: osx | ||
env: TASK=proto | ||
- os: osx | ||
env: TASK=regular PYTHON_VERSION=2.7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please check #904 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@henry0312 I saw. Do you have an agreement yet? I can add it later, I think it's a waste of resource to run those python tests I didn't alter multiple time now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have an agreement yet?
No, I don't know if @guolinke agrees with it yet, but I think we should run tests with Python 2.7.x and Python 3.6.x at least.
I think it's a waste of resource to run those python tests I didn't alter multiple time now.
I got it!
Yeah, Travis always faces with lacks of instances of OSX.
We have no choice but to reduce test cases for OSX 😢
one similar thing: https://github.com/dmlc/treelite/blob/master/src/tree.proto |
@wxchan any updates of this ? |
@wxchan it seems that the treelite is like our if-else solution ? |
@hcho3 |
@wxchan No, in treelite, protobuf is only for input not for output. (Protobuf is one of four methods to input a model into treelite compiler.) So LightGBM model isn't being translated into protobuf. Not sure how effective protobuf is for compression. |
I found useful information: https://github.com/thekvs/cpp-serializers |
@guolinke I already finished this thread. |
@wxchan this PR will break the build of VS. I will revert it and merge back after fixing this. |
I think I forgot changing some path of file in windows/LightGBM.vcxproj after moving it, is this the reason? |
@wxchan it cannot find some ".h" and ".cpp" files, seems are some the auto-generated protobuf files. |
@guolinke maybe this one https://github.com/wxchan/LightGBM/blob/1e6091c01f75cbbc86b55b104ae3231d302a0a9f/src/proto/not_implemented/model.pb.h? what's the error message? |
@wxchan |
@guolinke yes, I forgot adding it. btw, there are some files for when proto=on, some files for when proto=off, they are controlled by cmake, should I add them all to windows/LightGBM.vcxproj? |
@wxchan it seems VS cannot support proto, so we can just disable it |
related to #372