-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable load_from_bytes of model and ExtScorer #3331
base: master
Are you sure you want to change the base?
Conversation
No Taskcluster jobs started for this pull requestThe `allowPullRequests` configuration for this repository (in `.taskcluster.yml` on the
default branch) does not allow starting tasks for this pull request. |
this will be problematic for compatibility with some systems
Please introduce a new
I really don't see how this feature can work well with the tensorflow infra without forcing allocating huge strings |
native_client/tfmodelstate.cc
Outdated
MemmappedFileSystem::kMemmappedPackageDefaultGraphDef, | ||
&graph_def_); | ||
if (init_from_bytes){ | ||
// Need some help |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it makes any sense at all?
native_client/tflitemodelstate.cc
Outdated
std::cerr << "Error at reading model file " << model_path << std::endl; | ||
return DS_ERR_FAIL_INIT_MMAP; | ||
if (init_from_bytes){ | ||
fbmodel_ = tflite::FlatBufferModel::BuildFromBuffer(model_string.c_str(), model_string.size_t); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does that works, performance-wise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I see this, much alloc: https://github.com/tensorflow/tensorflow/blob/r2.3/tensorflow/lite/model_builder.cc#L104-L115
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't performed any benchmark, but I haven't noticed any slow down (for loading the pb model). When loading the Scorer, the test seems a bit slower due to reading/converting more than 900MB.
When running for our Language Model (size<20MB), it runs smoothly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, if you use a 20MB scorer, yeah, it might not be too bad, but I guess it can also depend on use device you target. My fear is people using that without understanding the consequences and then complaining about memory usage.
@bernardohenz I fail to get the usecase where this can be useful for mmap-able file formats (PBMM, TFLite), given how much this is going to force TensorFlow to allocate some memory. |
Does it? From looking at the implem, I'm unsure this can work without trashing memory of the device: https://github.com/tensorflow/tensorflow/blob/r2.3/tensorflow/lite/model_builder.cc#L104-L115 |
Also, when you introduce that, please try and expose it in as much bindings as you can. |
That's a good point, maybe we need to modify TensorFlow to take ownership of the pointer, or assume it'll outlive the session. |
That's much more invasive and risky, except if we can upstream that, but we know how long it can take. Depending on @bernardohenz use-case, maybe it makes sense to just expose that for non protobuf mmap files ? or even only for TFLite models ? |
@bernardohenz I see you have code already for TFLite, can you describe what kind of help you need? |
@lissyx I will be addressing the implementation of new methods for the API. I have no answer for using |
I want to make sure we don't put ourselves in a position where people will misuse and have unrealistic expectations
This does impact how C++ code handles strings, so maybe it depends on your env and in your case it makes sense? If you can share more of a stack, for example. |
@bernardohenz can you:
I could take a look and run CI on that, so we can cross-check |
Maybe this should be warned about in the docs: "using this feature can have negative impact on the amount of memory consumed by your application" |
@bernardohenz Your PR needs rebasing, it seems. |
@bernardohenz Please don't change file's mode from 644 to 755. |
@@ -69,45 +71,60 @@ void Scorer::setup_char_map() | |||
} | |||
} | |||
|
|||
int Scorer::load_lm(const std::string& lm_path) | |||
int Scorer::load_lm(const std::string& lm_string, bool load_from_bytes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either provide a default value, split the implementation in two load_lm
or update all call sites: this PR does not build because of load_lm
calls in generate_scorer_package.cpp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've provided a default value (false) for the boolean
} | ||
|
||
int Scorer::load_trie(std::ifstream& fin, const std::string& file_path) | ||
int Scorer::load_trie(std::stringstream& fin, const std::string& file_path, bool load_from_bytes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stringstream
is compatible with ifstream
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I tested yes
native_client/deepspeech.cc
Outdated
} | ||
|
||
int | ||
DS_CreateModel_(const std::string &aModelString, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how this is implemented, we go straight from const char*
to std::string &
native_client/deepspeech.h
Outdated
* @return Zero on success, non-zero on failure. | ||
*/ | ||
DEEPSPEECH_EXPORT | ||
int DS_CreateModelFromBuffer(const std::string &aModelBuffer, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::string
here makes that C++
limited, the API is C
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll be changing all the strings to char *
. But I'll have to pass the buffer size as parameter (I was having problems when a buffer encountered the \0
in the middle of a buffer)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thats perfectly fine.
* @return Zero on success, non-zero on failure (invalid arguments). | ||
*/ | ||
DEEPSPEECH_EXPORT | ||
int DS_EnableExternalScorerFromBuffer(ModelState* aCtx, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same, this is limiting to C++
callers when we expose C
API.
@@ -17,9 +17,13 @@ DontBhiksha::DontBhiksha(const void * /*base*/, uint64_t /*max_offset*/, uint64_ | |||
const uint8_t kArrayBhikshaVersion = 0; | |||
|
|||
// TODO: put this in binary file header instead when I change the binary file format again. | |||
void ArrayBhiksha::UpdateConfigFromBinary(const BinaryFormat &file, uint64_t offset, Config &config) { | |||
void ArrayBhiksha::UpdateConfigFromBinary(const BinaryFormat &file, uint64_t offset, Config &config, bool load_from_bytes) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be upstreamed ?
4726e4a
to
8113f98
Compare
@bernardohenz You should now be able to run PRs yourself. |
I've fixed the API functions exposition, tested the loading for TFLite. I've performed a small time benchmark for running the client (both for loading a .b and a .tflite): Loading a dummy .pb
Loading a .tflite
|
@bernardohenz There are still failures related to |
Just for the sake of completeness, can you add figures for before your changes?
Good, so now it's just about:
Right? |
@bernardohenz I see some merges, can you make sure this branch is clean of merges ? |
@lissyx yep, I'm trying to squash the commits, sorry. |
e42c7cb
to
5a5ef46
Compare
Just writing down some notes from my talk with Bernardo last week on the status of this PR:
|
5a5ef46
to
df40e22
Compare
This PR allows the loading of model and ExtScorer through array of bytes (
string
)Some observations:
D_GLIBCXX_USE_CXX11_ABI=1
when compiling the lib--init_from_bytes
) in the client for testing the loadingDS_CreateModel
for instance)