-
Notifications
You must be signed in to change notification settings - Fork 280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding function: Allow users to designate the number of batches #320
Conversation
modified: src/libclipper/include/clipper/containers.hpp modified: src/libclipper/include/clipper/redis.hpp modified: src/libclipper/src/containers.cpp modified: src/libclipper/src/redis.cpp modified: src/management/src/management_frontend.hpp
Can one of the admins verify this patch? |
OK to test. |
jenkins ok to test |
Test FAILed. |
@Unous1996 There's an issue with the jenkins cluster right now, it's not your code that's causing the tests to fail. Hang tight, working to get the issue resolved. |
OK. So are there anything I need to do right now? |
|
||
batch_size : int, optional | ||
The user defined batch size. | ||
The default value is -1, which means that the user did not define a specific batch size. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace lines 284-285 with: "If the default value of -1 is used, Clipper will adaptively calculate the batch size for individual replicas of this model."
@@ -278,7 +279,10 @@ def build_and_deploy_model(self, | |||
The number of replicas of the model to create. The number of replicas | |||
for a model can be changed at any time with | |||
:py:meth:`clipper.ClipperConnection.set_num_replicas`. | |||
|
|||
batch_size : int, optional | |||
The user defined batch size. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
user defined >> user-defined
|
||
batch_size : int, optional | ||
The user defined batch size. | ||
The default value is -1, which means that the user did not define a specific batch size. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update the documentation for this method in the same way as for build_and_deploy_model
.
@@ -436,6 +445,10 @@ def deploy_model(self, | |||
The number of replicas of the model to create. The number of replicas | |||
for a model can be changed at any time with | |||
:py:meth:`clipper.ClipperConnection.set_num_replicas`. | |||
batch_size : int, optional | |||
The user defined batch size. | |||
The default value is -1, which means that the user did not define a specific batch size. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update the documentation for this method in the same way as for build_and_deploy_model
.
''' | ||
''' | ||
''' | ||
self.cm.deploy_model(name=name, version=version, input_type=input_type, image=image, batch_size=batch_size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add the num_replicas
keyword argument
src/libclipper/src/containers.cpp
Outdated
@@ -108,8 +138,17 @@ void ActiveContainers::add_container(VersionedModelId model, int connection_id, | |||
model.get_name(), model.get_id(), connection_id, | |||
replica_id, get_readable_input_type(input_type)); | |||
boost::unique_lock<boost::shared_mutex> l{m_}; | |||
|
|||
// Set a default batch size of 1 | |||
int batch_size = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default value should be -1
if the user has not defined a batch size for the container's associated model.
src/libclipper/src/containers.cpp
Outdated
int batch_size = 1; | ||
auto batch_size_search = batch_sizes_.find(model); | ||
if(batch_size_search != batch_sizes_.end()) { | ||
std::cout<<"MODEL_FOUND"<<std::endl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove print statement
@@ -670,6 +670,7 @@ class RequestHandler { | |||
InputType input_type = clipper::parse_input_type(input_type_raw); | |||
std::string container_name = get_string(d, "container_name"); | |||
std::string model_data_path = get_string(d, "model_data_path"); | |||
boost::optional<int> batch_size = get_int(d, "batch_size"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a check to ensure that batch_size
is either -1
or a positive integer, and throw a ManagementOperationError
if this condition is not met.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I will change it as soon as possible.
auto model_info = clipper::redis::get_model_by_key(redis_connection_, key); | ||
VersionedModelId model_id = VersionedModelId(model_info["model_name"], model_info["model_version"]); | ||
int batch_size = std::stoi(model_info["batch_size"]); | ||
std::cout<<"registered_batch_size"<<batch_size<<std::endl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace this with log_info_formatted("Registered batch size of {} for model {}:{}", batch_size, model_id.get_name(), model_id.get_id())
//#int batch_size = std::stoi(container_info["batch_size"]); | ||
boost::optional<int> batch_size = -1; | ||
for( auto it = container_info.begin(); it != container_info.end(); it++){ | ||
std::cout<<"it->first="<<it->first<<std::endl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove these print statements
@Unous1996 This is definitely on the right track! Most of the review comments are concerned with code cleanup and readability |
Test FAILed. |
Is there anything wrong with the cluster right now? Or there's some issue with my code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more comments
|
||
// The barch_size should be either positive or -1 | ||
if( batch_size<0 && batch_size!= -1){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This condition doesn't handle the invalid case where batch_size
is zero
src/libclipper/src/containers.cpp
Outdated
@@ -76,6 +77,10 @@ double ModelContainer::get_average_throughput_per_millisecond() { | |||
} | |||
|
|||
size_t ModelContainer::get_batch_size(Deadline deadline) { | |||
if (this->batch_size_ != -1){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no need to dereference this
here. Just refer to batch_size
|
||
// The barch_size should be either positive or -1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix typo: "barch_size" >> "batch_size"
containers/python/noop_container.py
Outdated
@@ -3,13 +3,15 @@ | |||
import os |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert this file
@@ -45,4 +45,4 @@ def signal_handler(signal, frame): | |||
predict(clipper_conn.get_query_addr(), np.random.random(200)) | |||
time.sleep(0.2) | |||
except Exception as e: | |||
clipper_conn.stop_all() | |||
clipper_conn.stop_all() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert this file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that I could not find the difference. Both seems to call clipper_conn.stop_all()
src/libclipper/src/containers.cpp
Outdated
@@ -108,8 +113,16 @@ void ActiveContainers::add_container(VersionedModelId model, int connection_id, | |||
model.get_name(), model.get_id(), connection_id, | |||
replica_id, get_readable_input_type(input_type)); | |||
boost::unique_lock<boost::shared_mutex> l{m_}; | |||
|
|||
// Set a default batch size of -1 | |||
int batch_size = -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Define a constexpr DEFAULT_BATCH_SIZE = -1
constant in container.hpp
and refer to it in all relevant locations, instead of using the value -1
directly
Test FAILed. |
Test FAILed. |
The tests are failing right because of the format checker. Run |
OK. Thanks.
…On Mon, Nov 27, 2017 at 8:22 AM, Dan Crankshaw ***@***.***> wrote:
The tests are failing right because of the format checker. Run
./bin/format_code.sh to format the code correctly.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#320 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AgAM14V1oPEGRRmgr_RTlmTNYsBNyzJFks5s6uG0gaJpZM4Qjkld>
.
|
Well, it seemed that I had problem when running the format_code.sh
I have already install the clang format file, move it to the ./bin/directory and changed the folder name to "clang-format", but it did not work. |
Test FAILed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Unous1996 This just needs a few more changes
parse_input_type(container_info["input_type"])); | ||
|
||
auto model_info = redis::get_model(redis_connection_, vm); | ||
VersionedModelId model_id = VersionedModelId(model_info["model_name"], model_info["model_version"]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The value of this variable will be identical to vm
(defined on line 291) and can be deleted
auto batch_size_search = model_info.find("batch_size"); | ||
if(batch_size_search != model_info.end()){ | ||
batch_size = std::stoi(model_info["batch_size"]); | ||
active_containers_->register_batch_size(model_id, batch_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this outside of the if
statement and replace model_id
with vm
int batch_size = DEFAULT_BATCH_SIZE; | ||
auto batch_size_search = model_info.find("batch_size"); | ||
if(batch_size_search != model_info.end()){ | ||
batch_size = std::stoi(model_info["batch_size"]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace with: batch_size = std::stoi(batch_size_search->second)
OK. I will try to fix it.
…On Sat, 30 Dec 2017 at 3:17 AM Corey Zumar ***@***.***> wrote:
***@***.**** requested changes on this pull request.
@Unous1996 <https://github.com/unous1996> This just needs a few more
changes
------------------------------
In src/libclipper/include/clipper/task_executor.hpp
<#320 (comment)>:
> @@ -272,9 +291,19 @@ class TaskExecutor {
VersionedModelId vm = VersionedModelId(
container_info["model_name"], container_info["model_version"]);
int replica_id = std::stoi(container_info["model_replica_id"]);
- active_containers_->add_container(
- vm, std::stoi(container_info["zmq_connection_id"]), replica_id,
- parse_input_type(container_info["input_type"]));
+ active_containers_->add_container(
+ vm, std::stoi(container_info["zmq_connection_id"]), replica_id,
+ parse_input_type(container_info["input_type"]));
+
+ auto model_info = redis::get_model(redis_connection_, vm);
+ VersionedModelId model_id = VersionedModelId(model_info["model_name"], model_info["model_version"]);
The value of this variable will be identical to vm (defined on line 291)
and can be deleted
------------------------------
In src/libclipper/include/clipper/task_executor.hpp
<#320 (comment)>:
> @@ -272,9 +291,19 @@ class TaskExecutor {
VersionedModelId vm = VersionedModelId(
container_info["model_name"], container_info["model_version"]);
int replica_id = std::stoi(container_info["model_replica_id"]);
- active_containers_->add_container(
- vm, std::stoi(container_info["zmq_connection_id"]), replica_id,
- parse_input_type(container_info["input_type"]));
+ active_containers_->add_container(
+ vm, std::stoi(container_info["zmq_connection_id"]), replica_id,
+ parse_input_type(container_info["input_type"]));
+
+ auto model_info = redis::get_model(redis_connection_, vm);
+ VersionedModelId model_id = VersionedModelId(model_info["model_name"], model_info["model_version"]);
+ int batch_size = DEFAULT_BATCH_SIZE;
+ auto batch_size_search = model_info.find("batch_size");
+ if(batch_size_search != model_info.end()){
+ batch_size = std::stoi(model_info["batch_size"]);
+ active_containers_->register_batch_size(model_id, batch_size);
move this outside of the if statement and replace model_id with vm
------------------------------
In src/libclipper/include/clipper/task_executor.hpp
<#320 (comment)>:
> @@ -272,9 +291,19 @@ class TaskExecutor {
VersionedModelId vm = VersionedModelId(
container_info["model_name"], container_info["model_version"]);
int replica_id = std::stoi(container_info["model_replica_id"]);
- active_containers_->add_container(
- vm, std::stoi(container_info["zmq_connection_id"]), replica_id,
- parse_input_type(container_info["input_type"]));
+ active_containers_->add_container(
+ vm, std::stoi(container_info["zmq_connection_id"]), replica_id,
+ parse_input_type(container_info["input_type"]));
+
+ auto model_info = redis::get_model(redis_connection_, vm);
+ VersionedModelId model_id = VersionedModelId(model_info["model_name"], model_info["model_version"]);
+ int batch_size = DEFAULT_BATCH_SIZE;
+ auto batch_size_search = model_info.find("batch_size");
+ if(batch_size_search != model_info.end()){
+ batch_size = std::stoi(model_info["batch_size"]);
Replace with: batch_size = std::stoi(batch_size_search->second)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#320 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AgAM1_3FlQSi5Q6n2E8ONw1Xt9bL1-HTks5tFhu8gaJpZM4Qjkld>
.
|
Test FAILed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Will merge when code formatting completes
Test FAILed. |
Test FAILed. |
Test FAILed. |
Test FAILed. |
Test PASSed. |
Test PASSed. |
Now when calling the register_model function, the users can designate the batch size so that all the containers could use this specific batch size instead of using the default batching strategy. For example:
connection.register_model(name="model_name", version="model_version15", input_type="floats",batch_size=5)
The default value of the parameter "batch_size" is -1. Clipper will apply the default strategy under this situation.