Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding function: Allow users to designate the number of batches #320

Merged
merged 16 commits into from
Jan 10, 2018

Conversation

Unous1996
Copy link
Contributor

Now when calling the register_model function, the users can designate the batch size so that all the containers could use this specific batch size instead of using the default batching strategy. For example:
connection.register_model(name="model_name", version="model_version15", input_type="floats",batch_size=5)
The default value of the parameter "batch_size" is -1. Clipper will apply the default strategy under this situation.

李鳌 and others added 4 commits November 11, 2017 19:30
	modified:   src/libclipper/include/clipper/containers.hpp
	modified:   src/libclipper/include/clipper/redis.hpp
	modified:   src/libclipper/src/containers.cpp
	modified:   src/libclipper/src/redis.cpp
	modified:   src/management/src/management_frontend.hpp
@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@Unous1996
Copy link
Contributor Author

OK to test.

@dcrankshaw
Copy link
Contributor

jenkins ok to test

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/785/
Test FAILed.

@dcrankshaw
Copy link
Contributor

@Unous1996 There's an issue with the jenkins cluster right now, it's not your code that's causing the tests to fail. Hang tight, working to get the issue resolved.

@Unous1996
Copy link
Contributor Author

OK. So are there anything I need to do right now?


batch_size : int, optional
The user defined batch size.
The default value is -1, which means that the user did not define a specific batch size.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace lines 284-285 with: "If the default value of -1 is used, Clipper will adaptively calculate the batch size for individual replicas of this model."

@@ -278,7 +279,10 @@ def build_and_deploy_model(self,
The number of replicas of the model to create. The number of replicas
for a model can be changed at any time with
:py:meth:`clipper.ClipperConnection.set_num_replicas`.

batch_size : int, optional
The user defined batch size.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

user defined >> user-defined


batch_size : int, optional
The user defined batch size.
The default value is -1, which means that the user did not define a specific batch size.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update the documentation for this method in the same way as for build_and_deploy_model.

@@ -436,6 +445,10 @@ def deploy_model(self,
The number of replicas of the model to create. The number of replicas
for a model can be changed at any time with
:py:meth:`clipper.ClipperConnection.set_num_replicas`.
batch_size : int, optional
The user defined batch size.
The default value is -1, which means that the user did not define a specific batch size.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update the documentation for this method in the same way as for build_and_deploy_model.

'''
'''
'''
self.cm.deploy_model(name=name, version=version, input_type=input_type, image=image, batch_size=batch_size)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add the num_replicas keyword argument

@@ -108,8 +138,17 @@ void ActiveContainers::add_container(VersionedModelId model, int connection_id,
model.get_name(), model.get_id(), connection_id,
replica_id, get_readable_input_type(input_type));
boost::unique_lock<boost::shared_mutex> l{m_};

// Set a default batch size of 1
int batch_size = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default value should be -1 if the user has not defined a batch size for the container's associated model.

int batch_size = 1;
auto batch_size_search = batch_sizes_.find(model);
if(batch_size_search != batch_sizes_.end()) {
std::cout<<"MODEL_FOUND"<<std::endl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove print statement

@@ -670,6 +670,7 @@ class RequestHandler {
InputType input_type = clipper::parse_input_type(input_type_raw);
std::string container_name = get_string(d, "container_name");
std::string model_data_path = get_string(d, "model_data_path");
boost::optional<int> batch_size = get_int(d, "batch_size");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a check to ensure that batch_size is either -1 or a positive integer, and throw a ManagementOperationError if this condition is not met.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I will change it as soon as possible.

auto model_info = clipper::redis::get_model_by_key(redis_connection_, key);
VersionedModelId model_id = VersionedModelId(model_info["model_name"], model_info["model_version"]);
int batch_size = std::stoi(model_info["batch_size"]);
std::cout<<"registered_batch_size"<<batch_size<<std::endl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace this with log_info_formatted("Registered batch size of {} for model {}:{}", batch_size, model_id.get_name(), model_id.get_id())

//#int batch_size = std::stoi(container_info["batch_size"]);
boost::optional<int> batch_size = -1;
for( auto it = container_info.begin(); it != container_info.end(); it++){
std::cout<<"it->first="<<it->first<<std::endl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove these print statements

@Corey-Zumar
Copy link
Contributor

@Unous1996 This is definitely on the right track! Most of the review comments are concerned with code cleanup and readability

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/787/
Test FAILed.

@Unous1996
Copy link
Contributor Author

Is there anything wrong with the cluster right now? Or there's some issue with my code.

Copy link
Contributor

@Corey-Zumar Corey-Zumar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more comments


// The barch_size should be either positive or -1
if( batch_size<0 && batch_size!= -1){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition doesn't handle the invalid case where batch_size is zero

@@ -76,6 +77,10 @@ double ModelContainer::get_average_throughput_per_millisecond() {
}

size_t ModelContainer::get_batch_size(Deadline deadline) {
if (this->batch_size_ != -1){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no need to dereference this here. Just refer to batch_size


// The barch_size should be either positive or -1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix typo: "barch_size" >> "batch_size"

@@ -3,13 +3,15 @@
import os
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert this file

@@ -45,4 +45,4 @@ def signal_handler(signal, frame):
predict(clipper_conn.get_query_addr(), np.random.random(200))
time.sleep(0.2)
except Exception as e:
clipper_conn.stop_all()
clipper_conn.stop_all()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert this file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that I could not find the difference. Both seems to call clipper_conn.stop_all()

@@ -108,8 +113,16 @@ void ActiveContainers::add_container(VersionedModelId model, int connection_id,
model.get_name(), model.get_id(), connection_id,
replica_id, get_readable_input_type(input_type));
boost::unique_lock<boost::shared_mutex> l{m_};

// Set a default batch size of -1
int batch_size = -1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Define a constexpr DEFAULT_BATCH_SIZE = -1 constant in container.hpp and refer to it in all relevant locations, instead of using the value -1 directly

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/789/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/790/
Test FAILed.

@dcrankshaw
Copy link
Contributor

The tests are failing right because of the format checker. Run ./bin/format_code.sh to format the code correctly.

@Unous1996
Copy link
Contributor Author

Unous1996 commented Nov 27, 2017 via email

@Unous1996
Copy link
Contributor Author

Well, it seemed that I had problem when running the format_code.sh

/Users/tim/Documents/Clipper_proj/clipper/bin/.. xargs: clang-format: No such file or directory xargs: clang-format: No such file or directory

I have already install the clang format file, move it to the ./bin/directory and changed the folder name to "clang-format", but it did not work.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/804/
Test FAILed.

Copy link
Contributor

@Corey-Zumar Corey-Zumar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Unous1996 This just needs a few more changes

parse_input_type(container_info["input_type"]));

auto model_info = redis::get_model(redis_connection_, vm);
VersionedModelId model_id = VersionedModelId(model_info["model_name"], model_info["model_version"]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value of this variable will be identical to vm (defined on line 291) and can be deleted

auto batch_size_search = model_info.find("batch_size");
if(batch_size_search != model_info.end()){
batch_size = std::stoi(model_info["batch_size"]);
active_containers_->register_batch_size(model_id, batch_size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this outside of the if statement and replace model_id with vm

int batch_size = DEFAULT_BATCH_SIZE;
auto batch_size_search = model_info.find("batch_size");
if(batch_size_search != model_info.end()){
batch_size = std::stoi(model_info["batch_size"]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace with: batch_size = std::stoi(batch_size_search->second)

@Unous1996
Copy link
Contributor Author

Unous1996 commented Dec 30, 2017 via email

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/817/
Test FAILed.

Copy link
Contributor

@Corey-Zumar Corey-Zumar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Will merge when code formatting completes

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/836/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/841/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/842/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/843/
Test FAILed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/846/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/847/
Test PASSed.

@dcrankshaw dcrankshaw merged commit d3a32dd into ucbrise:develop Jan 10, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants