Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model container testing function to clipper admin (Vanilla python) #394

Merged
merged 35 commits into from
Mar 13, 2018
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
4495bba
Vanilla python model
rohsuresh Feb 1, 2018
2c1474c
Merge branch 'develop' of https://github.com/ucbrise/clipper into dev…
rohsuresh Feb 1, 2018
bed8072
Merge branch 'develop' into modelTest
dcrankshaw Feb 5, 2018
ffa7525
PR edits and integration test
rohsuresh Feb 8, 2018
a3aa5eb
integration test
rohsuresh Feb 8, 2018
2b6cb33
Merge branch 'modelTest' of https://github.com/ucbrise/clipper into m…
rohsuresh Feb 8, 2018
8637156
deleting extra python files
rohsuresh Feb 8, 2018
a215b62
Addressed more PR comments: check for input_batch, batch input for in…
rohsuresh Feb 20, 2018
8cba54a
Ran formatting script
rohsuresh Feb 21, 2018
41c111d
Merge branch 'develop' into modelTest
dcrankshaw Feb 22, 2018
f4e005d
fix formatting issues
dcrankshaw Feb 22, 2018
80819ff
Fixing import error failing jenkins
rohsuresh Feb 25, 2018
e3951fb
merge issue
rohsuresh Feb 25, 2018
1214ba5
Merge branch 'modelTest' of https://github.com/ucbrise/clipper into m…
rohsuresh Feb 25, 2018
83ff62b
trying to fix import
rohsuresh Feb 25, 2018
5ab12c5
Removed import
rohsuresh Feb 25, 2018
2190404
Registered app in integration test
rohsuresh Feb 26, 2018
3609f3a
Removing reformatted files
rohsuresh Feb 26, 2018
a53355f
Fixed style errors
rohsuresh Feb 26, 2018
f9a19d1
Fixing integration test
rohsuresh Feb 26, 2018
b689450
fixed url
rohsuresh Mar 2, 2018
7f66829
fixing integration test
rohsuresh Mar 5, 2018
fcae76c
linked model to app
rohsuresh Mar 5, 2018
1678e2e
Playing around with connected model for query
rohsuresh Mar 5, 2018
e606a2e
trying to fix connection issues
rohsuresh Mar 5, 2018
f261d68
connection issues
rohsuresh Mar 6, 2018
c6b6fec
Fixed connection issue in integration test
rohsuresh Mar 9, 2018
917aeca
Uncommented tests
rohsuresh Mar 9, 2018
9972853
Merge branch 'develop' into modelTest
dcrankshaw Mar 9, 2018
8fa0806
retrigger
rohsuresh Mar 9, 2018
b4f998d
retry
rohsuresh Mar 9, 2018
d45addf
retrigger jenkins
rohsuresh Mar 10, 2018
a250da2
reverted accidental changes
dcrankshaw Mar 12, 2018
4964ece
Merge branch 'develop' into modelTest
dcrankshaw Mar 13, 2018
53cc658
Merge branch 'develop' into modelTest
dcrankshaw Mar 13, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added bin/python
Binary file not shown.
68 changes: 68 additions & 0 deletions clipper_admin/clipper_admin/clipper_admin.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,9 @@
import os
import tarfile
import six
from cloudpickle import CloudPickler
import pickle
import numpy as np

from .container_manager import CONTAINERLESS_MODEL_IMAGE
from .exceptions import ClipperException, UnconnectedException
Expand Down Expand Up @@ -1187,3 +1190,68 @@ def stop_all(self):
"""
self.cm.stop_all()
logger.info("Stopped all Clipper cluster and all model containers")

def test_predict_function(self, query, func, input_type):
"""Tests that the user's function has the correct signature and can be properly saved and loaded.

The function should take a dict request object like the query frontend expects JSON,
the predict function, and the input type for the model.

Parameters
----------
query: JSON or list of dicts
Inputs to test the prediction function on.
func: function
Predict function to test.
input_type: str
The input_type to be associated with the registered app and deployed model.
One of "integers", "floats", "doubles", "bytes", or "strings".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add an example in the method comment?

"""
query_data = list(x for x in list(query.values()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's going on here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be checking the JSON key as well, to ensure that their input is properly formatted

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming that the query is in a JSON/dict structure, I'm getting the values of each key. I'm then checking to make sure the values are of the right input_type before then converting it into the respective numpy array of right dtype. What formatting of the key input needs to be checked?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the user wants to provide a single input, the key should be "input". If they want to provide a list of inputs, the key should be "input_batch". See the website for details.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Thanks for the clarification.


if type(query_data[0][0]) == list:
query_data = query_data[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are you checking for here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm checking the nesting of the query data, whether it is a list or a list of lists.


flattened_data = [item for sublist in query_data for item in sublist]
numpy_data = None

if input_type == "bytes":
numpy_data = list(np.int8(x) for x in query_data)
for x in flattened_data:
if type(x) != bytes:
return "Invalid input type"

if input_type == "integers":
numpy_data = list(np.int32(x) for x in query_data)
for x in flattened_data:
if type(x) != int:
return "Invalid input type"

if input_type == "floats" or input_type == "doubles":
if input_type == "floats":
numpy_data = list(np.float32(x) for x in query_data)
else:
numpy_data = list(np.float64(x) for x in query_data)
for x in flattened_data:
if type(x) != float:
return "Invalid input type"

if input_type == "string":
numpy_data = list(np.str_(x) for x in query_data)
for x in flattened_data:
if type(x) != str:
return "Invalid input type"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add checks for the other two accepted input types: (floats and bytes)


s = six.StringIO()
c = CloudPickler(s, 2)
c.dump(func)
serialized_func = s.getvalue()
reloaded_func = pickle.loads(serialized_func)

try:
assert reloaded_func
except AssertionError as e:
logger.error("Function does not properly serialize and reload")
return "Function does not properly serialize and reload"

return reloaded_func(numpy_data)
3 changes: 2 additions & 1 deletion clipper_admin/clipper_admin/deployers/python.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@
import os
import posixpath
import shutil
import pickle
import numpy as np
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert the changes to this file

from ..version import __version__

from .deployer_utils import save_python_function
Expand Down Expand Up @@ -87,7 +89,6 @@ def create_endpoint(

clipper_conn.link_model_to_app(name, name)


def deploy_python_closure(
clipper_conn,
name,
Expand Down
24 changes: 22 additions & 2 deletions integration-tests/clipper_admin_tests.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
import os
import json
import time
import numpy as np
import requests
import tempfile
import shutil
Expand All @@ -25,6 +26,7 @@
from clipper_admin.deployers.python import create_endpoint as create_py_endpoint
from clipper_admin.deployers.python import deploy_python_closure
from clipper_admin import __version__ as clipper_version
from clipper_admin import test_predict_function

sys.path.insert(0, os.path.abspath('%s/util_direct_import/' % cur_dir))
from util_package import mock_module_in_package as mmip
Expand Down Expand Up @@ -343,6 +345,20 @@ def predict_func(inputs):
})
self.assertEqual(len(containers), 1)

def test_test_predict_function(self):
def predict_func(xs):
return [sum(x) for x in xs]

deploy_python_closure(self.clipper_conn, name="sum-model", version=1, input_type="doubles", func=predict_func)
self.clipper_conn.link_model_to_app(app_name="hello-world", model_name="sum-model")

headers = {"Content-type": "application/json"}
test_input = list(np.random.random(10))
pred = requests.post("http://localhost:1337/hello-world/predict", headers=headers, data=json.dumps({"input": test_input})).json()
test_predict_result = test_predict_function(self.clipper_conn, query={"input": test_input}, func=predict_func, input_type="doubles")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should call the function like this self.clipper_conn.test_predict_function(query=...), not by passing in self.clipper_conn as an argument.

Can you add a second test prediction that uses the batch_predict interface? e.g.

batch_input = [list(np.random.random(10)) for _ in range(4)]
batch_pred = requests.post("http://localhost:1337/hello-world/predict", headers=headers, data=json.dumps({"input_batch": batch_input})).json()
test_batch_predict_result = self.clipper_conn.test_predict_function(query={"input_batch": test_input}, func=predict_func, input_type="doubles")


self.assertEqual(pred, test_predict_result)


class ClipperManagerTestCaseLong(unittest.TestCase):
@classmethod
Expand Down Expand Up @@ -374,8 +390,10 @@ def setUpClass(self):
self.latency_slo_micros)

self.clipper_conn.register_application(
self.app_name_4, self.input_type, self.default_output,
self.latency_slo_micros)
self.app_name_4,
self.input_type,
self.default_output,
slo_micros=30000000)

@classmethod
def tearDownClass(self):
Expand Down Expand Up @@ -481,6 +499,7 @@ def test_fixed_batch_size_model_processes_specified_query_batch_size_when_satura
model_version = 1

def predict_func(inputs):
time.sleep(.5)
batch_size = len(inputs)
return [str(batch_size) for _ in inputs]

Expand Down Expand Up @@ -534,6 +553,7 @@ def predict_func(inputs):
'test_stop_models',
'test_python_closure_deploys_successfully',
'test_register_py_endpoint',
'test_test_predict_function'
]

LONG_TEST_ORDERING = [
Expand Down