Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monitoring API #1691

Merged
merged 61 commits into from
Apr 22, 2016
Merged

Monitoring API #1691

merged 61 commits into from
Apr 22, 2016

Conversation

rimey
Copy link
Contributor

@rimey rimey commented Apr 3, 2016

This initial version of the monitoring client supports querying of time series, metric descriptors, and monitored resource descriptors.

The authors of this code are @supriyagarg and myself, both at Google.

@rimey rimey added the do not merge Indicates a pull request not ready for merge, due to either quality or timing. label Apr 3, 2016
@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Apr 3, 2016
:param key: The name of the label.

:type value_type: string
:param value_type: The type of the label. It must be one of ``"STRING"``,

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

@jgeewax
Copy link
Contributor

jgeewax commented Apr 4, 2016

Awesome - this is great. I think we have a few high-level issues, but thanks so much for throwing this in.

Issues I can see that we need to figure out:

  1. The namedtuple parent class: I don't recall seeing that pattern in this project. Not sure if that's a thing we want to use -- the main question I'd have is about mutability -- do any of those objects need to be manipulated at all?
  2. The nested methods for iterating over lists of things: I think we have an iterator class that does this... Might be worthwhile to know that we don't want to use that.
  3. Squashing commits: I think we might want to squash these 3 into a single PR-commit, but not 100% sure.
  4. Coverage: I think we lose some test coverage here, so we'd need to get that back up.

Thanks again -- this is awesome

@rimey
Copy link
Contributor Author

rimey commented Apr 9, 2016

To your points:

  1. I converted MetricDescriptor, ResourceDescriptor, and LabelDescriptor into ordinary classes. This could well be helpful when we add support for defining new custom metrics.

    I left Metric, Resource, TimeSeries, and Point as named tuples. If there is a need to revisit them, it would be best to do it in the context of writing data to custom metrics.

    There are already a couple of named tuple classes of this sort in streaming/http_wrapper.py, but they are not exposed.

  2. The code for listing metric descriptors and resource descriptors no longer uses nested functions (not that there is anything wrong with nested functions). It would not be improved by rewriting to use the Iterator base class.

    Moreover, it is better for these particular calls to return lists than iterators.

  3. I don't have a preference as to whether to squash or not.

  4. Test coverage is now at 100%. However, a number of the unit tests fail because pandas is not installed in the test environment. See How to handle an optional package dependency in tests? #1710.

supriyagarg added a commit to googledatalab/datalab that referenced this pull request Apr 11, 2016
This initial version of the monitoring client for Datalab allows users
to query timeseries data. It introduces a new module under gcp that
users can import using:

    from gcp.stackdriver import monitoring

Querying timeseries data:

There is a `Query` class that allows users to query timeseries data
for their monitored resources. They can initialize the query by specifying a
metric type and time interval, and refine it by adding filters to it.

The timeseries data is returned as a pandas DataFrame that allows users
to further manipulate the data and visualize it within Datalab.

IPython magics:

There are a couple of IPython magic commands to allow users to list
details of the available metrics and resource types. E.g.:

    %%monitoring list metrics

The base library:

The _impl directory is a snapshot of an earlier version of a
monitoring client library that we have submitted to gcloud-python:

googleapis/google-cloud-python#1691

It works as the base library on top of which the Datalab library adds
interactive features.

The authors of this code are @rimey and myself, both at Google.
rimey added 12 commits April 11, 2016 17:28
1. Removed start_time from the query constructor/factory.
2. Added a select_interval() method for when you actually
   do have a start time in your hands, or for when you want
   to set the time interval on an existing query object.
Several related changes:

 - Removed resource_type from the constructor/factory parameters.
 - Added support for filtering on "resource_type" as a pseudo-label.
 - Renamed select_resource_labels() to select_resources() and
   select_metric_labels() to select_metrics().
The signature of the reduce() method is now as follows:

    reduce(self, cross_series_reducer, *group_by_fields)
I had dropped this, but I think we want it.
@rimey
Copy link
Contributor Author

rimey commented Apr 20, 2016

@dhermes, I'm done processing your comments. Could you put together a short list of the significant issues that you feel are still open? At this point a VC might indeed be helpful. Would you be available tomorrow morning?

supriyagarg added a commit to googledatalab/datalab that referenced this pull request Apr 21, 2016
This initial version of the monitoring client for Datalab allows users
to query timeseries data. It introduces a new module under gcp that
users can import using:

    from gcp.stackdriver import monitoring

Querying timeseries data:

There is a `Query` class that allows users to query timeseries data
for their monitored resources. They can initialize the query by specifying a
metric type and time interval, and refine it by adding filters to it.

The timeseries data is returned as a pandas DataFrame that allows users
to further manipulate the data and visualize it within Datalab.

IPython magics:

There are a couple of IPython magic commands to allow users to list
details of the available metrics and resource types. E.g.:

    %%monitoring list metrics

The base library:

The _impl directory is a snapshot of an earlier version of a
monitoring client library that we have submitted to gcloud-python:

googleapis/google-cloud-python#1691

It works as the base library on top of which the Datalab library adds
interactive features.

The authors of this code are @rimey and myself, both at Google.
name=info['name'],
type=info['type'],
labels=tuple(LabelDescriptor._from_dict(label)
for label in info.get('labels', [])),

This comment was marked as spam.

This comment was marked as spam.

supriyagarg added a commit to googledatalab/datalab that referenced this pull request Apr 21, 2016
This initial version of the monitoring client for Datalab allows users
to query timeseries data. It introduces a new module under gcp that
users can import using:

    from gcp.stackdriver import monitoring

Querying timeseries data:

There is a `Query` class that allows users to query timeseries data
for their monitored resources. They can initialize the query by specifying a
metric type and time interval, and refine it by adding filters to it.

The timeseries data is returned as a pandas DataFrame that allows users
to further manipulate the data and visualize it within Datalab.

IPython magics:

There are a couple of IPython magic commands to allow users to list
details of the available metrics and resource types. E.g.:

    %%monitoring list metrics

The base library:

The _impl directory is a snapshot of an earlier version of a
monitoring client library that we have submitted to gcloud-python:

googleapis/google-cloud-python#1691

It works as the base library on top of which the Datalab library adds
interactive features.

The authors of this code are @rimey and myself, both at Google.
@dhermes
Copy link
Contributor

dhermes commented Apr 22, 2016

@rimey RE:

Could you put together a short list of the significant issues that you feel are still open?

Only 4 left to get through

  1. Trailing backslashes
  2. Double-list comp.
  3. filter vs. filter_ and type vs. type_
  4. Using existing datetime helpers

@rimey
Copy link
Contributor Author

rimey commented Apr 22, 2016

@dhermes PTAL

  • Trailing backslashes

Done.

  • Double-list comp.

Done.

  • filter vs. filter_ and type vs. type_

Done for filter. I'd like to keep type.

  • Using existing datetime helpers

I changed this code in the unit test to parse its constants using the timestamp format string in gcloud._helpers. I don't want to introduce time zone logic just so I can use a "DRY" helper function instead of a function from the Python standard library.

Good enough?

@dhermes
Copy link
Contributor

dhermes commented Apr 22, 2016

Yes I noticed! Thanks a lot.

AFAICT you only use type as a positional argument in MetricDescriptor.__init__ and ResourceDescriptor.__init__. So the name is somewhat less important since positional. Though maybe that isn't true for you, I noticed most calls to these constructors uses all keyword args.

As for setting self.type = type_, that isn't problematic since self.type is on the objects namespace, which cascades through self.__dict__, type(self).__dict__, and then down the MRO of the parent class. Thus there is no clobbering of __builtins__ when a builtin name is used as an instance / class property.

Are you still opposed to changing type -> type_ given this distinction between positional argument and instance attribute?

@rimey
Copy link
Contributor Author

rimey commented Apr 22, 2016

@dhermes I'd like to stick with type. You can't judge the external usage from my internal usage, and we will soon have more of both when we add creation and deletion of descriptors for custom metrics.

@dhermes
Copy link
Contributor

dhermes commented Apr 22, 2016

I just sent out #1737 to re-enable to the redefined-builtin error. What clarity is lost when moving from type to type_? Do we think the extra typing will overwhelm the subset of users that decides to use it as a keyword rather than positional?

@rimey
Copy link
Contributor Author

rimey commented Apr 22, 2016

@dhermes: Done, under protest.

I don't agree with this decision, for reasons I have already explained. I intend to propose reverting the change later if it has a bad effect on the experience for end users.

Okay?

@dhermes
Copy link
Contributor

dhermes commented Apr 22, 2016

SGTM. Happy to have a non-blocking discussion in an issue.

@dhermes
Copy link
Contributor

dhermes commented Apr 22, 2016

All issues are resolved, LGTM. I'm going to do a squash merge for the very first time. Here goes.

@dhermes dhermes merged commit 7983938 into googleapis:master Apr 22, 2016
@rimey rimey added the api: monitoring Issues related to the Cloud Monitoring API. label Apr 23, 2016
supriyagarg added a commit to googledatalab/datalab that referenced this pull request Apr 29, 2016
This initial version of the monitoring client for Datalab allows users
to query timeseries data. It introduces a new module under gcp that
users can import using:

    from gcp.stackdriver import monitoring

Querying timeseries data:

There is a `Query` class that allows users to query timeseries data
for their monitored resources. They can initialize the query by specifying a
metric type and time interval, and refine it by adding filters to it.

The timeseries data is returned as a pandas DataFrame that allows users
to further manipulate the data and visualize it within Datalab.

IPython magics:

There are a couple of IPython magic commands to allow users to list
details of the available metrics and resource types. E.g.:

    %%monitoring list metrics

The base library:

The _impl directory is a snapshot of an earlier version of a
monitoring client library that we have submitted to gcloud-python:

googleapis/google-cloud-python#1691

It works as the base library on top of which the Datalab library adds
interactive features.

The authors of this code are @rimey and myself, both at Google.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: monitoring Issues related to the Cloud Monitoring API. cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants