Monitoring API (#1691)

* Add gcloud.monitoring with fetching of metric descriptors. * Add fetching of resource descriptors. * Add querying of time series. * Add unit tests for everything but time series. * Added some links to the documentation, and a comment. * Dropped two nested generators. * Require start time and end time to be datetimes rather than strings. * Change how query intervals are specified. 1. Removed start_time from the query constructor/factory. 2. Added a select_interval() method for when you actually do have a start time in your hands, or for when you want to set the time interval on an existing query object. * Change how one filters on the resource type. Several related changes: - Removed resource_type from the constructor/factory parameters. - Added support for filtering on "resource_type" as a pseudo-label. - Renamed select_resource_labels() to select_resources() and select_metric_labels() to select_metrics(). * Add some pseudo-enums providing constants you can use if you wish. * More unit tests, and a few related changes. The signature of the reduce() method is now as follows: reduce(self, cross_series_reducer, *group_by_fields) * Added a metric_type property. I had dropped this, but I think we want it. * The descriptor classes are no longer named tuples. MetricDescriptor ResourceDescriptor LabelDescriptor * Move Query into a new query.py file. * Isolated the pandas-dependent code in _dataframe.py. * Add unit tests for _build_dataframe(). * Make the pseudo-enums importable from gcloud.monitoring. * Skip tests of _build_dataframe() if pandas is not available. * Fix lint errors. * Add "NO COVER" pragmas due to not having pandas in the testenv:cover environment. * Add a py27-pandas tox test environment. * as_matrix() -> values * Utilize the pseudo-enums in the docstrings and add a few more usage examples. * Add some basic usage documentation. * Fix some remaining references to the pseudo-enums in a docstring. * Add prompts (">>>") to examples showing interactive usage. * Add basic system tests. * Expect that the service will never return an empty name, type, or key. * Use print() in docstring examples. * Add pandas to the intersphinx configuration. * Alphabetical order. * Change TOP_RESOURCE_LABELS from list to tuple. * Rephrase a complex if-statement. * p -> point * Use DataFrame.from_records() to build from a list of rows. * Add a comment about the system test getting no time series data. * Use Query.DEFAULT_METRIC_TYPE cross-reference in doc strings. * Add some docstring text about NotFound exceptions. * Rename "filter" parameter to "filter_" in private methods. * Move status comments into module docstrings. * Use :data: for docstring cross references to pseudo-enums. * Change some empty lists to empty tuples. * _NOW -> _UTCNOW * Remove docstring text explaining about named tuples. * Reformat a docstring example. * _page_size -> page_size * Introduce an _iter_fragments() helper method. * Make _build_query_params() a generator. * Use six.iteritems(). * Reorganize test data around DIMENSIONS instead of N and M. * Restyle an import. * Unpack points more elegantly. * Remove pylint comment disabling redefined-builtin. (It's disabled project-wide.) * Cache the label map without having to disable pylint. * Excercise the test harness fully by reading past the available responses. * Change some more empty lists into empty tuples. * filter/filter_ -> filter_string * '%Y-%m-%dT%H:%M:%S.%fZ' -> _RFC3339_MICROS * Use parentheses instead of backslashes. * Reorganize a hard-to-understand bit of code for building multi-level column headers. * type -> type_
googleapis · Apr 22, 2016 · 7983938 · 7983938
1 parent 18c92fe
commit 7983938
Show file tree

Hide file tree

Showing 31 changed files with 4,307 additions and 1 deletion.
diff --git a/docs/.DS_Store b/docs/.DS_Store
diff --git a/docs/conf.py b/docs/conf.py
@@ -292,5 +292,6 @@
 intersphinx_mapping = {
     'httplib2': ('http://bitworking.org/projects/httplib2/doc/html/', None),
     'oauth2client': ('http://oauth2client.readthedocs.org/en/latest/', None),
+    'pandas': ('http://pandas.pydata.org/pandas-docs/stable/', None),
     'python': ('https://docs.python.org/', None),
 }
diff --git a/docs/index.rst b/docs/index.rst
@@ -119,6 +119,19 @@
   logging-metric
   logging-sink
 
+.. toctree::
+  :maxdepth: 0
+  :hidden:
+  :caption: Cloud Monitoring
+
+  monitoring-usage
+  Client <monitoring-client>
+  monitoring-metric
+  monitoring-resource
+  monitoring-query
+  monitoring-timeseries
+  monitoring-label
+
 .. toctree::
   :maxdepth: 0
   :hidden:

diff --git a/docs/monitoring-client.rst b/docs/monitoring-client.rst
@@ -0,0 +1,16 @@
+Monitoring Client
+=================
+
+.. automodule:: gcloud.monitoring.client
+  :members:
+  :undoc-members:
+  :show-inheritance:
+
+Connection
+~~~~~~~~~~
+
+.. automodule:: gcloud.monitoring.connection
+  :members:
+  :undoc-members:
+  :show-inheritance:
+
diff --git a/docs/monitoring-label.rst b/docs/monitoring-label.rst
@@ -0,0 +1,7 @@
+Label Descriptors
+=================
+
+.. automodule:: gcloud.monitoring.label
+  :members:
+  :undoc-members:
+  :show-inheritance:
diff --git a/docs/monitoring-metric.rst b/docs/monitoring-metric.rst
@@ -0,0 +1,7 @@
+Metric Descriptors
+==================
+
+.. automodule:: gcloud.monitoring.metric
+  :members:
+  :undoc-members:
+  :show-inheritance:
diff --git a/docs/monitoring-query.rst b/docs/monitoring-query.rst
@@ -0,0 +1,7 @@
+Time Series Query
+=================
+
+.. automodule:: gcloud.monitoring.query
+  :members:
+  :undoc-members:
+  :show-inheritance:
diff --git a/docs/monitoring-resource.rst b/docs/monitoring-resource.rst
@@ -0,0 +1,7 @@
+Resource Descriptors
+====================
+
+.. automodule:: gcloud.monitoring.resource
+  :members:
+  :undoc-members:
+  :show-inheritance:
diff --git a/docs/monitoring-timeseries.rst b/docs/monitoring-timeseries.rst
@@ -0,0 +1,7 @@
+Time Series
+===========
+
+.. automodule:: gcloud.monitoring.timeseries
+  :members:
+  :undoc-members:
+  :show-inheritance:
diff --git a/docs/monitoring-usage.rst b/docs/monitoring-usage.rst
@@ -0,0 +1,155 @@
+Using the API
+=============
+
+
+Introduction
+------------
+
+With the Monitoring API, you can work with Stackdriver metric data
+pertaining to monitored resources in Google Cloud Platform (GCP)
+or elsewhere.
+
+Essential concepts:
+
+- Metric data is associated with a **monitored resource**. A monitored
+  resource has a *resource type* and a set of *resource labels* —
+  key-value pairs — that identify the particular resource.
+- A **metric** further identifies the particular kind of data that
+  is being collected. It has a *metric type* and a set of *metric
+  labels* that, when combined with the resource labels, identify
+  a particular time series.
+- A **time series** is a collection of data points associated with
+  points or intervals in time.
+
+Please refer to the documentation for the `Monitoring API`_ for
+more information.
+
+At present, this client library supports querying of time series,
+metric descriptors, and resource descriptors.
+
+.. _Monitoring API: https://cloud.google.com/monitoring/api/
+
+
+The Monitoring Client Object
+----------------------------
+
+The monitoring client library generally makes its
+functionality available as methods of the monitoring
+:class:`~gcloud.monitoring.client.Client` class.
+A :class:`~gcloud.monitoring.client.Client` instance holds
+authentication credentials and the ID of the target project with
+which the metric data of interest is associated. This project ID
+will often refer to a `Stackdriver account`_ binding multiple
+GCP projects and AWS accounts. It can also simply be the ID of
+a monitored project.
+
+Most often the authentication credentials will be determined
+implicitly from your environment. See :doc:`gcloud-auth` for
+more information.
+
+It is thus typical to create a client object as follows::
+
+    >>> from gcloud import monitoring
+    >>> client = monitoring.Client(project='target-project')
+
+If you are running in Google Compute Engine or Google App Engine,
+the current project is the default target project. This default
+can be further overridden with the :envvar:`GCLOUD_PROJECT`
+environment variable. Using the default target project is
+even easier::
+
+    >>> client = monitoring.Client()
+
+If necessary, you can pass in ``credentials`` and ``project`` explicitly::
+
+    >>> client = monitoring.Client(project='target-project', credentials=...)
+
+.. _Stackdriver account: https://cloud.google.com/monitoring/accounts/
+
+
+Monitored Resource Descriptors
+------------------------------
+
+The available monitored resource types are defined by *monitored resource
+descriptors*. You can fetch a list of these with the
+:meth:`~gcloud.monitoring.client.Client.list_resource_descriptors` method::
+
+    >>> for descriptor in client.list_resource_descriptors():
+    ...     print(descriptor.type)
+
+Each :class:`~gcloud.monitoring.resource.ResourceDescriptor`
+has a type, a display name, a description, and a list of
+:class:`~gcloud.monitoring.label.LabelDescriptor` instances.
+See the documentation about `Monitored Resources`_
+for more information.
+
+.. _Monitored Resources:
+    https://cloud.google.com/monitoring/api/v3/monitored-resources
+
+
+Metric Descriptors
+------------------
+
+The available metric types are defined by *metric descriptors*.
+They include `platform metrics`_, `agent metrics`_, and `custom metrics`_.
+You can list all of these with the
+:meth:`~gcloud.monitoring.client.Client.list_metric_descriptors` method::
+
+    >>> for descriptor in client.list_metric_descriptors():
+    ...     print(descriptor.type)
+
+See :class:`~gcloud.monitoring.metric.MetricDescriptor` and the
+`Metric Descriptors`_ API documentation for more information.
+
+.. _platform metrics: https://cloud.google.com/monitoring/api/metrics
+.. _agent metrics: https://cloud.google.com/monitoring/agent/
+.. _custom metrics: https://cloud.google.com/monitoring/custom-metrics/
+.. _Metric Descriptors:
+    https://cloud.google.com/monitoring/api/ref_v3/rest/v3/\
+    projects.metricDescriptors
+
+
+Time Series Queries
+-------------------
+
+A time series includes a collection of data points and a set of
+resource and metric label values.
+See :class:`~gcloud.monitoring.timeseries.TimeSeries` and the
+`Time Series`_ API documentation for more information.
+
+While you can obtain time series objects by iterating over a
+:class:`~gcloud.monitoring.query.Query` object, usually it is
+more useful to retrieve time series data in the form of a
+:class:`pandas.DataFrame`, where each column corresponds to a
+single time series. For this, you must have :mod:`pandas` installed;
+it is not a required dependency of ``gcloud-python``.
+
+You can display CPU utilization across your GCE instances during
+the last five minutes as follows::
+
+    >>> METRIC = 'compute.googleapis.com/instance/cpu/utilization'
+    >>> query = client.query(METRIC, minutes=5)
+    >>> print(query.as_dataframe())
+
+:class:`~gcloud.monitoring.query.Query` objects provide a variety of
+methods for refining the query. You can request temporal alignment
+and cross-series reduction, and you can filter by label values.
+See the client :meth:`~gcloud.monitoring.client.Client.query` method
+and the :class:`~gcloud.monitoring.query.Query` class for more
+information.
+
+For example, you can display CPU utilization during the last hour
+across GCE instances with names beginning with ``"mycluster-"``,
+averaged over five-minute intervals and aggregated per zone, as
+follows::
+
+    >>> from gcloud.monitoring import Aligner, Reducer
+    >>> METRIC = 'compute.googleapis.com/instance/cpu/utilization'
+    >>> query = (client.query(METRIC, hours=1)
+    ...          .select_metrics(instance_name_prefix='mycluster-')
+    ...          .align(Aligner.ALIGN_MEAN, minutes=5)
+    ...          .reduce(Reducer.REDUCE_MEAN, 'resource.zone'))
+    >>> print(query.as_dataframe())
+
+.. _Time Series:
+    https://cloud.google.com/monitoring/api/ref_v3/rest/v3/TimeSeries
diff --git a/gcloud/monitoring/__init__.py b/gcloud/monitoring/__init__.py
@@ -0,0 +1,34 @@
+# Copyright 2016 Google Inc. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Google Monitoring API wrapper."""
+
+from gcloud.monitoring.client import Client
+from gcloud.monitoring.connection import Connection
+from gcloud.monitoring.label import LabelDescriptor
+from gcloud.monitoring.label import LabelValueType
+from gcloud.monitoring.metric import Metric
+from gcloud.monitoring.metric import MetricDescriptor
+from gcloud.monitoring.metric import MetricKind
+from gcloud.monitoring.metric import ValueType
+from gcloud.monitoring.query import Aligner
+from gcloud.monitoring.query import Query
+from gcloud.monitoring.query import Reducer
+from gcloud.monitoring.resource import Resource
+from gcloud.monitoring.resource import ResourceDescriptor
+from gcloud.monitoring.timeseries import Point
+from gcloud.monitoring.timeseries import TimeSeries
+
+
+SCOPE = Connection.SCOPE
diff --git a/gcloud/monitoring/_dataframe.py b/gcloud/monitoring/_dataframe.py
@@ -0,0 +1,116 @@
+# Copyright 2016 Google Inc. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Time series as :mod:`pandas` dataframes."""
+
+import itertools
+
+TOP_RESOURCE_LABELS = (
+    'project_id',
+    'aws_account',
+    'location',
+    'region',
+    'zone',
+)
+
+
+def _build_dataframe(time_series_iterable,
+                     label=None, labels=None):  # pragma: NO COVER
+    """Build a :mod:`pandas` dataframe out of time series.
+
+    :type time_series_iterable:
+        iterable over :class:`~gcloud.monitoring.timeseries.TimeSeries`
+    :param time_series_iterable:
+        An iterable (e.g., a query object) yielding time series.
+
+    :type label: string or None
+    :param label:
+        The label name to use for the dataframe header. This can be the name
+        of a resource label or metric label (e.g., ``"instance_name"``), or
+        the string ``"resource_type"``.
+
+    :type labels: list of strings, or None
+    :param labels:
+        A list or tuple of label names to use for the dataframe header.
+        If more than one label name is provided, the resulting dataframe
+        will have a multi-level column header.
+
+        Specifying neither ``label`` or ``labels`` results in a dataframe
+        with a multi-level column header including the resource type and
+        all available resource and metric labels.
+
+        Specifying both ``label`` and ``labels`` is an error.
+
+    :rtype: :class:`pandas.DataFrame`
+    :returns: A dataframe where each column represents one time series.
+    """
+    import pandas   # pylint: disable=import-error
+
+    if labels is not None:
+        if label is not None:
+            raise ValueError('Cannot specify both "label" and "labels".')
+        elif not labels:
+            raise ValueError('"labels" must be non-empty or None.')
+
+    columns = []
+    headers = []
+    for time_series in time_series_iterable:
+        pandas_series = pandas.Series(
+            data=[point.value for point in time_series.points],
+            index=[point.end_time for point in time_series.points],
+        )
+        columns.append(pandas_series)
+        headers.append(time_series.header())
+
+    # Implement a smart default of using all available labels.
+    if label is None and labels is None:
+        resource_labels = set(itertools.chain.from_iterable(
+            header.resource.labels for header in headers))
+        metric_labels = set(itertools.chain.from_iterable(
+            header.metric.labels for header in headers))
+        labels = (['resource_type'] +
+                  _sorted_resource_labels(resource_labels) +
+                  sorted(metric_labels))
+
+    # Assemble the columns into a DataFrame.
+    dataframe = pandas.DataFrame.from_records(columns).T
+
+    # Convert the timestamp strings into a DatetimeIndex.
+    dataframe.index = pandas.to_datetime(dataframe.index)
+
+    # Build a multi-level stack of column headers. Some labels may
+    # be undefined for some time series.
+    levels = []
+    for key in labels or [label]:
+        level = [header.labels.get(key, '') for header in headers]
+        levels.append(level)
+
+    # Build a column Index or MultiIndex. Do not include level names
+    # in the column header if the user requested a single-level header
+    # by specifying "label".
+    dataframe.columns = pandas.MultiIndex.from_arrays(
+        levels,
+        names=labels or None)
+
+    # Sort the rows just in case (since the API doesn't guarantee the
+    # ordering), and sort the columns lexicographically.
+    return dataframe.sort_index(axis=0).sort_index(axis=1)
+
+
+def _sorted_resource_labels(labels):
+    """Sort label names, putting well-known resource labels first."""
+    head = [label for label in TOP_RESOURCE_LABELS if label in labels]
+    tail = sorted(label for label in labels
+                  if label not in TOP_RESOURCE_LABELS)
+    return head + tail