Skip to content

Commit

Permalink
Monitoring API (#1691)
Browse files Browse the repository at this point in the history
* Add gcloud.monitoring with fetching of metric descriptors.

* Add fetching of resource descriptors.

* Add querying of time series.

* Add unit tests for everything but time series.

* Added some links to the documentation, and a comment.

* Dropped two nested generators.

* Require start time and end time to be datetimes rather than strings.

* Change how query intervals are specified.

1. Removed start_time from the query constructor/factory.
2. Added a select_interval() method for when you actually
   do have a start time in your hands, or for when you want
   to set the time interval on an existing query object.

* Change how one filters on the resource type.

Several related changes:

 - Removed resource_type from the constructor/factory parameters.
 - Added support for filtering on "resource_type" as a pseudo-label.
 - Renamed select_resource_labels() to select_resources() and
   select_metric_labels() to select_metrics().

* Add some pseudo-enums providing constants you can use if you wish.

* More unit tests, and a few related changes.

The signature of the reduce() method is now as follows:

    reduce(self, cross_series_reducer, *group_by_fields)

* Added a metric_type property.

I had dropped this, but I think we want it.

* The descriptor classes are no longer named tuples.

    MetricDescriptor
    ResourceDescriptor
    LabelDescriptor

* Move Query into a new query.py file.

* Isolated the pandas-dependent code in _dataframe.py.

* Add unit tests for _build_dataframe().

* Make the pseudo-enums importable from gcloud.monitoring.

* Skip tests of _build_dataframe() if pandas is not available.

* Fix lint errors.

* Add "NO COVER" pragmas due to not having pandas in the testenv:cover environment.

* Add a py27-pandas tox test environment.

* as_matrix() -> values

* Utilize the pseudo-enums in the docstrings and add a few more usage examples.

* Add some basic usage documentation.

* Fix some remaining references to the pseudo-enums in a docstring.

* Add prompts (">>>") to examples showing interactive usage.

* Add basic system tests.

* Expect that the service will never return an empty name, type, or key.

* Use print() in docstring examples.

* Add pandas to the intersphinx configuration.

* Alphabetical order.

* Change TOP_RESOURCE_LABELS from list to tuple.

* Rephrase a complex if-statement.

* p -> point

* Use DataFrame.from_records() to build from a list of rows.

* Add a comment about the system test getting no time series data.

* Use Query.DEFAULT_METRIC_TYPE cross-reference in doc strings.

* Add some docstring text about NotFound exceptions.

* Rename "filter" parameter to "filter_" in private methods.

* Move status comments into module docstrings.

* Use :data: for docstring cross references to pseudo-enums.

* Change some empty lists to empty tuples.

* _NOW -> _UTCNOW

* Remove docstring text explaining about named tuples.

* Reformat a docstring example.

* _page_size -> page_size

* Introduce an _iter_fragments() helper method.

* Make _build_query_params() a generator.

* Use six.iteritems().

* Reorganize test data around DIMENSIONS instead of N and M.

* Restyle an import.

* Unpack points more elegantly.

* Remove pylint comment disabling redefined-builtin.

(It's disabled project-wide.)

* Cache the label map without having to disable pylint.

* Excercise the test harness fully by reading past the available responses.

* Change some more empty lists into empty tuples.

* filter/filter_ -> filter_string

* '%Y-%m-%dT%H:%M:%S.%fZ' -> _RFC3339_MICROS

* Use parentheses instead of backslashes.

* Reorganize a hard-to-understand bit of code for building multi-level column headers.

* type -> type_
  • Loading branch information
rimey authored and dhermes committed Apr 22, 2016
1 parent 18c92fe commit 7983938
Show file tree
Hide file tree
Showing 31 changed files with 4,307 additions and 1 deletion.
Binary file added docs/.DS_Store
Binary file not shown.
1 change: 1 addition & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -292,5 +292,6 @@
intersphinx_mapping = {
'httplib2': ('http://bitworking.org/projects/httplib2/doc/html/', None),
'oauth2client': ('http://oauth2client.readthedocs.org/en/latest/', None),
'pandas': ('http://pandas.pydata.org/pandas-docs/stable/', None),
'python': ('https://docs.python.org/', None),
}
13 changes: 13 additions & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,19 @@
logging-metric
logging-sink

.. toctree::
:maxdepth: 0
:hidden:
:caption: Cloud Monitoring

monitoring-usage
Client <monitoring-client>
monitoring-metric
monitoring-resource
monitoring-query
monitoring-timeseries
monitoring-label

.. toctree::
:maxdepth: 0
:hidden:
Expand Down
16 changes: 16 additions & 0 deletions docs/monitoring-client.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
Monitoring Client
=================

.. automodule:: gcloud.monitoring.client
:members:
:undoc-members:
:show-inheritance:

Connection
~~~~~~~~~~

.. automodule:: gcloud.monitoring.connection
:members:
:undoc-members:
:show-inheritance:

7 changes: 7 additions & 0 deletions docs/monitoring-label.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
Label Descriptors
=================

.. automodule:: gcloud.monitoring.label
:members:
:undoc-members:
:show-inheritance:
7 changes: 7 additions & 0 deletions docs/monitoring-metric.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
Metric Descriptors
==================

.. automodule:: gcloud.monitoring.metric
:members:
:undoc-members:
:show-inheritance:
7 changes: 7 additions & 0 deletions docs/monitoring-query.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
Time Series Query
=================

.. automodule:: gcloud.monitoring.query
:members:
:undoc-members:
:show-inheritance:
7 changes: 7 additions & 0 deletions docs/monitoring-resource.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
Resource Descriptors
====================

.. automodule:: gcloud.monitoring.resource
:members:
:undoc-members:
:show-inheritance:
7 changes: 7 additions & 0 deletions docs/monitoring-timeseries.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
Time Series
===========

.. automodule:: gcloud.monitoring.timeseries
:members:
:undoc-members:
:show-inheritance:
155 changes: 155 additions & 0 deletions docs/monitoring-usage.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
Using the API
=============


Introduction
------------

With the Monitoring API, you can work with Stackdriver metric data
pertaining to monitored resources in Google Cloud Platform (GCP)
or elsewhere.

Essential concepts:

- Metric data is associated with a **monitored resource**. A monitored
resource has a *resource type* and a set of *resource labels* —
key-value pairs — that identify the particular resource.
- A **metric** further identifies the particular kind of data that
is being collected. It has a *metric type* and a set of *metric
labels* that, when combined with the resource labels, identify
a particular time series.
- A **time series** is a collection of data points associated with
points or intervals in time.

Please refer to the documentation for the `Monitoring API`_ for
more information.

At present, this client library supports querying of time series,
metric descriptors, and resource descriptors.

.. _Monitoring API: https://cloud.google.com/monitoring/api/


The Monitoring Client Object
----------------------------

The monitoring client library generally makes its
functionality available as methods of the monitoring
:class:`~gcloud.monitoring.client.Client` class.
A :class:`~gcloud.monitoring.client.Client` instance holds
authentication credentials and the ID of the target project with
which the metric data of interest is associated. This project ID
will often refer to a `Stackdriver account`_ binding multiple
GCP projects and AWS accounts. It can also simply be the ID of
a monitored project.

Most often the authentication credentials will be determined
implicitly from your environment. See :doc:`gcloud-auth` for
more information.

It is thus typical to create a client object as follows::

>>> from gcloud import monitoring
>>> client = monitoring.Client(project='target-project')

If you are running in Google Compute Engine or Google App Engine,
the current project is the default target project. This default
can be further overridden with the :envvar:`GCLOUD_PROJECT`
environment variable. Using the default target project is
even easier::

>>> client = monitoring.Client()

If necessary, you can pass in ``credentials`` and ``project`` explicitly::

>>> client = monitoring.Client(project='target-project', credentials=...)

.. _Stackdriver account: https://cloud.google.com/monitoring/accounts/


Monitored Resource Descriptors
------------------------------

The available monitored resource types are defined by *monitored resource
descriptors*. You can fetch a list of these with the
:meth:`~gcloud.monitoring.client.Client.list_resource_descriptors` method::

>>> for descriptor in client.list_resource_descriptors():
... print(descriptor.type)

Each :class:`~gcloud.monitoring.resource.ResourceDescriptor`
has a type, a display name, a description, and a list of
:class:`~gcloud.monitoring.label.LabelDescriptor` instances.
See the documentation about `Monitored Resources`_
for more information.

.. _Monitored Resources:
https://cloud.google.com/monitoring/api/v3/monitored-resources


Metric Descriptors
------------------

The available metric types are defined by *metric descriptors*.
They include `platform metrics`_, `agent metrics`_, and `custom metrics`_.
You can list all of these with the
:meth:`~gcloud.monitoring.client.Client.list_metric_descriptors` method::

>>> for descriptor in client.list_metric_descriptors():
... print(descriptor.type)

See :class:`~gcloud.monitoring.metric.MetricDescriptor` and the
`Metric Descriptors`_ API documentation for more information.

.. _platform metrics: https://cloud.google.com/monitoring/api/metrics
.. _agent metrics: https://cloud.google.com/monitoring/agent/
.. _custom metrics: https://cloud.google.com/monitoring/custom-metrics/
.. _Metric Descriptors:
https://cloud.google.com/monitoring/api/ref_v3/rest/v3/\
projects.metricDescriptors


Time Series Queries
-------------------

A time series includes a collection of data points and a set of
resource and metric label values.
See :class:`~gcloud.monitoring.timeseries.TimeSeries` and the
`Time Series`_ API documentation for more information.

While you can obtain time series objects by iterating over a
:class:`~gcloud.monitoring.query.Query` object, usually it is
more useful to retrieve time series data in the form of a
:class:`pandas.DataFrame`, where each column corresponds to a
single time series. For this, you must have :mod:`pandas` installed;
it is not a required dependency of ``gcloud-python``.

You can display CPU utilization across your GCE instances during
the last five minutes as follows::

>>> METRIC = 'compute.googleapis.com/instance/cpu/utilization'
>>> query = client.query(METRIC, minutes=5)
>>> print(query.as_dataframe())

:class:`~gcloud.monitoring.query.Query` objects provide a variety of
methods for refining the query. You can request temporal alignment
and cross-series reduction, and you can filter by label values.
See the client :meth:`~gcloud.monitoring.client.Client.query` method
and the :class:`~gcloud.monitoring.query.Query` class for more
information.

For example, you can display CPU utilization during the last hour
across GCE instances with names beginning with ``"mycluster-"``,
averaged over five-minute intervals and aggregated per zone, as
follows::

>>> from gcloud.monitoring import Aligner, Reducer
>>> METRIC = 'compute.googleapis.com/instance/cpu/utilization'
>>> query = (client.query(METRIC, hours=1)
... .select_metrics(instance_name_prefix='mycluster-')
... .align(Aligner.ALIGN_MEAN, minutes=5)
... .reduce(Reducer.REDUCE_MEAN, 'resource.zone'))
>>> print(query.as_dataframe())

.. _Time Series:
https://cloud.google.com/monitoring/api/ref_v3/rest/v3/TimeSeries
34 changes: 34 additions & 0 deletions gcloud/monitoring/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Copyright 2016 Google Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""Google Monitoring API wrapper."""

from gcloud.monitoring.client import Client
from gcloud.monitoring.connection import Connection
from gcloud.monitoring.label import LabelDescriptor
from gcloud.monitoring.label import LabelValueType
from gcloud.monitoring.metric import Metric
from gcloud.monitoring.metric import MetricDescriptor
from gcloud.monitoring.metric import MetricKind
from gcloud.monitoring.metric import ValueType
from gcloud.monitoring.query import Aligner
from gcloud.monitoring.query import Query
from gcloud.monitoring.query import Reducer
from gcloud.monitoring.resource import Resource
from gcloud.monitoring.resource import ResourceDescriptor
from gcloud.monitoring.timeseries import Point
from gcloud.monitoring.timeseries import TimeSeries


SCOPE = Connection.SCOPE
116 changes: 116 additions & 0 deletions gcloud/monitoring/_dataframe.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
# Copyright 2016 Google Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""Time series as :mod:`pandas` dataframes."""

import itertools

TOP_RESOURCE_LABELS = (
'project_id',
'aws_account',
'location',
'region',
'zone',
)


def _build_dataframe(time_series_iterable,
label=None, labels=None): # pragma: NO COVER
"""Build a :mod:`pandas` dataframe out of time series.
:type time_series_iterable:
iterable over :class:`~gcloud.monitoring.timeseries.TimeSeries`
:param time_series_iterable:
An iterable (e.g., a query object) yielding time series.
:type label: string or None
:param label:
The label name to use for the dataframe header. This can be the name
of a resource label or metric label (e.g., ``"instance_name"``), or
the string ``"resource_type"``.
:type labels: list of strings, or None
:param labels:
A list or tuple of label names to use for the dataframe header.
If more than one label name is provided, the resulting dataframe
will have a multi-level column header.
Specifying neither ``label`` or ``labels`` results in a dataframe
with a multi-level column header including the resource type and
all available resource and metric labels.
Specifying both ``label`` and ``labels`` is an error.
:rtype: :class:`pandas.DataFrame`
:returns: A dataframe where each column represents one time series.
"""
import pandas # pylint: disable=import-error

if labels is not None:
if label is not None:
raise ValueError('Cannot specify both "label" and "labels".')
elif not labels:
raise ValueError('"labels" must be non-empty or None.')

columns = []
headers = []
for time_series in time_series_iterable:
pandas_series = pandas.Series(
data=[point.value for point in time_series.points],
index=[point.end_time for point in time_series.points],
)
columns.append(pandas_series)
headers.append(time_series.header())

# Implement a smart default of using all available labels.
if label is None and labels is None:
resource_labels = set(itertools.chain.from_iterable(
header.resource.labels for header in headers))
metric_labels = set(itertools.chain.from_iterable(
header.metric.labels for header in headers))
labels = (['resource_type'] +
_sorted_resource_labels(resource_labels) +
sorted(metric_labels))

# Assemble the columns into a DataFrame.
dataframe = pandas.DataFrame.from_records(columns).T

# Convert the timestamp strings into a DatetimeIndex.
dataframe.index = pandas.to_datetime(dataframe.index)

# Build a multi-level stack of column headers. Some labels may
# be undefined for some time series.
levels = []
for key in labels or [label]:
level = [header.labels.get(key, '') for header in headers]
levels.append(level)

# Build a column Index or MultiIndex. Do not include level names
# in the column header if the user requested a single-level header
# by specifying "label".
dataframe.columns = pandas.MultiIndex.from_arrays(
levels,
names=labels or None)

# Sort the rows just in case (since the API doesn't guarantee the
# ordering), and sort the columns lexicographically.
return dataframe.sort_index(axis=0).sort_index(axis=1)


def _sorted_resource_labels(labels):
"""Sort label names, putting well-known resource labels first."""
head = [label for label in TOP_RESOURCE_LABELS if label in labels]
tail = sorted(label for label in labels
if label not in TOP_RESOURCE_LABELS)
return head + tail
Loading

0 comments on commit 7983938

Please sign in to comment.