Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monitoring: Write Timeseries Values Usage Plan #1867

Closed
waprin opened this issue Jun 17, 2016 · 4 comments
Closed

Monitoring: Write Timeseries Values Usage Plan #1867

waprin opened this issue Jun 17, 2016 · 4 comments
Assignees
Labels
api: monitoring Issues related to the Cloud Monitoring API. priority: p2 Moderately-important priority. Fix may not be included in next release.

Comments

@waprin
Copy link
Contributor

waprin commented Jun 17, 2016

Currently the Monitoring client does not support writing timeseries values, though that's probably the canonical use case of the API.

@rimey is planning on adding it, I was hoping to add some samples sooner so he said I could start to spec it out, and whoever gets to it first will do the PR, but either way we should come to some consensus on the usage patterns first.

I know he was also concerned about #1732 since datetimes are pretty important in monitoring, but I'm not sure yet what's actionable. We should definitely minimize needs for user to play with timestamps.

Anyway high level overview of Monitoring API from my understanding

MetricDescriptor - This object describes a type of Metric, can be parametrized with LabelDescriptors
MonitoredResourceDescriptor - This object describes a type of thing to be monitored, can be parametrized with LabelDescriptors

Metric - A MetricDescriptor parametrized with labels
Resource - A ResourceDescriptor paramettrized with labels

TimeSeries - Metric + Resource + points. The points have intervals, although for GAUGE metrics (one of the MetricKind) that interval will have the same start and end time and usually be the current time. CUMULATIVE metrics will have a separate start and end time, with the end time usually being the current time. DELTA metrics can not be used for custom metrics, so I think can be ignored for the purpose of writing timeseries, because I think you only ever write custom metrics (non-custom metrics are the default ones written for you).

See the current usage docs for more context.

I think the most common use case will be to write a custom GAUGE metric for the current time.

from gcloud import monitoring
client = monitoring.Client() 

# Create a Metric by its MetricDescriptor type name, assign labels as a dict
# Possibly doesn't exist but we won't check yet, unless they reload()
gauge_metric = client.metric('custom.googleapis.com/my_metric', labels={'response_code': 404})

# Create a Resource by its resource name, assign labels as dict
resource = client.resource('gce_instance', labels={'instance_id': 'my-instance', 'zone': 'us-central1-f'})

# Create and save the timeseries resource
client.write_timeseries(metric=metric, resource=resource, points=[1,2,3]) # API call

# GAUGE metrics should only specify end_time if now is not correct, start_time will default to end_time
client.write_timeseries(metric=metric, resource=resource, points=[4,5], end_time=get_a_time())

cumulative_metric = client.metric('custom.googleapis.com/my_cumulative_metric')

# end time always defaults to now
cilent.write_timeseries(metric=cumulative_metric, resource=resource, points=[7, 8, 9], start_time=get_start_time()) # API call

I was originally thinking something like

# Abandoned pattern
# Add 3 time series points, all default to interval of start/end at now
timeseries = monitoring.TimeSeries(metric=metric, resource=resource)
timeseries.add_points([1, 2, 3])
# Write the 3 time series values
timeseries.save() # API call

The disadvantage is that when you want to add more points to the same Metric+Resource, you either have to create a new TimeSeries object anyway, or have strange semantics where you add points to an object you already created, even though really a new object is being created. So I think it's better if TimeSeries are immutable client-side and you just create them directly with the call on the client.

Another small point is the start_time and end_time defaults. The idea is that end_time always defaults to now, unless specified, and start_time always defaults to end_time, unless specified. This makes sense, though it might look slightly strange that sometimes we are only explicitly setting start_time (to indicate when a CUMULATIV metric starts) and other times only explicitly setting end_time (to change the entire interval for a GAUGE metric). I think it's ok though.

One of the big questions is how much client-side validation do we want to do. For example given a GAUGE metric descriptor, we know that startTime needs to equal endTime. However if we want to enforce this client-side, we have to pull down the metric descriptor first, introducing an extra API call. So it's probably best to just leave most of the validation server side.

Do we need an actual Label class or are dicts good enough? I think dicts are good enough.

cc / @sharbison3

@sharbison3
Copy link

If the gcloud-python library does not support custom metrics, then we
cannot use it in the code samples to be shipped before June 30. We must use
the older client libraries, and we'll try to include gcloud-python in the
Q3 Stackdriver GA release.

If we don't support writing timeseries data, then we don't support custom
metrics. The fact that you can create new metric descriptors is not
sufficient.

On Thu, Jun 16, 2016 at 8:48 PM Bill Prin notifications@github.com wrote:

Currently the Monitoring client does not support writing timeseries
values, though that's probably the canonical use case of the API.

@rimey https://github.com/rimey is planning on adding it, I was hoping
to add some samples sooner so he said I could start to spec it out, and
whoever gets to it first will do the PR, but either way we should come to
some consensus on the usage patterns first.

I know he was also concerned about #1732
#1732 since
datetimes are pretty important in monitoring, but I'm not sure yet what's
actionable. We should definitely minimize needs for user to play with
timestamps.

Anyway high level overview of Monitoring API from my understanding

MetricDescriptor - This object describes a type of Metric, can be
parametrized with LabelDescriptors
MonitoredResourceDescriptor - This object describes a type of thing to be
monitored, can be parametrized with LabelDescriptors

Metric - A MetricDescriptor parametrized with labels
Resource - A ResourceDescriptor paramettrized with labels

TimeSeries - Metric + Resource + points. The points have intervals,
although for GAUGE metrics (one of the MetricKind) that interval will have
the same start and end time and usually be the current time. CUMULATIVE
metrics will have a separate start and end time, with the end time usually
being the current time. DELTA metrics can not be used for custom metrics,
so I think can be ignored for the purpose of writing timeseries, because I
think you only ever write custom metrics (non-custom metrics are the
default ones written for you).

See the current usage docs
http://googlecloudplatform.github.io/gcloud-python/stable/monitoring-usage.html
for more context.

I think the most common use case will be to write a custom GAUGE metric
for the current time.

`
from gcloud import monitoring
client = monitoring.Client()
Create a Metric by its MetricDescriptor type name, assign labels as a dict Possibly
doesn't exist but we won't check yet, unless they reload()

gauge_metric = client.metric('custom.googleapis.com/my_metric',
labels={'response_code': 404})
Create a Resource by its resource name, assign labels as dict

resource = client.resource('gce_instance', labels={'instance_id':
'my-instance', 'zone': 'us-central1-f'})
Create and save the timeseries resource

client.write_timeseries(metric=metric, resource=resource, points=[1,2,3])

API call

GAUGE metrics should only specify end_time if now is not correct,
start_time will default to end_time

client.write_timeseries(metric=metric, resource=resource, [4,5],
end_time=get_a_time())

cumulative_metric = client.metric('
custom.googleapis.com/my_cumulative_metric')
end time always defaults to now

cilent.write_timeseries([7, 8, 9], start_time=get_start_time()) # API call
`

I was originally thinking something like

`
Abandoned pattern Add 3 time series points, all default to interval of
start/end at now

timeseries = monitoring.TimeSeries(metric=metric, resource=resource)
timeseries.add_points([1, 2, 3])
Write the 3 time series values

timeseries.save() # API call
`

The disadvantage is that when you want to add more points to the same
Metric+Resource, you either have to create a new TimeSeries object anyway,
or have strange semantics where you add points to an object you already
created, even though really a new object is being created. So I think it's
better if TimeSeries are immutable client-side and you just create them
directly with the call on the client.

Another small point is the start_time and end_time defaults. The idea is
that end_time always defaults to now, unless specified, and start_time
always defaults to end_time, unless specified. This makes sense, though it
might look slightly strange that sometimes we are only explicitly setting
start_time (to indicate when a CUMULATIV metric starts) and other times
only explicitly setting end_time (to change the entire interval for a GAUGE
metric). I think it's ok though.

One of the big questions is how much client-side validation do we want to
do. For example given a GAUGE metric descriptor, we know that startTime
needs to equal endTime. However if we want to enforce this client-side, we
have to pull down the metric descriptor first, introducing an extra API
call. So it's probably best to just leave most of the validation server
side.

Do we need an actual Label class or are dicts good enough? I think dicts
are good enough.

cc / @sharbison3 https://github.com/sharbison3


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1867, or mute
the thread
https://github.com/notifications/unsubscribe/ALStW8Zj_bHVmdKDS3pR7escaltzu_XLks5qMe7UgaJpZM4I38VP
.

@waprin
Copy link
Contributor Author

waprin commented Jun 17, 2016

Sam, agreed we will stick with the Apiary (legacy? Google APIs client?) samples until this is ready. Besides the official docs, I am trying to add some sections to cloud.google.com/python, and all the tutorials there use gcloud to date. So it's not urgent since we will be using the older samples, but nice-to-have sooner rather than later.

@dhermes dhermes added the api: monitoring Issues related to the Cloud Monitoring API. label Jun 29, 2016
@rimey
Copy link
Contributor

rimey commented Jun 30, 2016

This convenience interface will be the most usual way of writing custom metrics. Defining it first is definitely the way to go. I think we'll also want to make some breaking changes to TimeSeries and related classes, but those can follow later.

The setup portion of the example above looks okay:

from gcloud import monitoring
client = monitoring.Client() 

# Create a Metric given a metric type name and a label dictionary.
# ??? Possibly doesn't exist but we won't check yet, unless they reload() ???
gauge_metric = client.metric('custom.googleapis.com/my_metric', labels={'response_code': 404})

# Create a Resource given a monitored resource type name and a label dictionary.
resource = client.resource('gce_instance', labels={'instance_id': 'my-instance', 'zone': 'us-central1-f'})

I don't think the second line of the first comment makes sense.

The metric and resource factory methods are well-aligned with gcloud-python style. I think it would be nice to be able to use keyword arguments to specify labels, but that can just as well be added later.

There are a few problems with the proposal for write_timeseries:

  • It doesn't make sense to specify multiple points.
  • We should provide for two different methods: a convenience method along the lines outlined, and also a lower-level method for writing multiple time series in one call.
  • The comment says "create and save the timeseries resource", and I'm not sure what that means. I think these calls should only invoke projects.timeSeries.create, which writes time series data, and not, for instance, projects.metricDescriptors.create.

I suggest the following:

# Write a point to a time series.
client.write_point(metric=metric, resource=resource, value=3.14) # API call

# Write data to several time series. The argument is a sequence of TimeSeries objects.
client.write_time_series([ts1, ts2, ..., tsN]) # API call

It's reasonable for write_point to take optional start_time and end_time arguments as proposed, with end_time defaulting to the current time. start_time can simply default to None, since the server defaults it to the value of end_time if it is not specified.

I would like to stick to the convention that these are specified as Python datetime objects, assumed to be UTC if naive.

Leaving most validation to the server is good. Definitely let's not make any extra calls.

@lukesneeringer lukesneeringer added the priority: p2 Moderately-important priority. Fix may not be included in next release. label Apr 19, 2017
@lukesneeringer
Copy link
Contributor

Hello,
One of the challenges of maintaining a large open source project is that sometimes, you can bite off more than you can chew. As the lead maintainer of google-cloud-python, I can definitely say that I have let the issues here pile up.

As part of trying to get things under control (as well as to empower us to provide better customer service in the future), I am declaring a "bankruptcy" of sorts on many of the old issues, especially those likely to have been addressed or made obsolete by more recent updates.

My goal is to close stale issues whose relevance or solution is no longer immediately evident, and which appear to be of lower importance. I believe in good faith that this is one of those issues, but I am scanning quickly and may occasionally be wrong. If this is an issue of high importance, please comment here and we will reconsider. If this is an issue whose solution is trivial, please consider providing a pull request.

Thank you!

parthea pushed a commit that referenced this issue Oct 21, 2023
…oogleCloudPlatform/python-docs-samples#1867)

* [Asset] Test: fix bucket clean up logic.
* [Asset] Add quickstart code for BatchGetAssetsHistory API.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: monitoring Issues related to the Cloud Monitoring API. priority: p2 Moderately-important priority. Fix may not be included in next release.
Projects
None yet
Development

No branches or pull requests

5 participants