Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancements to Prometheus exporter #609

Merged
merged 12 commits into from
Jun 21, 2021

Conversation

NajmudheenCT
Copy link
Member

@NajmudheenCT NajmudheenCT commented Jun 15, 2021

What this PR does / why we need it:

Special notes for your reviewer:
Change set from PR #595 and #588 are included in this
Release note:

Test Report
Prometheus graph displays all metrics for one port with properties
{resource_type="port", resource_id="port0", storage_sn="6013136b-f5ee-4acf-9146-d20e927f090b" }
Screenshot from 2021-06-15 20-05-36


for timestamp, value in values.items():
f.write("storage_%s{%s} %f %d\n" % (metric, labels,
value, timestamp))
timestamp += self.timestamp_offset_ms
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timestamp update can be moved out of _write_to_prometheus_format() function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

values dictionary contains multiple timestamps.. each timestamp is distinct .. so adding offset has to be inside this loop


class PrometheusExporter(object):

def __init__(self):
self.timestamp_offset_ms = self.set_timestamp_offset_from_utc_ms()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using timestamp offset may result in incorrect results in some cases. (Eg. Day Light Saving changes in some regions)

Copy link
Member Author

@NajmudheenCT NajmudheenCT Jun 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prometheus uses Unix time internally to persist metrics. for display Prometheus support timezone in display layer .. https://prometheus.io/docs/introduction/faq/

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After initializing PrometheusExporter class, if day light saving changes, offset become wrong.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm right.. Actually Prometheus do not suggest timestamp from the exporter. This could be one reason. But on our case we have to pass the timestamp as we export historic values. Could not see any way to determine daylight saving. We will have to document that behavior.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i remember python time libraray has functions to determine if DST

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do not cache offset, and get it just before export, it will be wrong just one time when day light savings change

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.. we get timezone info before every export now

@codecov
Copy link

codecov bot commented Jun 16, 2021

Codecov Report

Merging #609 (28378d7) into master (ee87076) will decrease coverage by 0.06%.
The diff coverage is 39.39%.

@@            Coverage Diff             @@
##           master     #609      +/-   ##
==========================================
- Coverage   71.95%   71.88%   -0.07%     
==========================================
  Files         141      141              
  Lines       12492    12562      +70     
  Branches     1493     1499       +6     
==========================================
+ Hits         8988     9030      +42     
- Misses       2989     3017      +28     
  Partials      515      515              
Impacted Files Coverage Δ
delfin/exporter/prometheus/exporter_server.py 0.00% <0.00%> (ø)
delfin/exporter/prometheus/prometheus.py 26.08% <24.07%> (-5.92%) ⬇️
delfin/drivers/fake_storage/__init__.py 95.14% <100.00%> (+8.64%) ⬆️
delfin/drivers/api.py 88.09% <0.00%> (+2.38%) ⬆️

cfg.StrOpt('metrics_cache_file', default='/var/lib/delfin/delfin_exporter'
'.txt',
help='The temp cache file used for persisting metrics'),
cfg.StrOpt('metrics_dir', default='/var/lib/delfin/metrics',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the dir name be CONSTANT driven?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 including folder and file names could be that way

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, Thanks

for file in glob.glob("*.prom"):
file_list.append(file)
data = ''
for file in file_list:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this loop can be merged into loop at 49, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


class PrometheusExporter(object):

def __init__(self):
self.timestamp_offset_ms = self.set_timestamp_offset_from_utc_ms()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i remember python time libraray has functions to determine if DST

@@ -76,7 +76,7 @@
'maxLength': 255}
},
},
'writeRequests': {
'writeIops': {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is these metric names are common metric names? I see it is changed now. What is the possibility of further change? Can we use any key mapping to keep the code more maintainable?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

applicable for all

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes These are common metric names, based on our analysis on different platforms we derived common names now. hence to align we have changed this metric names. This is schema would be validated against driver capabilities.. drivers would use their mapping to map storage metric names with delfin metric names

for i in range(MIN_PERF_VALUES, MAX_PERF_VALUES):
timestamp = int(float(datetime.datetime.now().timestamp()
) * 1000)
timestamp = start_time
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it system time?!/regional specific?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is this used?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is timestamp equivalent to UTC. since we support historic collection , this is the fake driver side emulation to support collection from provided start_time to end_time

cfg.StrOpt('metrics_cache_file', default='/var/lib/delfin/delfin_exporter'
'.txt',
help='The temp cache file used for persisting metrics'),
cfg.StrOpt('metrics_dir', default='/var/lib/delfin/metrics',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 including folder and file names could be that way


grp = cfg.OptGroup('PROMETHEUS_EXPORTER')
METRICS_CACHE_DIR = '/var/lib/delfin/metrics'
Copy link
Collaborator

@sushanthakumar sushanthakumar Jun 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be defined seperately in both exporter server and promethus files? Currenlty in it is in 2 places

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes because both code run on different processes .


def push_to_prometheus(self, storage_metrics):
with open(cfg.CONF.PROMETHEUS_EXPORTER.metrics_cache_file, "a+") as f:
self.timestamp_offset_ms = self.set_timestamp_offset_from_utc_ms()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In error case, set_timestamp_offset_from_utc_ms returns 0, will that be fine from promethues side?if not we can handle this and next statements as errors and return

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we return error this collection will be lost.. if we set 0 atleast there is a chance that prometheus wil receive it and accepets/rejects based on its timezone and rejection time window..

Copy link
Collaborator

@kumarashit kumarashit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sushanthakumar
Copy link
Collaborator

LGTM

Copy link
Collaborator

@joseph-v joseph-v left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@skdwriting skdwriting left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[v1.2.2 Testing] Gaps in Prometheus and Grafana graphs when Prometheus exporter is enabled
5 participants