-
Notifications
You must be signed in to change notification settings - Fork 543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[watermarkorch] only perform periodic clear if the polling is on #781
Conversation
orchagent/watermarkorch.cpp
Outdated
m_telemetryTimer->start(); | ||
m_isTimerEnabled = true; | ||
} | ||
else if (i.first == "FLEX_COUNTER_STATUS" && i.second == "disable" && m_isTimerEnabled == true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If "FLEX_COUNTER_STATUS" of the first processed key, say PG_WATERMARK, is "enable", while the second "QUEUE_WATERMARK" is "disable", the telemetryTimer will not start.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mykolaf can you fix the logic?
orchagent/watermarkorch.cpp
Outdated
m_telemetryTimer->start(); | ||
m_isTimerEnabled = true; | ||
} | ||
else if (i.first == "FLEX_COUNTER_STATUS" && i.second == "disable" && m_isTimerEnabled == true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mykolaf can you fix the logic?
934d354
to
5a119d2
Compare
private: | ||
bool m_isTimerEnabled = false; | ||
bool m_qWmEnabled = false; | ||
bool m_pgWmEnabled = false; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of defining one Boolean variable for each watermark category, we may choose to use only one uint8_t variable, with each category being a bit set in it, up to 8 categories.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now status is stored as a uint8_t
if (i.first == "interval") | ||
{ | ||
auto intervT = timespec { .tv_sec = to_uint<uint32_t>(i.second.c_str()) , .tv_nsec = 0 }; | ||
m_telemetryTimer->setInterval(intervT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just checked the implementation of setInterval(), https://github.com/Azure/sonic-swss-common/blob/master/common/selectabletimer.cpp#L63. It only changes the value of the itimerspec struct, not sync the value to the timer. If the timer is not armed, it will not start the timer. This is good as we may not want to start the timer, but only want to change the interval. If the timer is armed, we will need a call to timerfd_settime(), which is called by SelectableTimer::start() and SelectableTimer::reset(), to sync the actual change to the timer. This means the change will take effect at a later time when the current timer expires and the SelectableTimer doTask handler is called. If we do not care about the interval to take immediate effect. This is also fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The timer is expected to be changed after current interval has ended. This is stated in the design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, let's follow the design for now. Then, we need a call to sync the telemetry interval change to the actual timer in the doTask(SelectableTimer &timer) if we see there is a change in the telemetry interval
* if it is disabled we will not restart the timer at the end of current interval */ | ||
if (isTimerEnabled()) | ||
{ | ||
m_telemetryTimer->reset(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The struct timerspec it_interval field of timerfd_settime(), when calling SelectableTimer::start(), is not zero, so the timer will re-arm automatically. If telemetry is enabled, we do not need to call SelectableTimer::reset() in the SelectableTimer doTask handler, because the timer will automatically re-arm. Your comment on line 210 looks wrong. Instead, we need to deal with the opposite case---if the telemetry is disabled, we need to explicitly call SelectableTimer::stop() (, which is made in the handleFcConfigUpdate?). Have you tested your change?
} | ||
if (isTimerEnabled()) | ||
{ | ||
m_telemetryTimer->start(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the telemetry timer has been started, a call to SelectableTimer::start() will reset the timer settings. This resets the current time to expiration, causing the intervals to be uneven that the interval length, where an enable cli issued, are different from others. We should refrain from calling SelectableTimer::start() if there is no change to the timer interval value itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added logic to avoid uneven intervals.
@@ -170,16 +207,18 @@ void WatermarkOrch::doTask(SelectableTimer &timer) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For line 198 to 206, how can you be so sure that periodic timer kicks in first before user issues the clear watermark through cli? That m_pg_ids, m_unicast_queue_ids, and m_multicast_queue_ids is already populated for the latter? Can you give your rationale behind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea of introducing m_pg_ids, m_unicast_queue_ids, and m_multicast_queue_ids , I guess, is to have a cached copy of the oids at hand to remove the need of further queries into obtaining these oids for clear command. The only opportunity/condition to update this cached copy is when a watermark clear command is issued by timer and the doTask() handler finds that the vector is empty. So we assume here that when this update opportunity comes, the pg, uni_queue, and multi_queue oids are generated complete and will not be subject to any further changes.
The sources of the oids comes from the COUNTERS_DB *MAP table, which is set when FlexCounterOrch receives any signal on "FLEX_COUNTER_STATUS" field to call the generate*Map() function. This signal in the current orchagent initialization sequence, I guess, comes for granted after the port initialization is done, with all pg and queue oids being settled. So we will get a list of pg and queue oids complete and ready at the time the cached oid copy still hits the emptiness condition.
Also, to introduce cached copy, we need to evaluate the risk of running out-of-sync to the original source. I think it should be fine for pg and queue oids as they remain unchanged upon creation during the life cycle of SONiC running
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main idea of having m_pg_ids, m_unicast_queue_ids, m_multicast_queue_ids is to avoid querying them every time, performing deserialization of oids. This things are not expected to change.
This question is not on the PR per se. There are two sources that will set a TABLE in COUNTERS_DB, one from user/timer clear command and the other from the flex counter thread lus script execution. In this case, does redis provide a lock mechanism internally to ensure mutual exclusion on writing to a table? Also @qiluo-msft for insights |
@wendani if the Redis written by single transactions, they are mutual exclusive. |
Signed-off-by: Mykola Faryma <mykolaf@mellanox.com>
Signed-off-by: Mykola Faryma <mykolaf@mellanox.com>
Signed-off-by: Mykola Faryma <mykolaf@mellanox.com>
if (i.first == "interval") | ||
{ | ||
auto intervT = timespec { .tv_sec = to_uint<uint32_t>(i.second.c_str()) , .tv_nsec = 0 }; | ||
m_telemetryTimer->setInterval(intervT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, let's follow the design for now. Then, we need a call to sync the telemetry interval change to the actual timer in the doTask(SelectableTimer &timer) if we see there is a change in the telemetry interval
Synced offline with Qi. A single transaction is in terms of a hget call or a hset call. The watermark lua script (b750a4b#diff-f8446e5c7632fc3a23e06c9f9419828c) updates a value in the typical 'read-modify-write' fashion as hget-compare-hset. A user or a timer clear, either directly issued through CLI or delegated to WatermarkOrch via the APPL_DB event channel, runs in a different context from the syncd flex counter thread. If the user clears the value in between the hget and hset, we can have a race condition situation. One thing we can do in the lua script is to have a value's hset call follow as close as possible to its hget call, rather than hgetting all three values and then hsetting all three values if any update is needed. This may helping reduce the time interval between hget and hset, at least not introducing any more delay between the two operations, and alleviate the problem to some extent. |
@wendani I can enhance the lua script in separate PR. |
Signed-off-by: Mykola Faryma <mykolaf@mellanox.com>
@wendani the changes in the Lux script should go with the PR or a seperate PR? anything else from this PR is missing? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approve on the condition that it is verified on mlnx dut before merge:
- telemetry timer is auto rearmed
- change of telemetry interval
- enable/disable watermark polling, telemetry timer is started/stopped
@wendani Just doublechecked, updated the PR message. Ready to merge |
* [watermarkorch] only perform periodic clear if the polling is on Signed-off-by: Mykola Faryma <mykolaf@mellanox.com> * [watermarkorch] only set timer interval on change Signed-off-by: Mykola Faryma <mykolaf@mellanox.com> * fix logic Signed-off-by: Mykola Faryma <mykolaf@mellanox.com> * fix comments Signed-off-by: Mykola Faryma <mykolaf@mellanox.com> * fix change timer Signed-off-by: Mykola Faryma <mykolaf@mellanox.com>
Signed-off-by: Mykola Faryma mykolaf@mellanox.com
No need to periodically clear watermarks if the polling is disabled.
What I did
Changes:
Why I did it
How I verified it
Details if related