-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add metric emitter lua module #2611
Add metric emitter lua module #2611
Conversation
@fmejia97 please check the lint errors https://travis-ci.org/kubernetes/ingress-nginx/jobs/388896002#L721 |
a735193
to
d8398b4
Compare
|
||
table.insert(_M.queue, jsonPayload) | ||
|
||
local ok, err = ngx.timer.at(0, flush_queue) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the behavior of ngx.timer.at
when nginx is receiving more than 1024 rps? (the default for pending timers)
https://github.com/openresty/lua-nginx-module#lua_max_pending_timers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will return nil, together with an error (from the documentation you linked)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aledbf Also worth mentioning - we use a nginx module to expose the active number of running timers; possibly worth merging this upstream to https://github.com/openresty/lua-nginx-module? Wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wdyt?
That sounds like a good idea 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fmejia97 @andrewlouis93 why don't you fork lift defer.lua
from our fork and just do defer.to_timer_phase(<you function that sends the json payload>)
in this middleware
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just did :) @ElvinEfendi
d8398b4
to
e08c118
Compare
Codecov Report
@@ Coverage Diff @@
## master #2611 +/- ##
==========================================
- Coverage 40.83% 40.69% -0.14%
==========================================
Files 75 75
Lines 5123 5123
==========================================
- Hits 2092 2085 -7
- Misses 2750 2756 +6
- Partials 281 282 +1
Continue to review full report at Codecov.
|
end | ||
|
||
function _M.call() | ||
-- Create JSON Metrics Payload -- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to explain the obvious
upstreamStatus = ngx.var.upstream_status or "-", | ||
namespace = ngx.var.namespace or "-", | ||
ingress = ngx.var.ingress_name or "-", | ||
service = ngx.var.service_name or "-", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you avoid spacing like this? later on when we add another line that's longer we then will have to adjust all the others
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does this list come from? From VTS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The stats on the list were given by @aledbf on the kubernetes slack channel. This does not yet cover all VTS metrics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fmejia97 feel free to add all the fields you think are required to match the VTS module. I just only added the minimum to start the work with the UDP server
local assert = assert | ||
|
||
local _M = { | ||
queue = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
queue
can be local, no? why does it have to be exported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not needed anymore since we are going to use the defer
module to delegate deferring callbacks (with queue optimization) to the timer phase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fmejia97 could you use this version of the module instead: https://github.com/Shopify/ingress/blob/master/rootfs/etc/nginx/lua/util/defer.lua?
This doesn't export things that don't need to be exported
What about |
local jsonPayload = cjson.encode({ | ||
host = ngx.var.host or "-", | ||
status = ngx.var.status or "-", | ||
remoteAddr = ngx.var.realip_remote_addr or "-", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit confusing. ngx.var.realip_remote_addr
is the original client IP, but remoteAddr
indicates client IP (obtained from X-Frowarded-For
if set). I'd say ngx.var.remote_addr
is more valuable since it will be the true client IP.
Maybe have both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I'll add both to avoid confusion.
bytesSent = tonumber(ngx.var.bytes_sent) or -1, | ||
protocol = ngx.var.server_protocol or "-", | ||
method = ngx.var.request_method or "-", | ||
path = ngx.var.uri or "-", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not key it as uri
instead of path
to have better clarity and consistency with the rest of the mappings?
method = ngx.var.request_method or "-", | ||
path = ngx.var.uri or "-", | ||
requestLength = tonumber(ngx.var.request_length) or -1, | ||
requestDuration = tonumber(ngx.var.request_time) or -1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above, requestTime
path = ngx.var.uri or "-", | ||
requestLength = tonumber(ngx.var.request_length) or -1, | ||
requestDuration = tonumber(ngx.var.request_time) or -1, | ||
upstreamName = ngx.var.upstream or "-", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ngx.var.upstream
does not seem like a standard Nginx variable, are we setting it ourselves somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, that was a mistake. It should've been ngx.var.proxy_upstream_name
.
requestDuration = tonumber(ngx.var.request_time) or -1, | ||
upstreamName = ngx.var.upstream or "-", | ||
upstreamIP = ngx.var.upstream_addr or "-", | ||
upstreamResponseTime = tonumber(ngx.var.upstream_response_time) or -1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this gonna work? when would tonumber
return nil?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It returns nil
when it fails to convert its argument to a number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nil
or -1
will return -1
. The UDP listener expects some fields to be of type float64
, and it will error when calling Unmarshal
if the value of the field is not a number.
@ElvinEfendi @andrewlouis93 Addressed your comments 👍 |
b9e8f43
to
6c282df
Compare
@fmejia97 We're still not matching all the fields published by the VTS module. I think this is the last thing we need to do before I think this can be merged in. |
Code looks good to me! Can you add unit test for Regarding to matching VTS metrics I have not checked that - I'll wait for @andrewlouis93 to approve it. But we can always revisit that part as long as we have the basics. |
return true | ||
end | ||
|
||
return _M |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Newline
@andrewlouis93 @aledbf @ElvinEfendi Added metrics for |
Nice work @fmejia97 just some quick unit tests left now :) |
Added unit tests for |
571ee57
to
a97baf9
Compare
get_phase = function() return "timer" end, | ||
var = {} | ||
} | ||
_G.ngx = _ngx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
redundant indentation
describe("Monitor", function() | ||
local monitor = require("monitor") | ||
describe("encode_nginx_stats()", function() | ||
ngx.var = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does this do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ElvinEfendi encode_nginx_stats()
creates a JSON encoded structure using the current NGNX context stats (from ngx.var). In this test case, I'm setting a specific environment to test the method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 I see you are doing it below but ngx.var =
does not make sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh woops 😄didn't realize this. Removing 👍
describe("encode_nginx_stats()", function() | ||
ngx.var = | ||
it("successfuly encodes the current stats of nginx to JSON", function() | ||
local nginxEnvironment = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use underscore naming convention when writing Lua code
describe("Defer", function() | ||
describe("to_timer_phase", function() | ||
it("executes passed callback immediately if called on timer phase", function() | ||
assert.equal(true, defer.to_timer_phase(function() return 1 end)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how does this assert that the function was executed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Create a table here - and have the callback modify the table's value. E.g. Initialize defer.call_count = 0
, and then in the callback - increment that value by 1.
Your assertion can check for that value having been increased.
The problem with this right now is that your test will pass regardless of whether or not the callback passed in to to_timer_phase
is executed.
why are you putting the tests into defer/ and monitor/ folders? |
assert(s:close()) | ||
end | ||
|
||
function _M.encode_nginx_stats() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you test the public .call
method with proper assertions this will be covered as well and you won't need to export this method just for testing purposes.
You can export only if you justify that testing .call
is unnecessarily complex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ElvinEfendi My initial approach was to test .call
method. Yesterday I spent a fair bit of the day trying to figure out how to make it work but couldn't find a clean / non-hacky way of testing it. Mostly due to the nature of the module: 1) Generate JSON struct form nginx context variables. 2) Create a timer with send_data callback. 3) Send the JSON struct as a UDP message. The main logic of the module is building the JSON structure (then use the interface of the other modules which are already tested). After pairing with Andrew we decided that this would be the cleanest way of going forward.
service = "test-app", | ||
} | ||
|
||
assert.are.same(decodedJSONStats,expectedJSONStats) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is some nice syntactic sugar 🍬
@fmejia97 @andrewlouis93 @ElvinEfendi what's missing to merge this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀 🚀 🚀
@aledbf this looks good, let's ship this. |
@fmejia97 please squash this in fewer commits and we are ready to go |
dc43bdf
to
966e9f5
Compare
@aledbf Squashed commits ✅ |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: aledbf, fmejia97 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What this PR does / why we need it:
This PR adds the
metric_emitter
lua module that emits UDP messages with NGINX stats to the controller.Which issue this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close that issue when PR gets merged): fixes #Special notes for your reviewer: