-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
386fix #4076
386fix #4076
Conversation
…re out how to separate a 32bit build from a 64 bit build. Enhancement: Allow collecting of perfmon data from a remote node Enhancement: Allow collecting data using the timestamp from the source and don't re-timestamp
Fix: 32 and 64 bit versions split out into separate files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @srclosson for taking the time to look at this code, getting these types right is quite tricky. Below are some comments, but also make sure take a look at the unit test failures on appveyor, they run on a 64-bit Windows system.
if ret == ERROR_SUCCESS { | ||
ret = PdhGetFormattedCounterArrayDouble(metric.counterHandle, &bufSize, | ||
&bufCount, &emptyBuf[0]) // uses null ptr here according to MSDN. | ||
ret = PdhGetFormattedCounterArrayDouble(metric.counterHandle, &bufSize, &bufCount, &emptyBuf[0]) // uses null ptr here according to MSDN. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert this line since no change and had better wrapping, we try to wrap at about 80chars.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Roger doger
if ret == PDH_MORE_DATA { | ||
filledBuf := make([]PDH_FMT_COUNTERVALUE_ITEM_DOUBLE, bufCount*size) | ||
if len(filledBuf) == 0 { | ||
continue | ||
} | ||
ret = PdhGetFormattedCounterArrayDouble(metric.counterHandle, | ||
&bufSize, &bufCount, &filledBuf[0]) | ||
ret = PdhGetFormattedCounterArrayDouble(metric.counterHandle, &bufSize, &bufCount, &filledBuf[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert this line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted
@@ -90,6 +97,7 @@ type item struct { | |||
counter string | |||
instance string | |||
measurement string | |||
nodename string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we call this computer
to match the PDH documentation: https://msdn.microsoft.com/en-us/library/windows/desktop/aa373193(v=vs.85).aspx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Good idea.
|
||
if instance == "------" { | ||
query = "\\" + objectname + "\\" + counter | ||
query = "\\\\" + nodename + "\\" + objectname + "\\" + counter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the documentation this doesn't seem like a valid path if there is no computer name, you would get:
\\\Process(*)\% Processor Time
Can you verify if this works?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can tell you now. The node name is required to be specified. I personally like this, because the measurement gets tagged with the node name, which is now different from the collecting host, but it's not backwards compatible.
Let me check a different idea in which would be backwards compatible.
[[inputs.win_perf_counters.object]] | ||
# Processor usage, alternative to native, reports on a per core. | ||
# Processor usage, alternative to native, reports on a per core. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use 2 space indention in the sample config
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, I must have used a tab when I meant a space.
@@ -38,6 +38,12 @@ It is recommended NOT to use this on OSes starting with Vista and newer because | |||
Example for Windows Server 2003, this would be set to true: | |||
`PreVistaSupport=true` | |||
|
|||
#### UseWinTimestamps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a more descriptive name would be UsePerfCounterTime
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup. I like it. Changed
} | ||
|
||
func PdhUseWinTimestamps() { | ||
pdh_CollectQueryDataWithTime = libpdhDll.MustFindProc("PdhCollectQueryDataWithTime") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's do like above on AddEnglishCounterW
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I learn something new everyday. Much cleaner. Thank-you
pdh_CollectQueryDataWithTime = libpdhDll.MustFindProc("PdhCollectQueryDataWithTime") | ||
} | ||
|
||
func PdhConnectMachine(szMachineName string) uint32 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't seem to find where this is called from, is it used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, and I neglected to write a comment header stating this. In experimenting I had added the call, but ended up not requiring it (it was actually orders of magnitude slower doing the connect call and then querying for the counters. I've noted this in a comment header.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you instead remove all the calls that are not used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For sure. I can. I just seems a shame, the call works, and may be useful in the future. Perhaps there is an alternative place to put it? I'll remove it for now. At least it will be in my git history if I need to access it again.
// } | ||
// fmt.Println("Converted bytes: ", n) | ||
// | ||
func WideCharToMultiByte(codePage uint32, dwFlags uint32, wchar *uint16, nwchar int32, str *byte, nstr int32) (nwrite int32, err error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this called from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not. It was one of those things I used to try and fix the win32 bug without padding the structures. The call itself works good and I thought it might be useful in the future. I've noted this in a comment, but I can remove it as well.
} | ||
|
||
var ( | ||
// Library | ||
libpdhDll *syscall.DLL | ||
libkrnDll *syscall.DLL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move all the kernel32 code to a new file: kernel32.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. Done
Renamed nodename to computer moved out kernel32 calls to their own library Added tests
…g with "computer" if computer was not specified. (Backwards compatibility)
@@ -50,6 +56,16 @@ beneath the main win_perf_counters entry, `[[inputs.win_perf_counters]]`. | |||
|
|||
Following this is 3 required key/value pairs and the three optional parameters and their usage. | |||
|
|||
#### Computer | |||
**Required** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mark this as optional
- Adding a computer name is optional - Removed unused code - Fixed another few instances of uint32 with [4]byte
Can this be re-reviewed? |
Can you take another pass and remove all code that is no actively being used right now. If we need to bring back in this code we can grab it either from your tree or from https://github.com/lxn/win (who authored most of the Windows code). Also, it looks like #4036 conflicted with the changes here so we will need to sort that out. |
closes #1899 |
closes #2468 |
Are you in progress now? |
func UTF16PtrToString(s *uint16) string { | ||
if s == nil { | ||
return "" | ||
} | ||
return syscall.UTF16ToString((*[1 << 29]uint16)(unsafe.Pointer(s))[0:]) | ||
|
||
return syscall.UTF16ToString((*[20]uint16)(unsafe.Pointer(s))[:]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO, this is not right. You have to cast pointer to large enough array, otherwise you could get out of bounds.
There is 100% probability that strings with more than 20 characters go though this function .
Keeping original will not harm 32-bit support, as array size is type int, which is at least 32 bits, so [1<<29] is OK
@srclosson I believe the 32-bit path is fixed by #4189, but I believe you will need to base the remote computer and counter times fixes on the new code. I'm going to close this pull request since I think it will be easier to open a new one. |
@danielnelson, #4189 didn't contain fix for 386 issue. It doesn't work in 32bit build, I tested it.
Maybe it would be more convenient to keep 386 support separated from other features (collecting from remote computers and adding timestamp to metrics) |
Okay... So what's the status here? @danielnelson would you comment on how it's best to proceed? |
I'm not getting the crash anymore when I compile with GOARCH=386 and run on a 64-bit system, without #4189 it crashes immediately. That said I haven't tested anything more than running @srclosson Depends on how much time you have available, I'm hoping to release 1.7.0-rc1 at the end of next week, and it would be great if we could have the 32-bit fix in. If your schedule is full until then it might be best for @vlastahajek to adapt your work so we can get it in for 1.7. Ideally we would have separate pull requests for 32-bit fix and the new features. |
@danielnelson: Yes, it's not crashing, at least for several tens minutes, but it writes invalid values for all the stuff, measurement names, fieldnames, fieldvalues. As discussed in the original issue, there is problem with aligning data within structures with different field types between Go and Wind 32 API. Crash before occurred fast because there were, IMHO, wrong allocation of the buffer for PDH_FMT_COUNTERVALUE_ITEM_DOUBLE array. I've prepared separated fix for 386problem, cause I primarily needed to fix my code for 32bit support. @srclosson: Now it depends on you. Did you have a time to look at changes? I did quite a large refactor. |
@vlastahajek Can you go ahead and open a PR for the 32-bit fix? |
Ok, #4206 |
I merged @srclosson's branch with master. The question is how to push it. @srclosson, As those features are your work, I would prefer if you could add me as a contributor to your forked telegraf repo and will commit it there. We can then continue with this PR (rename it) and with comments. My merges is purely melding my refactor and fixes with @srclosson's enhacements, except one change, that I removed the I would also propose a few improvements:
Both should be easy to implement. Multiple remote host can be easily added as stacking performance counters for multiple hosts, although solution using more workers would be prettier. |
@vlastahajek In the meantime if you have time to prepare a branch with your changes that would be nice. I think the RemoteHost change is a good one, but for multiple hosts doesn't it work with a mix of local and remote hosts?
|
Hi, My config looks like [[inputs.win_perf_counters.object]] So it's work with mixed set of hosts |
@danielnelson, I already have it in branch in my fork: https://github.com/bonitoo-io/telegraf/tree/386fix. It's merged imported original branch of this PR
Yes, this should work as a workaround.
|
@vlastahajek My current fork: https://github.com/DmitryMaletin/telegraf/blob/dma/plugins/inputs/win_perf_counters/win_perf_counters.go |
Host per counter object is as it is now. I would vote the same as you, having global remote host(s) for all counters |
Would plugin level be reasonable? Remember you can always have multiple plugins defined. Obviously not as fine grained as per object but in many cases this will be easier since you don't have to continually redefine the same remote hosts:
|
yes. for me it sounds reasonable. |
So, how to proceed further? @srclosson, Are going to look at it? |
@danielnelson/@vlastahajek, I can certianly look at it. @danielnelson, could we get on the phone for 30 minutes? I've got a lot of stuff to contribute, and I think it would make sense to bring you up to speed on how to best go about merging some of it. There is a lot... Everything from a GPS Nmea plugin, to generic alarm parsing, windows event logs, webdav/sftp data transfer etc... I just don't want to hold up the line and would like to just show you through the ones I've developed, get your feedback, and some ideas on how I could/should go about merging them... |
Another idea I had for the remote nodes is to do something like, (pseudocode) for _, entry := range PerfCounterItems { This way, all of the requests get sent off at the same time. Probably more performant than doing each one individually... |
Yes, this would be a good thing, ideally network operations are done concurrently with a connection limit and with timeouts. Support for cancellation using a Context is also a nice touch but Telegraf doesn't have a way to do this yet. |
Required for all PRs: