-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
util: add computeTime option to inspect #19994
Conversation
4be6fce
to
bdeb7a0
Compare
|
doc/api/util.md
Outdated
* `computeTime` {number} Specifies the maximum time in milliseconds that | ||
`util.inspect()` may take to compute the output. If it exceeds that time, a | ||
`INSPECTION_TIMEOUT` warning is emitted. If the computation takes more than | ||
100 ms above that time, it will limit all objects to the absolut minimum |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/absolut/absolute/
If we really want to go this |
Comment addressed. @mscdex I thought about that as well but the implementation becomes more difficult in that case. Right now I can not think of a implementation with a single variable when using New CI https://ci.nodejs.org/job/node-test-pull-request/14237/ |
hrtime is definitely the better choice |
/ping @nodejs/util |
Here is a snippet to conver to hrtime to millis. |
Is making |
if we're so worried about the time of this that we're adding timeouts to it, perhaps we should start looking at what takes up all this time. IMO if something needs a time limit like this it shouldn't exist how it currently is. |
IMHO the best solution is to apply an algorithm that removes circular dependencies, such as https://github.com/davidmarkclements/fast-safe-stringify/blob/master/index.js#L18-L43. |
doc/api/util.md
Outdated
@@ -404,6 +407,11 @@ changes: | |||
This is useful to minimize the inspection output for large complicated | |||
objects. To make it recurse indefinitely pass `null` or `Infinity`. | |||
**Default:** `Infinity`. | |||
* `computeTime` {number} Specifies the maximum time in milliseconds that | |||
`util.inspect()` may take to compute the output. If it exceeds that time, a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: a `INSPECTION_TIMEOUT`
-> an `INSPECTION_TIMEOUT`
lib/util.js
Outdated
process.emitWarning('util.inspect takes very long.', { | ||
code: 'INSPECTION_TIMEOUT', | ||
detail: 'util.inspect() received an object that takes very long to ' + | ||
'compute the output for. If it takes more than 100 more ' + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: more than 100 more
-> more than 100
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That has a different meaning though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe something like "If it takes 100 milliseconds more"? I am not sure, but more than 100 more
sounds a bit heavy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that is not unambiguous enough. I see your point of improving the wording though.
About the performance bottleneck: it is not about circular references. It is
I totally understand that this is not ideal but I doubt that blocking the event loop is any better. There were multiple cases before where people complained about a blocked event loop. That was also the reason for the |
I switched to process.hrtime, addressed the doc nit and increased the default timeout to 300 + 100 ms. New CI https://ci.nodejs.org/job/node-test-pull-request/14247/ |
Then the logical thing to do is to concoct an Strawman: start with
Why is that? Performance or compatibility? |
lib/util.js
Outdated
@@ -394,6 +404,23 @@ function formatValue(ctx, value, recurseTimes, ln) { | |||
return ctx.stylize('null', 'null'); | |||
} | |||
|
|||
const now = getNow(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we just pass the previous process.hrtime()
value to process.hrtime()
to get a time difference and then compare that value instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but that requires a second property on the ctx
object and I would like to prevent that.
Partially yes, partially no. To improve the performance of I also already spoke with @bmeurer about the problem with
Both. The performance highly depends on what object it is. For e.g., arrays it is worse to use |
We could add something like It's not exactly trivial and I'd have to check if properties of dictionary mode objects can be counted in better than O(n) time but I'm reasonably sure that for everything else it can be done in amortized O(1). |
That is exactly what I asked @bmeurer for. |
Just a heads up about
|
Seems like there was a bug in |
To give a better idea here, this definitely is not about circular references. The issue that triggered this is a failure in npm update after @BridgeAR changed the default depth to infinity. npm uses massive in memory objects that cause inspect to choke. At a depth of 8, one npmlog call generates a single string that is 6.6 M and over 81k lines long. These are deep objects that do not contain a lot of keys but there are many many of them. They also happen to contain the full readme.md content of every dependency in a modules graph. The current inspect impl is depth first and concats strings recursively upwards using a variety of string concats, template strings, and array joins. For this one object that results in an absolutely massive amount of memory thrashing. |
No matter the two perf optimizations that I opened: I still believe that this is probably a good idea. The reason is that this way there is a guarantee that the event loop is not blocked for to long. It should probably only be a "last resort" thing, so I update the warning time to 1000 ms and the bail out to 1250 ms. |
8f01bef
to
6a078ac
Compare
I updated the code to use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments:
-
There is no opt-in or even an opt-out, it's an always-on feature. The best you can do is set it to a real high value. At least make it accept
Infinity
as an opt-out but I think it should be opt-in. -
Calling
process.hrtime()
so frequently is going to result in performance regressions.
This PR essentially assumes that process.hrtime()
is free when in fact it can take 100+ microseconds on some systems. It's going to dominate with some workloads.
doc/api/util.md
Outdated
@@ -423,6 +426,11 @@ changes: | |||
`object`. This is useful to minimize the inspection output for large | |||
complicated objects. To make it recurse indefinitely pass `null` or | |||
`Infinity`. **Default:** `Infinity`. | |||
* `maxClockTime` {number} Specifies the maximum time in milliseconds that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just maxTime
? I'm sure everyone will understand that to be wall clock time, i.e., human time.
maxClockTime
inevitably raises the question "what clock?"
lib/util.js
Outdated
@@ -86,7 +86,8 @@ const inspectDefaultOptions = Object.seal({ | |||
showProxy: false, | |||
maxArrayLength: 100, | |||
breakLength: 60, | |||
compact: true | |||
compact: true, | |||
clockTime: 1000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you misspelled this... you're using inspectDefaultOptions.maxClockTime
further down at any rate.
lib/util.js
Outdated
compact: inspectDefaultOptions.compact | ||
compact: inspectDefaultOptions.compact, | ||
maxClockTime: inspectDefaultOptions.maxClockTime, | ||
startTime: process.hrtime() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you add a comma, the diff will be less noisy next time.
lib/util.js
Outdated
@@ -433,6 +443,22 @@ function formatValue(ctx, value, recurseTimes, ln) { | |||
return ctx.stylize('null', 'null'); | |||
} | |||
|
|||
if (getClockTime(ctx.startTime) > ctx.maxClockTime) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cache the result of getClockTime()
. process.hrtime()
is not necessarily cheap.
6a078ac
to
97d92fc
Compare
I completely rewrote the implementation to actually now rely on a "complexity" first. It is counted together by checking the inspected properties and by normalizing these in a few places (error inspection and complex keys take quite some time). After reaching a complexity of 1e5 it will check each further 1e5 if a second has passed. If that is the case, it will immediately limit all further inspection to the bare minimum and trigger a warning. Using this complexity should a) address the concerns about the overhead measuring the time and b) also address the concerns about a non deterministic return value. In my tests where my CPU was randomly on fire from different other sources it returned in about 70% the same entry and in all other cases just up to two other results. So the range in which the value might differ is now also limited to a absolut minimum and in most cases it should just always return the same value. All of this does not apply to small objects as they would not reach the complexity of 1e5 or take more than one second to compute. The current numbers came out to be reliable in a bunch of tests that I wrote specifically to test for this but that do not seem to be useful to be added as general tests (I checked e.g., very different objects and the time it took to compute those, computation with a high heap memory usage and other things). If requested, I can also open a new PR since the implementation differs significantly from the original approach. CI https://ci.nodejs.org/job/node-test-pull-request/14914/ |
This can also land as a semver-minor in case the default is first set to Infinity. That way everything would behave exactly as before. As semver-major it could then be set to 1e5. |
Looks like related CI failures on Windows and Linux. |
I fixed the CI issues by skipping the test on 32bit systems. The maximum string size can otherwise easily be reached in one second. |
doc/api/util.md
Outdated
@@ -360,6 +360,12 @@ stream.write('With ES6'); | |||
<!-- YAML | |||
added: v0.3.0 | |||
changes: | |||
- version: REPLACEME | |||
pr-url: https://github.com/nodejs/node/pull/REPLACEME | |||
description: The `maxClockTime` option is supported now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this obsolete? I can't find this option in the doc.
doc/api/util.md
Outdated
description: The `maxClockTime` option is supported now. | ||
- version: REPLACEME | ||
pr-url: https://github.com/nodejs/node/pull/17907 | ||
description: The `depth` default changed to `Infinity`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this reverted? The doc still states 2
.
It would be nice to get some reviews here :) |
1e08b92
to
a8d6f6b
Compare
Rebased due to conflicts. New CI https://ci.nodejs.org/job/node-test-pull-request/15002/ @addaleax @mscdex @jasnell @mcollina @Trott @bnoordhuis @joyeecheung PTAL |
a8d6f6b
to
8dba214
Compare
Rebased due to conflicts. CI https://ci.nodejs.org/job/node-test-pull-request/15078/ This solves a big issue with util.inspect and it would be great if it could get any more attention. Shall I close this PR and reopen it? @nodejs/tsc PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
lib/util.js
Outdated
@@ -302,24 +303,29 @@ function inspect(value, opts) { | |||
maxArrayLength: inspectDefaultOptions.maxArrayLength, | |||
breakLength: inspectDefaultOptions.breakLength, | |||
indentationLvl: 0, | |||
compact: inspectDefaultOptions.compact | |||
compact: inspectDefaultOptions.compact, | |||
// If minComplexity is a negative value it will just immediately triggers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: immediately trigger
doc/api/util.md
Outdated
complexity while inspecting, the inspection is continued up to one more | ||
second. If the inspection does not complete in that second, it will limit | ||
all further inspection to the absolute minimum and an `INSPECTION_ABORTED` | ||
warning is emitted. **Default:** `1e5`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’d prefer it if we could split this into one option that sets a complexity maximum and one that sets an overall time limit
And ideally both with value Infinity
so that we can split the semver-majorness out into a separate PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fine with setting this to Infinity first and then to change the default.
However, I would really prefer to have a single option for this. We already have quite a few options and IMO setting this super fine grained does not really help. I guess we could already call this maxComplexity
instead of min
while keeping the functionality as is and just rewording it a bit.
We could also bind the extra time to the requested complexity as in timeout after Math.floor(opts.complexity / 100)
ms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the rename to maxComplexity
is a good idea.
I do think we should always provide some way to limit the complexity while still getting absolutely deterministic results (i.e. no timer). This PR optimizes for the common case of Node.js usage, but for some applications determinism is a paramount property and we should not disable those use cases.
I agree that we already a larger number of options here than what would be ideal, but I think making these two options would be the best approach atm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought again about the necessity for determinism: if an application requires that, it would very likely require the whole object to be inspected. Therefore it would require setting the value to Infinity
. The reason is that the application asked for the whole inspection, nothing else. Having a limited output due to a limitation in complexity on Node.js side would therefore always be an issue.
So I still believe we should keep the functionality as I implemented it right now. I also thought about overloading the input type to accept numbers as budget and strings with ms
at the end as a time budget but I withdraw that after trying to look at it again out of the perspective of the described application.
I think it’s fine to ping @nodejs/collaborators for a change like this :) |
lib/util.js
Outdated
@@ -76,7 +76,8 @@ const inspectDefaultOptions = Object.seal({ | |||
showProxy: false, | |||
maxArrayLength: 100, | |||
breakLength: 60, | |||
compact: true | |||
compact: true, | |||
minComplexity: 1e5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a trailing comma? Ditto on line 309.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to keep this as is currently. I agree that it would be better to change this but it does not seem like others agree. So I would only want to change this if eslint/eslint#10350 gets implemented and we use that rule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, fair enough.
I've been surreptitiously sneaking in commas though. Don't tell.
} | ||
if (ctx.time === undefined) { | ||
ctx.time = process.hrtime(); | ||
} else if (getClockTime(ctx.time) > 1e3) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this necessary? Leaving out wall time tracking results in a simpler algorithm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point is: without this it is hard to tell how much time this will actually need on different CPUs. Using the wall time is future proof by always delivering the best result with a best effort mechanism.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean is, is tracking any kind of time necessary?
What you call complexity, I would call budget. You start with a fixed budget and spend it until it's gone.
Since you have that kind of logic in place already, and since it's deterministic and clock agnostic, why also track time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Budget is indeed a good description. I am just not sure if this would be understood by looking at a option named budget
.
Relying on the budget completely without checking the wall time as well means I can measure the time it takes on my machine and estimate a arbitrary time limit that should not be exceeded in the default case (e.g. 1000-1250 ms as currently done). However, we also have Raspberry PIs and similar and their CPUs are much much slower and the same budget would get spend slower and therefore block the event loop longer. Our CPUs in general get faster over time with new inventions though, so the current default case for a fast machine would soon drop and the budget is spend faster. Meaning the maximum time the event loop would be allowed to be blocked would drop to e.g., 250ms instead of the current limit.
Using both together makes sure that it will always be around the 1000-1250ms range. That is the reason why I would like to have the combination of both, even though it is not fully deterministic after spending the complete budget.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a fair point. Have you considered splitting it out into two separate options, one for budget/complexity and one for wall time? Or are you saying that you feel a single option is best?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel a single option is best for two reasons:
- We already have to many options in util inspect and the more we add, the more complicated it gets for the user. I believe we should just have very solid defaults while allowing the user to also change some things that are definitely necessary in some cases.
- I do not see a necessity for having a fine grained control over this. The option is meant to prevent the event loop from being blocked to long while being able to also opt out of that or to stricten the rule further (by setting the budget to <= 0 even though the minimum time is still set to 1 second). The question is: is it necessary for the user to allow util.inspect to bail out before one second passed? I personally doubt that.
'INSPECTION_ABORTED' | ||
); | ||
|
||
util.inspect(obj, { depth: Infinity }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add tests that test the limits here?
I think we need a test for the complexity parameter to make sure it works
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is tested implicitly here. Without the option being set the test would crash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BridgeAR I realize, but for example if we break the code that parses the option then tests wouldn't catch it - so if it's not a lot of trouble I think adding a second test that sets the configuration explicitly is a good idea.
Definitely not blocking on this though.
To limit the maximum computation time this adds a `minComplexity` option. Up to that complexity any object will be fully inspected. As soon as that limit is reached the time is going to be measured for the further inspection and the inspection is limited to the absolute minimum after one second has passed. That makes sure the event loop is not blocked for to long while still producing a good output in almost all cases. To make sure the output is almost deterministic even though a timeout is used, it will only measure the time each 1e5 complexity units. This also limits the performance overhead to the minimum.
8dba214
to
2de779a
Compare
Closing in favor of #22756 |
This adds a limit to the maximum computation time while using
util.inspect()
. That makes sure the event loop is not blocked forto long even though still having a good output in pretty much all
cases.
Fixes: #19405
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passes