Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error() should flush stdout before emitting message #258

Closed
pbelbin opened this issue Jun 1, 2020 · 9 comments · Fixed by #261
Closed

error() should flush stdout before emitting message #258

pbelbin opened this issue Jun 1, 2020 · 9 comments · Fixed by #261
Assignees

Comments

@pbelbin
Copy link
Collaborator

pbelbin commented Jun 1, 2020

It appears that there are circumstances where the waterfall-cli.error(), which outputs to stderr and then terminates execution, may actually emit it's output before stdout has been emptied.

From a client perspective, putting the following code before calling waterfall-cli.error() appears to allow stdout to be completed before the error message is output (from a function with 'async' declared):

// Allow stdout to flush before proceeding
await new Promise<void>((resolve) => {
	process.stdout.write('', () => {
		resolve();
	});
});

Instead of putting this flushing of stdout before every waterfall-cli.error() usage within client code, it makes sense to me that this should be put within the error() directly, so that it can be put in one place once, rather than many other places in the client code.

There is documentation about how the mode of operation of stdout and stderr changes based on the usage scenario at: https://nodejs.org/api/process.html#process_a_note_on_process_i_o, which makes it clear that there is different behavior when the output is being piped (asynchronous) versus non-piped (synchronous) on POSIX (unix-like) usages.

@pbelbin
Copy link
Collaborator Author

pbelbin commented Jun 2, 2020

Ok. So, a lot of digging on this, indicates that this is a continuing sore-point in Node.js in general.

Here are some open issues related to this problem:

elm-explorations/test#138
nodejs/node#6456
nodejs/node#6379

In the current Node.js documentation regarding process.stdout, there is this specific note, which points out that the asynchronous vs synchronous nature of process.stdout handling varies depending on usage scenario.

https://nodejs.org/docs/latest-v14.x/api/process.html#process_a_note_on_process_i_o

When the output is asynchronous, we run into a sticky situation when process.exit() is used synchronously, because the output stream may not yet have been emptied when the exit() shuts everything down, leaving data undelivered to the consumer of the stdout stream. In order to allow the output stream to dequeue to the consumer, the main thread must be able to yield or get into a state of waiting for a callback before continuing, but this is not the normal Node.js JavaScript top-level code behavior. The two things are largely incompatible.

One model for dealing with this issue is to essentially turn everything into an asynchronous case, where the 'top level' code flow 'finishes' very quickly, and all program logic happens within code that is being waited upon. This does not mean that process.exit() can not be used, but, that it's use must be within an async block of code which has the opportunity to await the actual emptying of the stdout buffers before executing the exit().

I did find one other approach which appears to work, but, may have a slight slowing effect, and how much this effect has will vary somewhat depending on how much output is being done. In our case, I believe this is mostly limited to cases where we are emitting error details.

This approach is to set a flag which causes the output to become blocking.

ie:

[process.stdout, process.stderr].forEach((stream: any) => {
	if (stream._handle && typeof stream._handle.setBlocking === 'function') {
		stream._handle.setBlocking(true);
	}
});

It's not clear to me yet how this flag being set actually brings about the change that's observed, but it does appear to provide the desired result. Perhaps it avoids use of the buffer that would normally be used, and sends data instead to the output, and results in a system-level blocking function effectively putting the brakes on until the content is consumed by the other end of the pipe?

@pbelbin
Copy link
Collaborator Author

pbelbin commented Jun 2, 2020

Here is a fairly simple way to observe the effect of this issue:

node -p 'console.log(process.stdout.isTTY); process.stdout._handle.setBlocking(false); process.stdout.write("x".repeat(5e6)); process.exit()'|wc -c

For the above, I am observing this result: 65546

node -p 'console.log(process.stdout.isTTY); process.stdout._handle.setBlocking(true); process.stdout.write("x".repeat(5e6)); process.exit()'|wc -c

For the above command, I observe the expected result: 5000010

@WesCossick
Copy link
Member

Very interesting. Thanks for putting this detailed information together. Do you, then, propose that we set blocking to true?

@pbelbin
Copy link
Collaborator Author

pbelbin commented Jun 2, 2020

Yes, but, my testing so far shows that for this to work, it actually can not be done at the last minute. It has to be done before the buffering of the output stream becomes involved, from what I can see.

Since this is actually a Node.js problem rather than waterfall-cli issue, it might be best just to note it is a suggested thing, and leave it to the client to do.

@WesCossick
Copy link
Member

it might be best just to note it is a suggested thing, and leave it to the client to do.

Maybe the best middle ground would be to make this a default-enabled option, but allow people to disable it when initializing Waterfall CLI. Thoughts?

@pbelbin
Copy link
Collaborator Author

pbelbin commented Jun 2, 2020

I am not against that. I will do some testing to see if that's a viable option.

Meanwhile, unfortunately, we might need a combination approach. Even though I had set the setBlocking(true), in the application code, in one scenario, I'm seeing only 8 of 12 errors make it to the output before waterfall-cli.error() takes over. However, in this particular case, adding

// Allow stdout to flush before proceeding
await new Promise<void>((resolve) => {
	process.stdout.write('', () => {
		resolve();
	});
});

In the client code before the call to waterfall-cli.error() did cause the output to be flushed fully.

So, there appears to be more to the story than just setting setBlocking(true)

@WesCossick
Copy link
Member

Huh; this is proving much more challenging than we thought!

@pbelbin
Copy link
Collaborator Author

pbelbin commented Jun 2, 2020

I am now seeing that even with the .setBlocking(true) being set in the outer-most layer of the client code, that not all of the output is being flushed().

I wonder if the fix for this issue varies depending on whether the client code is synchronous or not?

If I make the waterfall-cli.error() 'async', and put the await new Promise() (see earlier comment) before the existing call to process.exit(), then, the client will have the ability, if being called from within an already asynchronous function, to do: await waterfall.error('text of message'), and this will enable the flushing to happen before the output of the error message and process.exit() is invoked.

If the exact same function is used from a synchronous context, the await new Promise() will start, but, code execution continue without truly waiting, and issue the error message and exit as it has been doing.

I think this might be a decent way to go, unless we want to have 2 flavors of exit(), one that is as it is now, expecting to be used synchronously, and a new function exitAsync() which has the flushing code and is expected to be used only in the form await exitAsync('message text').

For clarity:

The error function:

// A helper function provided to commands to keep error messages consistent
export async function error(message: string) {
	// Allow stdout to flush before proceeding
	await new Promise<void>((resolve) => {
		process.stdout.write('', () => {
			resolve();
		});
	});
	
	// Emit error message
	printPrettyError(message);
	
	// Exit
	process.exit(255);
}

and, in the client, within an async function, would use:

import * as waterfall from 'waterfall-cli';

(async () => {})(
.
.
await waterfall.error(`yeah, we failed');

);

// Note: no explicit exit call here.  Synchronous code execution falls through here, leaving the async code to continue

@WesCossick
Copy link
Member

I think that's a reasonable approach. If we release this library publicly, we'll want to document these things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

2 participants