Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

readAll api - Implementation for chunked outputStreamType and unit tests for transform, categories, consistentSnapshot and onBatchError. #630

Merged
merged 1 commit into from
Dec 21, 2021

Conversation

anu3990
Copy link
Contributor

@anu3990 anu3990 commented Dec 20, 2021

#620 - Implementation for chunked outputStreamType and unit tests for transform, categories, consistentSnapshot and onBatchError.

Copy link
Contributor

@ehennum ehennum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very close with a good level of testing.

lib/documents.js Outdated
jobState.requesterCount++;
let firstRequestCompleted = onReadAllDocs(jobState,0);
if(firstRequestCompleted){
spinReaderThreads(jobState,1, maxRequesters);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Node.js is asynchronous -- waiting always has to execute in a callback.

That's different from Java, which blocks.

By definition, onReadAllDocs() will return before the response comes back from the server. So, when lines 2852 through 2855 execute, the timestamp will never be known.

Instead, line 2853 should be executed when either the result() callback (in object mode) or the first data event callback (in chunked mode) executes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Erik, this is one of the ways to make the code synchronous. The variable "firstRequestCompleted" will be true only after 2851 is executed, so line 2853 is always executed after the first request. I had put loggers in the function readDocumentsImpl for timestamp values in each request and only the first value came as null. Rest all requests had values in them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, Anushree: onReadAllDocs() will certainly have executed.

But, executing onReadAllDocs() only starts the first request. The response from the first request won't come back until later when the object or chunked mode response callback executes.

Because the timestamp is in the response, the timestamp won't be available until the response comes back.

If onReadAllDocs() returns a value, the firstRequestCompleted variable will evaluate to true, but that doesn't guarantee the timestamp is available.

Am I missing something?

.on('error', function(err){
readAllDocumentsErrorHandle(jobState, batchdef, readBatchArray, readerId, val, err);
})
.on('data', function(item){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In chunked mode, the timestamp would be available for the first time om this callback and thus other requesters could be spun up.

});
}
else {
readDocumentsImplRequest.result((output) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In object mode, the timestamp would first become available in this callback and thus other requesters could be spun up.

}
else {
readDocumentsImplRequest.result((output) => {
jobState.docsReadSuccessfully+= readBatchArray.length;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see belatedly that, in chunked mode, the Node.js API cannot report the number of documents read successfully or failed.

Maybe that's something to document?

],
quality: 1
}
}));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I recall correctly, Mocha has a before task that could do the setup separately from the it test, which would be better than using a timeout to execute the test after the setup.

Or, the test could move into an on completion callback for the setup similar to some of the other tests to execute in a result callback.

(The concern is that the timeout could be fragile.)

}));
});

it('should readAll documents with consistentSnapshot option as true', function(done){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A good test for a consistent snapshot would be to update a document after the first document is read but before the modified document is read.

The output should have the old content for the modified document because the timestamp of the modification is after the consistent timestamp.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will work on this.

…sts for transform, categories, consistentSnapshot and onBatchError.
@anu3990
Copy link
Contributor Author

anu3990 commented Dec 21, 2021

Combining all commits into one commit before merge.

@anu3990 anu3990 merged commit a5b4b59 into marklogic:develop Dec 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants