readAll api - Implementation for chunked outputStreamType and unit tests for transform, categories, consistentSnapshot and onBatchError. #630

anu3990 · 2021-12-20T21:08:49Z

#620 - Implementation for chunked outputStreamType and unit tests for transform, categories, consistentSnapshot and onBatchError.

ehennum

Looks very close with a good level of testing.

ehennum · 2021-12-21T16:55:39Z

lib/documents.js

+    jobState.requesterCount++;
+    let firstRequestCompleted = onReadAllDocs(jobState,0);
+    if(firstRequestCompleted){
+      spinReaderThreads(jobState,1, maxRequesters);


Node.js is asynchronous -- waiting always has to execute in a callback.

That's different from Java, which blocks.

By definition, onReadAllDocs() will return before the response comes back from the server. So, when lines 2852 through 2855 execute, the timestamp will never be known.

Instead, line 2853 should be executed when either the result() callback (in object mode) or the first data event callback (in chunked mode) executes.

Hi Erik, this is one of the ways to make the code synchronous. The variable "firstRequestCompleted" will be true only after 2851 is executed, so line 2853 is always executed after the first request. I had put loggers in the function readDocumentsImpl for timestamp values in each request and only the first value came as null. Rest all requests had values in them.

Hi, Anushree: onReadAllDocs() will certainly have executed.

But, executing onReadAllDocs() only starts the first request. The response from the first request won't come back until later when the object or chunked mode response callback executes.

Because the timestamp is in the response, the timestamp won't be available until the response comes back.

If onReadAllDocs() returns a value, the firstRequestCompleted variable will evaluate to true, but that doesn't guarantee the timestamp is available.

Am I missing something?

ehennum · 2021-12-21T16:59:49Z

lib/documents.js

+        .on('error', function(err){
+          readAllDocumentsErrorHandle(jobState, batchdef, readBatchArray, readerId, val, err);
+        })
+        .on('data', function(item){


In chunked mode, the timestamp would be available for the first time om this callback and thus other requesters could be spun up.

ehennum · 2021-12-21T17:00:32Z

lib/documents.js

+        });
+  }
+  else {
+    readDocumentsImplRequest.result((output) => {


In object mode, the timestamp would first become available in this callback and thus other requesters could be spun up.

ehennum · 2021-12-21T17:04:29Z

lib/documents.js

+  }
+  else {
+    readDocumentsImplRequest.result((output) => {
+      jobState.docsReadSuccessfully+= readBatchArray.length;


I see belatedly that, in chunked mode, the Node.js API cannot report the number of documents read successfully or failed.

Maybe that's something to document?

ehennum · 2021-12-21T17:10:07Z

test-basic/documents-data-movement-readAll.js

+                ],
+                quality: 1
+            }
+        }));


If I recall correctly, Mocha has a before task that could do the setup separately from the it test, which would be better than using a timeout to execute the test after the setup.

Or, the test could move into an on completion callback for the setup similar to some of the other tests to execute in a result callback.

(The concern is that the timeout could be fragile.)

ehennum · 2021-12-21T17:22:33Z

test-basic/documents-data-movement-readAll.js

+        }));
+    });
+
+    it('should readAll documents with consistentSnapshot option as true', function(done){


A good test for a consistent snapshot would be to update a document after the first document is read but before the modified document is read.

The output should have the old content for the modified document because the timestamp of the modification is after the consistent timestamp.

I will work on this.

…sts for transform, categories, consistentSnapshot and onBatchError.

anu3990 · 2021-12-21T20:37:47Z

Combining all commits into one commit before merge.

anu3990 requested review from ehennum and georgeajit December 20, 2021 21:08

ehennum approved these changes Dec 21, 2021

View reviewed changes

readAll api - Implementation for chunked outputStreamType and unit te…

f78da40

…sts for transform, categories, consistentSnapshot and onBatchError.

anu3990 force-pushed the develop branch from 3075401 to f78da40 Compare December 21, 2021 20:38

anu3990 merged commit a5b4b59 into marklogic:develop Dec 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readAll api - Implementation for chunked outputStreamType and unit tests for transform, categories, consistentSnapshot and onBatchError. #630

readAll api - Implementation for chunked outputStreamType and unit tests for transform, categories, consistentSnapshot and onBatchError. #630

anu3990 commented Dec 20, 2021 •

edited

Loading

ehennum left a comment

ehennum Dec 21, 2021

anu3990 Dec 21, 2021

ehennum Dec 21, 2021

ehennum Dec 21, 2021

ehennum Dec 21, 2021

ehennum Dec 21, 2021

ehennum Dec 21, 2021

ehennum Dec 21, 2021

anu3990 Dec 21, 2021

anu3990 commented Dec 21, 2021

readAll api - Implementation for chunked outputStreamType and unit tests for transform, categories, consistentSnapshot and onBatchError. #630

readAll api - Implementation for chunked outputStreamType and unit tests for transform, categories, consistentSnapshot and onBatchError. #630

Conversation

anu3990 commented Dec 20, 2021 • edited Loading

ehennum left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anu3990 commented Dec 21, 2021

anu3990 commented Dec 20, 2021 •

edited

Loading