-
Notifications
You must be signed in to change notification settings - Fork 5
Rationale
Key reasons of why you might want to use this library...
This library operates on JavaScript native types (sync and async iterables), and outputs the same. This means, no integration commitment, you can make use of the library in any context, without creating any compatibility concerns.
- Every synchronous pipeline produces a native synchronous iterable
- Every asynchronous pipeline produces a native asynchronous iterable
This separation also has a profound impact on performance, as explained below.
If you look at the Benchmarks, synchronous iteration outperforms asynchronous many times over. This tells us that mixing synchronous and asynchronous processing into one isn't a good idea. However, this is the path many frameworks are taking, sacrificing performance to the convenience of processing unification.
What makes matter worse, is that in the real world applications, the amount of asynchronous processing is significantly lower than synchronous.
To design a good product, you need a clear picture of your data flow, in order to be able to improve on performance and scalability efficiently, and that does require separation of synchronous and asynchronous layers in your data processing.
To illustrate this, let's start with a bad code example:
import {pipe, toAsync, filter, distinct, map, wait} from 'iter-ops';
const data = [12, 32, 357, ...]; // million items or so
const i = pipe(
toAsync(data), // make asynchronous
filter(a => a % 3 === 0), // take only numbers divisible by 3
distinct(), // remove duplicates
map(a => service.process(a)), // use async service, which returns Promise
wait() // resolve each promise
); // inferred type = AsyncIterableExt
for await(const a of i) {
console.log(a); // show resolved data
}
And here's what a good version of the same code should look:
import {pipe, toAsync, filter, distinct, map, wait} from 'iter-ops';
const data = [12, 32, 357, ...]; // million items or so
// syncronous pipeline:
const i = pipe(
data,
filter(a => a % 3 === 0),
distinct()
); // inferred type = IterableExt
// asynchronous pipeline:
const k = pipe(
toAsync(i), // enable async processing
map(a => service.process(a)),
wait()
); // inferred type = AsyncIterableExt
for await(const a of k) {
console.log(a); // show resolved data
}
Just by separating synchronous processing pipeline from asynchronous one, in the above scenario of filtering through a lot of initial data, before asynchronous processing, we can achieve performance increase of easily 10x times over.
This library keeps things separately, both through explicit type control and run-time, so there is never any confusion of whether you're doing synchronous or asynchronous data processing at any given time.
Operators in this library support iteration state/session (IterationState), which lets you persist additional processing logic, during iteration session, for a more complex processing logic.
In the example below, we use iteration session of the filter operator, to detect and remove repeated values (do not confuse with distinct, which removes all duplicates).
const i = pipe(
iterable,
filter((value, index, state) => {
if(value === state.previousValue) {
return false;
}
state.previousValue = value;
return true;
})
);
Here's a more generic distinctUntilChanged implementation as a custom operator.