Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove use of generator functions #314

Merged
merged 5 commits into from
Oct 23, 2015
Merged

Remove use of generator functions #314

merged 5 commits into from
Oct 23, 2015

Conversation

arthurschreiber
Copy link
Collaborator

The switch to a generator based parser in #285 brought better performance and decreased memory usage with large column values, but significantly decreased performance when processing a large number of result rows containing small column values. See #303 for more information.


This PR replaces all use of generator functions with a callback based approach. It is definately less elegant, but also has a much better performance profile.

Here's the benchmarks so far:

$ node benchmarks/
Many result rows x 7.19 ops/sec ±1.40% (38 runs sampled)
Memory: 25.03125 MiB
inserting nvarchar(max) with 5242880 chars x 15.00 ops/sec ±1.60% (69 runs sampled)
Memory: 28.88671875 MiB
inserting nvarchar(max) with 4 chars x 70.06 ops/sec ±0.38% (29 runs sampled)
Memory: 7.65234375 MiB
inserting varbinary(4) with 4 bytes x 70.06 ops/sec ±0.41% (29 runs sampled)
Memory: 8.015625 MiB
inserting varbinary(max) with 50 MiB x 4.03 ops/sec ±3.67% (24 runs sampled)
Memory: 90.84765625 MiB
inserting varbinary(max) with 5 MiB x 34.24 ops/sec ±2.63% (57 runs sampled)
Memory: 37.046875 MiB
inserting varbinary(max) with 4 bytes x 70.09 ops/sec ±0.35% (29 runs sampled)
Memory: 7.9765625 MiB

compared to tedious 1.12.3:

$ node benchmarks
Many result rows x 1.09 ops/sec ±1.35% (10 runs sampled)
Memory: 13.9296875 MiB
inserting nvarchar(max) with 5242880 chars x 10.33 ops/sec ±2.11% (51 runs sampled)
Memory: 29.625 MiB
inserting nvarchar(max) with 4 chars x 72.84 ops/sec ±0.52% (40 runs sampled)
Memory: 8.08984375 MiB
inserting varbinary(4) with 4 bytes x 71.34 ops/sec ±0.30% (34 runs sampled)
Memory: 8.1015625 MiB
inserting varbinary(max) with 50 MiB x 2.57 ops/sec ±3.22% (17 runs sampled)
Memory: 97.53515625 MiB
inserting varbinary(max) with 5 MiB x 21.09 ops/sec ±2.60% (51 runs sampled)
Memory: 69.35546875 MiB
inserting varbinary(max) with 4 bytes x 72.04 ops/sec ±0.43% (37 runs sampled)
Memory: 7.890625 MiB

and tedious 1.11.5:

$ node benchmarks/
Many result rows x 18.17 ops/sec ±1.48% (60 runs sampled)
Memory: 23.84375 MiB
inserting nvarchar(max) with 5242880 chars x 1.79 ops/sec ±1.09% (13 runs sampled)
Memory: 29.3515625 MiB
inserting nvarchar(max) with 4 chars x 67.65 ops/sec ±0.32% (19 runs sampled)
Memory: 5.5 MiB
inserting varbinary(4) with 4 bytes x 67.80 ops/sec ±0.28% (21 runs sampled)
Memory: 5.77734375 MiB
inserting varbinary(max) with 50 MiB x 0.08 ops/sec ±0.94% (5 runs sampled)
Memory: 208.00390625 MiB
inserting varbinary(max) with 5 MiB x 6.77 ops/sec ±1.67% (36 runs sampled)
Memory: 133.6640625 MiB
inserting varbinary(max) with 4 bytes x 67.77 ops/sec ±0.32% (21 runs sampled)
Memory: 5.796875 MiB

As you can see, we're still not back to the old performance, but the difference is not as big as before. There's a lot more that can be done to improve the performance even further. Basic profiling shows that a lot of time is currently spent recompiling the same functions over and over again, and there's also quite a few functions which either are never optimized or get deoptimized.

@bretcope
Copy link
Member

bretcope commented Sep 8, 2015

What functions in particular are performance bottlenecks?

@arthurschreiber
Copy link
Collaborator Author

Basically, it's every function that makes use of a generator. it's partially due to regenerator, the internal invoke that is called on every generator step is getting deoptimized pretty early on, and deoptimization tracing will dump a huge list of a ton of related functions that get deoptimized as a result of that.

The current callback based approach is quite naive, because it goes through the awaitData function every time a read(Something) method is called on the parser. Token parsing can be made much, much more effective by reading a bunch of data into a buffer in one swoop and reading from that buffer directly, instead of jumping through multiple indirection hoops.

I've learnt from the mistake I made with #303 / #285 and will ensure that I add benchmarks for many different parser cases and for each individual token type, helping us to benchmark and profile every change we do and to pinpoint bottlenecks.

@arthurschreiber
Copy link
Collaborator Author

Node v4.2.1 / Tedious 1.12.3

$ node benchmarks
Many result rows x 1.38 ops/sec ±1.43% (11 runs sampled)
Memory: 33.6171875 MiB
inserting nvarchar(max) with 5242880 chars x 11.11 ops/sec ±1.76% (57 runs sampled)
Memory: 81.47265625 MiB
inserting nvarchar(max) with 4 chars x 523 ops/sec ±1.63% (86 runs sampled)
Memory: 19.55078125 MiB
inserting varbinary(4) with 4 bytes x 540 ops/sec ±1.41% (85 runs sampled)
Memory: 19.65234375 MiB
inserting varbinary(max) with 50 MiB x 2.53 ops/sec ±1.93% (17 runs sampled)
Memory: 106.3671875 MiB
inserting varbinary(max) with 5 MiB x 22.62 ops/sec ±3.96% (58 runs sampled)
Memory: 59.41015625 MiB
inserting varbinary(max) with 4 bytes x 516 ops/sec ±2.02% (86 runs sampled)
Memory: 19.890625 MiB
parsing `COLMETADATA` tokens x 126 ops/sec ±2.53% (87 runs sampled)
Memory: 14.71484375 MiB
parsing `DONEPROC` tokens x 56.52 ops/sec ±3.38% (70 runs sampled)
Memory: 44.9765625 MiB
parsing tokens for 100 rows x 75.63 ops/sec ±2.66% (75 runs sampled)
Memory: 15.0390625 MiB

Node v4.2.1 / Tedious 1.12.3 plus changes in this branch

$ node benchmarks
Many result rows x 3.93 ops/sec ±1.94% (24 runs sampled)
Memory: 72.91796875 MiB
inserting nvarchar(max) with 5242880 chars x 20.53 ops/sec ±3.04% (54 runs sampled)
Memory: 119.69921875 MiB
inserting nvarchar(max) with 4 chars x 531 ops/sec ±1.33% (83 runs sampled)
Memory: 21.94921875 MiB
inserting varbinary(4) with 4 bytes x 546 ops/sec ±1.61% (84 runs sampled)
Memory: 21.6015625 MiB
inserting varbinary(max) with 50 MiB x 4.64 ops/sec ±2.43% (27 runs sampled)
Memory: 131.46875 MiB
inserting varbinary(max) with 5 MiB x 43.46 ops/sec ±3.82% (72 runs sampled)
Memory: 85.38671875 MiB
inserting varbinary(max) with 4 bytes x 531 ops/sec ±1.37% (86 runs sampled)
Memory: 21.90625 MiB
parsing `COLMETADATA` tokens x 147 ops/sec ±2.17% (80 runs sampled)
Memory: 44.92578125 MiB
parsing `DONEPROC` tokens x 137 ops/sec ±2.07% (75 runs sampled)
Memory: 43.90234375 MiB
parsing tokens for 100 rows x 168 ops/sec ±2.47% (76 runs sampled)
Memory: 45.30078125 MiB

I have some more changes which further bring performance back to the levels of 1.11.5 all across the board, but I'll release the changes that are in this PR as 1.12.4.

arthurschreiber added a commit that referenced this pull request Oct 23, 2015
Remove use of generator functions
@arthurschreiber arthurschreiber merged commit 68b6ace into master Oct 23, 2015
@chdh chdh mentioned this pull request Feb 23, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants