Faster and simpler filter implementation #12

mourner · 2016-01-29T16:39:41Z

Closes #10, ref #8. Alternative approach, still using eval.

Filters have the same performance, but filter creation is 40% faster, and there is 30% less code.

It also makes in with a lot of items faster due to using indexOf, and it'll be easier to significantly improve further on this basis.

cc @jfirebaugh @dcervelli

Closes #6, closes #10, ref #8.

mourner · 2016-01-29T16:56:27Z

Benchmark for simple filters (this branch is much faster than master):

# master
create 5-item filter: 1.174ms
filter 1 million times: 89.382ms

# pr 9
create 5-item filter: 1.081ms
filter 1 million times: 89.812ms

# simple 
create 5-item filter: 0.796ms
filter 1 million times: 30.522ms

mourner · 2016-01-29T17:08:47Z

Benchmark for 2k:

# master 
create 2000-item filter: 10.961ms
filter 1 million times: 20915.215ms

# pr 9
create 2000-item filter: 6.585ms
filter 1 million times: 215.851ms

# simple
create 2000-item filter: 0.747ms
filter 1 million times: 1414.595ms

So for filtering against a huge number of items, this branch is much faster than master but much slower than pr/9. I'll now try to implement binary search and this should change things dramatically though.

mourner · 2016-01-29T18:01:48Z

Added binary search! It turns on for more than 200 items (picked by testing running benchmarks). Results are great:

# pr 9
create 100000-item filter: 428.668ms
filter 1 million times: 499.984ms

# simple
create 100000-item filter: 71.494ms
filter 1 million times: 383.677ms

dcervelli · 2016-01-29T18:10:34Z

@mourner Awesome! Way simpler. Let me quickly add a benchmark on randomly generated strings as the filter items instead of numbers.

dcervelli · 2016-01-29T18:22:57Z

For random string data (which is closer to our use case) I get these results:

# pr 9
create 64000-item filter: 240.739ms
filter 1 million times: 757.678ms

# simple
create 64000-item filter: 161.316ms
filter 1 million times: 1879.914ms

Given the simpler code, I'm totally OK with this until it proves to be a bottleneck (which I doubt it will).

And for calibration: on my box for bench/big.js I get:

# simple
create 64000-item filter: 38.474ms
filter 1 million times: 324.140ms

Here's my benchmark:

'use strict';

var filter = require('../');

var N = 64000;

var arr = ['in', 'foo'];
for (var i = 0; i < N; i++) arr.push(i + " " + Math.random());

console.time('create ' + N + '-item filter');
var f = filter(arr);
console.timeEnd('create ' + N + '-item filter');

var feature = {properties: {foo: 0}};

console.time('filter 1 million times');
for (var i = 0; i < 1000000; i++) {
    feature.properties.foo = arr[Math.floor(Math.random() * N) + 2];
    f(feature);
}
console.timeEnd('filter 1 million times');

Thanks!

jfirebaugh · 2016-01-29T18:36:31Z

test.js

@@ -294,7 +294,7 @@ test('!in, $type', function(t) {

 test('any', function(t) {
    var f1 = filter(['any']);
-    t.equal(f1({properties: {foo: 1}}), false);
+    t.equal(f1({properties: {foo: 1}}), true);


This shouldn't change; returning false for an empty any is standard mathematical / functional programming behavior.

Didn't know that, it's not very intuitive. Will fix.

mourner · 2016-01-29T20:41:31Z

@dcervelli great! I think it's still slower than it could be due to the fact that you have to evaluate the whole huge array on every filter function call. For future improvements, we can make an attempt at creating a closure that would contain all the big arrays referenced in filters so that the function call itself remains lightweight.

faster and simpler filter implementation

b4bc18d

Closes #6, closes #10, ref #8.

mourner force-pushed the simple branch from 55aecbb to b4bc18d Compare January 29, 2016 16:44

mourner mentioned this pull request Jan 29, 2016

Change 'in' (and '!in') filter to work with a hash for performance. #9

Closed

implement binary search for huge in filters, close #8

2a0e951

add benchmarks

41b7910

jfirebaugh reviewed Jan 29, 2016
View reviewed changes

mourner added 2 commits January 29, 2016 22:13

bring back empty any = false behavior

5739338

add gitignore, update package.json

69bb575

mourner merged commit 69bb575 into master Jan 29, 2016

mourner deleted the simple branch January 29, 2016 20:48

gmaclennan mentioned this pull request Oct 6, 2016

Support filtering tiles from geojson-vt #20

Closed

mourner mentioned this pull request Dec 14, 2016

Improve function performance for an increasing number of stops mapbox/mapbox-gl-function#31

Closed

This was referenced Dec 21, 2016

Improve function performance for an increasing number of stops mapbox/mapbox-gl-style-spec#628

Closed

Support using feature filters with tiles from geojson-vt mapbox/mapbox-gl-style-spec#640

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster and simpler filter implementation #12

Faster and simpler filter implementation #12

mourner commented Jan 29, 2016

mourner commented Jan 29, 2016

mourner commented Jan 29, 2016

mourner commented Jan 29, 2016

dcervelli commented Jan 29, 2016

dcervelli commented Jan 29, 2016

jfirebaugh Jan 29, 2016

mourner Jan 29, 2016

mourner commented Jan 29, 2016

Faster and simpler filter implementation #12

Faster and simpler filter implementation #12

Conversation

mourner commented Jan 29, 2016

mourner commented Jan 29, 2016

mourner commented Jan 29, 2016

mourner commented Jan 29, 2016

dcervelli commented Jan 29, 2016

dcervelli commented Jan 29, 2016

jfirebaugh Jan 29, 2016

Choose a reason for hiding this comment

mourner Jan 29, 2016

Choose a reason for hiding this comment

mourner commented Jan 29, 2016