Mongoose session #3018

NelsonVides · 2021-01-29T19:54:22Z

Plenty of explanations and profiling and details and whatnot, in the commit messages 😁

Simply put, maps are faster than lists. Let's see some benchmarks: * Microbenchmarkins: Using Benchee, we can set up diverse inputs, where two parameters are tweakable: number of keys, and their order. Lists are expected to be slower when they're big, but inserting or looking up the first value will always be a constant-time operation. However, how likely it is that we're looking up the first value? I generate the inputs using a function that takes a size and an ordering, and generates a list of such size with the element we're going to query, at the beginning, at the end, or simply shuffles randomly the list. Then it returns this dataset as a tuple where the first element is such proplist, and the second is the same proplist but converted to a map. Results show that, for the best scenario when the lookup element is at the beginning of the list, is the only setting where lists win, but we should see the differences: for smaller lists (3 or 10 elements), maps are on average only 1.1~1.3 times slower, and 99% percentiles are actually even better at 1.1 times faster. When the size gets bigger, the implementation of maps changes (they're two arrays for small sizes, but change to big-branching-factor B-trees tables when they get sufficiently big), then maps can be as much as 2.5 times slower on average, and 1.5 times slower on 99% percentiles. For all other cases, when the lookup key is at the end regardless of the size, or when the keys are uniformly distributed across the input, maps beat lists amazingly, with differences ranging around 20x faster in average, and several hundred times better on the 99% percentiles. * Storage sizes: An important thing to take into account, is that this session info is stored in the mnesia session table, so the memory it takes when written to the ets tables is relevant, as it would indirectly affect the performance of how these tables are synchronised across nodes, they'll be copied by a process when querying the table, and in general consume RAM on the host. The following checks in the shell give us some information: A single element of several kv-pairs takes less space for maps: > BigList = [ {X, a} || X <- lists:seq(1,100)]. > BigMap = maps:from_list(BigList). > ets:insert(ListTid, {1, BigList}). > ets:insert(MapTid, {1, BigMap}). > ets:info(ListTid,memory). 818 > ets:info(MapTid,memory). 676 Many elements of many kv-pairs takes linearly less space for maps: > [ ets:insert(ListTid, {N, [ {N+X, N+X+1} || X <- lists:seq(1,10) ]}) || N <- lists:seq(1,100) ]. > [ ets:insert(MapTid, {N, maps:from_list([ {N+X, N+X+1} || X <- lists:seq(1,10) ]) }) || N <- lists:seq(1,100) ]. > ets:info(ListTid, size). -> 100 > ets:info(MapTid, size). -> 100 > ets:info(ListTid,memory). 6011 > ets:info(MapTid,memory). 3411 * MongooseIM profiling with fprof: Using amoc's one_to_one scenario with default configuration and 12000 users with interarrival=10, I run fprof on MIM for 10 seconds, in both master and this branch. These are some results: Showing a 1.2x improvement over quering: ** With lists {[{{ejabberd_gen_sm,get_sessions,3}, 25135, 8089.379, 145.278}], { {ejabberd_sm_mnesia,get_sessions,2}, 25135, 8089.379, 145.278}, % [{{mnesia,dirty_index_read,3}, 25135, 7944.101, 414.477}]}. ** With maps {[{{ejabberd_gen_sm,get_sessions,3}, 20824, 5920.781, 110.654}], { {ejabberd_sm_mnesia,get_sessions,2}, 20824, 5920.781, 110.654}, % [{{mnesia,dirty_index_read,3}, 20824, 5803.559, 301.085}, {garbage_collect, 681, 6.568, 6.568}]}. Showing 1.23x improvements in inserting: ** With lists {[{{ejabberd_sm_mnesia,'-create_session/4-fun-0-',1},1001, 635.633, 5.730}, {{ejabberd_sm_mnesia,'-create_session/4-fun-1-',1},1000, 559.658, 5.372}], { {mnesia,write,1}, 2001, 1195.291, 11.102}, % [{{mnesia,write,3}, 2001, 1184.189, 10.698}]}. ** With maps {[{{ejabberd_sm_mnesia,'-create_session/4-fun-0-',1},1000, 522.757, 4.918}, {{ejabberd_sm_mnesia,'-create_session/4-fun-1-',1},1000, 444.673, 4.290}], { {mnesia,write,1}, 2000, 967.430, 9.208}, % [{{mnesia,write,3}, 2000, 958.222, 8.892}]}. Showing a 9x improvement over merging: ** With lists {[{{ejabberd_sm_mnesia,create_session,4}, 1000, 178.977, 28.998}], { {mongoose_session,merge_info,2}, 1000, 178.977, 28.998}, % [{{orddict,merge,3}, 1000, 72.619, 26.883}, {{orddict,from_list,1}, 2000, 68.903, 19.996}, {{erlang,setelement,3}, 1000, 4.281, 4.281}, {{orddict,to_list,1}, 1000, 4.176, 4.176}]}. ** With maps {[{{ejabberd_sm_mnesia,create_session,4}, 1000, 19.326, 11.556}], { {mongoose_session,merge_info,2}, 1000, 19.326, 11.556}, % [{{maps,merge,2}, 1000, 4.192, 4.192}, {{erlang,setelement,3}, 1000, 3.578, 3.578}]}. ======================================================================== Below, code and full given output: - Data generation ``` random_contents(Order, Size) -> L = [{X, y} || X <- lists:seq(1, Size)], FinalList = order(Order, L), {FinalList, maps:from_list(FinalList)}. order(first, L) -> [{z, z} | L]; order(last, L) -> L ++ [{z, z}]; order(random, L) -> shuffle([{z, z} | L]). shuffle([]) -> []; shuffle([Elem]) -> [Elem]; shuffle(List) -> shuffle(List, length(List), []). shuffle([], 0, Result) -> Result; shuffle(List, Len, Result) -> {Elem, Rest} = nth_rest(random:uniform(Len), List), shuffle(Rest, Len - 1, [Elem|Result]). nth_rest(N, List) -> nth_rest(N, List, []). nth_rest(1, [E|List], Prefix) -> {E, Prefix ++ List}; nth_rest(N, [E|List], Prefix) -> nth_rest(N - 1, List, [E|Prefix]). ``` - Benchee benchmark ``` Benchee.run( %{ "proplists" => fn {list, _} -> :proplists.get_value(:z, list) end, "keyfind" => fn {list, _} -> :lists.keyfind(:z, 1, list) end, "maps" => fn {_, map} -> :maps.get(:z, map) end }, inputs: %{ "tiny first" => :bench_erl.random_contents(:first, 3), "tiny last" => :bench_erl.random_contents(:last, 3), "tiny random" => :bench_erl.random_contents(:random, 3), "small first" => :bench_erl.random_contents(:first, 10), "small last" => :bench_erl.random_contents(:last, 10), "small random" => :bench_erl.random_contents(:random, 10) "medium first" => :bench_erl.random_contents(:first, 100), "medium last" => :bench_erl.random_contents(:last, 100), "medium random" => :bench_erl.random_contents(:random, 100), "large first" => :bench_erl.random_contents(:first, 1000), "large last" => :bench_erl.random_contents(:last, 1000), "large random" => :bench_erl.random_contents(:random, 1000) } ) ``` - Results ***** With input tiny first ##### Name ips average deviation median 99th % keyfind 56.44 M 17.72 ns ±2113.88% 15 ns 33 ns maps 51.36 M 19.47 ns ±1655.88% 18 ns 25 ns proplists 39.91 M 25.06 ns ±1270.28% 21 ns 44 ns Comparison: keyfind 56.44 M maps 51.36 M - 1.10x slower +1.75 ns proplists 39.91 M - 1.41x slower +7.34 ns ***** With input tiny last ##### Name ips average deviation median 99th % keyfind 46.77 M 21.38 ns ±15573.91% 18 ns 35 ns maps 46.58 M 21.47 ns ±15841.53% 19 ns 28 ns proplists 14.68 M 68.13 ns ±5534.98% 64 ns 87 ns Comparison: keyfind 46.77 M maps 46.58 M - 1.00x slower +0.0892 ns proplists 14.68 M - 3.19x slower +46.75 ns ***** With input tiny random ##### Name ips average deviation median 99th % keyfind 53.06 M 18.85 ns ±1802.36% 16 ns 33 ns maps 50.92 M 19.64 ns ±1978.64% 18 ns 27 ns proplists 24.02 M 41.63 ns ±914.27% 38 ns 58 ns Comparison: keyfind 53.06 M maps 50.92 M - 1.04x slower +0.79 ns proplists 24.02 M - 2.21x slower +22.78 ns ***** With input small first ##### Name ips average deviation median 99th % keyfind 52.42 M 19.08 ns ±10211.76% 16 ns 33 ns maps 48.32 M 20.70 ns ±8531.67% 18 ns 31 ns proplists 36.70 M 27.25 ns ±6266.95% 24 ns 41 ns Comparison: keyfind 52.42 M maps 48.32 M - 1.08x slower +1.62 ns proplists 36.70 M - 1.43x slower +8.17 ns ***** With input small last ##### Name ips average deviation median 99th % maps 46.98 M 21.29 ns ±8269.86% 18 ns 40 ns keyfind 31.67 M 31.58 ns ±5654.38% 29 ns 46 ns proplists 7.01 M 142.63 ns ±1391.75% 139 ns 160 ns Comparison: maps 46.98 M keyfind 31.67 M - 1.48x slower +10.29 ns proplists 7.01 M - 6.70x slower +121.35 ns ***** With input small random ##### Name ips average deviation median 99th % maps 47.88 M 20.89 ns ±8337.35% 18 ns 31 ns keyfind 36.55 M 27.36 ns ±6339.40% 25 ns 42 ns proplists 8.91 M 112.19 ns ±1709.40% 109 ns 130 ns Comparison: maps 47.88 M keyfind 36.55 M - 1.31x slower +6.47 ns proplists 8.91 M - 5.37x slower +91.30 ns ***** With input medium first ##### Name ips average deviation median 99th % keyfind 55.53 M 18.01 ns ±2242.82% 16 ns 31 ns proplists 41.04 M 24.37 ns ±1707.53% 21 ns 41 ns maps 24.35 M 41.07 ns ±1082.37% 38 ns 58 ns Comparison: keyfind 55.53 M proplists 41.04 M - 1.35x slower +6.36 ns maps 24.35 M - 2.28x slower +23.06 ns ***** With input medium last ##### Name ips average deviation median 99th % maps 25.35 M 39.44 ns ±1081.63% 37 ns 52 ns keyfind 6.70 M 149.32 ns ±330.74% 144 ns 192 ns proplists 0.95 M 1048.98 ns ±98.44% 1028 ns 1193 ns Comparison: maps 25.35 M keyfind 6.70 M - 3.79x slower +109.88 ns proplists 0.95 M - 26.59x slower +1009.53 ns ***** With input medium random ##### Name ips average deviation median 99th % maps 24.49 M 40.83 ns ±1053.65% 37 ns 61 ns keyfind 11.58 M 86.38 ns ±537.75% 82 ns 108 ns proplists 1.57 M 636.12 ns ±130.81% 623 ns 847 ns Comparison: maps 24.49 M keyfind 11.58 M - 2.12x slower +45.54 ns proplists 1.57 M - 15.58x slower +595.29 ns ***** With input large first ##### Name ips average deviation median 99th % keyfind 54.27 M 18.43 ns ±12429.51% 16 ns 31 ns proplists 38.30 M 26.11 ns ±7029.54% 21 ns 42 ns maps 23.16 M 43.17 ns ±4515.38% 40 ns 50 ns Comparison: keyfind 54.27 M proplists 38.30 M - 1.42x slower +7.68 ns maps 23.16 M - 2.34x slower +24.75 ns ***** With input large last ##### Name ips average deviation median 99th % maps 27200.24 K 0.0368 μs ±6462.81% 0.0340 μs 0.0440 μs keyfind 287.78 K 3.47 μs ±16.22% 3.40 μs 4.62 μs proplists 97.58 K 10.25 μs ±93.41% 10.03 μs 12.74 μs Comparison: maps 27200.24 K keyfind 287.78 K - 94.52x slower +3.44 μs proplists 97.58 K - 278.74x slower +10.21 μs ***** With input large random ##### Name ips average deviation median 99th % maps 26.83 M 37.27 ns ±6362.71% 34 ns 53 ns keyfind 1.41 M 708.50 ns ±30.78% 701 ns 789 ns proplists 0.35 M 2886.97 ns ±31.76% 2821 ns 3239 ns Comparison: maps 26.83 M keyfind 1.41 M - 19.01x slower +671.24 ns proplists 0.35 M - 77.47x slower +2849.71 ns

Up to now, get_sessions/0,1 returned `[ses_tuple()]`, but get_sessions/2,3 returned `[session()]`, which to begin with seems inconsistent, and perhaps that is the biggest benefit of this change. Profiling data from the mongooseim_one_to_one scenario didn't really show any difference, sometimes increases of 0.01%, sometimes decreases of 0.01%. This is sadly expected, most often that scenario doesn't cover the paths where this is used.

mongoose-im · 2021-01-29T23:00:58Z

9049.1 / Erlang 23.0.3 / small_tests / 8c2297f
Reports root / small

9049.2 / Erlang 23.0.3 / internal_mnesia / 8c2297f
Reports root/ big
OK: 1502 / Failed: 0 / User-skipped: 161 / Auto-skipped: 0

9049.3 / Erlang 23.0.3 / odbc_mssql_mnesia / 8c2297f
Reports root/ big
OK: 2770 / Failed: 0 / User-skipped: 229 / Auto-skipped: 0

9049.4 / Erlang 23.0.3 / mysql_redis / 8c2297f
Reports root/ big
OK: 2765 / Failed: 0 / User-skipped: 234 / Auto-skipped: 0

9049.6 / Erlang 23.0.3 / ldap_mnesia / 8c2297f
Reports root/ big
OK: 1404 / Failed: 0 / User-skipped: 259 / Auto-skipped: 0

9049.5 / Erlang 23.0.3 / riak_mnesia / 8c2297f
Reports root/ big
OK: 1628 / Failed: 0 / User-skipped: 181 / Auto-skipped: 0

9049.7 / Erlang 23.0.3 / elasticsearch_and_cassandra_mnesia / 8c2297f
Reports root/ big
OK: 331 / Failed: 0 / User-skipped: 38 / Auto-skipped: 0

9049.9 / Erlang 22.3 / pgsql_mnesia / 8c2297f
Reports root/ big / small
OK: 2783 / Failed: 0 / User-skipped: 216 / Auto-skipped: 0

chrzaszcz

Good changes! A few comments from me.

src/mod_carboncopy.erl

src/ejabberd_sm.erl

src/ejabberd_sm_redis.erl

chrzaszcz

Very good changes!

NelsonVides requested a review from chrzaszcz January 29, 2021 19:54

NelsonVides force-pushed the mongoose_session branch from 4d5d4c6 to dedb47a Compare January 29, 2021 19:59

chrzaszcz reviewed Feb 11, 2021

View reviewed changes

src/mod_carboncopy.erl Outdated Show resolved Hide resolved

src/ejabberd_sm.erl Outdated Show resolved Hide resolved

chrzaszcz reviewed Feb 11, 2021

View reviewed changes

src/ejabberd_sm_redis.erl Show resolved Hide resolved

Apply PR review, improve tracing and remove redundant type

3217ef6

NelsonVides requested a review from chrzaszcz February 19, 2021 11:31

chrzaszcz approved these changes Mar 2, 2021

View reviewed changes

chrzaszcz merged commit 135fec2 into master Mar 2, 2021

chrzaszcz deleted the mongoose_session branch March 2, 2021 08:34

leszke added this to the 4.2.0 milestone Apr 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mongoose session #3018

Mongoose session #3018

NelsonVides commented Jan 29, 2021

mongoose-im commented Jan 29, 2021 •

edited

Loading

chrzaszcz left a comment

chrzaszcz left a comment

Mongoose session #3018

Mongoose session #3018

Conversation

NelsonVides commented Jan 29, 2021

mongoose-im commented Jan 29, 2021 • edited Loading

chrzaszcz left a comment

Choose a reason for hiding this comment

chrzaszcz left a comment

Choose a reason for hiding this comment

mongoose-im commented Jan 29, 2021 •

edited

Loading