-
Notifications
You must be signed in to change notification settings - Fork 428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mongoose session #3018
Mongoose session #3018
Conversation
Simply put, maps are faster than lists. Let's see some benchmarks: * Microbenchmarkins: Using Benchee, we can set up diverse inputs, where two parameters are tweakable: number of keys, and their order. Lists are expected to be slower when they're big, but inserting or looking up the first value will always be a constant-time operation. However, how likely it is that we're looking up the first value? I generate the inputs using a function that takes a size and an ordering, and generates a list of such size with the element we're going to query, at the beginning, at the end, or simply shuffles randomly the list. Then it returns this dataset as a tuple where the first element is such proplist, and the second is the same proplist but converted to a map. Results show that, for the best scenario when the lookup element is at the beginning of the list, is the only setting where lists win, but we should see the differences: for smaller lists (3 or 10 elements), maps are on average only 1.1~1.3 times slower, and 99% percentiles are actually even better at 1.1 times faster. When the size gets bigger, the implementation of maps changes (they're two arrays for small sizes, but change to big-branching-factor B-trees tables when they get sufficiently big), then maps can be as much as 2.5 times slower on average, and 1.5 times slower on 99% percentiles. For all other cases, when the lookup key is at the end regardless of the size, or when the keys are uniformly distributed across the input, maps beat lists amazingly, with differences ranging around 20x faster in average, and several hundred times better on the 99% percentiles. * Storage sizes: An important thing to take into account, is that this session info is stored in the mnesia session table, so the memory it takes when written to the ets tables is relevant, as it would indirectly affect the performance of how these tables are synchronised across nodes, they'll be copied by a process when querying the table, and in general consume RAM on the host. The following checks in the shell give us some information: A single element of several kv-pairs takes less space for maps: > BigList = [ {X, a} || X <- lists:seq(1,100)]. > BigMap = maps:from_list(BigList). > ets:insert(ListTid, {1, BigList}). > ets:insert(MapTid, {1, BigMap}). > ets:info(ListTid,memory). 818 > ets:info(MapTid,memory). 676 Many elements of many kv-pairs takes linearly less space for maps: > [ ets:insert(ListTid, {N, [ {N+X, N+X+1} || X <- lists:seq(1,10) ]}) || N <- lists:seq(1,100) ]. > [ ets:insert(MapTid, {N, maps:from_list([ {N+X, N+X+1} || X <- lists:seq(1,10) ]) }) || N <- lists:seq(1,100) ]. > ets:info(ListTid, size). -> 100 > ets:info(MapTid, size). -> 100 > ets:info(ListTid,memory). 6011 > ets:info(MapTid,memory). 3411 * MongooseIM profiling with fprof: Using amoc's one_to_one scenario with default configuration and 12000 users with interarrival=10, I run fprof on MIM for 10 seconds, in both master and this branch. These are some results: Showing a 1.2x improvement over quering: ** With lists {[{{ejabberd_gen_sm,get_sessions,3}, 25135, 8089.379, 145.278}], { {ejabberd_sm_mnesia,get_sessions,2}, 25135, 8089.379, 145.278}, % [{{mnesia,dirty_index_read,3}, 25135, 7944.101, 414.477}]}. ** With maps {[{{ejabberd_gen_sm,get_sessions,3}, 20824, 5920.781, 110.654}], { {ejabberd_sm_mnesia,get_sessions,2}, 20824, 5920.781, 110.654}, % [{{mnesia,dirty_index_read,3}, 20824, 5803.559, 301.085}, {garbage_collect, 681, 6.568, 6.568}]}. Showing 1.23x improvements in inserting: ** With lists {[{{ejabberd_sm_mnesia,'-create_session/4-fun-0-',1},1001, 635.633, 5.730}, {{ejabberd_sm_mnesia,'-create_session/4-fun-1-',1},1000, 559.658, 5.372}], { {mnesia,write,1}, 2001, 1195.291, 11.102}, % [{{mnesia,write,3}, 2001, 1184.189, 10.698}]}. ** With maps {[{{ejabberd_sm_mnesia,'-create_session/4-fun-0-',1},1000, 522.757, 4.918}, {{ejabberd_sm_mnesia,'-create_session/4-fun-1-',1},1000, 444.673, 4.290}], { {mnesia,write,1}, 2000, 967.430, 9.208}, % [{{mnesia,write,3}, 2000, 958.222, 8.892}]}. Showing a 9x improvement over merging: ** With lists {[{{ejabberd_sm_mnesia,create_session,4}, 1000, 178.977, 28.998}], { {mongoose_session,merge_info,2}, 1000, 178.977, 28.998}, % [{{orddict,merge,3}, 1000, 72.619, 26.883}, {{orddict,from_list,1}, 2000, 68.903, 19.996}, {{erlang,setelement,3}, 1000, 4.281, 4.281}, {{orddict,to_list,1}, 1000, 4.176, 4.176}]}. ** With maps {[{{ejabberd_sm_mnesia,create_session,4}, 1000, 19.326, 11.556}], { {mongoose_session,merge_info,2}, 1000, 19.326, 11.556}, % [{{maps,merge,2}, 1000, 4.192, 4.192}, {{erlang,setelement,3}, 1000, 3.578, 3.578}]}. ======================================================================== Below, code and full given output: - Data generation ``` random_contents(Order, Size) -> L = [{X, y} || X <- lists:seq(1, Size)], FinalList = order(Order, L), {FinalList, maps:from_list(FinalList)}. order(first, L) -> [{z, z} | L]; order(last, L) -> L ++ [{z, z}]; order(random, L) -> shuffle([{z, z} | L]). shuffle([]) -> []; shuffle([Elem]) -> [Elem]; shuffle(List) -> shuffle(List, length(List), []). shuffle([], 0, Result) -> Result; shuffle(List, Len, Result) -> {Elem, Rest} = nth_rest(random:uniform(Len), List), shuffle(Rest, Len - 1, [Elem|Result]). nth_rest(N, List) -> nth_rest(N, List, []). nth_rest(1, [E|List], Prefix) -> {E, Prefix ++ List}; nth_rest(N, [E|List], Prefix) -> nth_rest(N - 1, List, [E|Prefix]). ``` - Benchee benchmark ``` Benchee.run( %{ "proplists" => fn {list, _} -> :proplists.get_value(:z, list) end, "keyfind" => fn {list, _} -> :lists.keyfind(:z, 1, list) end, "maps" => fn {_, map} -> :maps.get(:z, map) end }, inputs: %{ "tiny first" => :bench_erl.random_contents(:first, 3), "tiny last" => :bench_erl.random_contents(:last, 3), "tiny random" => :bench_erl.random_contents(:random, 3), "small first" => :bench_erl.random_contents(:first, 10), "small last" => :bench_erl.random_contents(:last, 10), "small random" => :bench_erl.random_contents(:random, 10) "medium first" => :bench_erl.random_contents(:first, 100), "medium last" => :bench_erl.random_contents(:last, 100), "medium random" => :bench_erl.random_contents(:random, 100), "large first" => :bench_erl.random_contents(:first, 1000), "large last" => :bench_erl.random_contents(:last, 1000), "large random" => :bench_erl.random_contents(:random, 1000) } ) ``` - Results ***** With input tiny first ##### Name ips average deviation median 99th % keyfind 56.44 M 17.72 ns ±2113.88% 15 ns 33 ns maps 51.36 M 19.47 ns ±1655.88% 18 ns 25 ns proplists 39.91 M 25.06 ns ±1270.28% 21 ns 44 ns Comparison: keyfind 56.44 M maps 51.36 M - 1.10x slower +1.75 ns proplists 39.91 M - 1.41x slower +7.34 ns ***** With input tiny last ##### Name ips average deviation median 99th % keyfind 46.77 M 21.38 ns ±15573.91% 18 ns 35 ns maps 46.58 M 21.47 ns ±15841.53% 19 ns 28 ns proplists 14.68 M 68.13 ns ±5534.98% 64 ns 87 ns Comparison: keyfind 46.77 M maps 46.58 M - 1.00x slower +0.0892 ns proplists 14.68 M - 3.19x slower +46.75 ns ***** With input tiny random ##### Name ips average deviation median 99th % keyfind 53.06 M 18.85 ns ±1802.36% 16 ns 33 ns maps 50.92 M 19.64 ns ±1978.64% 18 ns 27 ns proplists 24.02 M 41.63 ns ±914.27% 38 ns 58 ns Comparison: keyfind 53.06 M maps 50.92 M - 1.04x slower +0.79 ns proplists 24.02 M - 2.21x slower +22.78 ns ***** With input small first ##### Name ips average deviation median 99th % keyfind 52.42 M 19.08 ns ±10211.76% 16 ns 33 ns maps 48.32 M 20.70 ns ±8531.67% 18 ns 31 ns proplists 36.70 M 27.25 ns ±6266.95% 24 ns 41 ns Comparison: keyfind 52.42 M maps 48.32 M - 1.08x slower +1.62 ns proplists 36.70 M - 1.43x slower +8.17 ns ***** With input small last ##### Name ips average deviation median 99th % maps 46.98 M 21.29 ns ±8269.86% 18 ns 40 ns keyfind 31.67 M 31.58 ns ±5654.38% 29 ns 46 ns proplists 7.01 M 142.63 ns ±1391.75% 139 ns 160 ns Comparison: maps 46.98 M keyfind 31.67 M - 1.48x slower +10.29 ns proplists 7.01 M - 6.70x slower +121.35 ns ***** With input small random ##### Name ips average deviation median 99th % maps 47.88 M 20.89 ns ±8337.35% 18 ns 31 ns keyfind 36.55 M 27.36 ns ±6339.40% 25 ns 42 ns proplists 8.91 M 112.19 ns ±1709.40% 109 ns 130 ns Comparison: maps 47.88 M keyfind 36.55 M - 1.31x slower +6.47 ns proplists 8.91 M - 5.37x slower +91.30 ns ***** With input medium first ##### Name ips average deviation median 99th % keyfind 55.53 M 18.01 ns ±2242.82% 16 ns 31 ns proplists 41.04 M 24.37 ns ±1707.53% 21 ns 41 ns maps 24.35 M 41.07 ns ±1082.37% 38 ns 58 ns Comparison: keyfind 55.53 M proplists 41.04 M - 1.35x slower +6.36 ns maps 24.35 M - 2.28x slower +23.06 ns ***** With input medium last ##### Name ips average deviation median 99th % maps 25.35 M 39.44 ns ±1081.63% 37 ns 52 ns keyfind 6.70 M 149.32 ns ±330.74% 144 ns 192 ns proplists 0.95 M 1048.98 ns ±98.44% 1028 ns 1193 ns Comparison: maps 25.35 M keyfind 6.70 M - 3.79x slower +109.88 ns proplists 0.95 M - 26.59x slower +1009.53 ns ***** With input medium random ##### Name ips average deviation median 99th % maps 24.49 M 40.83 ns ±1053.65% 37 ns 61 ns keyfind 11.58 M 86.38 ns ±537.75% 82 ns 108 ns proplists 1.57 M 636.12 ns ±130.81% 623 ns 847 ns Comparison: maps 24.49 M keyfind 11.58 M - 2.12x slower +45.54 ns proplists 1.57 M - 15.58x slower +595.29 ns ***** With input large first ##### Name ips average deviation median 99th % keyfind 54.27 M 18.43 ns ±12429.51% 16 ns 31 ns proplists 38.30 M 26.11 ns ±7029.54% 21 ns 42 ns maps 23.16 M 43.17 ns ±4515.38% 40 ns 50 ns Comparison: keyfind 54.27 M proplists 38.30 M - 1.42x slower +7.68 ns maps 23.16 M - 2.34x slower +24.75 ns ***** With input large last ##### Name ips average deviation median 99th % maps 27200.24 K 0.0368 μs ±6462.81% 0.0340 μs 0.0440 μs keyfind 287.78 K 3.47 μs ±16.22% 3.40 μs 4.62 μs proplists 97.58 K 10.25 μs ±93.41% 10.03 μs 12.74 μs Comparison: maps 27200.24 K keyfind 287.78 K - 94.52x slower +3.44 μs proplists 97.58 K - 278.74x slower +10.21 μs ***** With input large random ##### Name ips average deviation median 99th % maps 26.83 M 37.27 ns ±6362.71% 34 ns 53 ns keyfind 1.41 M 708.50 ns ±30.78% 701 ns 789 ns proplists 0.35 M 2886.97 ns ±31.76% 2821 ns 3239 ns Comparison: maps 26.83 M keyfind 1.41 M - 19.01x slower +671.24 ns proplists 0.35 M - 77.47x slower +2849.71 ns
Up to now, get_sessions/0,1 returned `[ses_tuple()]`, but get_sessions/2,3 returned `[session()]`, which to begin with seems inconsistent, and perhaps that is the biggest benefit of this change. Profiling data from the mongooseim_one_to_one scenario didn't really show any difference, sometimes increases of 0.01%, sometimes decreases of 0.01%. This is sadly expected, most often that scenario doesn't cover the paths where this is used.
4d5d4c6
to
dedb47a
Compare
9049.1 / Erlang 23.0.3 / small_tests / 8c2297f 9049.2 / Erlang 23.0.3 / internal_mnesia / 8c2297f 9049.3 / Erlang 23.0.3 / odbc_mssql_mnesia / 8c2297f 9049.4 / Erlang 23.0.3 / mysql_redis / 8c2297f 9049.6 / Erlang 23.0.3 / ldap_mnesia / 8c2297f 9049.5 / Erlang 23.0.3 / riak_mnesia / 8c2297f 9049.7 / Erlang 23.0.3 / elasticsearch_and_cassandra_mnesia / 8c2297f 9049.9 / Erlang 22.3 / pgsql_mnesia / 8c2297f |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good changes! A few comments from me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good changes!
Plenty of explanations and profiling and details and whatnot, in the commit messages 😁