Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mongoose session #3018

Merged
merged 3 commits into from
Mar 2, 2021
Merged

Mongoose session #3018

merged 3 commits into from
Mar 2, 2021

Conversation

NelsonVides
Copy link
Collaborator

Plenty of explanations and profiling and details and whatnot, in the commit messages 😁

Simply put, maps are faster than lists. Let's see some benchmarks:

* Microbenchmarkins:

Using Benchee, we can set up diverse inputs, where two parameters are
tweakable: number of keys, and their order. Lists are expected to be
slower when they're big, but inserting or looking up the first value
will always be a constant-time operation. However, how likely it is that
we're looking up the first value?

I generate the inputs using a function that takes a size and an
ordering, and generates a list of such size with the element we're going
to query, at the beginning, at the end, or simply shuffles randomly the
list. Then it returns this dataset as a tuple where the first element is
such proplist, and the second is the same proplist but converted to a
map.

Results show that, for the best scenario when the lookup element is at
the beginning of the list, is the only setting where lists win, but we
should see the differences: for smaller lists (3 or 10 elements), maps
are on average only 1.1~1.3 times slower, and 99% percentiles are
actually even better at 1.1 times faster. When the size gets bigger, the
implementation of maps changes (they're two arrays for small sizes, but
change to big-branching-factor B-trees tables when they get sufficiently
big), then maps can be as much as 2.5 times slower on average, and 1.5
times slower on 99% percentiles.

For all other cases, when the lookup key is at the end regardless of the
size, or when the keys are uniformly distributed across the input, maps
beat lists amazingly, with differences ranging around 20x faster in
average, and several hundred times better on the 99% percentiles.

* Storage sizes:

An important thing to take into account, is that this session info is
stored in the mnesia session table, so the memory it takes when written
to the ets tables is relevant, as it would indirectly affect the
performance of how these tables are synchronised across nodes, they'll
be copied by a process when querying the table, and in general consume
RAM on the host.  The following checks in the shell give us some
information:

A single element of several kv-pairs takes less space for maps:
> BigList = [ {X, a} || X <- lists:seq(1,100)].
> BigMap = maps:from_list(BigList).
> ets:insert(ListTid, {1, BigList}).
> ets:insert(MapTid, {1, BigMap}).
> ets:info(ListTid,memory).
818
> ets:info(MapTid,memory).
676

Many elements of many kv-pairs takes linearly less space for maps:
> [ ets:insert(ListTid, {N, [ {N+X, N+X+1} || X <- lists:seq(1,10) ]}) || N <- lists:seq(1,100) ].
> [ ets:insert(MapTid, {N, maps:from_list([ {N+X, N+X+1} || X <- lists:seq(1,10) ]) }) || N <- lists:seq(1,100) ].
> ets:info(ListTid, size). -> 100
> ets:info(MapTid, size). -> 100
> ets:info(ListTid,memory).
6011
> ets:info(MapTid,memory).
3411

* MongooseIM profiling with fprof:

Using amoc's one_to_one scenario with default configuration and 12000
users with interarrival=10, I run fprof on MIM for 10 seconds, in both
master and this branch. These are some results:

Showing a 1.2x improvement over quering:
** With lists
{[{{ejabberd_gen_sm,get_sessions,3},           25135, 8089.379,  145.278}],
 { {ejabberd_sm_mnesia,get_sessions,2},        25135, 8089.379,  145.278},     %
 [{{mnesia,dirty_index_read,3},                25135, 7944.101,  414.477}]}.
** With maps
{[{{ejabberd_gen_sm,get_sessions,3},           20824, 5920.781,  110.654}],
 { {ejabberd_sm_mnesia,get_sessions,2},        20824, 5920.781,  110.654},     %
 [{{mnesia,dirty_index_read,3},                20824, 5803.559,  301.085},
  {garbage_collect,                             681,    6.568,    6.568}]}.

Showing 1.23x improvements in inserting:
** With lists
{[{{ejabberd_sm_mnesia,'-create_session/4-fun-0-',1},1001,  635.633,    5.730},
  {{ejabberd_sm_mnesia,'-create_session/4-fun-1-',1},1000,  559.658,    5.372}],
 { {mnesia,write,1},                           2001, 1195.291,   11.102},     %
 [{{mnesia,write,3},                           2001, 1184.189,   10.698}]}.
** With maps
{[{{ejabberd_sm_mnesia,'-create_session/4-fun-0-',1},1000,  522.757,    4.918},
  {{ejabberd_sm_mnesia,'-create_session/4-fun-1-',1},1000,  444.673,    4.290}],
 { {mnesia,write,1},                           2000,  967.430,    9.208},     %
 [{{mnesia,write,3},                           2000,  958.222,    8.892}]}.

Showing a 9x improvement over merging:
** With lists
{[{{ejabberd_sm_mnesia,create_session,4},      1000,  178.977,   28.998}],
 { {mongoose_session,merge_info,2},            1000,  178.977,   28.998},     %
 [{{orddict,merge,3},                          1000,   72.619,   26.883},
  {{orddict,from_list,1},                      2000,   68.903,   19.996},
  {{erlang,setelement,3},                      1000,    4.281,    4.281},
  {{orddict,to_list,1},                        1000,    4.176,    4.176}]}.
** With maps
{[{{ejabberd_sm_mnesia,create_session,4},      1000,   19.326,   11.556}],
 { {mongoose_session,merge_info,2},            1000,   19.326,   11.556},     %
 [{{maps,merge,2},                             1000,    4.192,    4.192},
  {{erlang,setelement,3},                      1000,    3.578,    3.578}]}.

========================================================================

Below, code and full given output:

- Data generation
```
random_contents(Order, Size) ->
    L = [{X, y} || X <- lists:seq(1, Size)],
    FinalList = order(Order, L),
    {FinalList, maps:from_list(FinalList)}.

order(first, L) -> [{z, z} | L];
order(last, L) -> L ++ [{z, z}];
order(random, L) -> shuffle([{z, z} | L]).

shuffle([])     -> [];
shuffle([Elem]) -> [Elem];
shuffle(List)   -> shuffle(List, length(List), []).

shuffle([], 0, Result) -> Result;
shuffle(List, Len, Result) ->
    {Elem, Rest} = nth_rest(random:uniform(Len), List),
    shuffle(Rest, Len - 1, [Elem|Result]).

nth_rest(N, List) -> nth_rest(N, List, []).

nth_rest(1, [E|List], Prefix) -> {E, Prefix ++ List};
nth_rest(N, [E|List], Prefix) -> nth_rest(N - 1, List, [E|Prefix]).
```

- Benchee benchmark
```
Benchee.run(
  %{
    "proplists" => fn {list, _} -> :proplists.get_value(:z, list) end,
    "keyfind" => fn {list, _} -> :lists.keyfind(:z, 1, list) end,
    "maps" => fn {_, map} -> :maps.get(:z, map) end
  },
  inputs: %{
    "tiny first" => :bench_erl.random_contents(:first, 3),
    "tiny last" => :bench_erl.random_contents(:last, 3),
    "tiny random" => :bench_erl.random_contents(:random, 3),
    "small first" => :bench_erl.random_contents(:first, 10),
    "small last" => :bench_erl.random_contents(:last, 10),
    "small random" => :bench_erl.random_contents(:random, 10)
    "medium first" => :bench_erl.random_contents(:first, 100),
    "medium last" => :bench_erl.random_contents(:last, 100),
    "medium random" => :bench_erl.random_contents(:random, 100),
    "large first" => :bench_erl.random_contents(:first, 1000),
    "large last" => :bench_erl.random_contents(:last, 1000),
    "large random" => :bench_erl.random_contents(:random, 1000)
  }
)
```

- Results
***** With input tiny first #####
Name                ips        average  deviation         median         99th %
keyfind         56.44 M       17.72 ns  ±2113.88%          15 ns          33 ns
maps            51.36 M       19.47 ns  ±1655.88%          18 ns          25 ns
proplists       39.91 M       25.06 ns  ±1270.28%          21 ns          44 ns

Comparison:
keyfind         56.44 M
maps            51.36 M - 1.10x slower +1.75 ns
proplists       39.91 M - 1.41x slower +7.34 ns

***** With input tiny last #####
Name                ips        average  deviation         median         99th %
keyfind         46.77 M       21.38 ns ±15573.91%          18 ns          35 ns
maps            46.58 M       21.47 ns ±15841.53%          19 ns          28 ns
proplists       14.68 M       68.13 ns  ±5534.98%          64 ns          87 ns

Comparison:
keyfind         46.77 M
maps            46.58 M - 1.00x slower +0.0892 ns
proplists       14.68 M - 3.19x slower +46.75 ns

***** With input tiny random #####
Name                ips        average  deviation         median         99th %
keyfind         53.06 M       18.85 ns  ±1802.36%          16 ns          33 ns
maps            50.92 M       19.64 ns  ±1978.64%          18 ns          27 ns
proplists       24.02 M       41.63 ns   ±914.27%          38 ns          58 ns

Comparison:
keyfind         53.06 M
maps            50.92 M - 1.04x slower +0.79 ns
proplists       24.02 M - 2.21x slower +22.78 ns

***** With input small first #####
Name                ips        average  deviation         median         99th %
keyfind         52.42 M       19.08 ns ±10211.76%          16 ns          33 ns
maps            48.32 M       20.70 ns  ±8531.67%          18 ns          31 ns
proplists       36.70 M       27.25 ns  ±6266.95%          24 ns          41 ns

Comparison:
keyfind         52.42 M
maps            48.32 M - 1.08x slower +1.62 ns
proplists       36.70 M - 1.43x slower +8.17 ns

***** With input small last #####
Name                ips        average  deviation         median         99th %
maps            46.98 M       21.29 ns  ±8269.86%          18 ns          40 ns
keyfind         31.67 M      31.58 ns  ±5654.38%          29 ns          46 ns
proplists        7.01 M      142.63 ns  ±1391.75%         139 ns         160 ns

Comparison:
maps            46.98 M
keyfind         31.67 M - 1.48x slower +10.29 ns
proplists        7.01 M - 6.70x slower +121.35 ns

***** With input small random #####
Name                ips        average  deviation         median         99th %
maps            47.88 M       20.89 ns  ±8337.35%          18 ns          31 ns
keyfind         36.55 M       27.36 ns  ±6339.40%          25 ns          42 ns
proplists        8.91 M      112.19 ns  ±1709.40%         109 ns         130 ns

Comparison:
maps            47.88 M
keyfind         36.55 M - 1.31x slower +6.47 ns
proplists        8.91 M - 5.37x slower +91.30 ns

***** With input medium first #####
Name                ips        average  deviation         median         99th %
keyfind         55.53 M       18.01 ns  ±2242.82%          16 ns          31 ns
proplists       41.04 M       24.37 ns  ±1707.53%          21 ns          41 ns
maps            24.35 M       41.07 ns  ±1082.37%          38 ns          58 ns

Comparison:
keyfind         55.53 M
proplists       41.04 M - 1.35x slower +6.36 ns
maps            24.35 M - 2.28x slower +23.06 ns

***** With input medium last #####
Name                ips        average  deviation         median         99th %
maps            25.35 M       39.44 ns  ±1081.63%          37 ns          52 ns
keyfind          6.70 M      149.32 ns   ±330.74%         144 ns         192 ns
proplists        0.95 M     1048.98 ns    ±98.44%        1028 ns        1193 ns

Comparison:
maps            25.35 M
keyfind          6.70 M - 3.79x slower +109.88 ns
proplists        0.95 M - 26.59x slower +1009.53 ns

***** With input medium random #####
Name                ips        average  deviation         median         99th %
maps            24.49 M       40.83 ns  ±1053.65%          37 ns          61 ns
keyfind         11.58 M       86.38 ns   ±537.75%          82 ns         108 ns
proplists        1.57 M      636.12 ns   ±130.81%         623 ns         847 ns

Comparison:
maps            24.49 M
keyfind         11.58 M - 2.12x slower +45.54 ns
proplists        1.57 M - 15.58x slower +595.29 ns

***** With input large first #####
Name                ips        average  deviation         median         99th %
keyfind         54.27 M       18.43 ns ±12429.51%          16 ns          31 ns
proplists       38.30 M       26.11 ns  ±7029.54%          21 ns          42 ns
maps            23.16 M       43.17 ns  ±4515.38%          40 ns          50 ns

Comparison:
keyfind         54.27 M
proplists       38.30 M - 1.42x slower +7.68 ns
maps            23.16 M - 2.34x slower +24.75 ns

***** With input large last #####
Name                ips        average  deviation         median         99th %
maps         27200.24 K      0.0368 μs  ±6462.81%      0.0340 μs      0.0440 μs
keyfind        287.78 K        3.47 μs    ±16.22%        3.40 μs        4.62 μs
proplists       97.58 K       10.25 μs    ±93.41%       10.03 μs       12.74 μs

Comparison:
maps         27200.24 K
keyfind        287.78 K - 94.52x slower +3.44 μs
proplists       97.58 K - 278.74x slower +10.21 μs

***** With input large random #####
Name                ips        average  deviation         median         99th %
maps            26.83 M       37.27 ns  ±6362.71%          34 ns          53 ns
keyfind          1.41 M      708.50 ns    ±30.78%         701 ns         789 ns
proplists        0.35 M     2886.97 ns    ±31.76%        2821 ns        3239 ns

Comparison:
maps            26.83 M
keyfind          1.41 M - 19.01x slower +671.24 ns
proplists        0.35 M - 77.47x slower +2849.71 ns
Up to now, get_sessions/0,1 returned `[ses_tuple()]`, but
get_sessions/2,3 returned `[session()]`, which to begin with seems
inconsistent, and perhaps that is the biggest benefit of this change.

Profiling data from the mongooseim_one_to_one scenario didn't really
show any difference, sometimes increases of 0.01%, sometimes decreases
of 0.01%. This is sadly expected, most often that scenario doesn't cover
the paths where this is used.
@mongoose-im
Copy link
Collaborator

mongoose-im commented Jan 29, 2021

9049.1 / Erlang 23.0.3 / small_tests / 8c2297f
Reports root / small


9049.2 / Erlang 23.0.3 / internal_mnesia / 8c2297f
Reports root/ big
OK: 1502 / Failed: 0 / User-skipped: 161 / Auto-skipped: 0


9049.3 / Erlang 23.0.3 / odbc_mssql_mnesia / 8c2297f
Reports root/ big
OK: 2770 / Failed: 0 / User-skipped: 229 / Auto-skipped: 0


9049.4 / Erlang 23.0.3 / mysql_redis / 8c2297f
Reports root/ big
OK: 2765 / Failed: 0 / User-skipped: 234 / Auto-skipped: 0


9049.6 / Erlang 23.0.3 / ldap_mnesia / 8c2297f
Reports root/ big
OK: 1404 / Failed: 0 / User-skipped: 259 / Auto-skipped: 0


9049.5 / Erlang 23.0.3 / riak_mnesia / 8c2297f
Reports root/ big
OK: 1628 / Failed: 0 / User-skipped: 181 / Auto-skipped: 0


9049.7 / Erlang 23.0.3 / elasticsearch_and_cassandra_mnesia / 8c2297f
Reports root/ big
OK: 331 / Failed: 0 / User-skipped: 38 / Auto-skipped: 0


9049.9 / Erlang 22.3 / pgsql_mnesia / 8c2297f
Reports root/ big / small
OK: 2783 / Failed: 0 / User-skipped: 216 / Auto-skipped: 0

Copy link
Member

@chrzaszcz chrzaszcz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good changes! A few comments from me.

src/mod_carboncopy.erl Outdated Show resolved Hide resolved
src/ejabberd_sm.erl Outdated Show resolved Hide resolved
Copy link
Member

@chrzaszcz chrzaszcz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good changes!

@chrzaszcz chrzaszcz merged commit 135fec2 into master Mar 2, 2021
@chrzaszcz chrzaszcz deleted the mongoose_session branch March 2, 2021 08:34
@leszke leszke added this to the 4.2.0 milestone Apr 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants