feat: report daily participants to public stats #133

bajtos · 2024-01-17T16:13:19Z

Create a new spark_stats table to keep track of daily participants:

SELECT day::TEXT, COUNT(DISTINCT participant_id) as count
FROM daily_participants GROUP BY day;

This table allows us to correctly calculate monthly participants too:

SELECT
  date_trunc('month', day)::DATE::TEXT as month,
  COUNT(DISTINCT participant_id) as count
FROM daily_participants
GROUP BY month;

TODO:

add tests

Links:

More evaluation stats & visualisations & alerts #79

Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>

juliangruber

..otherwise LGTM!

juliangruber · 2024-01-17T16:17:04Z

lib/public-stats.js

+ */
+const updateDailyParticipants = async (pgClient, participants) => {
+  debug('Updating daily participants (%s seen)', participants.length)
+  for (const participantAddress of participants) {


Performance wise, are we ok with running 2 SQL queries for every participant, in series? And if it fails midway, will we have inconsistent data? Playing devil's advocate here

Great questions! 💯

I'll think about this tomorrow.

Yeah, I was focused so much on storage efficiency and the querying side, that I completely neglected the writing side.

With ~2k participants per round, my current implementation would run 4k SQL queries. That would take forever to complete.

Thanks for flagging this early! 😍

Cool, I found a neat trick: We can use SELECT UNNEST($1::TEXT[]) to build a query that accepts a JavaScript array parameter and converts the items into result rows.

Then I can change INSERT...VALUES to INSERT...SELECT to leverage this mechanism.

The changes were a bit more involved (see 7a2d14d), but I expect the performance to be excellent.

Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>

juliangruber

Great work 😍

feat: report daily participants to public stats

27b26cf

Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>

bajtos requested a review from juliangruber January 17, 2024 16:13

juliangruber approved these changes Jan 17, 2024

View reviewed changes

bajtos added 2 commits January 18, 2024 08:40

Merge branch 'main' into feat-daily-participants

830c028

fixup! add tests + optimise performance

7a2d14d

Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>

bajtos requested a review from juliangruber January 18, 2024 08:56

bajtos marked this pull request as ready for review January 18, 2024 08:59

bajtos mentioned this pull request Jan 18, 2024

feat: daily & monthly participants filecoin-station/spark-stats#10

Merged

juliangruber approved these changes Jan 18, 2024

View reviewed changes

bajtos merged commit 045ff21 into main Jan 18, 2024
5 checks passed

bajtos deleted the feat-daily-participants branch January 18, 2024 12:36

bajtos mentioned this pull request Apr 15, 2024

Feat: Created daily_node_metrics table and received/stored station_id #188

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: report daily participants to public stats #133

feat: report daily participants to public stats #133

bajtos commented Jan 17, 2024 •

edited

Loading

juliangruber left a comment

juliangruber Jan 17, 2024 •

edited

Loading

bajtos Jan 17, 2024

bajtos Jan 18, 2024

bajtos Jan 18, 2024

juliangruber left a comment

feat: report daily participants to public stats #133

feat: report daily participants to public stats #133

Conversation

bajtos commented Jan 17, 2024 • edited Loading

juliangruber left a comment

Choose a reason for hiding this comment

juliangruber Jan 17, 2024 • edited Loading

Choose a reason for hiding this comment

bajtos Jan 17, 2024

Choose a reason for hiding this comment

bajtos Jan 18, 2024

Choose a reason for hiding this comment

bajtos Jan 18, 2024

Choose a reason for hiding this comment

juliangruber left a comment

Choose a reason for hiding this comment

bajtos commented Jan 17, 2024 •

edited

Loading

juliangruber Jan 17, 2024 •

edited

Loading