Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat_: use content-topic override for all community filters #5993

Open
wants to merge 1 commit into
base: feat/comm-content-topic-poc
Choose a base branch
from

Conversation

chaitanyaprem
Copy link
Contributor

@chaitanyaprem chaitanyaprem commented Oct 25, 2024

This is second phase of content-topic changes for communities where single content-topic is used for sending as well as receiving. This is follow-up change of #5864 and should be released only once all users have migrated to a version using #5864 as this change breaks compatibility for communities and users on any version before #5864.

Important changes:

  • Both send and receive in communities is done via single content-topic
  • Basic validation of community messaging with phase-1 changes
  • validation of non community messaging with 2.31 and phase-1 changes - tested by kicking a user from community and all instances are updated.
  • Dogfooding by other team members
  • All community feature validation by QA
  • If possible , release phase-2 as well along with phase-1 changes for new communities that would be created . But rollout 2 phases separately for existing communities as being discussed here [Deliverable] Review usage of content topics in Status Communities protocol waku-org/pm#268 (comment). @osmaczko would it be as simple as adding a communityVersion field in communityDescription? Wondering if all existing communities can be labelled as version 1 and communities created after this change be version 2 and use the version to determine how content-topics shall be used.

@status-im-auto
Copy link
Member

status-im-auto commented Oct 25, 2024

Jenkins Builds

Click to see older builds (128)
Commit #️⃣ Finished (UTC) Duration Platform Result
✔️ f3352d2 #1 2024-10-25 10:07:21 ~4 min ios 📦zip
✔️ f3352d2 #1 2024-10-25 10:08:07 ~5 min linux 📦zip
✖️ f3352d2 #1 2024-10-25 10:08:59 ~6 min tests-rpc 📄log
✔️ f3352d2 #1 2024-10-25 10:09:04 ~6 min android 📦aar
✖️ f3352d2 #1 2024-10-25 10:36:11 ~33 min tests 📄log
✔️ f803523 #2 2024-10-29 11:08:30 ~2 min android 📦aar
✔️ f803523 #2 2024-10-29 11:08:46 ~2 min linux 📦zip
✔️ f803523 #2 2024-10-29 11:09:13 ~3 min ios 📦zip
✖️ f803523 #2 2024-10-29 11:12:17 ~6 min tests-rpc 📄log
✖️ f803523 #3 2024-10-29 11:37:24 ~2 min tests-rpc 📄log
✖️ f803523 #2 2024-10-29 11:40:03 ~34 min tests 📄log
✔️ f028f84 #3 2024-10-29 12:22:42 ~3 min ios 📦zip
✔️ f028f84 #3 2024-10-29 12:24:53 ~5 min linux 📦zip
✔️ f028f84 #3 2024-10-29 12:25:35 ~5 min android 📦aar
✖️ f028f84 #4 2024-10-29 12:27:01 ~7 min tests-rpc 📄log
✖️ f028f84 #3 2024-10-29 12:59:16 ~39 min tests 📄log
✔️ cb59e06 #4 2024-10-30 04:28:07 ~3 min android 📦aar
✔️ cb59e06 #4 2024-10-30 04:28:22 ~3 min linux 📦zip
✔️ cb59e06 #5 2024-10-30 04:29:23 ~4 min tests-rpc 📄log
✔️ cb59e06 #1 2024-10-30 04:29:31 ~4 min macos 📦zip
✔️ cb59e06 #4 2024-10-30 04:30:25 ~5 min ios 📦zip
✔️ cb59e06 #1 2024-10-30 04:32:45 ~8 min macos 📦zip
✖️ cb59e06 #1 2024-10-30 04:35:56 ~11 min windows 📦zip
✖️ cb59e06 #2 2024-10-30 04:53:15 ~10 min windows 📦zip
✔️ cb59e06 #4 2024-10-30 04:58:23 ~33 min tests 📄log
✔️ 54b1cd7 #6 2024-11-01 08:27:44 ~2 min tests-rpc 📄log
✔️ 54b1cd7 #2 2024-11-01 08:30:41 ~5 min macos 📦zip
✔️ 54b1cd7 #5 2024-11-01 08:30:54 ~5 min linux 📦zip
✔️ 54b1cd7 #5 2024-11-01 08:31:16 ~5 min android 📦aar
✔️ 54b1cd7 #5 2024-11-01 08:31:38 ~6 min ios 📦zip
✔️ 54b1cd7 #2 2024-11-01 08:35:29 ~10 min macos 📦zip
✖️ 54b1cd7 #3 2024-11-01 08:36:51 ~11 min windows 📦zip
✔️ 54b1cd7 #5 2024-11-01 08:59:53 ~34 min tests 📄log
✔️ 7964e8e #6 2024-12-04 10:55:42 ~5 min linux 📦zip
✔️ 7964e8e #4 2024-12-04 10:56:49 ~6 min windows 📦zip
✔️ 7964e8e #3 2024-12-04 10:57:05 ~6 min macos 📦zip
✔️ 7964e8e #7 2024-12-04 10:57:22 ~7 min tests-rpc 📄log
✔️ 7964e8e #6 2024-12-04 10:57:32 ~7 min android 📦aar
✔️ 7964e8e #6 2024-12-04 10:57:54 ~7 min ios 📦zip
✔️ 7964e8e #3 2024-12-04 11:00:34 ~10 min macos 📦zip
✔️ 7964e8e #6 2024-12-04 11:21:38 ~31 min tests 📄log
✔️ 5c3e122 #5 2024-12-04 11:01:09 ~4 min windows 📦zip
✔️ 5c3e122 #7 2024-12-04 11:03:27 ~7 min linux 📦zip
✔️ 5c3e122 #7 2024-12-04 11:03:50 ~6 min android 📦aar
✔️ 5c3e122 #8 2024-12-04 11:03:52 ~6 min tests-rpc 📄log
✔️ 5c3e122 #4 2024-12-04 11:05:17 ~8 min macos 📦zip
✔️ 5c3e122 #7 2024-12-04 11:06:05 ~8 min ios 📦zip
✔️ 5c3e122 #4 2024-12-04 11:09:22 ~8 min macos 📦zip
✔️ 5c3e122 #7 2024-12-04 11:52:13 ~30 min tests 📄log
✔️ 6219ad7 #8 2024-12-09 13:38:01 ~5 min linux 📦zip
✔️ 6219ad7 #8 2024-12-09 13:38:15 ~5 min android 📦aar
✔️ 6219ad7 #6 2024-12-09 13:38:23 ~5 min windows 📦zip
✔️ 6219ad7 #9 2024-12-09 13:39:00 ~6 min tests-rpc 📄log
✔️ 6219ad7 #8 2024-12-09 13:39:08 ~6 min ios 📦zip
✔️ 6219ad7 #5 2024-12-09 13:42:50 ~10 min macos 📦zip
✔️ 6219ad7 #5 2024-12-09 13:45:16 ~12 min macos 📦zip
✖️ 6219ad7 #8 2024-12-09 14:02:06 ~29 min tests 📄log
✔️ c0ccc28 #7 2024-12-09 13:42:29 ~4 min windows 📦zip
✔️ c0ccc28 #9 2024-12-09 13:43:43 ~5 min linux 📦zip
✔️ c0ccc28 #9 2024-12-09 13:44:04 ~5 min android 📦aar
✔️ c0ccc28 #10 2024-12-09 13:45:09 ~5 min tests-rpc 📄log
✔️ c0ccc28 #9 2024-12-09 13:46:03 ~6 min ios 📦zip
✔️ c0ccc28 #6 2024-12-09 13:51:34 ~8 min macos 📦zip
✔️ c0ccc28 #6 2024-12-09 13:53:40 ~8 min macos 📦zip
✔️ c0ccc28 #9 2024-12-09 14:31:27 ~29 min tests 📄log
✔️ 00ea105 #8 2024-12-10 07:28:59 ~4 min windows 📦zip
✔️ 00ea105 #10 2024-12-10 07:29:51 ~5 min android 📦aar
✔️ 00ea105 #10 2024-12-10 07:30:14 ~6 min linux 📦zip
✔️ 00ea105 #10 2024-12-10 07:31:04 ~6 min ios 📦zip
✔️ 00ea105 #11 2024-12-10 07:31:19 ~7 min tests-rpc 📄log
✔️ 00ea105 #7 2024-12-10 07:33:42 ~9 min macos 📦zip
✔️ 00ea105 #8 2024-12-10 07:52:27 ~11 min macos 📦zip
✖️ 00ea105 #10 2024-12-10 07:54:57 ~30 min tests 📄log
✖️ eba057f #11 2024-12-11 07:10:08 ~3 min tests 📄log
✔️ eba057f #9 2024-12-11 07:11:01 ~4 min windows 📦zip
✔️ eba057f #11 2024-12-11 07:12:31 ~6 min linux 📦zip
✔️ eba057f #11 2024-12-11 07:12:33 ~6 min android 📦aar
✔️ eba057f #11 2024-12-11 07:13:12 ~6 min ios 📦zip
✔️ eba057f #12 2024-12-11 07:13:23 ~7 min tests-rpc 📄log
✔️ eba057f #9 2024-12-11 07:16:15 ~9 min macos 📦zip
✔️ eba057f #8 2024-12-11 07:16:37 ~10 min macos 📦zip
✖️ c28eb07 #12 2024-12-11 07:18:01 ~3 min tests 📄log
✔️ c28eb07 #10 2024-12-11 07:18:39 ~4 min windows 📦zip
✔️ c28eb07 #13 2024-12-11 07:20:35 ~6 min tests-rpc 📄log
✔️ c28eb07 #12 2024-12-11 07:20:39 ~6 min linux 📦zip
✔️ c28eb07 #12 2024-12-11 07:21:02 ~6 min android 📦aar
✔️ c28eb07 #12 2024-12-11 07:21:36 ~7 min ios 📦zip
✔️ c28eb07 #10 2024-12-11 07:24:51 ~8 min macos 📦zip
✔️ c28eb07 #9 2024-12-11 07:27:03 ~10 min macos 📦zip
✔️ 0447b52 #11 2024-12-11 07:30:58 ~4 min windows 📦zip
✔️ 0447b52 #13 2024-12-11 07:32:26 ~5 min linux 📦zip
✔️ 0447b52 #13 2024-12-11 07:33:07 ~6 min android 📦aar
✔️ 0447b52 #14 2024-12-11 07:33:16 ~6 min tests-rpc 📄log
✔️ 0447b52 #13 2024-12-11 07:33:46 ~7 min ios 📦zip
✔️ 0447b52 #11 2024-12-11 07:34:32 ~7 min macos 📦zip
✔️ 0447b52 #10 2024-12-11 07:37:16 ~10 min macos 📦zip
✔️ 0447b52 #13 2024-12-11 07:56:40 ~30 min tests 📄log
✔️ bc63806 #12 2024-12-11 07:45:04 ~4 min windows 📦zip
✔️ bc63806 #14 2024-12-11 07:46:36 ~5 min linux 📦zip
✔️ bc63806 #14 2024-12-11 07:47:06 ~6 min android 📦aar
✔️ bc63806 #15 2024-12-11 07:47:24 ~6 min tests-rpc 📄log
✔️ bc63806 #14 2024-12-11 07:47:39 ~6 min ios 📦zip
✔️ bc63806 #12 2024-12-11 07:49:39 ~8 min macos 📦zip
✔️ bc63806 #11 2024-12-11 07:51:15 ~10 min macos 📦zip
✔️ b247503 #13 2024-12-11 07:49:13 ~4 min windows 📦zip
✔️ b247503 #15 2024-12-11 07:52:19 ~5 min linux 📦zip
✔️ b247503 #15 2024-12-11 07:53:10 ~5 min android 📦aar
✔️ b247503 #16 2024-12-11 07:53:59 ~6 min tests-rpc 📄log
✔️ b247503 #15 2024-12-11 07:55:06 ~7 min ios 📦zip
✔️ b247503 #13 2024-12-11 07:56:38 ~6 min macos 📦zip
✖️ b247503 #14 2024-12-11 07:59:45 ~2 min tests 📄log
✔️ b247503 #12 2024-12-11 08:01:43 ~10 min macos 📦zip
✔️ 654cd34 #14 2024-12-11 08:42:57 ~4 min windows 📦zip
✔️ 654cd34 #13 2024-12-11 08:43:32 ~4 min macos 📦zip
✔️ 654cd34 #16 2024-12-11 08:45:05 ~6 min android 📦aar
✔️ 654cd34 #17 2024-12-11 08:45:25 ~6 min tests-rpc 📄log
✔️ 654cd34 #16 2024-12-11 08:45:26 ~6 min ios 📦zip
✔️ 654cd34 #16 2024-12-11 08:45:42 ~6 min linux 📦zip
✔️ 654cd34 #14 2024-12-11 08:45:47 ~7 min macos 📦zip
✔️ 654cd34 #15 2024-12-11 09:08:05 ~29 min tests 📄log
✔️ cbaee58 #15 2024-12-12 04:51:16 ~4 min windows 📦zip
✔️ cbaee58 #14 2024-12-12 04:51:25 ~4 min macos 📦zip
✔️ cbaee58 #17 2024-12-12 04:51:32 ~4 min ios 📦zip
✔️ cbaee58 #17 2024-12-12 04:52:46 ~5 min linux 📦zip
✔️ cbaee58 #17 2024-12-12 04:52:56 ~5 min android 📦aar
✔️ cbaee58 #18 2024-12-12 04:53:30 ~6 min tests-rpc 📄log
✔️ cbaee58 #15 2024-12-12 04:56:51 ~9 min macos 📦zip
✖️ cbaee58 #16 2024-12-12 05:15:56 ~28 min tests 📄log
Commit #️⃣ Finished (UTC) Duration Platform Result
✔️ 60a8ed8 #16 2024-12-12 04:55:24 ~4 min windows 📦zip
✔️ 60a8ed8 #15 2024-12-12 04:55:58 ~4 min macos 📦zip
✔️ 60a8ed8 #18 2024-12-12 04:56:12 ~4 min ios 📦zip
✔️ 60a8ed8 #18 2024-12-12 04:58:19 ~5 min linux 📦zip
✔️ 60a8ed8 #18 2024-12-12 04:58:47 ~5 min android 📦aar
✔️ 60a8ed8 #19 2024-12-12 04:59:55 ~6 min tests-rpc 📄log
✔️ 60a8ed8 #16 2024-12-12 05:05:21 ~8 min macos 📦zip
✔️ 60a8ed8 #17 2024-12-12 05:47:50 ~31 min tests 📄log
✔️ 0073156 #17 2024-12-12 06:37:50 ~4 min windows 📦zip
✔️ 0073156 #16 2024-12-12 06:38:08 ~4 min macos 📦zip
✔️ 0073156 #19 2024-12-12 06:38:14 ~4 min ios 📦zip
✔️ 0073156 #19 2024-12-12 06:39:05 ~5 min linux 📦zip
✔️ 0073156 #19 2024-12-12 06:39:12 ~5 min android 📦aar
✔️ 0073156 #20 2024-12-12 06:40:05 ~6 min tests-rpc 📄log
✔️ 0073156 #17 2024-12-12 06:46:34 ~12 min macos 📦zip
✔️ 0073156 #18 2024-12-12 07:04:18 ~30 min tests 📄log

Copy link

codecov bot commented Oct 25, 2024

Codecov Report

Attention: Patch coverage is 0% with 4 lines in your changes missing coverage. Please review.

Project coverage is 61.24%. Comparing base (ae2cada) to head (0073156).

Files with missing lines Patch % Lines
protocol/messenger_filter_init.go 0.00% 4 Missing ⚠️
Additional details and impacted files
@@                       Coverage Diff                       @@
##           feat/comm-content-topic-poc    #5993      +/-   ##
===============================================================
- Coverage                        61.40%   61.24%   -0.17%     
===============================================================
  Files                              833      833              
  Lines                           109934   109885      -49     
===============================================================
- Hits                             67506    67294     -212     
- Misses                           34558    34678     +120     
- Partials                          7870     7913      +43     
Flag Coverage Δ
functional 19.57% <0.00%> (-0.01%) ⬇️
unit 59.93% <0.00%> (-0.18%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
protocol/messenger_communities.go 52.99% <ø> (-0.60%) ⬇️
protocol/messenger_filter_init.go 56.34% <0.00%> (+1.12%) ⬆️

... and 47 files with indirect coverage changes

Copy link
Contributor

@osmaczko osmaczko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chaitanyaprem can you please rebase, the diff seems wrong, thanks 🙏

@chaitanyaprem chaitanyaprem force-pushed the feat/comm-content-topic-ph2 branch from cb59e06 to 54b1cd7 Compare November 1, 2024 08:25
@chaitanyaprem chaitanyaprem force-pushed the feat/comm-content-topic-poc branch 2 times, most recently from b5315d5 to 103b415 Compare November 1, 2024 11:43
@chaitanyaprem chaitanyaprem force-pushed the feat/comm-content-topic-poc branch 2 times, most recently from b4b929f to e8877b1 Compare December 4, 2024 10:14
@chaitanyaprem chaitanyaprem force-pushed the feat/comm-content-topic-ph2 branch 4 times, most recently from 6219ad7 to c0ccc28 Compare December 9, 2024 13:33
@chaitanyaprem chaitanyaprem marked this pull request as ready for review December 9, 2024 13:33
@chaitanyaprem chaitanyaprem force-pushed the feat/comm-content-topic-poc branch from eedb0da to e49e585 Compare December 10, 2024 07:09
@chaitanyaprem chaitanyaprem force-pushed the feat/comm-content-topic-ph2 branch from c0ccc28 to 00ea105 Compare December 10, 2024 07:23
@chaitanyaprem chaitanyaprem force-pushed the feat/comm-content-topic-ph2 branch from 00ea105 to eba057f Compare December 11, 2024 07:05
@chaitanyaprem chaitanyaprem force-pushed the feat/comm-content-topic-ph2 branch 5 times, most recently from b247503 to 654cd34 Compare December 11, 2024 08:38
@chaitanyaprem chaitanyaprem force-pushed the feat/comm-content-topic-ph2 branch from 654cd34 to cbaee58 Compare December 12, 2024 04:46
@chaitanyaprem chaitanyaprem force-pushed the feat/comm-content-topic-ph2 branch from cbaee58 to 60a8ed8 Compare December 12, 2024 04:47
@chaitanyaprem chaitanyaprem force-pushed the feat/comm-content-topic-poc branch from 342a0da to ae2cada Compare December 12, 2024 06:09
@chaitanyaprem chaitanyaprem force-pushed the feat/comm-content-topic-ph2 branch from 60a8ed8 to 0073156 Compare December 12, 2024 06:33
@chaitanyaprem
Copy link
Contributor Author

I have done basic testing of this PR locally. This needs to be QA tested.

Not sure what is the release timeline this has to be included in.

Since this breaks compatibility with any code before #5864 (probably planned for 2.33), i will leave it upto desktop and mobile teams to decide when to include this. As community chats will not interop with any code before #5864 .

@iurimatias and @ilmotta

@ilmotta
Copy link
Contributor

ilmotta commented Dec 16, 2024

Since this breaks compatibility with any code before #5864 (probably planned for 2.33), i will leave it upto desktop and mobile teams to decide when to include this.

This is follow-up change of #5864 and should be released only once all users have migrated to a version using #5864 as this change breaks compatibility for communities and users on any version before #5864.

Perhaps more than when to include the changes from this PR (2.33, 2.34, etc), I'm thinking about how to include these changes in a way that minimizes impact to users of Status.

@fryorcraken: in the forum you asked about the deprecation policy for Status. I've never heard of such a policy. In a discussion not so long ago (can't remember if it was about single content topic), I remember @osmaczko suggested the Status app could deprecate XYZ based on % of usage, not on arbitrary release versions/steps. This gradual rollout strategy is what I tend to prefer if feasible so that users can plan when they upgrade, not us CCs. Up to a point of course because at a certain low threshold of usage the maintenance cost of multiple implementations becomes hard to justify.

@chaitanyaprem in this forum post you shared a potential solution where a message in the community could be broadcast, asking users to upgrade. Another person suggests We can then warn users who still publish to old content topics for several versions before the compatibility break is effected. Both ideas seem simple and could work well 👍🏼 @osmaczko @iurimatias what do you think should be the migration plan for introducing the changes from phase 2 in Status?

@fryorcraken
Copy link

fryorcraken commented Dec 16, 2024

@ilmotta

Status app could deprecate XYZ based on % of usage, not on arbitrary release versions/steps.

How would you determine the current % usage?

How are we supposed to improved protocol behaviour, ie, reliability and scalability, if we are enable to properly plan breaking change upgrades?

I've never heard of such a policy.

All major software have a way to deprecate old versions.

https://nodejs.org/en/about/previous-releases: unsupported and LTS
https://wiki.ubuntu.com/Releases: end of standard support and LTS

For centralized mobile app, the app will often force you to upgrade to continue using. We are not talking about something like that, but we are talking about breaking compatibility between 2 specific versions. Users can keep the old versions, as long as their friends and community do,

@ilmotta
Copy link
Contributor

ilmotta commented Dec 17, 2024

How would you determine the current % usage?

@fryorcraken The first step is to determine whether gradual phasing out or rollout is something we want in Status. I haven't thought about the how, and I'm probably not the best person to answer this question in detail either.

For mobile apps, most Android users are automatically tracked by Google, and we can view the number of devices using any specific Status version as a time series in the developer portal. Since Android users dominate most of Status's metrics, this data can serve as a strong indicator for deciding when to roll out breaking changes.

For Desktop users, it's more challenging (for good reasons) as there's no entity automatically tracking usage. For iOS, approximately 25% of users of Status allow Apple to collect usage data, such as the app version. Additionally, some users opt in to help Status with analytics, which provides further insights into the usage of different versions.

Another way to understand usage is by leveraging the Status proxy. The proxy is a centralized piece of software, like many others, and we could use it to track the app version to precisely identify which versions are in use. I say precisely because, unfortunately, the proxy is currently mandatory. Hopefully, in the not so distant future, we'll give users the option to disable the proxy and choose their own RPCs, similar to what Metamask allows.

Therefore, as we can see, there are ways at the Status app level to analyze and track usage by version over time with a decent degree of certainty. To clarify, by % of usage I mean usage of a specific app version, not feature usage. We can analyse usage of versions prior to releasing to users the changes from PR #5864.

All major software have a way to deprecate old versions.

I was referring to your question in the forum @fryorcraken can please clarify the deprecation policy for Status apps? If not existing, can this please defined? Of course I understand all major software has deprecation strategies defined :) In Status, I believe this policy doesn't exist, so I think there's no clear definition yet. Your question in the forum was left answered, so I tried to bring it back here in context of this PR where the breakage will be introduced.

The gradual rollout is my preferred option when available and when meeting business and development criteria. If we want to urgently introduce breakage, then it may not be the best solution.

How are we supposed to improved protocol behaviour, ie, reliability and scalability, if we are enable to properly plan breaking change upgrades?

I'm not sure I understand this question. Gradual rollouts allow changes in the protocol as well, at the expense of (maybe) added complexity and time, but more respectful to users. There are tradeoffs, but it's not an impediment to improvement. In Status mobile we know for sure the majority auto-upgrade the app. This means we might not have to wait too much to more safely introduce breakage.

I would be fine introducing breakage in 2.33 if the majority wants to go in that direction.

Seems like an interesting topic for one of our weekly calls or sync calls between teams (and including the @status-im/status-go-guild).

@fryorcraken
Copy link

fryorcraken commented Dec 17, 2024

Indeed, the question how to handle breaking upgrades in the forum has remained unanswered. Thank you for re-opening the discussion in the context of this PR.

I'm not sure I understand this question. Gradual rollouts allow changes in the protocol as well, at the expense of (maybe) added complexity and time, but more respectful to users.

I agree with gradual roll outs, as described in the forum post you quoted.

What I am worried about, is undefined timelines. Because this not only create complexity, debt, and makes roadmap/release planning harder. But it also makes managing user expectations more difficult.

Having a guideline such as:
"Backward compatibility guarantees are waived 6 months after a given release".

Makes it clear that:

  • a user with a version older than 6 months may not be able to interact with user that have the latest app
  • When improving a protocol, we have a 6 months transition period at most.

In the context of this change, moving communities to a single content topic has 3 main benefits:

  1. Increase store performance by reducing complexity and number of store queries
  2. Increase filter performance by reducing complexity and number of subscriptions
  3. Decrease of technical debt, allowing more flexibility and option around community upgrades.

The first phase is implements sending messages on a unique content topic. But we keep expecting messages on all content topics.

Which means none of those benefits are delivered at present. Only once we can break backward compatibility can the users benefit of this change.

Waiting for (let's say) at least 90% of users to upgrade to a version that has phase 1 before we go for phase 2 means:

  • I cannot plan phase 2 work, until we reach 90%.
  • We cannot provide general guideline to users, and have to rely to more complex communication such as "2.40 is not compatible with 2.33 and before" for each release. Instead of a clear "6 months old -> no guarantees of compatibility".
  • I cannot plan any change that build on this, until we reached the agreed user %.

Finally, as you mentioned on android you have an auto-upgrade path. Which means that android usage numbers are likely to be more optimistic that desktop numbers. So we might suprise some desktop users that do not pay attention because a specific version number breaks compatibility.

A consistent messages is more likely to keen users better informed.

@ilmotta
Copy link
Contributor

ilmotta commented Dec 17, 2024

Thanks @fryorcraken, you cleared up some things for me.

Which means none of those benefits are delivered at present. Only once we can break backward compatibility can the users benefit of this change.

My prior understanding was that we could support both solutions and still benefit the network because the majority would likely upgrade within a reasonable timeframe, and therefore we would be able to wait for at least a few months and likely break less users. If we can only realize the benefits all at once then there's little benefit to break based on usage.

I cannot plan phase 2 work, until we reach 90%.

I agree planning becomes less predictable, but the benefit is supposedly for users, not us. Breaking only with hard deadlines or hard release numbers is unquestionably simpler and cheaper for us.

A gradual roll out does not mean deadlines can't be established. There are different ways to define the threshold to break. We can roll out gradually but establish a hard 6 month break period. Meaning, Status will break as fast as possible based on usage at X%, and definitely break in 6 months or any other period we think is reasonable for the breakage in question. This is more flexible than only the deadline/version model, albeit with added planning complexity as you say.

We cannot provide general guideline to users, and have to rely to more complex communication such as "2.40 is not compatible with 2.33 and before" for each release. Instead of a clear "6 months old -> no guarantees of compatibility".

Depends on the user's preference a bit. A user may be fine with a slightly more convoluted explanation, in exchange for more time to adjust. A more technical user may even like this strategy because it may imply the software breaks closer to the theoretical best. But of course I don't know what the majority of users of Status prefer. That's one reason I lean on the side of caution when I read suggestions to break at Status version X without evidence to support the choice except our own cost and product requirements (which to be fair are plenty important).

So we might suprise some desktop users that do not pay attention because a specific version number breaks compatibility.

Indeed, this would always be a risk. Maybe too much of a risk if the breakage is introduced too fast based on Mobile metrics. To some extent, we can make good guesses about Desktop versions because some users do enable analytics and we have the Status proxy to cross-reference data and guess opt-in rate. Even if we make the proxy optional, many users would likely enable the proxy in order to have a good out of the box experience, especially for the wallet features.


Having said all that @fryorcraken, I agree with you on your recommendation to go for a simpler approach for phase 2. I would still like to hear other folks' opinions about when to best merge this PR into mainline cc @iurimatias. If that's 2.33 then we and Comms should probably start to communicate in January because 2.33 may be released still in February and we want users to have time to plan their upgrades. Might be better to go for 2.34 instead.

@osmaczko
Copy link
Contributor

Indeed, the question how to handle breaking upgrades in the forum has remained unanswered. Thank you for re-opening the discussion in the context of this PR.

I believe @Samyoul provided insight into how it has been done so far, but under a different post: https://forum.vac.dev/t/breaking-changes-and-roll-out-strategies/338/3.

How would you determine the current % usage?

In the context of this breaking change, I believe it is technically feasible, though not entirely straightforward. We could monitor all public channel communication propagated through waku and aggregate it by user (public key) and content topic (whether legacy or new). However, this would only provide an estimate, as it would not account for users communicating exclusively in private channels.

A gradual roll out does not mean deadlines can't be established. There are different ways to define the threshold to break. We can roll out gradually but establish a hard 6 month break period. Meaning, Status will break as fast as possible based on usage at X%, and definitely break in 6 months or any other period we think is reasonable for the breakage in question. This is more flexible than only the deadline/version model, albeit with added planning complexity as you say.

+1 I'll try to formalize it below.

Step App Version Read Write
Initial N old protocol old protocol
Preparation N+1 old and new protocol old protocol
Switch N+x old and new protocol new protocol
Clean-up N+y new protocol new protocol

where y>x.

Switch to N+x when:
time(N+x)-time(N) >= A months OR
usage of N+1 >= B% OR
(x-1) >= C (version gap threshold)

Switch to N+y when:
time(N+y)-time(N+1) >= A months OR
usage of N+x >= B% OR
(y-x) >= C (version gap threshold)

Proposed parameters: A=6 months, B=90%, C=2 version gap threshold.

I added another condition: the version gap threshold. I’m not sure if this is something we want. It would allow us to bypass the A-month period when usage statistics cannot be gathered reliably, while still providing users with C intermediary releases for a safe migration. However, it adds complexity to the process, so it might be better to rely solely on the first two conditions and consider lowering the value of A.

@fryorcraken
Copy link

cc> +1 I'll try to formalize it below.

Step App Version Read Write
Initial N old protocol old protocol
Preparation N+1 old and new protocol old protocol
Switch N+x old and new protocol new protocol
Clean-up N+y new protocol new protocol

where y>x.

Switch to N+x when: time(N+x)-time(N) >= A months OR usage of N+1 >= B% OR (x-1) >= C (version gap threshold)

Switch to N+y when: time(N+y)-time(N+1) >= A months OR usage of N+x >= B% OR (y-x) >= C (version gap threshold)

Proposed parameters: A=6 months, B=90%, C=2 version gap threshold.

I added another condition: the version gap threshold. I’m not sure if this is something we want. It would allow us to bypass the A-month period when usage statistics cannot be gathered reliably, while still providing users with C intermediary releases for a safe migration. However, it adds complexity to the process, so it might be better to rely solely on the first two conditions and consider lowering the value of A.

Thanks @osmaczko, looks reasonable.

I suggest for Status lead to align on this and provide an official guideline/response. The forum thread you mentioned would be a good spot.

cc @sunleos

@ilmotta
Copy link
Contributor

ilmotta commented Dec 18, 2024

I suggest for Status lead to align on this and provide an official guideline/response. The forum thread you mentioned would be a good spot.

@fryorcraken @sunleos We have two definitions to eventually align on, one is about how Waku breaking changes should be introduced into Status, and the other is about how Status handles deprecation of anything. Sometimes the line may be blurry because general recommendations may encompass Waku breaking changes, but not necessarily the opposite.

In the context of this breaking change, I believe it is technically feasible, though not entirely straightforward. We could monitor all public channel communication propagated through waku and aggregate it by user (public key) and content topic (whether legacy or new). However, this would only provide an estimate, as it would not account for users communicating exclusively in private channels. However, this would only provide an estimate, as it would not account for users communicating exclusively in private channels.

👍🏼 @osmaczko The combination of other application-level metrics alongside data from Waku will allow us to minimize the impact on users. The app-level metrics are readily available already.

Since we are going for a more precise definition with a formula, we may as well include in the usage percentage a margin for error, and the margin would depend on the data source. Sometimes, for instance, if the breaking change isn't related to Waku and only affects iOS users, and we know only 25% of the user base opted in App Store Connect, we would need to be more careful with the parameters.

@fryorcraken
Copy link

Waku breaking changes

So we are clear, there are no breaking changes coming from Waku at any time soon.
We are talking about breaking changes in the Status chat protocols: Status Communities and Status 1:1 chats.

@ilmotta
Copy link
Contributor

ilmotta commented Dec 20, 2024

Waku breaking changes

So we are clear, there are no breaking changes coming from Waku at any time soon. We are talking about breaking changes in the Status chat protocols: Status Communities and Status 1:1 chats.

Yes, we are clear @fryorcraken. The conversation was never about Waku protocols since we are talking about community-level changes, but from now on I'll try to be more specific and avoid generically saying "Waku". Re-reading now I see how that can be confusing to you and other Waku CCs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants