[FIX] Corrected timefield comparison for message retention policy cron #11725

TheReal1604 · 2018-08-09T13:32:14Z

@vynmera I think it would be better to compare the room "ts" timefield not the _updatedAt field for various reasons:

If a users change the name or description of the channel the _updatedAt field is touched and the messages in this channel gets not cleared

I found this issue after enabling the retention policy for our system with 365 days retention time - but it doesnt clean any messages. If I take the manual wizard it is cleaning messages.

The ground zero for this is, we migrated our instance to LDAP and renamed ALL user accounts - this results in that the _updatedAt field of all channels got updated - unfortunately this timefield is now not older than the 365 days, but messages in it are older than that.

vynmera · 2018-08-09T15:02:22Z

Hmmm, I'm not so sure about this. Basically, the current logic skips all channels that have not been touched since the last prune, to save processing power. Changing the checked field to ts will make the system prune those channels exactly once, and then never prune them again until recreated.

Perhaps the lastUpdate check should be a configurable option that can be enabled or disabled, for those that do not worry about processing power and want a more precise prune. What do you think, @ggazzo ?
I suppose this system causes channels that aren't talked in to be delayed in their prune, which some might not want.

vynmera · 2018-08-13T01:40:36Z

Rereading everything, I understand your misconception more.

You are operating on the grounds that the algorithm skips channels which are younger than the prune timer. Which is logical, in fact - we could totally have added that by filtering to channels where ts is less than, or equal to newest. Had the error been made to change this to _lastUpdate, that would have been erroneous.

However, this algorithm instead compares a timestamp of the last prune (lastPrune) with the last time the channel was touched. If the channel is active, this is no issue. But if the channel hasn't been touched for 30 days, this will cause messages in the channel to possibly be 30 days too old. Once a new message is posted, the room is edited or the server is restarted, those over-date messages will be pruned as normal. This is undocumented behaviour that some people won't want, while others, who are merely using this feature to save disk space, will be OK with it.

I feel the best option here is a cleanly documented Higher Precision switch in Admin settings, to disable this check altogether and as such prune more precisely, costing a bit more processing power in return. I will be moving forward to create a PR to address this in the coming week.

TheReal1604 · 2018-08-13T06:17:03Z

Thanks for your explanation @vynmera I was not aware that changing the field to ts leads to cleaning once - but with your explanation and viewing again on the code this makes absolutely sense. For me the Higher Precision switch would be ok - but maybe there is something more elegant and not an additional layer of complexity. 😁

However, this algorithm instead compares a timestamp of the last prune (lastPrune) with the last time the channel was touched. If the channel is active, this is no issue. But if the channel hasn't been touched for 30 days, this will cause messages in the channel to possibly be 30 days too old. Once a new message is posted, the room is edited or the server is restarted, those over-date messages will be pruned as normal. This is undocumented behaviour that some people won't want, while others, who are merely using this feature to save disk space, will be OK with it.

But are you sure the updatedAt field is touched when the server restarts? We definitely have channels which doesnt get this update at a server restart.

Just curios - I tested it now - the channels get cleaned fine and the old messages are not visible anymore. But my server statistics reports the same amount of direct / private group / channel messages as before the cleanup. I take a look in the code, but maybe you can explain what happens here in detail? Maybe the messages get flagged and not deleted directly?

vynmera · 2018-08-13T09:06:41Z

@TheReal1604 The reason the prune goes through when the server restarts is that after a restart, the lastPrune field gets set back to 0 (as it's only saved in memory, not stored). Thus the first run will try all channels.

As for the statistics, I believe those are mainly just tallying counters. They're one-way, I believe. If you want an exact count of the amount of messages, connect to the mongo and run db.rocketchat_message.count({});. I'll look into this more when I get the chance though, to see if normal deletions decrease the counter. I'm not sure if they do.

changed timefield for prune comparison from _updatedAt to ts

4485e19

vynmera mentioned this pull request Aug 12, 2018

[NEW] Message retention policy and pruning #11236

Merged

42 tasks

TheReal1604 closed this Aug 13, 2018

vynmera mentioned this pull request Sep 26, 2018

Retention policy not working without any new activity #12093

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FIX] Corrected timefield comparison for message retention policy cron #11725

[FIX] Corrected timefield comparison for message retention policy cron #11725

TheReal1604 commented Aug 9, 2018

vynmera commented Aug 9, 2018

vynmera commented Aug 13, 2018

TheReal1604 commented Aug 13, 2018

vynmera commented Aug 13, 2018

[FIX] Corrected timefield comparison for message retention policy cron #11725

[FIX] Corrected timefield comparison for message retention policy cron #11725

Conversation

TheReal1604 commented Aug 9, 2018

vynmera commented Aug 9, 2018

vynmera commented Aug 13, 2018

TheReal1604 commented Aug 13, 2018

vynmera commented Aug 13, 2018