-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unexpected number of sessions regarding nrcpt and avg values #604
Comments
Hi ikedas. I'm not sure whether you need more informations from us or not. If so please tell me. Bests, Frédéric. |
Sorry for delayed response. Reading code, I found that, recipients are sorted by their email addresses, not by their domains. Then, a packet is filled by recipients by each. When
the packet will be saved in the spool and the new packet will be prepared. Thant's why each packet does not always contain recipients with number specified by |
Hi Soji, Thank you for looking at the code! It appears to me that sorting email addresses by domains prior to filling a packet would help reducing the number of packets and postfix tasks, then improve the storage efficiency. Do you see any issues coming with changing the sorting order? Can you provide any quick hack or point me to the line of code I should change to have email addresses sorted by domains? Bests, |
Hello, I'm not sure how the above reference has anything to do with this issue. @ikedas, is there a chance that we can make some progress on this? Please tell me what I should try. It sounds inappropriate to not sort by domains at first, as this leads to the creation of too many packets and too many emails for the same destination domain, reducing deduplication efficiency and increasing disk space used by a factor of 20 to 30 for a few thousands of recipients. Regards, |
I also think it would be more logical to group by domain as this would allow more optimizations and special case treatment for specific domains. |
This morning, a staff member sent a 2.48MB email with a picture to sell..... a trash can! This resulted in 45 packets and even with the dedup in place this led to 45 packets x 2,48MB x 8 stores ==> 892.8MB of data written! Can someone please help with this issue? It's about having Sympa sorting by destination domains before creating packets. |
Hi @FredNass, Could you please check this patch? |
Hi @ikedas, Thanks a lot for providing this patch. We'll try it when my colleague is back on Monday and we'll get back to you. |
Hi @ikedas, |
Make storage into outgoing spool more efficient (#604)
Version
6.2.32
Installation method
From @sympa-ja.org repository
Expected behavior
We have a list made of 6152 members from 44 different domains. 72 members are from 43 different domains and 6080 are from a single domain. We set nrcpt=1000 and avg=100 globally (not in nrcpt_by_domain.conf)
We would expect Sympa to create 50 sessions with recipients grouped by their destination domain: 7 sessions with the 6080 recipients for the unique domain and 43 sessions with recipients for each one of the other domains.
Actual behavior
Sympa creates 44 sessions with a number of recipients between 100 and 280 and does not respect the global nrcpt=1000.
Additional information
Setting nrcpt in nrcpt_by_domain.conf has no effect.
This issue has for major consequence that a single message turns into the creation of many more messages and reduces deduplication efficiency on the same mail store. Our mail stores receive and store 44 different messages with only 20 to 30 mailboxes referencing each one of these messages, when it should have received and stored only 7 messages with 1000 mailboxes referencing each one of them.
Please tell us how we can help troubleshoot this issue.
Regards,
Frédéric.
The text was updated successfully, but these errors were encountered: