-
Notifications
You must be signed in to change notification settings - Fork 230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SMTPClient stuck (race condition?) #315
Comments
can you provide a few more details? specifically:
|
This is what our email manager object looks like:
Stack trace for the two errors:
Note that these errors happen several times as described above and our retry system works fine. However, on the last time it logged the Apparently the SMTP server is accessed through a VPN tunnel - there is no authentication so it relies on having a known static IP through the tunnel. It has been suggested when the flakey mobile connection goes down the |
thanks for the details. they help a lot 😄
our connections use node's built-in socket classes, so if that were the case i doubt we could do much. we're kind of at the mercy of the stack in that arena. at any rate, it's still hitting timeout, so the internal state should reset properly. it's very odd to me that it isn't, and suggests to me the internal state is getting corrupted. except...
...if i understand this correctly, it means that emails are still being sent even after a timeout, and it's only the final request that gets snagged (please correct me if i'm wrong). what's more, you have your queue's concurrency set to 1, so you shouldn't be having a race condition between multiple invocations of one test i would suggest is creating a new |
Correct, you'll see that if We dug around in the emailjs source for a while and it's clear that We have enabled the logging in emailjs now so we'll see if that sheds any more light on the situation. However, there doesn't appear to be any logging around the timeout and error handling paths - it would be really helpful if this could be added so we can see what's happening! Thanks. |
very odd behavior.
the
i agree; the instrumentation is very bare bones; i'll see what i can do. do you need this on the
thanks for the report! |
We haven't managed to migrate over to ESM yet so we'll need this on Thanks for the help :) |
i've managed to repro something that looks like the error you're describing using our test suite. to reproduce, checkout the the error is triggered specifically by the client's configuration: const client = new SMTPClient({ port, tls: true });
const server = new SMTPServer({ secure: true }); can you confirm that your configuration isn't similar? |
That looks very similar to our configuration! We're only using |
cool, i think we have a target now 😃 i don't know what the fix would be, unfortunately, so it may take some time to figure out. in the meantime, using |
i've been digging through the code on a separate branch & i think this might be a configuration issue. specifically, i set up connection tests (see |
Thanks for looking into this. Unfortunately I think we may have drifted away from the real issue, which is that the promise sometimes never resolves/rejects - it just gets left hanging. This should never happen, no matter what the client or server configuration is. It should either be resolved succesfully, rejected with an error, or rejected due to timeout. |
correct, that's still an issue, but i think it may be a sympton of the configuration issue. i'm still looking into why the promise is stuck pending. |
i managed to create a simplified version of your queue code at https://github.com/eleith/emailjs/blob/%C3%B8/test/queue.ts to test for this exact scenario and... it works as it should, rejecting after five tries with a properly closed-out client. the timeouts even came from the client. i'm at a bit of a loss as to how to continue. |
The latest update is that this ran fine for just over a week, including handling periods of bad connections and errors as above, but eventually something triggered it again (during a period of poor connectivity). So it seems that it only happens when there are connectivity issues, and even then it's pretty intermittent - most of the time it handles it fine. With emailjs logging enabled these are the last messages we received from emailjs before it just went silent on us:
If we're ever going to get a test to reproduce this I think you're going to need some sort of test socket that randomly drops packets, hangs or disconnects. Then have a test that retries continuosly forever and just leave it running until it hangs. I don't think you need any of our queuing system at all, it's not really relevant. |
the best kind of bug!
i'll have to do some research on how to set that up. given the randomness, though, i'm worrried the issue is at the os level. i think looking into logging would be a good next step now to try and clear up what exactly is happening when you get your hang. |
We're connecting to an ISP mail server over a flakey mobile connection. We've implemented a retry system but it relies on emailjs resolving.
We recently got the following sequence of error responses:
On the next retry it seems that emailjs never resolved the promise so our retry queue is now blocked and no email is being sent. This was 4 days ago.
client.sending
istrue
client.ready
istrue
client.smtp._state
is2
(CONNECTED
)I have checked
netstat
and there is definitely no active connection.The text was updated successfully, but these errors were encountered: