Multiple resetting connection after RPCError #2036

marcinh · 2022-03-02T14:49:17Z

Description

When there is something wrong with communication master-worker (RPCError caught)), connection is being reset (new RPC server created) multiple times (actually after every heartbeat interval) until new message from worker appears. So when there is only single worker and RPCError is caught - connection is being reset infinitely.

Expected behavior

After catching RPCError connection should be reset once

Actual behavior

Connection is being reset every single second (heartbeat interval) until new message from worker arrives or new worker appears

Steps to reproduce

Setup locust master + single worker, trigger RPCError communication error somehow.

Possible solution

Currently connection_reset() method in MasterRunner is called in heartbeat_worker thread when connection_broken flag is set. Once this flag is set to True connection reset bill be done in a loop until this flag is set to false (what is done in client listener thread). So the possible solution is to move setting connection_broken flag to false into connection_reset() method itself

    def reset_connection(self):
        logger.info("Reset connection to worker")
        try:
            self.server.close()
            self.server = rpc.Server(self.master_bind_host, self.master_bind_port)
            self.connection_broken = False
        except RPCError as e:
            logger.error(f"Temporary failure when resetting connection: {e}, will retry later.")

Environment

OS: Centos
Python version: 3.9
Locust version: 2.8.2

The text was updated successfully, but these errors were encountered:

github-actions · 2022-05-02T02:17:27Z

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 10 days.

github-actions · 2022-05-13T02:17:03Z

This issue was closed because it has been stalled for 10 days with no activity. This does not necessarily mean that the issue is bad, but it most likely means that nobody is willing to take the time to fix it. If you have found Locust useful, then consider contributing a fix yourself!

marcinh added the bug label Mar 2, 2022

github-actions bot added the stale Issue had no activity. Might still be worth fixing, but dont expect someone else to fix it label May 2, 2022

Nosibb mentioned this issue May 12, 2022

Fix multiple resetting connection after RPCError #2096

Merged

github-actions bot closed this as completed May 13, 2022

cyberw reopened this May 13, 2022

cyberw removed the stale Issue had no activity. Might still be worth fixing, but dont expect someone else to fix it label May 13, 2022

cyberw added the stale Issue had no activity. Might still be worth fixing, but dont expect someone else to fix it label Jul 7, 2022

cyberw closed this as completed Jul 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple resetting connection after RPCError #2036

Multiple resetting connection after RPCError #2036

marcinh commented Mar 2, 2022

github-actions bot commented May 2, 2022

github-actions bot commented May 13, 2022

Multiple resetting connection after RPCError #2036

Multiple resetting connection after RPCError #2036

Comments

marcinh commented Mar 2, 2022

Description

Expected behavior

Actual behavior

Steps to reproduce

Possible solution

Environment

github-actions bot commented May 2, 2022

github-actions bot commented May 13, 2022