Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client has no connection to Kafka anymore after idling for some time #1487

Closed
weeco opened this issue Sep 10, 2019 · 7 comments
Closed

Client has no connection to Kafka anymore after idling for some time #1487

weeco opened this issue Sep 10, 2019 · 7 comments
Labels
stale Issues and pull requests without any recent activity

Comments

@weeco
Copy link
Contributor

weeco commented Sep 10, 2019

Versions

Please specify real version numbers or git SHAs, not just "Latest" since that changes fairly regularly.

Sarama Kafka Go
v1.23.1 v2.3 1.12
Configuration

What configuration values are you using for Sarama and Kafka?

// NewSaramaAdminConfig creates a new sarama config which can be used for the admin client
func NewSaramaAdminConfig(cfg *Config) (*sarama.Config, error) {
	sConfig := sarama.NewConfig()
	sConfig.ClientID = cfg.ClientID
	sConfig.Net.KeepAlive = time.Second * 45

	version, err := sarama.ParseKafkaVersion(cfg.KafkaVersion)
	if err != nil {
		return nil, err
	}
	sConfig.Version = version

	if cfg.SASLEnabled {
		sConfig.Net.SASL.Enable = true
		sConfig.Net.SASL.User = cfg.SASLUsername
		sConfig.Net.SASL.Password = cfg.SASLPassword
	}

	err = sConfig.Validate()
	if err != nil {
		return nil, err
	}

	return sConfig, nil
}
Logs
{"level":"info","ts":"2019-09-10T22:47:59.272+0200","msg":"Server listening on address","address":"[::]:9090","port":9090}
{"level":"error","ts":"2019-09-10T23:24:34.138+0200","msg":"Failed to list topics from kafka cluster","error":"write tcp 192.168.178.66:58796->redacted:19092: wsasend: An established connection was aborted by the software in your host machine."}
Problem Description

I have an admin client which is supposed to return a list of all Kafka topics upon invocation on a REST endpoint. This does work as desired, however after some idling (e. g. not running any requests with the admin client) for 30 minutes or so I can not get any topics anymore because of the error shown above in the logs.

I thought the .Net.KeepAlive config option (as shown in the go code) would resolve this, but apparently this hasn't helped. What is the problem?

@FrancoisPoinsot
Copy link
Contributor

You can access debug log if you set the library logger: https://github.com/Shopify/sarama/blob/master/tools/kafka-console-producer/kafka-console-producer.go#L45

Can you share the output?

@rikimaru0345
Copy link

@FrancoisPoinsot

Hi, I'm working on the same project as weeco, so I've got a log for you 😄

[
   // ...

    {
        "level": "debug",
        "ts": "2019-10-04T15:57:27.316+0200",
        "msg": "Successful SASL handshake. Available mechanisms: [PLAIN]",
        "source": "sarama"
    },
    {
        "level": "debug",
        "ts": "2019-10-04T15:57:27.337+0200",
        "msg": "SASL authentication successful with broker XXXXXXXXX - [0 0 0 0]",
        "source": "sarama"
    },
    {
        "level": "debug",
        "ts": "2019-10-04T15:57:27.337+0200",
        "msg": "Connected to broker at XXXXXXXXX (registered as #1)",
        "source": "sarama"
    },
    {
        "level": "debug",
        "ts": "2019-10-04T16:04:26.781+0200",
        "msg": "client/metadata fetching metadata for all topics from broker XXXXXXXXX",
        "source": "sarama"
    },
    {
        "level": "debug",
        "ts": "2019-10-04T16:14:26.781+0200",
        "msg": "client/metadata fetching metadata for all topics from broker XXXXXXXXX",
        "source": "sarama"
    },
    {
        "level": "error",
        "ts": "2019-10-04T16:19:25.782+0200",
        "msg": "Sending REST error",
        "route": "XXXXXXXXX",
        "method": "GET",
        "error": "one of the brokers failed to return a list of consumer groups: EOF"
    },
    {
        "level": "warn",
        "ts": "2019-10-04T16:22:00.266+0200",
        "msg": "Keep alive has errored",
        "error": "write tcp XXXXXXXXX->XXXXXXXXX: wsasend: Eine bestehende Verbindung wurde softwaregesteuert\r\ndurch den Hostcomputer abgebrochen."
    },
    {
        "level": "debug",
        "ts": "2019-10-04T16:24:26.781+0200",
        "msg": "client/metadata fetching metadata for all topics from broker XXXXXXXXX",
        "source": "sarama"
    },
    {
        "level": "warn",
        "ts": "2019-10-04T16:24:30.472+0200",
        "msg": "Keep alive has errored",
        "error": "write tcp XXXXXXXXX->XXXXXXXXX: wsasend: Eine bestehende Verbindung wurde softwaregesteuert\r\ndurch den Hostcomputer abgebrochen."
    },
    {
        "level": "warn",
        "ts": "2019-10-04T16:25:30.522+0200",
        "msg": "Keep alive has errored",
        "error": "write tcp XXXXXXXXX->XXXXXXXXX: wsasend: Eine bestehende Verbindung wurde softwaregesteuert\r\ndurch den Hostcomputer abgebrochen."
    },

   // ...
]

Some things to note here:

The message Eine bestehende Verbindung wurde softwaregesteuert... is the description for WSAECONNABORTED (that's socket error code 10053).

The "keep alive" is referring to this code: https://github.com/kafka-owl/kafka-owl/blob/master/backend/cmd/api/main.go#L50-L59

@weeco weeco changed the title Admin Client has no connection to Kafka anymore after idling for some time Client has no connection to Kafka anymore after idling for some time Oct 5, 2019
@weeco
Copy link
Contributor Author

weeco commented Oct 5, 2019

I have updated the title as this is not an Admin client exclusive problem. The same happens if you use the sarama client directly, rather than the AdminClient.

@rikimaru0345
Copy link

rikimaru0345 commented Oct 16, 2019

Seems like the problem lies with the Dialer from the net module of Go. Sarama uses it here:
https://github.com/Shopify/sarama/blob/d1948414ad5ca980ff1506c4ca7cb99de2bdeaff/broker.go#L156

Corresponding issue:
golang/go#31490

@ghost
Copy link

ghost commented Feb 21, 2020

Thank you for taking the time to raise this issue. However, it has not had any activity on it in the past 90 days and will be closed in 30 days if no updates occur.
Please check if the master branch has already resolved the issue since it was raised. If you believe the issue is still valid and you would like input from the maintainers then please comment to ask for it to be reviewed.

@ghost ghost added the stale Issues and pull requests without any recent activity label Feb 21, 2020
@weeco
Copy link
Contributor Author

weeco commented Feb 21, 2020

Still valid

@ghost ghost removed the stale Issues and pull requests without any recent activity label Feb 21, 2020
@ghost
Copy link

ghost commented May 21, 2020

Thank you for taking the time to raise this issue. However, it has not had any activity on it in the past 90 days and will be closed in 30 days if no updates occur.
Please check if the master branch has already resolved the issue since it was raised. If you believe the issue is still valid and you would like input from the maintainers then please comment to ask for it to be reviewed.

@ghost ghost added the stale Issues and pull requests without any recent activity label May 21, 2020
@ghost ghost closed this as completed Mar 17, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale Issues and pull requests without any recent activity
Projects
None yet
Development

No branches or pull requests

3 participants