Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kafka.common.KafkaTimeoutError: ('Failed to update metadata after %s secs.', 60.0) #607

Closed
archiechen opened this issue Mar 18, 2016 · 52 comments

Comments

@archiechen
Copy link

kafka version: 0.8.2.0-1.kafka1.3.2.p0.15 (cloudera released)

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/archimonde/lib/python2.6/site-packages/kafka/producer/kafka.py", line 357, in send
    self._wait_on_metadata(topic, self.config['max_block_ms'] / 1000.0)
  File "/opt/archimonde/lib/python2.6/site-packages/kafka/producer/kafka.py", line 465, in _wait_on_metadata
    "Failed to update metadata after %s secs.", max_wait)
kafka.common.KafkaTimeoutError: ('Failed to update metadata after %s secs.', 60.0)

But it's ok on 2.0.0-1.kafka2.0.0.p0.12.

@archiechen
Copy link
Author

resolved. must set all brokers for bootstrap_servers.

@khangnguyen
Copy link

@archiechen I am getting the same error too, what did you do?

@jasonrhaas
Copy link

so what is the fix here?

@nicholasserra
Copy link

nicholasserra commented Jan 25, 2017

Also wondering fix. I see this error when running kafka via Docker but not when installing from binary.

EDIT: So wondering what the root cause may be to help me pinpoint issues between environments.

@lukewendling
Copy link

Anyone get an answer to this?

@softwarevamp
Copy link

softwarevamp commented Sep 1, 2017

docker default binds to ipv6 only. but usually you may miss configuring listeners of it.
related to moby/moby#2174.

@jeffwidman
Copy link
Collaborator

jeffwidman commented Dec 2, 2017

I am re-opening, I also saw the same issue in production with kafka-python 1.3.5 and kafka broker 0.10.0.1. The producer was a VM, the broker was baremetal--no docker here.

The error showed up after rolling the kafka cluster, and the KafkaProducer instance never recovered, just continually threw this error once a minute. The brokers were rolled by stopping/restarting the broker processes--the underlying machines did not change and DNS was not affected.

I watched the tcp stream and the producer never tried to even talk to Kafka, so it wasn't timing out cluster side.

When I restarted the service, it immediately started working.

@geoff-va
Copy link

I too saw this exact issue with kafka-python 1.3.5 and kafka broker 1.0.0. I tried restarting one of the brokers, and now I continually get the following message:

kafka.errors.KafkaTimeoutError: KafkaTimeoutError: Failed to update metadata after 60.0 secs.

My producer is running inside a docker container, but both kafka brokers are running in linux via supervisor. Both of my brokers are included via bootstrap_servers.

@griff122
Copy link

Another cause for this issue is using a bytes object as the topic name, instead of a string. In python 3, you could have a string come in as a bytes object (a = b'test_string'). If this happens, you can just convert the topic name to a utf-8 string, and it might start working. It worked for me.

if type(kafka_topic) == bytes:
    kafka_topic = kafka_topic.decode('utf-8')
kafka_producer = KafkaProducer(bootstrap_servers=kafka_brokers)
kafka_producer.send(kafka_topic, payload)
kafka_producer.flush()

@brennerm
Copy link

Steps to reproduce the issue:

  1. Clone Kafka Docker repo git clone https://github.com/wurstmeister/kafka-docker
  2. Clone kafka-python repo git clone https://github.com/dpkp/kafka-python
  3. Start Kafka ( cd kafka-docker; docker-compose up -d )
  4. Find out Kafka broker port docker ps
  5. Set correct broker port in example.py cd kafka-python; vi example.py (line 17 and 35)
  6. Execute example python example.py

@dustinfarris
Copy link

here's the logs right before the timeout:

DEBUG 2018-02-24 05:38:58,420 kafka 46849 Requesting metadata update for topic ('example',)
DEBUG 2018-02-24 05:38:58,520 client_async 46849 Sending metadata request MetadataRequest_v1(topics=[('example',)]) to node 1011
DEBUG 2018-02-24 05:38:58,520 parser 46849 Sending request MetadataRequest_v1(topics=[('example',)])
DEBUG 2018-02-24 05:38:58,520 conn 46849 <BrokerConnection node_id=1011 host=localhost/::1 port=9092> Request 562: MetadataRequest_v1(topics=[('example',)])
DEBUG 2018-02-24 05:38:58,523 parser 46849 Received correlation id: 562
DEBUG 2018-02-24 05:38:58,523 parser 46849 Processing response MetadataResponse_v1
DEBUG 2018-02-24 05:38:58,523 conn 46849 <BrokerConnection node_id=1011 host=localhost/::1 port=9092> Response 562 (3.287076950073242 ms): MetadataResponse_v1(brokers=[(node_id=1011, host='localhost', port=9092, rack=None)], controller_id=1011, topics=[(error_code=17, topic="('example',)", is_internal=False, partitions=[])])
DEBUG 2018-02-24 05:38:58,523 kafka 46849 _wait_on_metadata woke after 59.96455192565918 secs.
DEBUG 2018-02-24 05:38:58,524 kafka 46849 Requesting metadata update for topic ('example',)
Traceback (most recent call last):
  File "/Users/dustin/.virtualenvs/oda-data-pipe/bin/oda", line 11, in <module>
    load_entry_point('oda-data-pipe', 'console_scripts', 'oda')()
  File "/Users/dustin/Work/cu/oda-data-pipe/oda_data_pipe/main.py", line 150, in main
    args.func(args)
  File "/Users/dustin/Work/cu/oda-data-pipe/oda_data_pipe/main.py", line 28, in produce
    job_module.extract()
  File "/Users/dustin/Work/cu/oda-data-pipe/oda_data_pipe/oda_producer.py", line 53, in wrapper
    kafka_producer.send(topic_name, item)
  File "/Users/dustin/.virtualenvs/oda-data-pipe/lib/python3.6/site-packages/kafka/producer/kafka.py", line 546, in send
    self._wait_on_metadata(topic, self.config['max_block_ms'] / 1000.0)
  File "/Users/dustin/.virtualenvs/oda-data-pipe/lib/python3.6/site-packages/kafka/producer/kafka.py", line 664, in _wait_on_metadata
    "Failed to update metadata after %.1f secs." % max_wait)
kafka.errors.KafkaTimeoutError: KafkaTimeoutError: Failed to update metadata after 60.0 secs.
INFO 2018-02-24 05:38:58,561 kafka 46849 Closing the Kafka producer with 0 secs timeout.
INFO 2018-02-24 05:38:58,561 kafka 46849 Proceeding to force close the producer since pending requests could not be completed within timeout 0.
DEBUG 2018-02-24 05:38:58,562 sender 46849 Beginning shutdown of Kafka producer I/O thread, sending remaining records.
INFO 2018-02-24 05:38:58,562 conn 46849 <BrokerConnection node_id=1011 host=localhost/::1 port=9092>: Closing connection.
DEBUG 2018-02-24 05:38:58,563 conn 46849 <BrokerConnection node_id=1011 host=localhost/::1 port=9092>: reconnect backoff 0.04849831291093153 after 1 failures
DEBUG 2018-02-24 05:38:58,563 sender 46849 Shutdown of Kafka producer I/O thread has completed.
DEBUG 2018-02-24 05:38:58,563 kafka 46849 The Kafka producer has closed.
INFO 2018-02-24 05:38:58,572 kafka 46849 Kafka producer closed

@dustinfarris
Copy link

I've also noticed that using kafka-python in the interpreter works just fine, with the exact same bootstrap_servers and everything else. The bug only surfaces when executed as part of a Python program.

@dpkp dpkp self-assigned this Mar 9, 2018
@dpkp
Copy link
Owner

dpkp commented Mar 9, 2018

If this happens before any message is able to be sent then it typically indicates a low-level connection error. Take a look at the kafka.conn.BrokerConnection debug logs. I've made several changes to the network connection code on master to handle various edge cases that you may be hitting. If you are able to test with master, can you check whether this issue is fixed now?

@benkibejs
Copy link

benkibejs commented Mar 19, 2018

I am having the same issue with 1.4.2 and 1.4.3-dev0. Do not understand where i can find the debug logs you are talking about. Would they be somewhere in the docker volume?

Traceback (most recent call last):
File "producer.py", line 41, in <module>
    producer.send('answer_topic', value=message)
File "/usr/local/lib/python3.5/dist-packages/kafka/producer/kafka.py", line 543, in send
    self._wait_on_metadata(topic, self.config['max_block_ms'] / 1000.0)  
File "/usr/local/lib/python3.5/dist-packages/kafka/producer/kafka.py", line 664, in _wait_on_metadata
    "Failed to update metadata after %.1f secs." % max_wait)
kafka.errors.KafkaTimeoutError: KafkaTimeoutError: Failed to update metadata after 60.0 secs.

@dpkp
Copy link
Owner

dpkp commented Mar 23, 2018

to enable python logging at its most simple form:

import logging
logging.basicConfig(level=logging.DEBUG)

@dpkp dpkp removed their assignment Mar 23, 2018
@tkaymak
Copy link

tkaymak commented Apr 16, 2018

I am having the same issue while using minikube and https://github.com/Yolean/kubernetes-kafka (Kubernetes version 1.9.0, Kafka 1.0)

@GagandeepS
Copy link

Whats the fix for this issue guys?

@madss
Copy link

madss commented Jul 5, 2018

If running the kafka brokers inside a container, make sure that it advertises the correct hostnames that are accessible by the clients. If not specified it will use the canonical hostname of the container and that may be an internal one that cannot be used outside of the container.

You can set the advertised hosts with advertised.listeners in the server properties, or if you just want to test with an unmodified kafka docker image, you can override the default setting with:

bin/kafka-server-start.sh config/server.properties --override advertised.listeners=PLAINTEXT://<accessible-hostname>:9092

@mikekeda
Copy link

I had same problem,
To solve the problem I increased request_timeout_ms (I used 10ms before)

@egodigitus
Copy link

Had the same issue.
Was producing against a kafka cluster which was not available, changing the cluster address to an available one fixed the problem for me

@worms
Copy link

worms commented Sep 18, 2018

You will also see this if you disallow topic creation on publish on the broker (auto.create.topics.enable=false) and then try to produce to a topic that hasn't been created yet.

@chunzhenzyd
Copy link

Hi,

How to fix this problem ?

@bolianlai
Copy link

I have encountered this problem when I connect to kafka deployed with docker. How can I solve this problem?

@tvoinarovskyi
Copy link
Collaborator

tvoinarovskyi commented Feb 22, 2019

@jeffwidman
Due to the essense of how Kubernetis uses DNS, maybe the root cause here is that we either cache dns records or that the Kubernetis setup does not broadcast DNS's in advertized.listeners, but IP addresses.
Most likely after rolling a new instance it still continued to use old IP. We need to make sure the DNS is refreshed if node times out.

@tvoinarovskyi
Copy link
Collaborator

For others in thread, this error means that the client could not connect to a node. Most likely it's not visible. Please make sure the advertized.listener config is valid. For Kubenetes users, please make sure you adverize the same dns as you bootstrap from.

@jeffwidman
Copy link
Collaborator

Right, but I saw it in a non-Kubernetes environment.

I got this message when trying to publish to a topic that doesn't exist.

I haven't examined the actual code path, but it would be nice if we threw a more obvious error for that scenario.

@christian-quisbert
Copy link

Hi, i have the same problem with kafka inside Docker.
I leave the Exception log :

Service    | Exception in thread Thread-2:
Service    | Traceback (most recent call last):
Service    |   File "/usr/local/lib/python2.7/threading.py", line 801, in __bootstrap_inner
Service    |     self.run()
Service    |   File "/usr/local/lib/python2.7/threading.py", line 754, in run
Service    |     self.__target(*self.__args, **self.__kwargs)
Service    |   File "<file>.py", line XX, in worker
Service    |     self.producer.send(self.kafka_topic, item['data'])
Service    |   File "/usr/local/lib/python2.7/site-packages/kafka/producer/kafka.py", line 543, in send
Service    |     self._wait_on_metadata(topic, self.config['max_block_ms'] / 1000.0)
Service    |   File "/usr/local/lib/python2.7/site-packages/kafka/producer/kafka.py", line 664, in _wait_on_metadata
Service    |     "Failed to update metadata after %.1f secs." % max_wait)
Service    | KafkaTimeoutError: KafkaTimeoutError: Failed to update metadata after 60.0 secs.

Versions: "Python" 2.7 | "kafka-python" 1.4.3 | and "spotify/kafka" Docker Image.
I appreciate you help.

@kerol
Copy link

kerol commented Mar 13, 2019

I solved this problem after starting zookeeper on my macOS,

brew services start kafka
brew services start zookeepr

Btw, you can enable logging debug for more information:

import logging
logging.basicConfig(level=logging.DEBUG)

Because producer complained like this:
DEBUG:kafka.client:Give up sending metadata request since no node is available

@jeffwidman
Copy link
Collaborator

The 1.4.5 release includes some commits that may fix some unconfirmed edge cases related to this. So if you're following this ticket, please retry with the latest release and let us know if you're still seeing this.

@ghost
Copy link

ghost commented Apr 8, 2019

Has anyone found the concrete fix for this issue.
I am using the latest version1.4.5.
The issue is while I am trying to send a json message within a Django based web application via producer, I am getting an error stating KafkaTimeoutError: Failed to update metadata after 60.0 secs.
However, if I run the same code as a standalone python script it runs fine.
The Kafka Broker and Django Server is running on the same local machine without any docker.

kafka-error.zip

@jeffwidman
Copy link
Collaborator

1.4.6 was released earlier this week--there is a chance the fix for #1760 may also help with this, but not sure.

@parasjain can you try that and see if it helps?

@GeorgeXiaojie
Copy link

Also encountered this problem in version 1.4.5, because my topic contains a comma (,)

@stevanmilic
Copy link

I solved the problem by setting KAFKA_ADVERTISED_HOST_NAME: 127.0.0.1 in docker-compose.yml for wurstmeister/kafka image.

@0mars
Copy link

0mars commented Jun 19, 2019

update: found that using container name everywhere instead of using 192.168.99.100 works pretty well

@jdolitsky
Copy link

Ran into this issue in a Kubernetes environment (incubator/kafka Helm chart). The issue was that the number of replicas was less than the value for "default.replication.factor"

@apigban
Copy link

apigban commented Aug 20, 2019

I got this message when trying to publish to a topic that doesn't exist.

THANK YOU!

@mrzhangboss
Copy link

I meet this problem.My env is
docker kafka image: wurstmeister/kafka:2.12-2.2.1
and I fix it by
change image version to wurstmeister/kafka:2.12-2.1.1 and add environment
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka1:9092(change kafka1 to your hostname)

@mujina93
Copy link

mujina93 commented Nov 13, 2019

I had this issue when porting the code from python 2 to 3.

Solved by changing

kafkaValue = json.dumps(someDict, separators=(',', ':'))
producer.send(
    topic=bytes(topicNameAsString), 
    key=bytes(someString), 
    value=kafkaValue)

to

kafkaValue = json.dumps(someDict, separators=(',', ':'))
producer.send(
    topic=topicNameAsString, # <-- just string
    key=bytes(someString, encoding='utf8'), # <-- encoding='utf8'
    value=bytes(kafkaValue, encoding='utf8')) # <-- bytes

@avloss
Copy link

avloss commented Dec 21, 2019

can't believe this is still open - either topic is automatically created in both cases or in neither. This is puzzling!

@dpkp
Copy link
Owner

dpkp commented Dec 29, 2019

I don't think this issue is terribly useful to keep open. There can be several causes for a timeout when fetching metadata. It could be networking, topic creation problems, topic naming problems, etc -- note all of the different "me too" explanations so far. I'm going to close this because it does not appear to be a specific bug in kafka-python that could be fixed with a PR.

@davidzhx
Copy link

run into the same error message, but the root cause in my case was related to an outdated invalid intermediate root cert in the CA Bundle, fixed once updated the intermediate cert.

@okazelnik
Copy link

I'm still facing the problem in Kuberentes with kafka-python 1.4.7.
Someone found a solution for it?

@etos
Copy link

etos commented Jul 8, 2020

If useful for anyone, I started see this error after auto.create.topics.enable was set to false.. creating the topic manually resolved the issue

@kennyhe
Copy link

kennyhe commented Jul 19, 2020

I used docker-compose, in which the Kafka advertised listeners are:
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:29092,PLAINTEXT_HOST://localhost:9092

Then I changed the port in python code to 29029 and solved the problem.

@iamanvesh
Copy link

I was publishing to None topic. Fixed after changing the topic's name.

@nagi49000
Copy link

nagi49000 commented May 1, 2022

Very confusing how everyone has had this problem for years, and no fixes have been put in, even after moving onto 2.0.2 (which I've been testing on).
Lot's of issues described above with running in docker, and I can confirm that there is an issue with kafka-python running in docker, and creating a topic.
Running some simple code using a kafka-python KafkaProducer to send messages to a kafka topic fails when I run that code in a docker image, and kafka on the host. I run the same python code on the host, and hey-presto, the topic is created, and the messages sent. I then re-run the same code in docker container (as done in the first step, except now the topic is created), and hey-presto, now messages get sent to Kafka.

@wizeankitbhardwaj
Copy link

I've also noticed that using kafka-python in the interpreter works just fine, with the exact same bootstrap_servers and everything else. The bug only surfaces when executed as part of a Python program.

not for me. Its same in cmd also.

@joaoFTH7
Copy link

I use docker in VM Ubuntu 20.04 LTS with Bridge network interface and my host is a Windows 10, when i try to consume or produce topics from Windows always receive timeout errors, to resolve this error kafka.common.KafkaTimeoutError: ('Failed to update metadata after %s secs.', 60.0), in docker-compose i set KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://192.168.3.7:9092,PLAINTEXT_INTERNAL://broker:29092, in your case just change 192.168.3.7 for your VM IP addres, broker is the container name and is just used to communicate with others containers

@Lucas12j
Copy link

Lucas12j commented Sep 14, 2022

I was with the same problem, using the kafka in the container docker and the kafka-python in a conda enviroment.
I resolved using a topic name already existing, and add the parameter "key_serializer=str.encode" on the topic creation . I followed the documentation https://kafka-python.readthedocs.io/en/master/ #Serialize string keys.

UPDATE: The comand was executed, but the message not sended. I tested directally in the docker container, and work.
It is clear that the problem is an incompatibility of the communication beetwen the kafka docker container and the docker-host, not necessary a problem with the kafka-python.

@siavashoh
Copy link

siavashoh commented Feb 17, 2023

I was facing the same issue. I was using docker-compose. I realized the problem was from the Kafka listeners. So I update Kafka environments to this: (I'm not sure which line fixed it, but it working)

  • KAFKA_CFG_ZOOKEEPER_CONNECT=zookeeper:2181
  • KAFKA_CFG_LISTENERS=INTERNAL://:9093,OUTSIDE://:9092
  • KAFKA_CFG_ADVERTISED_LISTENERS=INTERNAL://kafka:9093,OUTSIDE://kafka:9092
  • KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=INTERNAL:PLAINTEXT,OUTSIDE:PLAINTEXT
  • KAFKA_CFG_INTER_BROKER_LISTENER_NAME=INTERNAL
  • ALLOW_PLAINTEXT_LISTENER=yes

@Calemsy
Copy link

Calemsy commented Jun 20, 2024

remove security_protocol="SSL" can work for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests