Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUARANTINE] test_send_tasks_to_celery_hang hangs on self-hosted runners #16168

Closed
potiuk opened this issue May 30, 2021 · 2 comments
Closed
Assignees
Labels
kind:bug This is a clearly a bug Quarantine Issues that are occasionally failing and are quarantined

Comments

@potiuk
Copy link
Member

potiuk commented May 30, 2021

The test_send_tasks_to_celery_hang hangs on self-hosted runners more often than not.

It's been introduced in #15989 and while the test does not usually hang on regular GitHub runners, or in case of running it locally (I could not make it fail), it does hang almost always when run on self-hosted runners.

Marking it as quarantined for now.

Example here:

https://github.com/apache/airflow/runs/2703420580?check_suite_focus=true#step:6:6118

  ________________________ test_send_tasks_to_celery_hang ________________________
  
  register_signals = None
  
      def test_send_tasks_to_celery_hang(register_signals):  # pylint: disable=unused-argument
          """
          Test that celery_executor does not hang after many runs.
          """
          executor = celery_executor.CeleryExecutor()
      
          task = MockTask()
          task_tuples_to_send = [(None, None, None, None, task) for _ in range(26)]
      
          for _ in range(500):
              # This loop can hang on Linux if celery_executor does something wrong with
              # multiprocessing.
  >           results = executor._send_tasks_to_celery(task_tuples_to_send)
  
  tests/executors/test_celery_executor.py:537: 
  _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
  airflow/executors/celery_executor.py:324: in _send_tasks_to_celery
      send_pool.map(send_task_to_executor, task_tuples_to_send, chunksize=chunksize)
  /usr/local/lib/python3.7/concurrent/futures/_base.py:623: in __exit__
      self.shutdown(wait=True)
  /usr/local/lib/python3.7/concurrent/futures/process.py:681: in shutdown
      self._queue_management_thread.join()
  /usr/local/lib/python3.7/threading.py:1044: in join
      self._wait_for_tstate_lock()
  _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
  
  self = <Thread(QueueManagerThread, stopped daemon 140457603389184)>
  block = True, timeout = -1
  
      def _wait_for_tstate_lock(self, block=True, timeout=-1):
          # Issue #18808: wait for the thread state to be gone.
          # At the end of the thread's life, after all knowledge of the thread
          # is removed from C data structures, C code releases our _tstate_lock.
          # This method passes its arguments to _tstate_lock.acquire().
          # If the lock is acquired, the C code is done, and self._stop() is
          # called.  That sets ._is_stopped to True, and ._tstate_lock to None.
          lock = self._tstate_lock
          if lock is None:  # already determined that the C code is done
              assert self._is_stopped
  >       elif lock.acquire(block, timeout):
  E       Failed: Timeout >60.0s
  
  /usr/local/lib/python3.7/threading.py:1060: Failed
@potiuk potiuk added kind:bug This is a clearly a bug Quarantine Issues that are occasionally failing and are quarantined labels May 30, 2021
@potiuk
Copy link
Member Author

potiuk commented May 30, 2021

cc: @yuqian90

potiuk added a commit to potiuk/airflow that referenced this issue May 30, 2021
The test_send_tasks_to_celery_hang hangs on self-hosted runners more
often than not.

It's been introduced in apache#15989 and while the test does not usually hang
on regular GitHub runners, or in case of running it locally (I could not
make it fail), it does hang almost always when run on self-hosted
runners.

Marking it as quarantined for now.

Issue apache#16168 created to keep track of it.
potiuk added a commit that referenced this issue May 30, 2021
The test_send_tasks_to_celery_hang hangs on self-hosted runners more
often than not.

It's been introduced in #15989 and while the test does not usually hang
on regular GitHub runners, or in case of running it locally (I could not
make it fail), it does hang almost always when run on self-hosted
runners.

Marking it as quarantined for now.

Issue #16168 created to keep track of it.
@potiuk potiuk added this to the Airflow 2 clean-up milestone Dec 18, 2021
@eladkal
Copy link
Contributor

eladkal commented Dec 31, 2022

closing as test isn't marked as quarantined anymore

@eladkal eladkal closed this as completed Dec 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:bug This is a clearly a bug Quarantine Issues that are occasionally failing and are quarantined
Projects
None yet
Development

No branches or pull requests

3 participants