Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version 2.3.3 breaks "Plugins as Python packages" feature #25271

Closed
2 tasks done
rino0601 opened this issue Jul 25, 2022 · 8 comments · Fixed by #25296
Closed
2 tasks done

Version 2.3.3 breaks "Plugins as Python packages" feature #25271

rino0601 opened this issue Jul 25, 2022 · 8 comments · Fixed by #25296
Labels
area:core kind:bug This is a clearly a bug
Milestone

Comments

@rino0601
Copy link
Contributor

Apache Airflow version

2.3.3 (latest released)

What happened

In 2.3.3

If I use https://airflow.apache.org/docs/apache-airflow/stable/plugins.html#plugins-as-python-packages feature, then I see these Error:

short:
ValueError: The name 'airs' is already registered for this blueprint. Use 'name=' to provide a unique name.

long:

i'm trying to reproduce it...

If I don't use it(workarounding by AIRFLOW__CORE__PLUGINS_FOLDER), errors doesn't occur.

It didn't happend in 2.3.2 and earlier

What you think should happen instead

Looks like plugins are import multiple times if it is plugins-as-python-packages.

Perhaps flask's major version change is the main cause.
Presumably, in flask 1.0, duplicate registration of blueprint was quietly filtered out, but in 2.0 it seems to have been changed to generate an error. (I am trying to find out if this hypothesis is correct)

Anyway, use the latest version of FAB is important. we will have to adapt to this change, so plugins will have to be imported once regardless how it defined.

How to reproduce

It was reproduced in the environment used at work, but it is difficult to disclose or explain it.
I'm working to reproduce it with the breeze command, and I open the issue first with the belief that it's not just me.

Operating System

CentOS Linux release 7.9.2009 (Core)

Versions of Apache Airflow Providers

$ SHIV_INTERPRETER=1 airsflow -m pip freeze | grep apache-
apache-airflow==2.3.3
apache-airflow-providers-apache-hive==3.1.0
apache-airflow-providers-apache-spark==2.1.0
apache-airflow-providers-celery==3.0.0
apache-airflow-providers-common-sql==1.0.0
apache-airflow-providers-ftp==3.1.0
apache-airflow-providers-http==3.0.0
apache-airflow-providers-imap==3.0.0
apache-airflow-providers-postgres==5.1.0
apache-airflow-providers-redis==3.0.0
apache-airflow-providers-sqlite==3.1.0

but I think these are irrelevant.

Deployment

Other 3rd-party Helm chart

Deployment details

docker image based on centos7, python 3.9.10 interpreter, self-written helm2 chart ....

... but I think these are irrelevant.

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@rino0601 rino0601 added area:core kind:bug This is a clearly a bug labels Jul 25, 2022
@boring-cyborg
Copy link

boring-cyborg bot commented Jul 25, 2022

Thanks for opening your first issue here! Be sure to follow the issue template!

@potiuk potiuk added this to the Airflow 2.3.4 milestone Jul 25, 2022
@potiuk
Copy link
Member

potiuk commented Jul 25, 2022

Yeah. It would be great to get to the bottom of this. Happy to help with investigation if you have some more findings.

@bmoon4
Copy link

bmoon4 commented Jul 26, 2022

In FAB 2.X, registering the same blueprint more than once is prohibited.

https://github.com/pallets/flask/blob/2.1.2/src/flask/blueprints.py#L293-L296

Problem is entry_points_with_dist('airflow.plugins') is returning the plugin twice.
https://github.com/apache/airflow/blob/2.3.3/airflow/plugins_manager.py#L224-L225

We need a way to prevents double registration for plugins..

@uranusjr
Copy link
Member

The most likely cause is the plugin appears twice in sys.path and thus loaded multiple times by importlib.metadata. We can add some sort of registry that keeps track what distributions have been loaded (by distribution name) to avoid this.

@rgmz
Copy link

rgmz commented Jul 26, 2022

The most likely cause is the plugin appears twice in sys.path and thus loaded multiple times by importlib.metadata.

Indeed. One reason this can happen is if there are separate lib and lib64 directories, as pip install seems to copy the package to both directories.

For example:

>>> import importlib_metadata
>>> for dist in importlib_metadata.distributions():
...     for e in dist.entry_points:
...         if e.group != "airflow.plugins":
...             continue
...         print(dist._path)
...
/opt/app-root/lib64/python3.9/site-packages/custom_plugin-1.3.16.dist-info
/opt/app-root/lib/python3.9/site-packages/custom_plugin-1.3.16.dist-info

@uranusjr
Copy link
Member

as pip install seems to copy the package to both directories.

It’s probably not pip installing things to both directories; on some systems these two are symbolically linked (i.e. the same directory).

@VladimirYushkevich
Copy link
Contributor

I have very similar issue with duplicate plugins for 2.6.0:

Python 3.9.13 (main, Sep 22 2022, 15:42:13) 
[Clang 14.0.0 (clang-1400.0.29.102)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import importlib_metadata
>>> for dist in importlib_metadata.distributions():
...     for e in dist.entry_points:
...         if e.group != "airflow.plugins":
...             continue
...         print(dist._path)
... 
my_package.egg-info
/Users/my_user/path_to_my_project/my_package.egg-info

Any ideas why I have 2 entries in importlib_metadata.distributions() for airflow.plugins group (looks for me they are both pointing to same egg-info)

@VladimirYushkevich
Copy link
Contributor

I have very similar issue with duplicate plugins for 2.6.0:

Small update: Seems like above issue happens for me only during running airflow standalone (I also have virtual python environment). Maybe it is relates to: python/importlib_metadata#410.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:core kind:bug This is a clearly a bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants