Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Status of testing Providers that were prepared on December 23, 2023 #36384

Closed
32 of 84 tasks
potiuk opened this issue Dec 23, 2023 · 23 comments
Closed
32 of 84 tasks

Status of testing Providers that were prepared on December 23, 2023 #36384

potiuk opened this issue Dec 23, 2023 · 23 comments
Labels
kind:meta High-level information important to the community testing status Status of testing releases

Comments

@potiuk
Copy link
Member

potiuk commented Dec 23, 2023

Body

I have a kind request for all the contributors to the latest provider packages release.
Could you please help us to test the RC versions of the providers?

The guidelines on how to test providers can be found in

Verify providers by contributors

Let us know in the comment, whether the issue is addressed.

Those are providers that require testing as there were some substantial changes introduced:

Provider airbyte: 3.5.1rc1

Provider amazon: 8.14.0rc1

Provider apache.beam: 5.5.0rc1

Provider apache.cassandra: 3.4.1rc1

Provider apache.hdfs: 4.3.1rc1

Provider apache.hive: 6.4.0rc1

Provider apache.kafka: 1.3.1rc1

Provider apache.spark: 4.6.0rc1

Provider celery: 3.5.1rc1

Provider cncf.kubernetes: 7.12.0rc1

Provider common.sql: 1.10.0rc1

Provider databricks: 6.0.0rc1

Provider docker: 3.9.1rc1

Provider google: 10.13.0rc3

Provider microsoft.azure: 8.5.0rc1

Provider odbc: 4.4.0rc1

Provider openlineage: 1.3.1rc1

Provider postgres: 5.10.0rc1

Provider presto: 5.4.0rc1

Provider sftp: 4.8.1rc1

Provider slack: 8.5.1rc1

Provider smtp: 1.6.0rc1

Provider sqlite: 3.7.0rc1

Provider ssh: 3.10.0rc1

Provider trino: 5.6.0rc1

Provider weaviate: 1.2.0rc1

Committer

  • I acknowledge that I am a maintainer/committer of the Apache Airflow project.
@potiuk potiuk added the kind:meta High-level information important to the community label Dec 23, 2023
airflow-oss-bot added a commit to astronomer/astronomer-providers that referenced this issue Dec 23, 2023
@Lee-W
Copy link
Member

Lee-W commented Dec 23, 2023

We've tested the following provider RCs with our example DAGs without encountering issues.

apache-airflow-providers-microsoft-azure==8.5.0rc1
apache-airflow-providers-sftp==4.8.1rc1
apache-airflow-providers-amazon==8.14.0rc1
apache-airflow-providers-apache-hive==6.4.0rc1
apache-airflow-providers-cncf-kubernetes==7.12.0rc1
apache-airflow-providers-databricks==6.0.0rc1
apache-airflow-providers-google==10.13.0rc3 (including #36202 )
apache-airflow-providers-sftp==4.8.1rc1
apache-airflow-providers-microsoft-azure==8.5.0rc1

@dirrao
Copy link
Contributor

dirrao commented Dec 23, 2023 via email

@utkarsharma2
Copy link
Contributor

utkarsharma2 commented Dec 23, 2023 via email

@Joffreybvn
Copy link
Contributor

Databricks and ODBC providers work fine.
Tested #36205 and #36000

@potiuk
Copy link
Member Author

potiuk commented Dec 23, 2023

Would it be possible to have an RC2 to include this commits <75d74b1> and <ff3b8da> for the Weavaite Provider?

I am not sure if there will be RC2 for Waeviate, but we can release a new version right after this one gets released as well.

I've been slightly improving the release process to make it possible (and easy) to have a more-or-less continuous release process. I have one more small things to fix but as I explained to @eladkal in a separate discussion, basically we should be able to start a new release process right after we release previous wave - and continue doing it all the time - even if some providers get removed from the previouse wave. This should streamline our release process, where we should be able to make provider releases even more frequently - maybe even once a week - especially if we also introduce more "rotating" position of release manager, so maybe that's a good thing to try it right away :).

@potiuk
Copy link
Member Author

potiuk commented Dec 23, 2023

As you can see - for example - in the past wave we had to withdraw google and databricks, so while we are releasing new changes in other providers, in this wave the google and databricks providers are also included (google with RC3 and databricks with new - breaking - 6.0.0 version rc1). All this is going to be pretty much fully automated after one more small thing I will need to add and we should be able just carry-over such previous providers to new wave, which means that no matter what the result of voting is, we will be able to start new wave right after we release the previous one.

That will also make our release waves smaller.

@guillaumeblaquiere
Copy link
Contributor

Ok for #36133

@shohamy7
Copy link
Contributor

I've tested #36241, all looks good

@spencertollefson
Copy link
Contributor

I tested #36273. All good.

@hussein-awala
Copy link
Member

I tested all my changes, and all work as expected.

@adam133
Copy link
Contributor

adam133 commented Dec 23, 2023

I won't get a chance to test the RC until late next week, but #36248 worked when I ran the changes locally.

@utkarsharma2
Copy link
Contributor

utkarsharma2 commented Dec 24, 2023 via email

@ankurbajaj9
Copy link
Contributor

This probably also needs changing the connection type in default connections #36145 . but in itself it is ok

@vchiapaikeo
Copy link
Contributor

vchiapaikeo commented Dec 24, 2023

Has anyone noticed unusual behavior with the DagFileProcessor lately? I've observed some unusual behavior where a relatively simple dag is continually getting removed and added back. I went ahead and reverted my change locally to see if I could still repro this bug and I could so do not think it is related. However, it is hard to test my change because this simple dag seems to be continually getting removed and re-added. Logs below -

image

Scheduler Logs

12023-12-24T13:51:44.460+0000] {manager. py:523} INFO - DAG my_dag is missing and will be deactivated.
[2023-12-2413:51:44.462+0000] {manager.py:535} INFO - Deactivated 1 DAGs which are no longer present in file.
[2023-12-2413:51:44.463+00001 {manager. py:539} INFO - Deleted DAG my_dag in serialized_dag table
[2023-12-2413:52:48.417+0000] {scheduler_job_runner.py: 696} INFO - Received executor event with state success for task instance TaskInstanceKey (dag_id='my_dag', ta sk_id='task', run_id='scheduled__2021-01-01T02:05:00+00:00', try_number=1, map_ind ex=-1)
12023-12-2413:52:48.419+0000] {scheduler_job_runner.py:733} INFO - TaskInstance Finished: dag_id=my_dag, task_id=task, run_id=scheduled__2021-01-01T02:05:00+00:00,
map_index=-1, run_start_date=2023-12-24 13:50:47. 082990+00:00, run_end_date=2023-12-24 13:52:47.473307+00:00, run_duration=120.390317, state=success, executor_stat e=success, try_number=1, max_tries=2, job_id=2080, pool=default_pool, queue=defaul t, priority_weight=1, operator=BashOperator, queued_dttm=2023-12-24 13:50:46.878463+00:00, queued_by_job_id=43, pid=4500
[2023-12-2413:53: 06.037+0000] {processor-py: 277} WARNING - Killing DAGFileProcess orProcess (PID=4696)

Dag File Processor Logs

[2023-12-24T13:47:13.820+0000] {manager.py:879} INFO -
================================================================================
DAG File Processing Stats

File Path                                      PID  Runtime      # DAGs    # Errors  Last Runtime    Last Run
-------------------------------------------  -----  ---------  --------  ----------  --------------  -------------------
/files/dags/my_dag.py                         1549  40.92s            0           1  100.14s         2023-12-24T13:46:02
/files/dags/example_dynamic_task_mapping.py                           1           0  0.63s           2023-12-24T13:46:48
================================================================================
[2023-12-24T13:47:44.032+0000] {manager.py:879} INFO -
================================================================================
DAG File Processing Stats

File Path                                      PID  Runtime      # DAGs    # Errors  Last Runtime    Last Run
-------------------------------------------  -----  ---------  --------  ----------  --------------  -------------------
/files/dags/my_dag.py                         1549  71.13s            0           1  100.14s         2023-12-24T13:46:02
/files/dags/example_dynamic_task_mapping.py                           1           0  0.59s           2023-12-24T13:47:19
================================================================================
[2023-12-24T13:48:13.391+0000] {manager.py:1201} ERROR - Processor for /files/dags/my_dag.py with PID 1549 started at 2023-12-24T13:46:32.902112+00:00 has timed out, killing it.
[2023-12-24T13:48:14.430+0000] {manager.py:879} INFO -
================================================================================
DAG File Processing Stats

File Path                                    PID    Runtime      # DAGs    # Errors  Last Runtime    Last Run
-------------------------------------------  -----  ---------  --------  ----------  --------------  -------------------
/files/dags/my_dag.py                                                 0           1  100.49s         2023-12-24T13:48:13
/files/dags/example_dynamic_task_mapping.py                           1           0  0.47s           2023-12-24T13:47:50
================================================================================

my_dag.py dag file

import datetime

from airflow import DAG
from airflow.operators.bash import BashOperator
# from airflow.providers.smtp.notifications.smtp import SmtpNotifier
from datetime import timedelta

EMAIL = "xyz@gmail.com"


with DAG(
    dag_id="my_dag",
    start_date=datetime.datetime(2021, 1, 1),
    schedule="*/5 * * * *",
    max_active_tasks=1,
    max_active_runs=1,
    # sla_miss_callback=SmtpNotifier(from_email=EMAIL, to=EMAIL),
):
    BashOperator(
        task_id="task",
        bash_command="sleep 120",
        retries=2,
        sla=timedelta(seconds=30),
        # on_success_callback=SmtpNotifier(from_email=EMAIL, to=EMAIL),
        # on_failure_callback=SmtpNotifier(from_email=EMAIL, to=EMAIL),
        # on_retry_callback=SmtpNotifier(from_email=EMAIL, to=EMAIL),
    )

@potiuk
Copy link
Member Author

potiuk commented Dec 24, 2023

I think there are some main changes that influence that - I already saw one change while I was testing Python Client #36377 - but it works fine with 2.8.0 - so maybe you can test it with 2.8.0 ?

@potiuk
Copy link
Member Author

potiuk commented Dec 24, 2023

BTW. Happy Holidays everyone ! - this one will continue after 26th of December 🎄

@josh-fell
Copy link
Contributor

#36198 and #36262 verified in packaged files.

@potiuk
Copy link
Member Author

potiuk commented Dec 26, 2023

Back after holidays :)

@nathadfield
Copy link
Collaborator

All good on #36072

@potiuk
Copy link
Member Author

potiuk commented Dec 28, 2023

Thank you everyone.

Providers are released

I invite everyone to help improve providers for the next release, a list of open issues can be found here

@potiuk potiuk closed this as completed Dec 28, 2023
@potiuk potiuk added the testing status Status of testing releases label Dec 28, 2023
@ginolegigot
Copy link
Contributor

Hello!
Small question relative to this thread, do you know when this file https://raw.githubusercontent.com/apache/airflow/constraints-2.8.0/constraints-3.11.txt or any python versions will be updated to take into account the latest providers versions ? The next apache-airflow release (2.8.1) ?

@potiuk
Copy link
Member Author

potiuk commented Jan 2, 2024

https://raw.githubusercontent.com/apache/airflow/constraints-2.8.0/constraints-3.11.txt or any python versions will be updated to take into account the latest providers versions ? The next apache-airflow release (2.8.1) ?

Released constraints are almost never updated (the only case they are updated is when for some reason airflow is not installable with them). This is explained here https://airflow.apache.org/docs/apache-airflow/stable/installation/installing-from-pypi.html#constraints-files : constraints are there to provide reproducible installation of Airflow - with THE SAME packages that it was released with originally (including providers).

If you want to upgrade to latest providers, you should install them without constraints. This is explained in https://airflow.apache.org/docs/apache-airflow/stable/installation/installing-from-pypi.html#installation-and-upgrade-scenarios and in this talk from Airflow Summit 2023: https://airflowsummit.org/sessions/2023/mastering-dependencies-the-airflow-way/ (I recommend reading it / watching the talk).

The constraints are updated (automatically, when all unit and integration tests pass in related branch):

In case of https://github.com/apache/airflow/blob/constraints-v2-8-test/constraints-3.10.txt - when we release 2.8.1, those constraints will become 2.8.1 and (as well as 2.8.0) will be frozen and will not be updated any more.

@ginolegigot
Copy link
Contributor

Many thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:meta High-level information important to the community testing status Status of testing releases
Projects
None yet
Development

No branches or pull requests