-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kubernetes Executor Config Volumes Break Airflow UI #9860
Comments
Thanks for opening your first issue here! Be sure to follow the issue template! |
It should probably work if you pass in @kaxil , hope this flexibility will be available in 1.10.12 as well :) |
Fixed :) #10084 |
Hey I am using I am using airflow 1.10.12 . This was broken in scheduler completely in 1.10.10 but it's not anymore there but, My dag is failing with this error now
And when I goto Graph views I get this error as well
|
@appunni-dishq can you please post the DAG you're using? |
This is what it roughly looks like. I tried to include all details as much as possible let me know if you need more @dimberman |
Hi @appunni-dishq I think you defined your executor_config incorrectly. The entire thing is a dict, so you need to write it like this:
instead of
|
Sorry I wrote it wrong here actually it was like this
|
It's definitely not a syntax issue @dimberman |
@appunni-dishq I started airflow with the following DAG using the 1.10.12 release:
I see it rendered on the UI with no issue Is there a difference between my DAG and yours? Can you try changing the DAG name and re-rendering? Can you also show what the error looks like on the UI? |
Enable the dag and try again |
I had the dag previously before I changed version, I will change dag name and try again as there is a chance old dag id might be the issue |
I renamed the dag This is how it looks like, I started the dag after enabling it and then refreshed and that's when it happens @dimberman |
@appunni-dishq can you try the DAG I posted and confirm that it causes the same issue? (sorry for the back-and-forth, I'm genuinely confused why I can't replicate this) |
Hey I enabled it and ran the task, and only after beginning first run the error came. I am using Kubernetes to deploy airflow and this is the docker I am using may be that is the different you can tell what I am doing wrong
|
May be it's serialization which causes this ? |
Hi @dimberman intial error I talked about
This was caused by multiple instance of dags because of the way I hosted my volume, Same dags repeating causes more load on the executor which needs more cpu but times out which I saw that kubernetes_executor using more cpu since upgrade to 1.10.12 from 1.10.10, may be it's worth checking this performance regression out. The serialization error as I suspect is some remaining code in Airflow Web from 1.10.11 you guys were not aware of which tries to do serialization some how if a serialized dag is already found. This was happening once dag starts running and a task starts. So I guess it has something to do with the specifics. I will let you know more as soon as I know. Please ping me if you need more help to Debug this. I have the helm chart which I use which I can share. |
Hello, I am also facing the issue described above (UI Breaks). Airflow 1.10.12 with KubernetesExecutor. here is my dag : # -*- coding: utf-8 -*-
"""
### Tutorial Documentation
Documentation that goes along with the Airflow tutorial located
[here](https://airflow.apache.org/tutorial.html)
"""
# [START tutorial]
# [START import_module]
from datetime import timedelta
from helpers.k8s_executor import get_config as executor_config
from airflow import DAG
from airflow.operators.bash_operator import BashOperator
from airflow.utils.dates import days_ago
# [END import_module]
# [START default_args]
# These args will get passed on to each operator
# You can override them on a per-task basis during operator initialization
default_args = {
'owner': 'airflow',
'depends_on_past': False,
'start_date': days_ago(2),
'email': ['airflow@example.com'],
'email_on_failure': False,
'email_on_retry': False,
'retries': 1,
'retry_delay': timedelta(minutes=5),
# 'queue': 'bash_queue',
# 'pool': 'backfill',
# 'priority_weight': 10,
# 'end_date': datetime(2016, 1, 1),
# 'wait_for_downstream': False,
# 'dag': dag,
# 'sla': timedelta(hours=2),
# 'execution_timeout': timedelta(seconds=300),
# 'on_failure_callback': some_function,
# 'on_success_callback': some_other_function,
# 'on_retry_callback': another_function,
# 'sla_miss_callback': yet_another_function,
# 'trigger_rule': 'all_success'
}
# [END default_args]
# [START instantiate_dag]
with DAG(
'k8s_tutorial',
default_args=default_args,
description='A simple tutorial DAG on K8S',
schedule_interval=timedelta(days=1),
) as dag:
# [END instantiate_dag]
# t1, t2 and t3 are examples of tasks created by instantiating operators
# [START basic_task]
t1 = BashOperator(
task_id='print_date',
bash_command='date',
executor_config=executor_config()
)
t2 = BashOperator(
task_id='sleep',
depends_on_past=False,
bash_command='sleep 300',
retries=3,
executor_config=executor_config()
)
# [END basic_task]
# [START documentation]
dag.doc_md = __doc__
t1.doc_md = """\
#### Task Documentation
You can document your task using the attributes `doc_md` (markdown),
`doc` (plain text), `doc_rst`, `doc_json`, `doc_yaml` which gets
rendered in the UI's Task Instance Details page.
![img](http://montcs.bloomu.edu/~bobmon/Semesters/2012-01/491/import%20soul.png)
"""
# [END documentation]
t1 >> t2
# [END tutorial] here is the helper included that triggers the UI crash : def get_config():
"""
Override Executor Configuration for K8S
:return:
"""
from kubernetes.client import models as k8s
return {
"KubernetesExecutor": {
"request_cpu": "200m",
"limit_cpu": "200m",
"request_memory": "128Mi",
"limit_memory": "128Mi",
"labels": {
"app": "airflow",
"airflow-postgresql-client": "true"
},
"security_context": k8s.V1PodSecurityContext(
run_as_user=1000,
run_as_group=1000
),
"init_containers": [
k8s.V1Container(
name="git-sync-clone",
image="my_private_registry/k8s/images/git:latest",
command=[
"/scripts/git-clone.sh"
],
args=[
"MY_GIT_REPO",
"master",
"/git",
"MY_GIT_HOST",
"443",
"id_rsa"
],
resources=k8s.V1ResourceRequirements(
requests={"cpu": "100m", "memory": "64Mi"},
limits={"cpu": "100m", "memory": "64Mi"}
),
security_context=k8s.V1PodSecurityContext(
run_as_user=1000,
run_as_group=1000
),
volume_mounts=[
k8s.V1VolumeMount(
mount_path="/git",
name="airflow-dags",
read_only=False
),
k8s.V1VolumeMount(
mount_path="/scripts",
name="git-clone",
read_only=True
)
]
)
],
"volumes": [
k8s.V1Volume(
name="git-clone",
config_map=k8s.V1ConfigMapVolumeSource(
default_mode=0o777,
name="sandbox-airflow-scripts-git"
)
),
k8s.V1Volume(
name="airflow-dags",
empty_dir=k8s.V1EmptyDirVolumeSource()
),
k8s.V1Volume(
name="airflow-logs",
empty_dir=k8s.V1EmptyDirVolumeSource()
)
]
}
} Crash traceback :
|
We're hitting this same error. As @appunni-dishq pointed out above, it only happens in the graph UI when the DAG is enabled and has at least one DAG run. Any suggestions on a workaround, @dimberman ? |
Apache Airflow version: 1.10.11
Kubernetes version (if you are using kubernetes) (use
kubectl version
): 1.16.9Environment:
uname -a
):5.3.0-1019-aws #21~18.04.1-Ubuntu SMP x86_64 GNU/Linux
pip
installed viapipenv
What happened:
After adding a specification for
Volume
andVolumeMount
objects to theexecutor_config
forKubernetesExecutor
, the Airflow UI explodes with the following message when I try to clear the task or do any other action on it:Once this error is thrown after clearing a task with the volume-containing
executor_config
, that DAG no longer loads at all in the Airflow UIWhat you expected to happen:
The KubernetesExecutor dynamically adds my specific volumes to the Pod spec
How to reproduce it:
In an Airflow environment configured to use the KubernetesExecutor, pass a spec of the following form to the
executor_config
:Once such a task is defined, click the
Clear
button from the UI and it will throw theTypeError
aboveAnything else we need to know:
Obviously the volume mounts specified via
airflow.cfg
work, but we'd like to be able to dynamically add volumes to tasks that need them without having all of our jobs mount volumes unnecessarilyThe text was updated successfully, but these errors were encountered: