-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up podReconciliation using parallel goroutine #1286
Speed up podReconciliation using parallel goroutine #1286
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The calls to woc.checkAndCompress()
were removed (I think accidentally). This will break the compression feature.
if err != nil { | ||
woc.log.Warnf("Failed to apply execution control to pod %s", pod.Name) | ||
} | ||
err = woc.checkAndCompress() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jessesuen Not yet. I think the code err = woc.checkAndCompress()
here is useless. The function performAssessment
here is to make every pod state sync to nodeState. It needn't to be compress.
All workflow will be pass to persistUpdates
function, in that function, it will be checkAndCompress
. So I think this line can be removed. I have test compress function in my prod env, and it works well without this line. WDYT.
And add @sarabala1979 to double check this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
err = woc.checkAndCompress()
is required in podReconciliation. As per the existing call flow, every Node will update the nodestatus and output in podReconciliation function. Node Status compress feature needs to check every nodes update.
err = woc.checkAndCompress()
will execute only size check until it reach the size limit.
Cons:
- This will fail only the node which is crossing the limit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I have already put checkAndCompress
back.
@jessesuen I have comments in code line, Pls help review again. And pls add @sarabala1979 help double check that line removed. Thank you |
Instead of compressing the entire nodestatus on every pod update, We can do an estimate every pad nodestatus size with existing size to check nodestatus will fit or not. This approach will improve the speed on podReconciliation and also this will allow for successful pods which are fit into size. I have already discussed this with @jessesuen. |
60b084b
to
7127af2
Compare
LGTM |
7127af2
to
0256b0e
Compare
@jessesuen Thank you . |
Maybe in the For most normal process, we shouldn't Marshal all the node status. This will speed up process more . |
@jessesuen Any update on this? Thank you |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved
* Updated ARTIFACT_REPO.md (argoproj#1049) * Updated examples/README.md (argoproj#1051) * Support for K8s API based Executor (argoproj#1010) * Submodules are dirty after checkout -- need to update (argoproj#1052) * Parameter and Argument names should support snake case (argoproj#1048) * Add namespace explicitly to pod metadata (argoproj#1059) * Update dependencies to K8s v1.12 and client-go 9.0 * Adding SAP Hybris in Who uses Argo (argoproj#1064) * Add Cratejoy to list of users (argoproj#1063) * Raise not implemented error when artifact saving is unsupported (argoproj#1062) * Adding native GCS support for artifact storage and retrieval * Support nested steps workflow parallelism (argoproj#1046) * Auto-complete workflow names (argoproj#1061) * Auto-complete workflow names * Use cobra revision at fe5e611709b0c57fa4a89136deaa8e1d4004d053 * Fix string format arguments in workflow utilities. (argoproj#1070) * fix argoproj#1078 Azure AKS authentication issues (argoproj#1079) * Issue argoproj#740 - System level workflow parallelism limits & priorities (argoproj#1065) * Issue argoproj#740 - System level workflow parallelism limits & priorities * Apply reviewer notes * Add new article and minor edits. (argoproj#1083) * Update docs to outline bare minimum set of privileges for a workflow * Use relative links on README file (argoproj#1087) * Fix typo in demo.md (argoproj#1089) Fix a small typo in demo.md that I encounted when reading through the getting started guide. * Drop reference to removed `argo install` command. (argoproj#1074) * Initialize child node before marking phase. Fixes panic on invalid `When` (argoproj#1075) * argoproj#1081 added retry logic to s3 load and save function (argoproj#1082) * adding logo to be used by the OS Site (argoproj#1099) * Update ROADMAP.md * Update docs with examples using the K8s REST API * Issue argoproj#1114 - Set FORCE_NAMESPACE_ISOLATION env variable in namespace install manifests (argoproj#1116) * Fix examples docs of parameters. (argoproj#1110) * Remove docker_lib mount volume which is not needed anymore (argoproj#1115) * Remove docker_lib mount volume which is not needed anymore * Remove unused hostPathDir * add support for ppc64le and s390x (argoproj#1102) * Install mime-support in argoexec to set proper mime types for S3 artifacts (resolves argoproj#1119) * Adding Quantibio in Who uses Argo (argoproj#1111) * Adding Quantibio in Who uses Argo * fix spelling mistake * Fix output artifact and parameter conflict (argoproj#1125) `SaveArtifacts` deletes the files that `SaveParameters` might still need, so we're calling `SaveParameters` first. Fixes argoproj#1124 * Update generated swagger to fix verify-codegen (argoproj#1131) * Allow owner reference to be set in submit util (argoproj#1120) * Issue argoproj#1104 - Remove container wait timeout from 'argo logs --follow' (argoproj#1142) * Issue argoproj#1132 - Fix panic in ttl controller (argoproj#1143) * Issue argoproj#1040 - Kill daemoned step if workflow consist of single daemoned step (argoproj#1144) * Fix global artifact overwriting in nested workflow (argoproj#1086) * Fix issue where steps with exhausted retires would not complete (argoproj#1148) * add support for other archs (argoproj#1137) * Reflect minio chart changes in documentation (argoproj#1147) * Issue argoproj#1136 - Fix metadata for DAG with loops (argoproj#1149) * Issue argoproj#1136 - Fix metadata for DAG with loops * Add slack badge to README (argoproj#1164) * Fix failing TestAddGlobalArtifactToScope unit test * Fix tests compilation error (argoproj#1157) * Replace exponential retry with poll (argoproj#1166) * add support for hostNetwork & dnsPolicy config (argoproj#1161) * Support HDFS Artifact (argoproj#1159) Support HDFS Artifact (argoproj#1159) * Update codegen for network config (argoproj#1168) * Add GitHub to users in README.md (argoproj#1151) * Add Preferred Networks to users in README.md (argoproj#1172) * Add missing patch in namespace kustomization.yaml (argoproj#1170) * Validate ArchiveLocation artifacts (argoproj#1167) * Update README and preview notice in CLA. * Update README. (argoproj#1173) (argoproj#1176) * Argo users: Equinor (argoproj#1175) * Do not mount unnecessary docker socket (argoproj#1178) * Issue argoproj#1113 - Wait for daemon pods completion to handle annotations (argoproj#1177) * Issue argoproj#1113 - Wait for daemon pods completion to handle annotations * Add output artifacts to influxdb-ci example * Increased S3 artifact retry time and added log (argoproj#1138) * Issue argoproj#1123 - Fix 'kubectl get' failure if resource namespace is different from workflow namespace (argoproj#1171) * Refactor Makefile/Dockerfile to remove volume binding in favor of build stages (argoproj#1189) * Add Docker Hub build hooks * Add documentation how to use parameter-file's (argoproj#1191) * Issue argoproj#988 - Submit should not print logs to stdout unless output is 'wide' (argoproj#1192) * Fix missing docker binary in argoexec image. Improve reuse of image layers * Fischerjulian adds ruby to rest docs (argoproj#1196) * Adds link to ruby kubernetes library. * Links to a ruby example on how to start a workflow * Updated OWNERS (argoproj#1198) * Update community/README (argoproj#1197) * Issue argoproj#1128 - Use polling instead of fs notify to get annotation changes (argoproj#1194) * Minor spelling, formatting, and style updates. (argoproj#1193) * Dockerfile: argoexec base image correction (fixes argoproj#1209) (argoproj#1213) * Set executor image pull policy for resource template (argoproj#1174) * Add schedulerName to workflow and template spec (argoproj#1184) * Issue argoproj#1190 - Fix incorrect retry node handling (argoproj#1208) * fix dag retries (argoproj#1221) * Executor can access the k8s apiserver with a out-of-cluster config file (argoproj#1134) Executor can access the k8s apiserver with a out-of-cluster config file * Update README with typo fixes (argoproj#1220) * Update README.md (argoproj#1236) * Remove extra quotes around output parameter value (argoproj#1232) Ensure we do not insert extra single quotes when using valueFrom: jsonPath to set the value of an output parameter for resource templates. Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com> * Update README.md (argoproj#1224) * Include stderr when retrieving docker logs (argoproj#1225) * Add Gardener to "Who uses Argo" (argoproj#1228) * Add feature to continue workflow on failed/error steps/tasks (argoproj#1205) * Fix the Prometheus address references (argoproj#1237) * Fixed Issue#1223 Kubernetes Resource action: patch is not supported (argoproj#1245) * Fixed Issue#1223 Kubernetes Resource action: patch is not supported This PR is fixed the Issue#1223 reported by @shanesiebken . Argo kubernetes resource workflow failed on patch action. --patch or -p option is required for kubectl patch action. This PR is including the manifest yaml as patch argument for kubectl. This Fix will support the Patch action in Argo kubernetes resource workflow. This Fix will support only JSON merge strategic in patch action * udpated formating * typo, executo -> executor (argoproj#1243) * Issue#1165 fake outputs don't notify and task completes successfully (argoproj#1247) * Issue#1165 fake outputs don't notify and task completes successfully This PR is addressing the Issue#1165 reported by @alexfrieden. Issue/Bug: Argo is finishing the task successfully even artifact /file does exist. Fix: Validate the created gzip contains artifact or file. if file/artifact doesn't exist, Current step/stage/task will be failed with log message . Sample Log: ''' INFO[0029] Updating node artifact-passing-lkvj8[0].generate-artifact (artifact-passing-lkvj8-1949982165) status Running -> Error INFO[0029] Updating node artifact-passing-lkvj8[0].generate-artifact (artifact-passing-lkvj8-1949982165) message: failed to save outputs: File or Artifact does not exist. /tmp/hello_world.txt INFO[0029] Step group node artifact-passing-lkvj8[0] (artifact-passing-lkvj8-1067333159) deemed failed: child 'artifact-passing-lkvj8-1949982165' failed namespace=default workflow=artifact-passing-lkvj8 INFO[0029] node artifact-passing-lkvj8[0] (artifact-passing-lkvj8-1067333159) phase Running -> Failed namespace=default workflow=artifact-passing-lkvj8 ''' * fixed gometalinter errcheck issue * Git cloning via SSH was not verifying host public key (argoproj#1261) * Update versions (argoproj#1218) * Proxy Priority and PriorityClassName to pods (argoproj#1179) * Error running 1000s of tasks: "etcdserver: request is too large" argoproj#1186 (argoproj#1264) * Error running 1000s of tasks: "etcdserver: request is too large" argoproj#1186 This PR is addressing the feature request argoproj#1186. Issue: Nodestatus element keeps growing for big workflow. Workflow will fail once the workflow total size reachs 1 MB (maz size limit in ETCD) . Solution: Compressing the Nodestatus once size reachs the 1 MB which increasing 60% to 80% more steps to execute in compress mode. Latest: Argo cli and Argo UI will able to decode and print nodestatus from compressednoode. Limitation: Kubectl willl not decode the compressedNode element * added Operator.go * revert the testing yaml * Fixed the lint issue * fixed * fixed lint * Fixed Testcase * incorporated the review comments * Reverted the change * incorporated review comments * fixing gometalinter checks * incorporated review comments * Update pod-limits.yaml * updated few comments * updated error message format * reverted unwanted files * Reduce redundancy pod label action (argoproj#1271) * Add the `mergeStrategy` option to resource patching (argoproj#1269) * This adds the ability to pass a mergeStrategy to a patch resource. this is valuable because the default merge strategy for kubernetes is 'strategic', which does not work with Custom Resources. * This also updates the resource example to demonstrate how it is used * Fix bug with DockerExecutor's CopyFile (argoproj#1275) The check to see if the source path was in the tgz archive was wrong when source path was a folder, the arguments to strings.Contains were inverted. * Add workflow labels and annotations global vars (argoproj#1280) * Argo CI is current inactive (argoproj#1285) * Issue#896 Workflow steps with non-existant output artifact path will succeed (argoproj#1277) * Issue#896 Workflow steps with non-existant output artifact path will succeed Issue: argoproj#897 Solution: Added new element "optional" in Artifact. The default is false. This flag will make artifact as optional and existence check will be ignored if input/output artifact has optional=true. Output Artifact ( optional=true ): Artifact existence check will be ignored during the save artifact in destination and continued workflow Input Artifact ( optional=true ): Artifact exist check will be ignored during load artifact from source and continued workflow * added end of line * removed unwanted whitespace * Deleted test code * go formatted * added formatting directives * updated Codegen * Fixed format on merge conflict * format fix * updated comments * improved error case * Fix for Resource creation where template has same parameter templating (argoproj#1283) * Fix for Resource creation where template has same parameter templating This PR will enable to support the custom template variable reference. Soulltion: Workflow variable reference resolve will check the Workflow variable prefix. * added test * fixed gofmt issue * fixed format * fixed gofmt on common.go * fixed testcase * fixed gofmt * Added unit testcase and documented * fixed Gofmt format * updated comments * Admiralty: add link to blog post, add user (argoproj#1295) * Add dns config support (argoproj#1301) * Speed up podReconciliation using parallel goroutine (argoproj#1286) * Speed up podReconciliation using parallel goroutine * Fix make lint issue * put checkandcompress back * Add community meeting notes link (argoproj#1304) * Add Karius to users in README.md (argoproj#1305) * Added support for artifact path references (argoproj#1300) * Added support for artifact path references Adds new `{{inputs.artifacts.<NAME>.path}}` and `{{outputs.artifacts.<NAME>.path}}` placeholders. * Add support for init containers (argoproj#1183) * Secrets should be passed to pods using volumes instead of API calls (argoproj#1302) * Secrets should be passed to pods using downward API instead of API calls * Fixed Gogfmt format * fixed file close Gofmt * updated review comments * fixed gofmt * updated review comments * CheckandEstimate implementation to optimize podReconciliation (argoproj#1308) * CheckandEstimate implementation * fixed variable rename * fixed gofmt * fixed feedbacks * Update operator.go * Update operator.go * Add alibaba cloud to officially using argo list (argoproj#1313) * Refactor checkandEstimate to optimize podReconciliation (argoproj#1311) * Refactor checkandEstimate to optimize podReconciliation * Move compress function to persistUpdates * Fix formatting issues in examples documentation (argoproj#1310) * Fix nil pointer dereference with secret volumes (argoproj#1314) * Archive location should conditionally be added to template only when needed * Fix SIGSEGV in watch/CheckAndDecompress. Consolidate duplicate code (resolves argoproj#1315) * Implement support for PNS (Process Namespace Sharing) executor (argoproj#1214) * Implements PNS (Process Namespace Sharing) executor * Adds limited support for Kubelet/K8s API artifact collection by mirroring volume mounts to wait sidecar * Adds validation to detect when output artifacts are not supported by the executor * Adds ability to customize executor from workflow-controller-configmap (e.g. add environment variables, append command line args such as loglevel) * Fixes an issue where daemon steps were not getting terminated properly * Reorganize manifests to kustomize 2 and update version to v2.3.0-rc1 * Update v2.3.0 CHANGELOG.md * Export the methods of `KubernetesClientInterface` (argoproj#1294) All calls to these methods previously generated a panic at runtime because the calls resolved to the default, panic-always implementation, not to the overrides provided by `k8sAPIClient` and `kubeletClient`. Embedding an exported interface with unexported methods into a struct is the only way to implement that interface in another package. When doing this, the compiler generates default, panic-always implementations for all methods from the interface. Implementors can override exported methods, but it's not possible to override an unexported method from the interface. All invocations that go through the interface will come to the default implementation, even if the struct tries to provide an override. * Update README.md (argoproj#1321) * Issue1316 Pod creation with secret volumemount (argoproj#1318) * CheckandEstimate implementation * fixed variable rename * fixed gofmt * fixed feedbacks * Fixed the duplicate mountpath issue * Support parameter substitution in the volumes attribute (argoproj#1238) * `argo list` was not displaying non-zero priorities correctly * Fix regression where argoexec wait would not return when podname was too long * wait will conditionally become privileged if main/sidecar privileged (resolves argoproj#1323) * Update version to v2.3.0-rc2. Update changelog * Add documentation on releasing * use a secret selector for getting credentials * fixing build issues * linter issues * fixing jenkinsfile(?) * jenkins * jenkins * jenkins * jenkins * jenkins? * jenkins :( * jenkins :( * jenkins * jenkins * jenkins * jenkins * gopkg * use GetSecretFromVolMount instead of GetSecrets * actually build argoexec * Fix argoproj#1340 parameter substitution bug Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>
* Updated ARTIFACT_REPO.md (argoproj#1049) * Updated examples/README.md (argoproj#1051) * Support for K8s API based Executor (argoproj#1010) * Submodules are dirty after checkout -- need to update (argoproj#1052) * Parameter and Argument names should support snake case (argoproj#1048) * Add namespace explicitly to pod metadata (argoproj#1059) * Update dependencies to K8s v1.12 and client-go 9.0 * Adding SAP Hybris in Who uses Argo (argoproj#1064) * Add Cratejoy to list of users (argoproj#1063) * Raise not implemented error when artifact saving is unsupported (argoproj#1062) * Adding native GCS support for artifact storage and retrieval * Support nested steps workflow parallelism (argoproj#1046) * Auto-complete workflow names (argoproj#1061) * Auto-complete workflow names * Use cobra revision at fe5e611709b0c57fa4a89136deaa8e1d4004d053 * Fix string format arguments in workflow utilities. (argoproj#1070) * fix argoproj#1078 Azure AKS authentication issues (argoproj#1079) * Issue argoproj#740 - System level workflow parallelism limits & priorities (argoproj#1065) * Issue argoproj#740 - System level workflow parallelism limits & priorities * Apply reviewer notes * Add new article and minor edits. (argoproj#1083) * Update docs to outline bare minimum set of privileges for a workflow * Use relative links on README file (argoproj#1087) * Fix typo in demo.md (argoproj#1089) Fix a small typo in demo.md that I encounted when reading through the getting started guide. * Drop reference to removed `argo install` command. (argoproj#1074) * Initialize child node before marking phase. Fixes panic on invalid `When` (argoproj#1075) * argoproj#1081 added retry logic to s3 load and save function (argoproj#1082) * adding logo to be used by the OS Site (argoproj#1099) * Update ROADMAP.md * Update docs with examples using the K8s REST API * Issue argoproj#1114 - Set FORCE_NAMESPACE_ISOLATION env variable in namespace install manifests (argoproj#1116) * Fix examples docs of parameters. (argoproj#1110) * Remove docker_lib mount volume which is not needed anymore (argoproj#1115) * Remove docker_lib mount volume which is not needed anymore * Remove unused hostPathDir * add support for ppc64le and s390x (argoproj#1102) * Install mime-support in argoexec to set proper mime types for S3 artifacts (resolves argoproj#1119) * Adding Quantibio in Who uses Argo (argoproj#1111) * Adding Quantibio in Who uses Argo * fix spelling mistake * Fix output artifact and parameter conflict (argoproj#1125) `SaveArtifacts` deletes the files that `SaveParameters` might still need, so we're calling `SaveParameters` first. Fixes argoproj#1124 * Update generated swagger to fix verify-codegen (argoproj#1131) * Allow owner reference to be set in submit util (argoproj#1120) * Issue argoproj#1104 - Remove container wait timeout from 'argo logs --follow' (argoproj#1142) * Issue argoproj#1132 - Fix panic in ttl controller (argoproj#1143) * Issue argoproj#1040 - Kill daemoned step if workflow consist of single daemoned step (argoproj#1144) * Fix global artifact overwriting in nested workflow (argoproj#1086) * Fix issue where steps with exhausted retires would not complete (argoproj#1148) * add support for other archs (argoproj#1137) * Reflect minio chart changes in documentation (argoproj#1147) * Issue argoproj#1136 - Fix metadata for DAG with loops (argoproj#1149) * Issue argoproj#1136 - Fix metadata for DAG with loops * Add slack badge to README (argoproj#1164) * Fix failing TestAddGlobalArtifactToScope unit test * Fix tests compilation error (argoproj#1157) * Replace exponential retry with poll (argoproj#1166) * add support for hostNetwork & dnsPolicy config (argoproj#1161) * Support HDFS Artifact (argoproj#1159) Support HDFS Artifact (argoproj#1159) * Update codegen for network config (argoproj#1168) * Add GitHub to users in README.md (argoproj#1151) * Add Preferred Networks to users in README.md (argoproj#1172) * Add missing patch in namespace kustomization.yaml (argoproj#1170) * Validate ArchiveLocation artifacts (argoproj#1167) * Update README and preview notice in CLA. * Update README. (argoproj#1173) (argoproj#1176) * Argo users: Equinor (argoproj#1175) * Do not mount unnecessary docker socket (argoproj#1178) * Issue argoproj#1113 - Wait for daemon pods completion to handle annotations (argoproj#1177) * Issue argoproj#1113 - Wait for daemon pods completion to handle annotations * Add output artifacts to influxdb-ci example * Increased S3 artifact retry time and added log (argoproj#1138) * Issue argoproj#1123 - Fix 'kubectl get' failure if resource namespace is different from workflow namespace (argoproj#1171) * Refactor Makefile/Dockerfile to remove volume binding in favor of build stages (argoproj#1189) * Add Docker Hub build hooks * Add documentation how to use parameter-file's (argoproj#1191) * Issue argoproj#988 - Submit should not print logs to stdout unless output is 'wide' (argoproj#1192) * Fix missing docker binary in argoexec image. Improve reuse of image layers * Fischerjulian adds ruby to rest docs (argoproj#1196) * Adds link to ruby kubernetes library. * Links to a ruby example on how to start a workflow * Updated OWNERS (argoproj#1198) * Update community/README (argoproj#1197) * Issue argoproj#1128 - Use polling instead of fs notify to get annotation changes (argoproj#1194) * Minor spelling, formatting, and style updates. (argoproj#1193) * Dockerfile: argoexec base image correction (fixes argoproj#1209) (argoproj#1213) * Set executor image pull policy for resource template (argoproj#1174) * Add schedulerName to workflow and template spec (argoproj#1184) * Issue argoproj#1190 - Fix incorrect retry node handling (argoproj#1208) * fix dag retries (argoproj#1221) * Executor can access the k8s apiserver with a out-of-cluster config file (argoproj#1134) Executor can access the k8s apiserver with a out-of-cluster config file * Update README with typo fixes (argoproj#1220) * Update README.md (argoproj#1236) * Remove extra quotes around output parameter value (argoproj#1232) Ensure we do not insert extra single quotes when using valueFrom: jsonPath to set the value of an output parameter for resource templates. Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com> * Update README.md (argoproj#1224) * Include stderr when retrieving docker logs (argoproj#1225) * Add Gardener to "Who uses Argo" (argoproj#1228) * Add feature to continue workflow on failed/error steps/tasks (argoproj#1205) * Fix the Prometheus address references (argoproj#1237) * Fixed Issue#1223 Kubernetes Resource action: patch is not supported (argoproj#1245) * Fixed Issue#1223 Kubernetes Resource action: patch is not supported This PR is fixed the Issue#1223 reported by @shanesiebken . Argo kubernetes resource workflow failed on patch action. --patch or -p option is required for kubectl patch action. This PR is including the manifest yaml as patch argument for kubectl. This Fix will support the Patch action in Argo kubernetes resource workflow. This Fix will support only JSON merge strategic in patch action * udpated formating * typo, executo -> executor (argoproj#1243) * Issue#1165 fake outputs don't notify and task completes successfully (argoproj#1247) * Issue#1165 fake outputs don't notify and task completes successfully This PR is addressing the Issue#1165 reported by @alexfrieden. Issue/Bug: Argo is finishing the task successfully even artifact /file does exist. Fix: Validate the created gzip contains artifact or file. if file/artifact doesn't exist, Current step/stage/task will be failed with log message . Sample Log: ''' INFO[0029] Updating node artifact-passing-lkvj8[0].generate-artifact (artifact-passing-lkvj8-1949982165) status Running -> Error INFO[0029] Updating node artifact-passing-lkvj8[0].generate-artifact (artifact-passing-lkvj8-1949982165) message: failed to save outputs: File or Artifact does not exist. /tmp/hello_world.txt INFO[0029] Step group node artifact-passing-lkvj8[0] (artifact-passing-lkvj8-1067333159) deemed failed: child 'artifact-passing-lkvj8-1949982165' failed namespace=default workflow=artifact-passing-lkvj8 INFO[0029] node artifact-passing-lkvj8[0] (artifact-passing-lkvj8-1067333159) phase Running -> Failed namespace=default workflow=artifact-passing-lkvj8 ''' * fixed gometalinter errcheck issue * Git cloning via SSH was not verifying host public key (argoproj#1261) * Update versions (argoproj#1218) * Proxy Priority and PriorityClassName to pods (argoproj#1179) * Error running 1000s of tasks: "etcdserver: request is too large" argoproj#1186 (argoproj#1264) * Error running 1000s of tasks: "etcdserver: request is too large" argoproj#1186 This PR is addressing the feature request argoproj#1186. Issue: Nodestatus element keeps growing for big workflow. Workflow will fail once the workflow total size reachs 1 MB (maz size limit in ETCD) . Solution: Compressing the Nodestatus once size reachs the 1 MB which increasing 60% to 80% more steps to execute in compress mode. Latest: Argo cli and Argo UI will able to decode and print nodestatus from compressednoode. Limitation: Kubectl willl not decode the compressedNode element * added Operator.go * revert the testing yaml * Fixed the lint issue * fixed * fixed lint * Fixed Testcase * incorporated the review comments * Reverted the change * incorporated review comments * fixing gometalinter checks * incorporated review comments * Update pod-limits.yaml * updated few comments * updated error message format * reverted unwanted files * Reduce redundancy pod label action (argoproj#1271) * Add the `mergeStrategy` option to resource patching (argoproj#1269) * This adds the ability to pass a mergeStrategy to a patch resource. this is valuable because the default merge strategy for kubernetes is 'strategic', which does not work with Custom Resources. * This also updates the resource example to demonstrate how it is used * Fix bug with DockerExecutor's CopyFile (argoproj#1275) The check to see if the source path was in the tgz archive was wrong when source path was a folder, the arguments to strings.Contains were inverted. * Add workflow labels and annotations global vars (argoproj#1280) * Argo CI is current inactive (argoproj#1285) * Issue#896 Workflow steps with non-existant output artifact path will succeed (argoproj#1277) * Issue#896 Workflow steps with non-existant output artifact path will succeed Issue: argoproj#897 Solution: Added new element "optional" in Artifact. The default is false. This flag will make artifact as optional and existence check will be ignored if input/output artifact has optional=true. Output Artifact ( optional=true ): Artifact existence check will be ignored during the save artifact in destination and continued workflow Input Artifact ( optional=true ): Artifact exist check will be ignored during load artifact from source and continued workflow * added end of line * removed unwanted whitespace * Deleted test code * go formatted * added formatting directives * updated Codegen * Fixed format on merge conflict * format fix * updated comments * improved error case * Fix for Resource creation where template has same parameter templating (argoproj#1283) * Fix for Resource creation where template has same parameter templating This PR will enable to support the custom template variable reference. Soulltion: Workflow variable reference resolve will check the Workflow variable prefix. * added test * fixed gofmt issue * fixed format * fixed gofmt on common.go * fixed testcase * fixed gofmt * Added unit testcase and documented * fixed Gofmt format * updated comments * Admiralty: add link to blog post, add user (argoproj#1295) * Add dns config support (argoproj#1301) * Speed up podReconciliation using parallel goroutine (argoproj#1286) * Speed up podReconciliation using parallel goroutine * Fix make lint issue * put checkandcompress back * Add community meeting notes link (argoproj#1304) * Add Karius to users in README.md (argoproj#1305) * Added support for artifact path references (argoproj#1300) * Added support for artifact path references Adds new `{{inputs.artifacts.<NAME>.path}}` and `{{outputs.artifacts.<NAME>.path}}` placeholders. * Add support for init containers (argoproj#1183) * Secrets should be passed to pods using volumes instead of API calls (argoproj#1302) * Secrets should be passed to pods using downward API instead of API calls * Fixed Gogfmt format * fixed file close Gofmt * updated review comments * fixed gofmt * updated review comments * CheckandEstimate implementation to optimize podReconciliation (argoproj#1308) * CheckandEstimate implementation * fixed variable rename * fixed gofmt * fixed feedbacks * Update operator.go * Update operator.go * Add alibaba cloud to officially using argo list (argoproj#1313) * Refactor checkandEstimate to optimize podReconciliation (argoproj#1311) * Refactor checkandEstimate to optimize podReconciliation * Move compress function to persistUpdates * Fix formatting issues in examples documentation (argoproj#1310) * Fix nil pointer dereference with secret volumes (argoproj#1314) * Archive location should conditionally be added to template only when needed * Fix SIGSEGV in watch/CheckAndDecompress. Consolidate duplicate code (resolves argoproj#1315) * Implement support for PNS (Process Namespace Sharing) executor (argoproj#1214) * Implements PNS (Process Namespace Sharing) executor * Adds limited support for Kubelet/K8s API artifact collection by mirroring volume mounts to wait sidecar * Adds validation to detect when output artifacts are not supported by the executor * Adds ability to customize executor from workflow-controller-configmap (e.g. add environment variables, append command line args such as loglevel) * Fixes an issue where daemon steps were not getting terminated properly * Reorganize manifests to kustomize 2 and update version to v2.3.0-rc1 * Update v2.3.0 CHANGELOG.md * Export the methods of `KubernetesClientInterface` (argoproj#1294) All calls to these methods previously generated a panic at runtime because the calls resolved to the default, panic-always implementation, not to the overrides provided by `k8sAPIClient` and `kubeletClient`. Embedding an exported interface with unexported methods into a struct is the only way to implement that interface in another package. When doing this, the compiler generates default, panic-always implementations for all methods from the interface. Implementors can override exported methods, but it's not possible to override an unexported method from the interface. All invocations that go through the interface will come to the default implementation, even if the struct tries to provide an override. * Update README.md (argoproj#1321) * Issue1316 Pod creation with secret volumemount (argoproj#1318) * CheckandEstimate implementation * fixed variable rename * fixed gofmt * fixed feedbacks * Fixed the duplicate mountpath issue * Support parameter substitution in the volumes attribute (argoproj#1238) * `argo list` was not displaying non-zero priorities correctly * Fix regression where argoexec wait would not return when podname was too long * wait will conditionally become privileged if main/sidecar privileged (resolves argoproj#1323) * Update version to v2.3.0-rc2. Update changelog * Add documentation on releasing * use a secret selector for getting credentials * fixing build issues * linter issues * fixing jenkinsfile(?) * jenkins * jenkins * jenkins * jenkins * jenkins? * jenkins :( * jenkins :( * jenkins * jenkins * jenkins * jenkins * gopkg * use GetSecretFromVolMount instead of GetSecrets * actually build argoexec * Fix argoproj#1340 parameter substitution bug Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com> * fixing gcs upload method
* wip * wip * initial working version of gcs artifact storage * addressing pr feedback * updating codegen * wip * fixing issue with workflow saving * check to see if stat result is nil * adding a jenkinsfile * a small change which will hopefully speed up jenkins builds a lot * cleanup of docker push logic * cleanup of docker push logic * cleanup of docker push logic * cleanup of docker push logic * cleanup of docker push logic * changing the import path * preserving original link in a readme * use semantic version tagging (#9) * [CSE-11] adding config file loader (#10) * adding configmap loader * PR #10 should have been a minor version not a patch (#11) * adding autodeploy to jenkinsfile (#12) * fixing autodeployments (#13) * [CSE-13] extended error handling for workflows (#16) * wip * mixed case imports cause all sorts of problems, switch to lowercase * fixing build issu * fixing error deserialization * fixing error deserialization * unmatched string logic * make workflows fail on error trigger * properly evaluate workflow failures * dev version bump * serialize errors and warnings into wf crd * ugh go types * rewriting error handling to support file sources * temporarily commenting out a test * fixing warning handler * fixing error handling fixing executor fixing executor Fixing executor fixing executor fixing executor fixing executor asdf fixing executor fixing operator operator fixing executor cleanup * cleaning up types * updating codegen * fixing version * updating codegen * add podname and stage name to error result * version 2.5.0->2.4.0 * ErrorCondition->ExceptionCondition * codegen * [CS-14] merging UI update into rc (#18) * update node * UI tweaks * fixing a comment * versionbump * fixing build errors * [CSE-57] Upgrade argo (#24) * Updated ARTIFACT_REPO.md (argoproj#1049) * Updated examples/README.md (argoproj#1051) * Support for K8s API based Executor (argoproj#1010) * Submodules are dirty after checkout -- need to update (argoproj#1052) * Parameter and Argument names should support snake case (argoproj#1048) * Add namespace explicitly to pod metadata (argoproj#1059) * Update dependencies to K8s v1.12 and client-go 9.0 * Adding SAP Hybris in Who uses Argo (argoproj#1064) * Add Cratejoy to list of users (argoproj#1063) * Raise not implemented error when artifact saving is unsupported (argoproj#1062) * Adding native GCS support for artifact storage and retrieval * Support nested steps workflow parallelism (argoproj#1046) * Auto-complete workflow names (argoproj#1061) * Auto-complete workflow names * Use cobra revision at fe5e611709b0c57fa4a89136deaa8e1d4004d053 * Fix string format arguments in workflow utilities. (argoproj#1070) * fix argoproj#1078 Azure AKS authentication issues (argoproj#1079) * Issue argoproj#740 - System level workflow parallelism limits & priorities (argoproj#1065) * Issue argoproj#740 - System level workflow parallelism limits & priorities * Apply reviewer notes * Add new article and minor edits. (argoproj#1083) * Update docs to outline bare minimum set of privileges for a workflow * Use relative links on README file (argoproj#1087) * Fix typo in demo.md (argoproj#1089) Fix a small typo in demo.md that I encounted when reading through the getting started guide. * Drop reference to removed `argo install` command. (argoproj#1074) * Initialize child node before marking phase. Fixes panic on invalid `When` (argoproj#1075) * argoproj#1081 added retry logic to s3 load and save function (argoproj#1082) * adding logo to be used by the OS Site (argoproj#1099) * Update ROADMAP.md * Update docs with examples using the K8s REST API * Issue argoproj#1114 - Set FORCE_NAMESPACE_ISOLATION env variable in namespace install manifests (argoproj#1116) * Fix examples docs of parameters. (argoproj#1110) * Remove docker_lib mount volume which is not needed anymore (argoproj#1115) * Remove docker_lib mount volume which is not needed anymore * Remove unused hostPathDir * add support for ppc64le and s390x (argoproj#1102) * Install mime-support in argoexec to set proper mime types for S3 artifacts (resolves argoproj#1119) * Adding Quantibio in Who uses Argo (argoproj#1111) * Adding Quantibio in Who uses Argo * fix spelling mistake * Fix output artifact and parameter conflict (argoproj#1125) `SaveArtifacts` deletes the files that `SaveParameters` might still need, so we're calling `SaveParameters` first. Fixes argoproj#1124 * Update generated swagger to fix verify-codegen (argoproj#1131) * Allow owner reference to be set in submit util (argoproj#1120) * Issue argoproj#1104 - Remove container wait timeout from 'argo logs --follow' (argoproj#1142) * Issue argoproj#1132 - Fix panic in ttl controller (argoproj#1143) * Issue argoproj#1040 - Kill daemoned step if workflow consist of single daemoned step (argoproj#1144) * Fix global artifact overwriting in nested workflow (argoproj#1086) * Fix issue where steps with exhausted retires would not complete (argoproj#1148) * add support for other archs (argoproj#1137) * Reflect minio chart changes in documentation (argoproj#1147) * Issue argoproj#1136 - Fix metadata for DAG with loops (argoproj#1149) * Issue argoproj#1136 - Fix metadata for DAG with loops * Add slack badge to README (argoproj#1164) * Fix failing TestAddGlobalArtifactToScope unit test * Fix tests compilation error (argoproj#1157) * Replace exponential retry with poll (argoproj#1166) * add support for hostNetwork & dnsPolicy config (argoproj#1161) * Support HDFS Artifact (argoproj#1159) Support HDFS Artifact (argoproj#1159) * Update codegen for network config (argoproj#1168) * Add GitHub to users in README.md (argoproj#1151) * Add Preferred Networks to users in README.md (argoproj#1172) * Add missing patch in namespace kustomization.yaml (argoproj#1170) * Validate ArchiveLocation artifacts (argoproj#1167) * Update README and preview notice in CLA. * Update README. (argoproj#1173) (argoproj#1176) * Argo users: Equinor (argoproj#1175) * Do not mount unnecessary docker socket (argoproj#1178) * Issue argoproj#1113 - Wait for daemon pods completion to handle annotations (argoproj#1177) * Issue argoproj#1113 - Wait for daemon pods completion to handle annotations * Add output artifacts to influxdb-ci example * Increased S3 artifact retry time and added log (argoproj#1138) * Issue argoproj#1123 - Fix 'kubectl get' failure if resource namespace is different from workflow namespace (argoproj#1171) * Refactor Makefile/Dockerfile to remove volume binding in favor of build stages (argoproj#1189) * Add Docker Hub build hooks * Add documentation how to use parameter-file's (argoproj#1191) * Issue argoproj#988 - Submit should not print logs to stdout unless output is 'wide' (argoproj#1192) * Fix missing docker binary in argoexec image. Improve reuse of image layers * Fischerjulian adds ruby to rest docs (argoproj#1196) * Adds link to ruby kubernetes library. * Links to a ruby example on how to start a workflow * Updated OWNERS (argoproj#1198) * Update community/README (argoproj#1197) * Issue argoproj#1128 - Use polling instead of fs notify to get annotation changes (argoproj#1194) * Minor spelling, formatting, and style updates. (argoproj#1193) * Dockerfile: argoexec base image correction (fixes argoproj#1209) (argoproj#1213) * Set executor image pull policy for resource template (argoproj#1174) * Add schedulerName to workflow and template spec (argoproj#1184) * Issue argoproj#1190 - Fix incorrect retry node handling (argoproj#1208) * fix dag retries (argoproj#1221) * Executor can access the k8s apiserver with a out-of-cluster config file (argoproj#1134) Executor can access the k8s apiserver with a out-of-cluster config file * Update README with typo fixes (argoproj#1220) * Update README.md (argoproj#1236) * Remove extra quotes around output parameter value (argoproj#1232) Ensure we do not insert extra single quotes when using valueFrom: jsonPath to set the value of an output parameter for resource templates. Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com> * Update README.md (argoproj#1224) * Include stderr when retrieving docker logs (argoproj#1225) * Add Gardener to "Who uses Argo" (argoproj#1228) * Add feature to continue workflow on failed/error steps/tasks (argoproj#1205) * Fix the Prometheus address references (argoproj#1237) * Fixed Issue#1223 Kubernetes Resource action: patch is not supported (argoproj#1245) * Fixed Issue#1223 Kubernetes Resource action: patch is not supported This PR is fixed the Issue#1223 reported by @shanesiebken . Argo kubernetes resource workflow failed on patch action. --patch or -p option is required for kubectl patch action. This PR is including the manifest yaml as patch argument for kubectl. This Fix will support the Patch action in Argo kubernetes resource workflow. This Fix will support only JSON merge strategic in patch action * udpated formating * typo, executo -> executor (argoproj#1243) * Issue#1165 fake outputs don't notify and task completes successfully (argoproj#1247) * Issue#1165 fake outputs don't notify and task completes successfully This PR is addressing the Issue#1165 reported by @alexfrieden. Issue/Bug: Argo is finishing the task successfully even artifact /file does exist. Fix: Validate the created gzip contains artifact or file. if file/artifact doesn't exist, Current step/stage/task will be failed with log message . Sample Log: ''' INFO[0029] Updating node artifact-passing-lkvj8[0].generate-artifact (artifact-passing-lkvj8-1949982165) status Running -> Error INFO[0029] Updating node artifact-passing-lkvj8[0].generate-artifact (artifact-passing-lkvj8-1949982165) message: failed to save outputs: File or Artifact does not exist. /tmp/hello_world.txt INFO[0029] Step group node artifact-passing-lkvj8[0] (artifact-passing-lkvj8-1067333159) deemed failed: child 'artifact-passing-lkvj8-1949982165' failed namespace=default workflow=artifact-passing-lkvj8 INFO[0029] node artifact-passing-lkvj8[0] (artifact-passing-lkvj8-1067333159) phase Running -> Failed namespace=default workflow=artifact-passing-lkvj8 ''' * fixed gometalinter errcheck issue * Git cloning via SSH was not verifying host public key (argoproj#1261) * Update versions (argoproj#1218) * Proxy Priority and PriorityClassName to pods (argoproj#1179) * Error running 1000s of tasks: "etcdserver: request is too large" argoproj#1186 (argoproj#1264) * Error running 1000s of tasks: "etcdserver: request is too large" argoproj#1186 This PR is addressing the feature request argoproj#1186. Issue: Nodestatus element keeps growing for big workflow. Workflow will fail once the workflow total size reachs 1 MB (maz size limit in ETCD) . Solution: Compressing the Nodestatus once size reachs the 1 MB which increasing 60% to 80% more steps to execute in compress mode. Latest: Argo cli and Argo UI will able to decode and print nodestatus from compressednoode. Limitation: Kubectl willl not decode the compressedNode element * added Operator.go * revert the testing yaml * Fixed the lint issue * fixed * fixed lint * Fixed Testcase * incorporated the review comments * Reverted the change * incorporated review comments * fixing gometalinter checks * incorporated review comments * Update pod-limits.yaml * updated few comments * updated error message format * reverted unwanted files * Reduce redundancy pod label action (argoproj#1271) * Add the `mergeStrategy` option to resource patching (argoproj#1269) * This adds the ability to pass a mergeStrategy to a patch resource. this is valuable because the default merge strategy for kubernetes is 'strategic', which does not work with Custom Resources. * This also updates the resource example to demonstrate how it is used * Fix bug with DockerExecutor's CopyFile (argoproj#1275) The check to see if the source path was in the tgz archive was wrong when source path was a folder, the arguments to strings.Contains were inverted. * Add workflow labels and annotations global vars (argoproj#1280) * Argo CI is current inactive (argoproj#1285) * Issue#896 Workflow steps with non-existant output artifact path will succeed (argoproj#1277) * Issue#896 Workflow steps with non-existant output artifact path will succeed Issue: argoproj#897 Solution: Added new element "optional" in Artifact. The default is false. This flag will make artifact as optional and existence check will be ignored if input/output artifact has optional=true. Output Artifact ( optional=true ): Artifact existence check will be ignored during the save artifact in destination and continued workflow Input Artifact ( optional=true ): Artifact exist check will be ignored during load artifact from source and continued workflow * added end of line * removed unwanted whitespace * Deleted test code * go formatted * added formatting directives * updated Codegen * Fixed format on merge conflict * format fix * updated comments * improved error case * Fix for Resource creation where template has same parameter templating (argoproj#1283) * Fix for Resource creation where template has same parameter templating This PR will enable to support the custom template variable reference. Soulltion: Workflow variable reference resolve will check the Workflow variable prefix. * added test * fixed gofmt issue * fixed format * fixed gofmt on common.go * fixed testcase * fixed gofmt * Added unit testcase and documented * fixed Gofmt format * updated comments * Admiralty: add link to blog post, add user (argoproj#1295) * Add dns config support (argoproj#1301) * Speed up podReconciliation using parallel goroutine (argoproj#1286) * Speed up podReconciliation using parallel goroutine * Fix make lint issue * put checkandcompress back * Add community meeting notes link (argoproj#1304) * Add Karius to users in README.md (argoproj#1305) * Added support for artifact path references (argoproj#1300) * Added support for artifact path references Adds new `{{inputs.artifacts.<NAME>.path}}` and `{{outputs.artifacts.<NAME>.path}}` placeholders. * Add support for init containers (argoproj#1183) * Secrets should be passed to pods using volumes instead of API calls (argoproj#1302) * Secrets should be passed to pods using downward API instead of API calls * Fixed Gogfmt format * fixed file close Gofmt * updated review comments * fixed gofmt * updated review comments * CheckandEstimate implementation to optimize podReconciliation (argoproj#1308) * CheckandEstimate implementation * fixed variable rename * fixed gofmt * fixed feedbacks * Update operator.go * Update operator.go * Add alibaba cloud to officially using argo list (argoproj#1313) * Refactor checkandEstimate to optimize podReconciliation (argoproj#1311) * Refactor checkandEstimate to optimize podReconciliation * Move compress function to persistUpdates * Fix formatting issues in examples documentation (argoproj#1310) * Fix nil pointer dereference with secret volumes (argoproj#1314) * Archive location should conditionally be added to template only when needed * Fix SIGSEGV in watch/CheckAndDecompress. Consolidate duplicate code (resolves argoproj#1315) * Implement support for PNS (Process Namespace Sharing) executor (argoproj#1214) * Implements PNS (Process Namespace Sharing) executor * Adds limited support for Kubelet/K8s API artifact collection by mirroring volume mounts to wait sidecar * Adds validation to detect when output artifacts are not supported by the executor * Adds ability to customize executor from workflow-controller-configmap (e.g. add environment variables, append command line args such as loglevel) * Fixes an issue where daemon steps were not getting terminated properly * Reorganize manifests to kustomize 2 and update version to v2.3.0-rc1 * Update v2.3.0 CHANGELOG.md * Export the methods of `KubernetesClientInterface` (argoproj#1294) All calls to these methods previously generated a panic at runtime because the calls resolved to the default, panic-always implementation, not to the overrides provided by `k8sAPIClient` and `kubeletClient`. Embedding an exported interface with unexported methods into a struct is the only way to implement that interface in another package. When doing this, the compiler generates default, panic-always implementations for all methods from the interface. Implementors can override exported methods, but it's not possible to override an unexported method from the interface. All invocations that go through the interface will come to the default implementation, even if the struct tries to provide an override. * Update README.md (argoproj#1321) * Issue1316 Pod creation with secret volumemount (argoproj#1318) * CheckandEstimate implementation * fixed variable rename * fixed gofmt * fixed feedbacks * Fixed the duplicate mountpath issue * Support parameter substitution in the volumes attribute (argoproj#1238) * `argo list` was not displaying non-zero priorities correctly * Fix regression where argoexec wait would not return when podname was too long * wait will conditionally become privileged if main/sidecar privileged (resolves argoproj#1323) * Update version to v2.3.0-rc2. Update changelog * Add documentation on releasing * use a secret selector for getting credentials * fixing build issues * linter issues * fixing jenkinsfile(?) * jenkins * jenkins * jenkins * jenkins * jenkins? * jenkins :( * jenkins :( * jenkins * jenkins * jenkins * jenkins * gopkg * use GetSecretFromVolMount instead of GetSecrets * actually build argoexec * Fix argoproj#1340 parameter substitution bug Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com> * fixing gcs upload method * disable autodeploy
* adding logo to be used by the OS Site (argoproj#1099) * Update ROADMAP.md * Update docs with examples using the K8s REST API * Issue argoproj#1114 - Set FORCE_NAMESPACE_ISOLATION env variable in namespace install manifests (argoproj#1116) * Fix examples docs of parameters. (argoproj#1110) * Remove docker_lib mount volume which is not needed anymore (argoproj#1115) * Remove docker_lib mount volume which is not needed anymore * Remove unused hostPathDir * add support for ppc64le and s390x (argoproj#1102) * Install mime-support in argoexec to set proper mime types for S3 artifacts (resolves argoproj#1119) * Adding Quantibio in Who uses Argo (argoproj#1111) * Adding Quantibio in Who uses Argo * fix spelling mistake * Fix output artifact and parameter conflict (argoproj#1125) `SaveArtifacts` deletes the files that `SaveParameters` might still need, so we're calling `SaveParameters` first. Fixes argoproj#1124 * Update generated swagger to fix verify-codegen (argoproj#1131) * Allow owner reference to be set in submit util (argoproj#1120) * Issue argoproj#1104 - Remove container wait timeout from 'argo logs --follow' (argoproj#1142) * Issue argoproj#1132 - Fix panic in ttl controller (argoproj#1143) * Issue argoproj#1040 - Kill daemoned step if workflow consist of single daemoned step (argoproj#1144) * Fix global artifact overwriting in nested workflow (argoproj#1086) * Fix issue where steps with exhausted retires would not complete (argoproj#1148) * add support for other archs (argoproj#1137) * Reflect minio chart changes in documentation (argoproj#1147) * Issue argoproj#1136 - Fix metadata for DAG with loops (argoproj#1149) * Issue argoproj#1136 - Fix metadata for DAG with loops * Add slack badge to README (argoproj#1164) * Fix failing TestAddGlobalArtifactToScope unit test * Fix tests compilation error (argoproj#1157) * Replace exponential retry with poll (argoproj#1166) * add support for hostNetwork & dnsPolicy config (argoproj#1161) * Support HDFS Artifact (argoproj#1159) Support HDFS Artifact (argoproj#1159) * Update codegen for network config (argoproj#1168) * Add GitHub to users in README.md (argoproj#1151) * Add Preferred Networks to users in README.md (argoproj#1172) * Add missing patch in namespace kustomization.yaml (argoproj#1170) * Validate ArchiveLocation artifacts (argoproj#1167) * Update README and preview notice in CLA. * Update README. (argoproj#1173) (argoproj#1176) * Argo users: Equinor (argoproj#1175) * Do not mount unnecessary docker socket (argoproj#1178) * Issue argoproj#1113 - Wait for daemon pods completion to handle annotations (argoproj#1177) * Issue argoproj#1113 - Wait for daemon pods completion to handle annotations * Add output artifacts to influxdb-ci example * Increased S3 artifact retry time and added log (argoproj#1138) * Issue argoproj#1123 - Fix 'kubectl get' failure if resource namespace is different from workflow namespace (argoproj#1171) * Refactor Makefile/Dockerfile to remove volume binding in favor of build stages (argoproj#1189) * Add Docker Hub build hooks * Add documentation how to use parameter-file's (argoproj#1191) * Issue argoproj#988 - Submit should not print logs to stdout unless output is 'wide' (argoproj#1192) * Fix missing docker binary in argoexec image. Improve reuse of image layers * Fischerjulian adds ruby to rest docs (argoproj#1196) * Adds link to ruby kubernetes library. * Links to a ruby example on how to start a workflow * Updated OWNERS (argoproj#1198) * Update community/README (argoproj#1197) * Issue argoproj#1128 - Use polling instead of fs notify to get annotation changes (argoproj#1194) * Minor spelling, formatting, and style updates. (argoproj#1193) * Dockerfile: argoexec base image correction (fixes argoproj#1209) (argoproj#1213) * Set executor image pull policy for resource template (argoproj#1174) * Add schedulerName to workflow and template spec (argoproj#1184) * Issue argoproj#1190 - Fix incorrect retry node handling (argoproj#1208) * fix dag retries (argoproj#1221) * Executor can access the k8s apiserver with a out-of-cluster config file (argoproj#1134) Executor can access the k8s apiserver with a out-of-cluster config file * Update README with typo fixes (argoproj#1220) * Update README.md (argoproj#1236) * Remove extra quotes around output parameter value (argoproj#1232) Ensure we do not insert extra single quotes when using valueFrom: jsonPath to set the value of an output parameter for resource templates. Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com> * Update README.md (argoproj#1224) * Include stderr when retrieving docker logs (argoproj#1225) * Add Gardener to "Who uses Argo" (argoproj#1228) * Add feature to continue workflow on failed/error steps/tasks (argoproj#1205) * Fix the Prometheus address references (argoproj#1237) * Fixed Issue#1223 Kubernetes Resource action: patch is not supported (argoproj#1245) * Fixed Issue#1223 Kubernetes Resource action: patch is not supported This PR is fixed the Issue#1223 reported by @shanesiebken . Argo kubernetes resource workflow failed on patch action. --patch or -p option is required for kubectl patch action. This PR is including the manifest yaml as patch argument for kubectl. This Fix will support the Patch action in Argo kubernetes resource workflow. This Fix will support only JSON merge strategic in patch action * udpated formating * typo, executo -> executor (argoproj#1243) * Issue#1165 fake outputs don't notify and task completes successfully (argoproj#1247) * Issue#1165 fake outputs don't notify and task completes successfully This PR is addressing the Issue#1165 reported by @alexfrieden. Issue/Bug: Argo is finishing the task successfully even artifact /file does exist. Fix: Validate the created gzip contains artifact or file. if file/artifact doesn't exist, Current step/stage/task will be failed with log message . Sample Log: ''' INFO[0029] Updating node artifact-passing-lkvj8[0].generate-artifact (artifact-passing-lkvj8-1949982165) status Running -> Error INFO[0029] Updating node artifact-passing-lkvj8[0].generate-artifact (artifact-passing-lkvj8-1949982165) message: failed to save outputs: File or Artifact does not exist. /tmp/hello_world.txt INFO[0029] Step group node artifact-passing-lkvj8[0] (artifact-passing-lkvj8-1067333159) deemed failed: child 'artifact-passing-lkvj8-1949982165' failed namespace=default workflow=artifact-passing-lkvj8 INFO[0029] node artifact-passing-lkvj8[0] (artifact-passing-lkvj8-1067333159) phase Running -> Failed namespace=default workflow=artifact-passing-lkvj8 ''' * fixed gometalinter errcheck issue * Git cloning via SSH was not verifying host public key (argoproj#1261) * Update versions (argoproj#1218) * Proxy Priority and PriorityClassName to pods (argoproj#1179) * Error running 1000s of tasks: "etcdserver: request is too large" argoproj#1186 (argoproj#1264) * Error running 1000s of tasks: "etcdserver: request is too large" argoproj#1186 This PR is addressing the feature request argoproj#1186. Issue: Nodestatus element keeps growing for big workflow. Workflow will fail once the workflow total size reachs 1 MB (maz size limit in ETCD) . Solution: Compressing the Nodestatus once size reachs the 1 MB which increasing 60% to 80% more steps to execute in compress mode. Latest: Argo cli and Argo UI will able to decode and print nodestatus from compressednoode. Limitation: Kubectl willl not decode the compressedNode element * added Operator.go * revert the testing yaml * Fixed the lint issue * fixed * fixed lint * Fixed Testcase * incorporated the review comments * Reverted the change * incorporated review comments * fixing gometalinter checks * incorporated review comments * Update pod-limits.yaml * updated few comments * updated error message format * reverted unwanted files * Reduce redundancy pod label action (argoproj#1271) * Add the `mergeStrategy` option to resource patching (argoproj#1269) * This adds the ability to pass a mergeStrategy to a patch resource. this is valuable because the default merge strategy for kubernetes is 'strategic', which does not work with Custom Resources. * This also updates the resource example to demonstrate how it is used * Fix bug with DockerExecutor's CopyFile (argoproj#1275) The check to see if the source path was in the tgz archive was wrong when source path was a folder, the arguments to strings.Contains were inverted. * Add workflow labels and annotations global vars (argoproj#1280) * Argo CI is current inactive (argoproj#1285) * Issue#896 Workflow steps with non-existant output artifact path will succeed (argoproj#1277) * Issue#896 Workflow steps with non-existant output artifact path will succeed Issue: argoproj#897 Solution: Added new element "optional" in Artifact. The default is false. This flag will make artifact as optional and existence check will be ignored if input/output artifact has optional=true. Output Artifact ( optional=true ): Artifact existence check will be ignored during the save artifact in destination and continued workflow Input Artifact ( optional=true ): Artifact exist check will be ignored during load artifact from source and continued workflow * added end of line * removed unwanted whitespace * Deleted test code * go formatted * added formatting directives * updated Codegen * Fixed format on merge conflict * format fix * updated comments * improved error case * Fix for Resource creation where template has same parameter templating (argoproj#1283) * Fix for Resource creation where template has same parameter templating This PR will enable to support the custom template variable reference. Soulltion: Workflow variable reference resolve will check the Workflow variable prefix. * added test * fixed gofmt issue * fixed format * fixed gofmt on common.go * fixed testcase * fixed gofmt * Added unit testcase and documented * fixed Gofmt format * updated comments * Admiralty: add link to blog post, add user (argoproj#1295) * Add dns config support (argoproj#1301) * Speed up podReconciliation using parallel goroutine (argoproj#1286) * Speed up podReconciliation using parallel goroutine * Fix make lint issue * put checkandcompress back * Add community meeting notes link (argoproj#1304) * Add Karius to users in README.md (argoproj#1305) * Added support for artifact path references (argoproj#1300) * Added support for artifact path references Adds new `{{inputs.artifacts.<NAME>.path}}` and `{{outputs.artifacts.<NAME>.path}}` placeholders. * Add support for init containers (argoproj#1183) * Secrets should be passed to pods using volumes instead of API calls (argoproj#1302) * Secrets should be passed to pods using downward API instead of API calls * Fixed Gogfmt format * fixed file close Gofmt * updated review comments * fixed gofmt * updated review comments * CheckandEstimate implementation to optimize podReconciliation (argoproj#1308) * CheckandEstimate implementation * fixed variable rename * fixed gofmt * fixed feedbacks * Update operator.go * Update operator.go * Add alibaba cloud to officially using argo list (argoproj#1313) * Refactor checkandEstimate to optimize podReconciliation (argoproj#1311) * Refactor checkandEstimate to optimize podReconciliation * Move compress function to persistUpdates * Fix formatting issues in examples documentation (argoproj#1310) * Fix nil pointer dereference with secret volumes (argoproj#1314) * Archive location should conditionally be added to template only when needed * Fix SIGSEGV in watch/CheckAndDecompress. Consolidate duplicate code (resolves argoproj#1315) * Implement support for PNS (Process Namespace Sharing) executor (argoproj#1214) * Implements PNS (Process Namespace Sharing) executor * Adds limited support for Kubelet/K8s API artifact collection by mirroring volume mounts to wait sidecar * Adds validation to detect when output artifacts are not supported by the executor * Adds ability to customize executor from workflow-controller-configmap (e.g. add environment variables, append command line args such as loglevel) * Fixes an issue where daemon steps were not getting terminated properly * Reorganize manifests to kustomize 2 and update version to v2.3.0-rc1 * Update v2.3.0 CHANGELOG.md * Export the methods of `KubernetesClientInterface` (argoproj#1294) All calls to these methods previously generated a panic at runtime because the calls resolved to the default, panic-always implementation, not to the overrides provided by `k8sAPIClient` and `kubeletClient`. Embedding an exported interface with unexported methods into a struct is the only way to implement that interface in another package. When doing this, the compiler generates default, panic-always implementations for all methods from the interface. Implementors can override exported methods, but it's not possible to override an unexported method from the interface. All invocations that go through the interface will come to the default implementation, even if the struct tries to provide an override. * Update README.md (argoproj#1321) * Issue1316 Pod creation with secret volumemount (argoproj#1318) * CheckandEstimate implementation * fixed variable rename * fixed gofmt * fixed feedbacks * Fixed the duplicate mountpath issue * Support parameter substitution in the volumes attribute (argoproj#1238) * `argo list` was not displaying non-zero priorities correctly * Fix regression where argoexec wait would not return when podname was too long * wait will conditionally become privileged if main/sidecar privileged (resolves argoproj#1323) * Update version to v2.3.0-rc2. Update changelog * Add documentation on releasing * Fix missing template local volumes, Handle volumes only used in init containers (argoproj#1342) * Fix argoproj#1340 parameter substitution bug (argoproj#1345) Also create podParams map in substitutePodParams Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com> * add / test (argoproj#1240) * Fix input artifacts with multiple ssh keys (argoproj#1338) * Fixed : Validate the secret credentials name and key (argoproj#1358) * CheckandEstimate implementation * fixed variable rename * fixed gofmt * fixed feedbacks * Fixed Issue1355 * fixed style * Delete e2e_temp.tmp * Fix: # 1328 argo submit --wait and argo wait quits while workflow is running (argoproj#1347) * CheckandEstimate implementation * fixed variable rename * fixed gofmt * fixed feedbacks * Fixed argo submit --wait and argo wait quits while workflow is running * fixed Style * Update version to v2.3.0-rc3 * Update release instructions * Fix issue where a DAG with exhausted retries would get stuck Running (argoproj#1364) * Update VERSION to v2.3.0, changelog, and manifests * reverting unintentional change to dockerfile * fixing vendor problems
* Validate ArchiveLocation artifacts (argoproj#1167) * Update README and preview notice in CLA. * Update README. (argoproj#1173) (argoproj#1176) * Argo users: Equinor (argoproj#1175) * Do not mount unnecessary docker socket (argoproj#1178) * Issue argoproj#1113 - Wait for daemon pods completion to handle annotations (argoproj#1177) * Issue argoproj#1113 - Wait for daemon pods completion to handle annotations * Add output artifacts to influxdb-ci example * Increased S3 artifact retry time and added log (argoproj#1138) * Issue argoproj#1123 - Fix 'kubectl get' failure if resource namespace is different from workflow namespace (argoproj#1171) * Refactor Makefile/Dockerfile to remove volume binding in favor of build stages (argoproj#1189) * Add Docker Hub build hooks * Add documentation how to use parameter-file's (argoproj#1191) * Issue argoproj#988 - Submit should not print logs to stdout unless output is 'wide' (argoproj#1192) * Fix missing docker binary in argoexec image. Improve reuse of image layers * Fischerjulian adds ruby to rest docs (argoproj#1196) * Adds link to ruby kubernetes library. * Links to a ruby example on how to start a workflow * Updated OWNERS (argoproj#1198) * Update community/README (argoproj#1197) * Issue argoproj#1128 - Use polling instead of fs notify to get annotation changes (argoproj#1194) * Minor spelling, formatting, and style updates. (argoproj#1193) * Dockerfile: argoexec base image correction (fixes argoproj#1209) (argoproj#1213) * Set executor image pull policy for resource template (argoproj#1174) * Add schedulerName to workflow and template spec (argoproj#1184) * Issue argoproj#1190 - Fix incorrect retry node handling (argoproj#1208) * fix dag retries (argoproj#1221) * Executor can access the k8s apiserver with a out-of-cluster config file (argoproj#1134) Executor can access the k8s apiserver with a out-of-cluster config file * Update README with typo fixes (argoproj#1220) * Update README.md (argoproj#1236) * Remove extra quotes around output parameter value (argoproj#1232) Ensure we do not insert extra single quotes when using valueFrom: jsonPath to set the value of an output parameter for resource templates. Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com> * Update README.md (argoproj#1224) * Include stderr when retrieving docker logs (argoproj#1225) * Add Gardener to "Who uses Argo" (argoproj#1228) * Add feature to continue workflow on failed/error steps/tasks (argoproj#1205) * Fix the Prometheus address references (argoproj#1237) * Fixed Issue#1223 Kubernetes Resource action: patch is not supported (argoproj#1245) * Fixed Issue#1223 Kubernetes Resource action: patch is not supported This PR is fixed the Issue#1223 reported by @shanesiebken . Argo kubernetes resource workflow failed on patch action. --patch or -p option is required for kubectl patch action. This PR is including the manifest yaml as patch argument for kubectl. This Fix will support the Patch action in Argo kubernetes resource workflow. This Fix will support only JSON merge strategic in patch action * udpated formating * typo, executo -> executor (argoproj#1243) * Issue#1165 fake outputs don't notify and task completes successfully (argoproj#1247) * Issue#1165 fake outputs don't notify and task completes successfully This PR is addressing the Issue#1165 reported by @alexfrieden. Issue/Bug: Argo is finishing the task successfully even artifact /file does exist. Fix: Validate the created gzip contains artifact or file. if file/artifact doesn't exist, Current step/stage/task will be failed with log message . Sample Log: ''' INFO[0029] Updating node artifact-passing-lkvj8[0].generate-artifact (artifact-passing-lkvj8-1949982165) status Running -> Error INFO[0029] Updating node artifact-passing-lkvj8[0].generate-artifact (artifact-passing-lkvj8-1949982165) message: failed to save outputs: File or Artifact does not exist. /tmp/hello_world.txt INFO[0029] Step group node artifact-passing-lkvj8[0] (artifact-passing-lkvj8-1067333159) deemed failed: child 'artifact-passing-lkvj8-1949982165' failed namespace=default workflow=artifact-passing-lkvj8 INFO[0029] node artifact-passing-lkvj8[0] (artifact-passing-lkvj8-1067333159) phase Running -> Failed namespace=default workflow=artifact-passing-lkvj8 ''' * fixed gometalinter errcheck issue * Git cloning via SSH was not verifying host public key (argoproj#1261) * Update versions (argoproj#1218) * Proxy Priority and PriorityClassName to pods (argoproj#1179) * Error running 1000s of tasks: "etcdserver: request is too large" argoproj#1186 (argoproj#1264) * Error running 1000s of tasks: "etcdserver: request is too large" argoproj#1186 This PR is addressing the feature request argoproj#1186. Issue: Nodestatus element keeps growing for big workflow. Workflow will fail once the workflow total size reachs 1 MB (maz size limit in ETCD) . Solution: Compressing the Nodestatus once size reachs the 1 MB which increasing 60% to 80% more steps to execute in compress mode. Latest: Argo cli and Argo UI will able to decode and print nodestatus from compressednoode. Limitation: Kubectl willl not decode the compressedNode element * added Operator.go * revert the testing yaml * Fixed the lint issue * fixed * fixed lint * Fixed Testcase * incorporated the review comments * Reverted the change * incorporated review comments * fixing gometalinter checks * incorporated review comments * Update pod-limits.yaml * updated few comments * updated error message format * reverted unwanted files * Reduce redundancy pod label action (argoproj#1271) * Add the `mergeStrategy` option to resource patching (argoproj#1269) * This adds the ability to pass a mergeStrategy to a patch resource. this is valuable because the default merge strategy for kubernetes is 'strategic', which does not work with Custom Resources. * This also updates the resource example to demonstrate how it is used * Fix bug with DockerExecutor's CopyFile (argoproj#1275) The check to see if the source path was in the tgz archive was wrong when source path was a folder, the arguments to strings.Contains were inverted. * Add workflow labels and annotations global vars (argoproj#1280) * Argo CI is current inactive (argoproj#1285) * Issue#896 Workflow steps with non-existant output artifact path will succeed (argoproj#1277) * Issue#896 Workflow steps with non-existant output artifact path will succeed Issue: argoproj#897 Solution: Added new element "optional" in Artifact. The default is false. This flag will make artifact as optional and existence check will be ignored if input/output artifact has optional=true. Output Artifact ( optional=true ): Artifact existence check will be ignored during the save artifact in destination and continued workflow Input Artifact ( optional=true ): Artifact exist check will be ignored during load artifact from source and continued workflow * added end of line * removed unwanted whitespace * Deleted test code * go formatted * added formatting directives * updated Codegen * Fixed format on merge conflict * format fix * updated comments * improved error case * Fix for Resource creation where template has same parameter templating (argoproj#1283) * Fix for Resource creation where template has same parameter templating This PR will enable to support the custom template variable reference. Soulltion: Workflow variable reference resolve will check the Workflow variable prefix. * added test * fixed gofmt issue * fixed format * fixed gofmt on common.go * fixed testcase * fixed gofmt * Added unit testcase and documented * fixed Gofmt format * updated comments * Admiralty: add link to blog post, add user (argoproj#1295) * Add dns config support (argoproj#1301) * Speed up podReconciliation using parallel goroutine (argoproj#1286) * Speed up podReconciliation using parallel goroutine * Fix make lint issue * put checkandcompress back * Add community meeting notes link (argoproj#1304) * Add Karius to users in README.md (argoproj#1305) * Added support for artifact path references (argoproj#1300) * Added support for artifact path references Adds new `{{inputs.artifacts.<NAME>.path}}` and `{{outputs.artifacts.<NAME>.path}}` placeholders. * Add support for init containers (argoproj#1183) * Secrets should be passed to pods using volumes instead of API calls (argoproj#1302) * Secrets should be passed to pods using downward API instead of API calls * Fixed Gogfmt format * fixed file close Gofmt * updated review comments * fixed gofmt * updated review comments * CheckandEstimate implementation to optimize podReconciliation (argoproj#1308) * CheckandEstimate implementation * fixed variable rename * fixed gofmt * fixed feedbacks * Update operator.go * Update operator.go * Add alibaba cloud to officially using argo list (argoproj#1313) * Refactor checkandEstimate to optimize podReconciliation (argoproj#1311) * Refactor checkandEstimate to optimize podReconciliation * Move compress function to persistUpdates * Fix formatting issues in examples documentation (argoproj#1310) * Fix nil pointer dereference with secret volumes (argoproj#1314) * Archive location should conditionally be added to template only when needed * Fix SIGSEGV in watch/CheckAndDecompress. Consolidate duplicate code (resolves argoproj#1315) * Implement support for PNS (Process Namespace Sharing) executor (argoproj#1214) * Implements PNS (Process Namespace Sharing) executor * Adds limited support for Kubelet/K8s API artifact collection by mirroring volume mounts to wait sidecar * Adds validation to detect when output artifacts are not supported by the executor * Adds ability to customize executor from workflow-controller-configmap (e.g. add environment variables, append command line args such as loglevel) * Fixes an issue where daemon steps were not getting terminated properly * Reorganize manifests to kustomize 2 and update version to v2.3.0-rc1 * Update v2.3.0 CHANGELOG.md * Export the methods of `KubernetesClientInterface` (argoproj#1294) All calls to these methods previously generated a panic at runtime because the calls resolved to the default, panic-always implementation, not to the overrides provided by `k8sAPIClient` and `kubeletClient`. Embedding an exported interface with unexported methods into a struct is the only way to implement that interface in another package. When doing this, the compiler generates default, panic-always implementations for all methods from the interface. Implementors can override exported methods, but it's not possible to override an unexported method from the interface. All invocations that go through the interface will come to the default implementation, even if the struct tries to provide an override. * Update README.md (argoproj#1321) * Issue1316 Pod creation with secret volumemount (argoproj#1318) * CheckandEstimate implementation * fixed variable rename * fixed gofmt * fixed feedbacks * Fixed the duplicate mountpath issue * Support parameter substitution in the volumes attribute (argoproj#1238) * `argo list` was not displaying non-zero priorities correctly * Fix regression where argoexec wait would not return when podname was too long * wait will conditionally become privileged if main/sidecar privileged (resolves argoproj#1323) * Update version to v2.3.0-rc2. Update changelog * Add documentation on releasing * Fix missing template local volumes, Handle volumes only used in init containers (argoproj#1342) * Fix argoproj#1340 parameter substitution bug (argoproj#1345) Also create podParams map in substitutePodParams Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com> * add / test (argoproj#1240) * Fix input artifacts with multiple ssh keys (argoproj#1338) * Fixed : Validate the secret credentials name and key (argoproj#1358) * CheckandEstimate implementation * fixed variable rename * fixed gofmt * fixed feedbacks * Fixed Issue1355 * fixed style * Delete e2e_temp.tmp * Fix: # 1328 argo submit --wait and argo wait quits while workflow is running (argoproj#1347) * CheckandEstimate implementation * fixed variable rename * fixed gofmt * fixed feedbacks * Fixed argo submit --wait and argo wait quits while workflow is running * fixed Style * Update version to v2.3.0-rc3 * Update release instructions * Add --status filter for get command (argoproj#1325) * Support an easy way to set owner reference (argoproj#1333) * Use golangci-lint instead of deprecated gometalinter (argoproj#1335) * [Fix argoproj#1242] Failed DAG nodes are now kept and set to running on RetryWorkflow. (argoproj#1250) * Fixed : CLI Does Not Honor metadata.namespace argoproj#1288 (argoproj#1352) * Validate action for resource templates (argoproj#1346) * Fix issue where a DAG with exhausted retries would get stuck Running (argoproj#1364) * Update README.md (argoproj#1372) Add Adevinta. https://www.adevinta.com/ * Update docs for the v2.3.0 release and to use the stable tag * Add Max Kelsen to USERS in README.md (argoproj#1374) Max Kelsen us utilising Argo throughout the organisation to manage data processing and machine learning pipelines. Incredibly thankful to the great community! * Fixed: Support hostAliases in WorkflowSpec argoproj#1265 (argoproj#1365) * Fixed : Support hostAliases in WorkflowSpec argoproj#1265 * Fixed: failed to save outputs: verify serviceaccount default:default has necessary privileges (argoproj#1362) Fixed: failed to save outputs: verify serviceaccount default:default has necessary privileges (argoproj#1362) * Fixed: make verify-codegen is failing on the master branch (argoproj#1399) (argoproj#1400) * Fixed: withParam parsing of JSON/YAML lists argoproj#1389 (argoproj#1397) * Make locating kubeconfig in example os independent (argoproj#1393) * Added Argo Rollouts to README (argoproj#1388) * Add Mirantis as an official user (argoproj#1401) * Update README.md (argoproj#1402) * Update README.md (argoproj#1404) Includes SAP Fieldglass in users section. * Fiixed: persistentvolumeclaims already exists argoproj#1130 (argoproj#1363) * Fixed: persistentvolumeclaims already exists argoproj#1130 * chore: add IBM to official users section in README.md (argoproj#1409) * Orders uses alphabetically (argoproj#1411) * Update OWNERS (argoproj#1429) * Typo fix in ARTIFACT_REPO.md (argoproj#1425) In the non-default artifact repo section, when showing the gcs example the bucket name said 'my-aws-bucket-name'. I've updated this to say 'my-gcs-bucket-name'. Super minor change but I've been banging my head against artifact repo outputs all day and this was bothering me. * Add OVH as official user (argoproj#1417) Add OVH as official user * Update demo.md (argoproj#1396) Step 2 instructs the user to create the namespace `argo`, and the coin-flip (at least) uses the service account `argo`, so it makes sense to provide `--serviceaccount=argo:argo` so that the initial experience works, "out of the box". * Fix typo (argoproj#1431) * PNS executor intermitently failed to capture entire log of script templates (argoproj#1406) * Terminate all containers within pod after main container completes (argoproj#1423) Resolves argoproj#1422 * Ability to configure hostPath mount for `/var/run/docker.sock` (argoproj#1419) * CheckandEstimate implementation * fixed variable rename * fixed gofmt * fixed feedbacks * implement the configurable Docker sock path * Update workflowpod.go * Style updated * Fixed: Implemented Template level service account (argoproj#1354) Fixed: Implemented Template level service account (argoproj#1354) * Add paging function for list command (argoproj#1420) * Add paging function for list command (argoproj#1420) * Revert "Update demo.md (argoproj#1396)" (argoproj#1433) This reverts commit 5635c33. * Update documentation for workflow.outputs.artifacts (argoproj#1439) * Improve bash completion (argoproj#1437) * Add threekit to user list (argoproj#1444) * Fix demo's doc issue of install minio chart (argoproj#1450) Signed-off-by: Aisuko <urakiny@gmail.com> * mention sidecar in failure message for sidecar containers (argoproj#1430) * Centralized Longterm workflow persistence storage (argoproj#1344) * Centralized Longterm workflow persistence storage implementaion * New Feature: provide failFast flag, allow a DAG to run all branches of the DAG (either success or failure) (argoproj#1443) * Fix bug: dag will missing some nodes when another branch node fails * Add test file * New Feature: provide failFast flag, allow a DAG to run all branches of the DAG (either success or failure) * Move failFast flag to DAG template spec * * Move test case file to test/e2e/expectedfailures since it is expected to fail * Remove unused check code * issue-1445: changing temp directory for output artifacts from root to tmp (argoproj#1458) * Support PodSecurityContext (argoproj#1463) * Add doc about failFast feature (argoproj#1453) * Added Codec to the Argo community list (argoproj#1477) * fix typo: symboloic > symbolic (argoproj#1478) * Add --no-color flag to logs (argoproj#1479) * Fix failFast bug: When a node in the middle fails, the entire workflow will hang (argoproj#1468) * Document the insecureIgnoreHostKey git flag (argoproj#1483) * Fix: 1008 `argo wait` and `argo submit --wait` should exit 1 if workflow fails (argoproj#1467) Fix: 1008 `argo wait` and `argo submit --wait` should exit 1 if workflow fails (argoproj#1467) * Update OWNERS (argoproj#1485) * Add Commodus Tech as official user (argoproj#1484) * Fix: Argo CLI should show warning if there is no workflow definition in file argoproj#1486 Fix: Argo CLI should show warning if there is no workflow definition in file argoproj#1486 * Exposed workflow priority as a variable (argoproj#1476) * Fix argoproj#1366 unpredictable global artifact behavior (argoproj#1461) * Fix: Support the List within List type in withParam argoproj#1471 (argoproj#1473) Fix: Support the List within List type in withParam argoproj#1471 (argoproj#1473) * Fix a compiler error (argoproj#1500) Fix a compiler error (argoproj#1500) * Readme update to add argo and airflow comparison (argoproj#1502) * Added argo vs airflow presentation * Update README.md * change 'continue-on-fail' example to better reflect its description (argoproj#1494) * Implemented Conditionally annotate outputs of script template only when consumed argoproj#1359 (argoproj#1462) * Fixed argoproj#1359 Implemented Conditionally annotate outputs of script template only when consumed * Allow output parameters with .value, not only .valueFrom (argoproj#1336) Fixed argoproj#1329 Allow output parameters with .value, not only .valueFrom (argoproj#1336) * Fix the lint target (argoproj#1505) Fix the lint target This fixes an issue with the `make lint` target, where if a developer has golangci-lint installed and also has linter errors, the linter fails with an error, causing the next case to fall through (and the old linter is run). This also fixes all of the linter errors that had somehow cropped up in the repo. * Fix a compiler error in a unit test (argoproj#1514) * Allow Makefile variables to be set from the command line (argoproj#1501) This changes the assignment operator of various Makefile variables from the recursive expansion operator (=) to the conditional assignment operator (?=), such that a developer can define their own values for those variables. This is highly valuable for a dev who wants to do local development with a local docker image * Fixed argoproj#1287 Executor kubectl version is obsolete (argoproj#1513) Fixed argoproj#1287 Executor kubectl version is obsolete (argoproj#1513) * Fix issue [Documentation] kubectl get service argo-artifacts -o wide (argoproj#1516) * Allow overriding workflow labels in 'argo submit' (argoproj#1475) * Support git shallow clones and additional ref fetches (argoproj#1521) Implemented a `depth` field for git artifact configuration that, when specified, will result in a shallow clone (and fetch) of the given number of commits from the branch tip. Implemented a `fetch` field for git artifact configuration that fetches the given refspecs prior to checkout. This is necessary when one wants to retrieve git revisions that exist in non-branch/-tag refs. The motivation for these features is to support retrieval of patchset refs from Gerrit code review (`refs/changes/[n]/[change]/[patch]`) but these new fields should provide more flexibility to anyone integrating with other git-based systems. * Add --dry-run option to `argo submit` (argoproj#1506) * Fix validation (argoproj#1508) * Implemented support for WorkflowSpec.ArtifactRepositoryRef (argoproj#1350) This change allows the workflow to specify the reference the configMap holding the artifact repository configuration. * Fix argo logs empty content when workflow run in virtual kubelet env (argoproj#1201) * Expose all input parameters to template as JSON (argoproj#1488) * WorkflowTemplate CRD (argoproj#1312) * Added Architecture doc (argoproj#1515) Fixed argoproj#894 Added Architecture doc (argoproj#1515) * Format sources and order imports with the help of goimports (argoproj#1504) * Update ISSUE_TEMPLATE.md (argoproj#1528) edit to follow to current README.md installation guides. * Introduce podGC strategy for deleting completed/successful pods (argoproj#1234) * Update CHANGELOG for v2.4 (argoproj#1531) * Update README.md (argoproj#1533) * Use cache to retrieve WorkflowTemplates (argoproj#1534) * Update argo dependencies to kubernetes v1.14 (argoproj#1530) * Update argo dependencies to kubernetes v1.14 * Update version to v2.4.0-rc1 * Update main.go (argoproj#1536) * Update main.go (argoproj#1536) * Remove GLog config from argo executor (argoproj#1537) * Remove GLog config from argo executor (argoproj#1537) * Initialize the wfClientset before using it (argoproj#1548) * docs(readme): fix workflow types link (argoproj#1560) * Optimize argo binary install documentation (argoproj#1563) * Document workflow controller dockerSockPath config (argoproj#1555) * Add coverage make target (argoproj#1557) * Fix issue saving outputs which overlap paths with inputs (argoproj#1567) * Support AutomountServiceAccountToken and executor specific service account(argoproj#1480) * added DataStax as an organization that uses Argo (argoproj#1576) * Fix inputs and arguments during template resolution (argoproj#1545) * Add entrypoint label to workflow default labels (argoproj#1550) * remove redundant codes (argoproj#1582) Signed-off-by: xiechengsheng <xie1995@whut.edu.cn> * Fix workflow template in namespaced controller (argoproj#1580) * Add workflow template permissions to namespaced deployment manifests * Use filtered shared informer factory for namespaced deployment * Regard resource templates as leaf nodes (argoproj#1593) This enables retryStrategy to be respected on resource templates. This closes argoproj#1370 * Update from github.com/ghodss/yaml to sigs.k8s.io/yaml (argoproj#1572) * Update Gopkg.toml and Gopkg.lock (argoproj#1596) * Issue1571 Support ability to assume IAM roles in S3 Artifacts (argoproj#1587) * Fixed: Ability to interface with S3 using assumed roles (session tokens) This PR fixes argoproj#1571 * Added retry around RuntimeExecutor.Wait call when waiting for main container completion (argoproj#1597) * Do not relocate the mounted docker.sock (argoproj#1607) The mount path of the docker.sock should not depend on the host path of the docker.sock * Fix DAG enable failFast will hang in some case (argoproj#1595) * Fix failFast will hang in some case * Increased Lint timeout (argoproj#1612) * Add merge keys to Workflow objects to allow for StrategicMergePatches (argoproj#1611) * Small code cleanup and add tests (argoproj#1562) * Added WorkflowStatus and NodeStatus types to the Open API Spec. (argoproj#1614) * Prevent controller from crashing due to glog writing to /tmp (argoproj#1613) * Updated the API Rule Violations list (argoproj#1618) * updated invite link (argoproj#1621) * Increase timeout of golangci-lint (argoproj#1623) * Store resolved templates (argoproj#1552) * Store resolved templates in node status * Update operator.go (argoproj#1630) * Update operator.go * update API * Fix retry workflow state (argoproj#1632) * Save stored template ID in nodes (argoproj#1631) * Grant get secret role to controller to support persistence (argoproj#1615) * Regenerate installation manifests (argoproj#1638) * Update CHANGELOG for v2.4.0 (argoproj#1636) * Update version to v2.4.0 * Add back SetGlogLevel calls * Fix regression where parallelism could cause workflow to fail (argoproj#1639) * Fix regression where global outputs were unresolveable in DAGs (argoproj#1640) * Fix global lint issue (argoproj#1641) * pin colinmarc/hdfs to the next commit, which no longer has vendored deps (argoproj#1622) * Delay killing sidecars until artifacts are saved (argoproj#1645) * fixed example wrong comment (argoproj#1643) * Fix missing merged changes in validate.go (argoproj#1647) * Fix DAG output aggregation (argoproj#1648) * Fix dag output aggregation correctly (argoproj#1649) * Use stored templates to raggregate step outputs (argoproj#1651) * Fix child node template handling (argoproj#1654) * Stop failing if artifact file exists, but empty (argoproj#1653) * Resolve WorkflowTemplate lazily (argoproj#1655) * Don't provision VM for empty artifacts (argoproj#1660) * Update version to v2.4.1 * Fix typo (argoproj#1679) * Handle sidecar killing properly (argoproj#1675) * Update README.md Argo Ansible role: Provisioning Argo Workflows on Kubernetes/OpenShift (argoproj#1673) * Handle retried node properly (argoproj#1669) * Store locally referenced template properly (argoproj#1670) * Update version to v2.4.2 * Fix issue that workflow.priority substitution didn't pass validation (argoproj#1690) * Added status of previous steps as variables (argoproj#1681) * Print multiple workflows in one command (argoproj#1650) * Fix retry node processing (argoproj#1694) * Apply Strategic merge patch against the pod spec (argoproj#1687) * fixed broke metrics endpoint per argoproj#1634 (argoproj#1695) * Fixed incorrect `pod.name` in retry pods (argoproj#1699) * Added ability to auto-resume from suspended state (argoproj#1715) * Filter workflows in list based on name prefix (argoproj#1721) * Support no-headers flag (argoproj#1760) * Refactoring Template Resolution Logic (argoproj#1744) * Fix retry node name issue on error (argoproj#1732) * Do not resolve remote templates in lint (argoproj#1787) * Handle operation level errors PVC in Retry (argoproj#1762) * Added hint when using certain tokens in when expressions (argoproj#1810) * Added hint when using certain tokens in when expressions * Minor * SSL enabled database connection for workflow repository (argoproj#1712) (argoproj#1756) * Error occurred on pod watch should result in an error on the wait container (argoproj#1776) * Update version to v2.4.3 * Update version to v2.4.3 * rename * fixing jenkins, committing extra changes * jenkins Co-authored-by: Daisuke Taniwaki <daisuketaniwaki@gmail.com> Co-authored-by: Ed Lee <edlee2121@users.noreply.github.com> Co-authored-by: Erik Parmann <eparmann@gmail.com> Co-authored-by: Alexander Matyushentsev <AMatyushentsev@gmail.com> Co-authored-by: kshamajain99 <kshamajain99@gmail.com> Co-authored-by: Jesse Suen <jessesuen@users.noreply.github.com> Co-authored-by: Marcin Karkocha <marcin.karkocha@outlook.com> Co-authored-by: Julian Fischer <ich@julianfischer.name> Co-authored-by: Anna Winkler <3526523+annawinkler@users.noreply.github.com> Co-authored-by: Ilias Katsakioris <elikatsis@arrikto.com> Co-authored-by: jdfalko <43558452+jdfalko@users.noreply.github.com> Co-authored-by: Greg Roodt <groodt@gmail.com> Co-authored-by: Naoto Migita <migggy@users.noreply.github.com> Co-authored-by: shahin <shahin@users.noreply.github.com> Co-authored-by: Tim Schrodi <tschrodi96@googlemail.com> Co-authored-by: Matthew Coleman <matthew.e.coleman@gmail.com> Co-authored-by: Saravanan Balasubramanian <33908564+sarabala1979@users.noreply.github.com> Co-authored-by: Nick Stott <nick@nickstott.com> Co-authored-by: Ismail Alidzhikov <i.alidjikov@gmail.com> Co-authored-by: Xianlu Bird <xianlubird@gmail.com> Co-authored-by: Ian Howell <ian.howell0@gmail.com> Co-authored-by: Fred Dubois <169247+duboisf@users.noreply.github.com> Co-authored-by: Johannes 'fish' Ziemke <github@freigeist.org> Co-authored-by: Adrien Trouillaud <adrienjt@users.noreply.github.com> Co-authored-by: xubofei1983 <39540637+xubofei1983@users.noreply.github.com> Co-authored-by: Alexey Volkov <alexey.volkov@ark-kun.com> Co-authored-by: Clemens Lange <clemens.lange@cern.ch> Co-authored-by: Chris Chambers <chris-chambers@users.noreply.github.com> Co-authored-by: Hideto Inamura <h.inamura0710@gmail.com> Co-authored-by: almariah <abdullahalmariah@gmail.com> Co-authored-by: Cristian Pop <cristian.pop3009@gmail.com> Co-authored-by: Jaime <yauma21@gmail.com> Co-authored-by: Jacob O'Farrell <jacob@maxkelsen.com> Co-authored-by: Ben Wells <b.v.wells@gmail.com> Co-authored-by: Paul Brit <paulbrit44@gmail.com> Co-authored-by: Brandon Steinman <brandon.steinman@sap.com> Co-authored-by: alex weidner <shimmerjs@us.ibm.com> Co-authored-by: Alex Collins <alexec@users.noreply.github.com> Co-authored-by: ianCambrio <50969109+ianCambrio@users.noreply.github.com> Co-authored-by: Jean-Louis Queguiner <jean-louis.queguiner@gadz.org> Co-authored-by: Stephen Steiner <ssteiner@juniper.net> Co-authored-by: Jonathon Belotti <jonathon@canva.com> Co-authored-by: Semjon Kopp <semjon.kopp@sap.com> Co-authored-by: Orion Delwaterman <delwaterman@gmail.com> Co-authored-by: Edwin Jacques <31151721+edwinpjacques@users.noreply.github.com> Co-authored-by: Ziyang Wang <wangziyang507@gmail.com> Co-authored-by: Aisuko <urakiny@gmail.com> Co-authored-by: tralexa <39952205+tralexa@users.noreply.github.com> Co-authored-by: Alex Capras <alexcapras@gmail.com> Co-authored-by: mark9white <mark@markwhite.com> Co-authored-by: Mostapha Sadeghipour Roudsari <sadeghipour@gmail.com> Co-authored-by: commodus-sebastien <37178563+commodus-sebastien@users.noreply.github.com> Co-authored-by: Mukulikak <mukulikak@gmail.com> Co-authored-by: Daniel Duvall <dan@mutual.io> Co-authored-by: Anes Benmerzoug <Anes.Benmerzoug@gmail.com> Co-authored-by: Christian Muehlhaeuser <muesli@gmail.com> Co-authored-by: hidekuro <hidekuro@users.noreply.github.com> Co-authored-by: jacky <jacky.wucheng@gmail.com> Co-authored-by: Brian Mericle <bpmericle@users.noreply.github.com> Co-authored-by: Takayuki Kasai <unblee@users.noreply.github.com> Co-authored-by: Xie.CS <xie1995@whut.edu.cn> Co-authored-by: John Wass <jwass3@gmail.com> Co-authored-by: Premkumar Masilamani <smileprem@users.noreply.github.com> Co-authored-by: Pablo Osinaga <paguos@gmail.com> Co-authored-by: David Seapy <ddseapy@ccri.com> Co-authored-by: Anastasia Satonina <56155326+darthnastya@users.noreply.github.com> Co-authored-by: Simon Behar <simbeh7@gmail.com> Co-authored-by: Tobias Bradtke <webwurst@gmail.com> Co-authored-by: Marek Čermák <prace.mcermak@gmail.com> Co-authored-by: Rick Avendaño <Avendano.Richard@gmail.com> Co-authored-by: sang <sanooj.m@gmail.com> Co-authored-by: Antoine Dao <antoinedao1@gmail.com> Co-authored-by: gerdos82 <37865635+gerdos82@users.noreply.github.com>
Signed-off-by: Derek Wang <whynowy@gmail.com>
I test many situations on every size workflow, found that there's a problem on large or medium workflow. As the workflow grows larger, the state sync of the pod is slower and slower. In the end, there will even be a situation where the state of the pod in the workflow differs from the state of the actual apiServer storage by more than 20 minutes. I use the test case below.
test case
I finally found that the ultimate cause of this phenomenon is because of this code https://github.com/argoproj/argo/blob/master/workflow/controller/operator.go#L475.
Every pod it will cost about 0.07s. For a workflow with 5000+ pod , it will become 350s(5.83min). Each workflow operate will cost 5.83min, it will cause all the workflow failed. I log some info here
We can see that, much time cost on pod list line about 5min.
So I raise this PR, using goroutine to parallel processing pod change. The data will be show on log info
You can see that, after using this PR, the cost time reduce to 1s. From 5min to 1s.
It's very useful on medium or large workflow and for normal workflow, it will also accelerate processing.
@jessesuen pls help review. Thanks.