Skip to content

Commit

Permalink
[ci] Kill hanged docker build process to avoid build timeout issue. (s…
Browse files Browse the repository at this point in the history
…onic-net#13726) (sonic-net#13597)

Why I did it
Docker build has a low rate of hanging up.
It hangs on different steps. So, it looks like a bug in docker daemon.

How I did it
Start a daemon process to scan running time more than 1 hours, and kill the process.

How to verify it
  • Loading branch information
liushilongbuaa committed Feb 22, 2023
1 parent 0798518 commit 9d20dda
Show file tree
Hide file tree
Showing 4 changed files with 34 additions and 0 deletions.
1 change: 1 addition & 0 deletions .azure-pipelines/azure-pipelines-build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,7 @@ jobs:

buildSteps:
- template: template-skipvstest.yml
- template: template-daemon.yml
- bash: |
set -ex
if [ $(GROUP_NAME) == vs ]; then
Expand Down
7 changes: 7 additions & 0 deletions .azure-pipelines/cleanup.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
steps:
- script: |
set -x
# kill daemon process
ps $(cat /tmp/azp_daemon_kill_docker_pid)
sudo kill $(cat /tmp/azp_daemon_kill_docker_pid)
rm /tmp/azp_daemon_kill_docker_pid
if sudo [ -f /var/run/march/docker.pid ] ; then
pid=`sudo cat /var/run/march/docker.pid` ; sudo kill $pid
fi
Expand All @@ -11,4 +17,5 @@ steps:
pid=`sudo cat dockerfs/var/run/docker.pid` ; sudo kill $pid
fi
sudo rm -rf $(ls -A1)
condition: always()
displayName: "Clean Workspace"
24 changes: 24 additions & 0 deletions .azure-pipelines/template-daemon.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
steps:
- bash: |
(
while true
do
sleep 120
now=$(date +%s)
pids=$(ps -C docker -o pid,etime,args | grep "docker build" | cut -d" " -f1)
for pid in $pids
do
start=$(date --date="$(ls -dl /proc/$pid --time-style full-iso | awk '{print$6,$7}')" +%s)
time_s=$(($now-$start))
if [[ $time_s -gt $(DOCKER_BUILD_TIMEOUT) ]]; then
echo =========== $(date +%F%T) $time_s &>> target/daemon.log
ps $pid &>> target/daemon.log
sudo kill $pid
fi
done
done
) &
daemon_pid=$!
ps $daemon_pid
echo $daemon_pid >> /tmp/azp_daemon_kill_docker_pid
displayName: start daemon to kill hang docker
2 changes: 2 additions & 0 deletions .azure-pipelines/template-variables.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,5 @@ variables:
SONIC_SLAVE_DOCKER_DRIVER: 'overlay2'
SONIC_BUILD_RETRY_COUNT: 3
SONIC_BUILD_RETRY_INTERVAL: 600
DOCKER_BUILDKIT: 0
DOCKER_BUILD_TIMEOUT: 3600

0 comments on commit 9d20dda

Please sign in to comment.