Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use self update ready entrypoint #99

Merged
merged 3 commits into from
Oct 4, 2020
Merged

Use self update ready entrypoint #99

merged 3 commits into from
Oct 4, 2020

Conversation

ytsarev
Copy link
Contributor

@ytsarev ytsarev commented Sep 16, 2020

Switch entrypoint according to actions/runner#484 (comment)

Testing showed very good result with update from 2.273.1 to 2.273.2

bcp-runner-xn565-25wr7 runner Downloading 2.273.2 runner
bcp-runner-xn565-cns67 runner Downloading 2.273.2 runner
bcp-runner-xn565-28k6t runner Waiting for current job finish running.
bcp-runner-xn565-28k6t runner Generate and execute update script.
bcp-runner-xn565-28k6t runner Runner will exit shortly for update, should back online within 10 seconds.
bcp-runner-xn565-lpvkf runner Waiting for current job finish running.
bcp-runner-xn565-lpvkf runner Generate and execute update script.
bcp-runner-xn565-lpvkf runner Runner will exit shortly for update, should back online within 10 seconds.
bcp-runner-xn565-28k6t runner Runner listener exited with error code 3
bcp-runner-xn565-28k6t runner Runner listener exit because of updating, re-launch runner in 5 seconds.
bcp-runner-xn565-28k6t runner Starting Runner listener with startup type: service
bcp-runner-xn565-28k6t runner Started listener process
bcp-runner-xn565-28k6t runner
bcp-runner-xn565-28k6t runner √ Connected to GitHub
bcp-runner-xn565-28k6t runner
bcp-runner-xn565-28k6t runner 2020-09-16 16:03:21Z: Listening for Jobs
bcp-runner-xn565-28k6t runner 2020-09-16 16:03:25Z: Running job: custom-runner

The workflow queue is getting picked up properly after update

@ytsarev ytsarev mentioned this pull request Sep 16, 2020
@stackdumper
Copy link

Yay, thank you!

@onelapahead onelapahead mentioned this pull request Sep 16, 2020
@igorbrigadir
Copy link

Yep, this is the workaround i've been using myself, but will say i think i've a few gaps in understanding and I'm still not totally clear if it solves some other issues detailed here: #40 (comment) didn't dig further because i just wanted to get my CI back up and running.

onelapahead added a commit to onelapahead/actions-runner-controller that referenced this pull request Sep 18, 2020
Copy link
Collaborator

@mumoshu mumoshu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

I'm now reviewing this. And the only question I have right now is that - how can we replicate the behavior of --once in this mode?

We've been using --once so that the runner pod is stopped to be recreated by the controller after each job run, so that each jojb run gets a clean environment.

@mumoshu
Copy link
Collaborator

mumoshu commented Sep 21, 2020

So I was able to modify bin/runsvc.sh and bin/RunnerService.js a bit so that we can add --once support for it:

Without my change:

runner@39a92ae2a66f:/runner$ bin/runsvc.sh
.path=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
Starting Runner listener with startup type: service
Started listener process
Started running service

√ Connected to GitHub

2020-09-21 04:05:46Z: Listening for Jobs
2020-09-21 04:06:27Z: Running job: Build
2020-09-21 04:06:38Z: Job Build completed with result: Succeeded

and the runner keeps running.

With my change, it's like:

runner@39a92ae2a66f:/runner$ bin/runsvc.sh --once
.path=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
Starting Runner listener with startup type: service
Started listener process
Started running service

√ Connected to GitHub

2020-09-21 03:57:39Z: Listening for Jobs
2020-09-21 04:04:50Z: Running job: Build
2020-09-21 04:05:23Z: Job Build completed with result: Succeeded
Runner listener exited with error code 0
Runner listener exit with 0 return code, stop the service, no retry needed.

Here's the modified bin/runsvc.sh, which just adds $* to RunnerService.js.

#!/bin/bash

# convert SIGTERM signal to SIGINT
# for more info on how to propagate SIGTERM to a child process see: http://veithen.github.io/2014/11/16/sigterm-propagation.html
trap 'kill -INT $PID' TERM INT

if [ -f ".path" ]; then
    # configure
    export PATH=`cat .path`
    echo ".path=${PATH}"
fi

# insert anything to setup env when running as a service

# run the host process which keep the listener alive
./externals/node12/bin/node ./bin/RunnerService.js $* &
PID=$!
wait $PID
trap - TERM INT
wait $PID

Here's the modified version of RunnerServiece.js, which just concat either process.argv.slice(2) or process.argv.slice(3) depending on if its in the interactive mode or not.

#!/usr/bin/env node
// Copyright (c) GitHub. All rights reserved.
// Licensed under the MIT license. See LICENSE file in the project root for full license information.

var childProcess = require("child_process");
var path = require("path")

var supported = ['linux', 'darwin']

if (supported.indexOf(process.platform) == -1) {
    console.log('Unsupported platform: ' + process.platform);
    console.log('Supported platforms are: ' + supported.toString());
    process.exit(1);
}

var stopping = false;
var listener = null;

var runService = function() {
    var listenerExePath = path.join(__dirname, '../bin/Runner.Listener');
    var interactive = process.argv[2] === "interactive";

    if(!stopping) {
        try {
            if (interactive) {
                console.log('Starting Runner listener interactively');
                listener = childProcess.spawn(listenerExePath, ['run'].concat(process.argv.slice(3)), { env: process.env });
            } else {
                console.log('Starting Runner listener with startup type: service');
                listener = childProcess.spawn(listenerExePath, ['run', '--startuptype', 'service'].concat(process.argv.slice(2)), { env: process.env });
            }

            console.log('Started listener process');

            listener.stdout.on('data', (data) => {
                process.stdout.write(data.toString('utf8'));
            });

            listener.stderr.on('data', (data) => {
                process.stdout.write(data.toString('utf8'));
            });

            listener.on('close', (code) => {
                console.log(`Runner listener exited with error code ${code}`);

                if (code === 0) {
                    console.log('Runner listener exit with 0 return code, stop the service, no retry needed.');
                    stopping = true;
                } else if (code === 1) {
                    console.log('Runner listener exit with terminated error, stop the service, no retry needed.');
                    stopping = true;
                } else if (code === 2) {
                    console.log('Runner listener exit with retryable error, re-launch runner in 5 seconds.');
                } else if (code === 3) {
                    console.log('Runner listener exit because of updating, re-launch runner in 5 seconds.');
                } else {
                    console.log('Runner listener exit with undefined return code, re-launch runner in 5 seconds.');
                }

                if(!stopping) {
                    setTimeout(runService, 5000);
                }
            });

        } catch(ex) {
            console.log(ex);
        }
    }
}

runService();
console.log('Started running service');

var gracefulShutdown = function(code) {
    console.log('Shutting down runner listener');
    stopping = true;
    if (listener) {
        console.log('Sending SIGINT to runner listener to stop');
        listener.kill('SIGINT');

        // TODO wait for 30 seconds and send a SIGKILL
    }
}

process.on('SIGINT', () => {
    gracefulShutdown(0);
});

process.on('SIGTERM', () => {
    gracefulShutdown(0);
});

@mumoshu
Copy link
Collaborator

mumoshu commented Sep 21, 2020

Are there any other ways to add support for --once to runsvc.sh? Would there be any predefined envvar to inject additional args?

@mumoshu
Copy link
Collaborator

mumoshu commented Sep 21, 2020

In case anyone is interested, here are the steps to try the modification yourself: https://github.com/mumoshu/actions-runner-the-hard-way

@onelapahead
Copy link
Contributor

onelapahead commented Sep 21, 2020

Looks like long-term --once might be replaced with an ephemeral setting soon? actions/runner#660

Which is meant to solve some issues/misunderstandings with --once: actions/runner#510, actions/runner#559

I'll have to try out your patch, but any advice for baking it into an image aside from maintaining a patched copy of runsvc.sh and RunnerListener.js?

@mumoshu
Copy link
Collaborator

mumoshu commented Sep 21, 2020

@hfuss Hey! Thanks for #97 and sharing work towards epehemeral. I didn't know about ephemeral until now but it does like so. Glad to see that ephemeral seems to be designed to deregister itself automatically, so this controller won't need to do it anymore.

@mumoshu
Copy link
Collaborator

mumoshu commented Sep 21, 2020

@hfuss Re patching, I was going to bake in runsvc.sh.patched and RunnerListener.js.patched into the image and just run diff and cp those into runsvc.sh and RunnerListener.js before running runsvc.sh in our entrypoint script: https://github.com/summerwind/actions-runner-controller/blob/master/runner/entrypoint.sh#L30

This way, we can at least see from the logs if there's any unexpected diff between the original and the patched scripts.

@mumoshu
Copy link
Collaborator

mumoshu commented Sep 21, 2020

Run `cd runner; NAME=$DOCKER_USER/actions-runner TAG=dev make docker-build docker-push`,
`kubectl apply -f release/actions-runner-controller.yaml`,
then update the runner image(not the controller image) by updating e.g. `Runner.Spec.Image` to `$DOCKER_USER/actions-runner:$TAG`, for testing.
@mumoshu
Copy link
Collaborator

mumoshu commented Sep 21, 2020

Seems to be working as expected. It did (1)wait for the auto update to finish before running the build (2)successfully restarted without stopping the pod, and (3)stopped after the first build(indicates --once is working

❯ k logs example-runner -c runner -f

--------------------------------------------------------------------------------
|        ____ _ _   _   _       _          _        _   _                      |
|       / ___(_) |_| | | |_   _| |__      / \   ___| |_(_) ___  _ __  ___      |
|      | |  _| | __| |_| | | | | '_ \    / _ \ / __| __| |/ _ \| '_ \/ __|     |
|      | |_| | | |_|  _  | |_| | |_) |  / ___ \ (__| |_| | (_) | | | \__ \     |
|       \____|_|\__|_| |_|\__,_|_.__/  /_/   \_\___|\__|_|\___/|_| |_|___/     |
|                                                                              |
|                       Self-hosted runner registration                        |
|                                                                              |
--------------------------------------------------------------------------------

# Authentication


√ Connected to GitHub

# Runner Registration



A runner exists with the same name
√ Successfully replaced the runner
√ Runner connection is good

# Runner settings


√ Settings Saved.

16c16
< ./externals/node12/bin/node ./bin/RunnerService.js &
---
> ./externals/node12/bin/node ./bin/RunnerService.js $* &
20c20
< wait $PID
---
> wait $PID
\ No newline at end of file
27c27
<                 listener = childProcess.spawn(listenerExePath, ['run'], { env: process.env });
---
>                 listener = childProcess.spawn(listenerExePath, ['run'].concat(process.argv.slice(3)), { env: process.env });
30c30
<                 listener = childProcess.spawn(listenerExePath, ['run', '--startuptype', 'service'], { env: process.env });
---
>                 listener = childProcess.spawn(listenerExePath, ['run', '--startuptype', 'service'].concat(process.argv.slice(2)), { env: process.env });
34c34
<         
---
> 
59c59
<                 
---
> 
91c91
< });
---
> });
\ No newline at end of file
.path=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
Starting Runner listener with startup type: service
Started listener process
Started running service

√ Connected to GitHub

2020-09-21 07:11:09Z: Listening for Jobs
Runner update in progress, do not shutdown runner.
Downloading 2.273.2 runner
Waiting for current job finish running.
Generate and execute update script.
Runner will exit shortly for update, should back online within 10 seconds.
Runner listener exited with error code 4
Runner listener exit with undefined return code, re-launch runner in 5 seconds.
Starting Runner listener with startup type: service
Started listener process

√ Connected to GitHub

2020-09-21 07:15:27Z: Listening for Jobs
2020-09-21 07:15:31Z: Running job: Build
2020-09-21 07:15:58Z: Job Build completed with result: Succeeded
Runner listener exited with error code 0
Runner listener exit with 0 return code, stop the service, no retry needed.

@onelapahead
Copy link
Contributor

onelapahead commented Sep 21, 2020

@mumoshu didn't say so the first time but thank you for the help with this particular issue, as well as all the efforts on this project as they are awesome and help a lot of people!

Will test out that patch shortly.

@onelapahead
Copy link
Contributor

Tested the patch out and it works for cleaning up the runners and successfully auto-updating them see https://github.com/AbsaOSS/actions-runner-controller/pull/1

Folks can use brix4dayz/actions-runner:v2.273.4 with the patch if need be: https://hub.docker.com/r/brix4dayz/actions-runner/tags

@mumoshu mumoshu mentioned this pull request Sep 23, 2020
@mumoshu
Copy link
Collaborator

mumoshu commented Sep 28, 2020

@summerwind Would you mind taking a look into this and https://github.com/AbsaOSS/actions-runner-controller/pull/1? Just wanting to be sure that I won't break anything badly 😃

@mumoshu
Copy link
Collaborator

mumoshu commented Sep 28, 2020

@summerwind You'll also be interested in actions/runner#660 for a long-term solution.

@mumoshu mumoshu merged commit b79ea98 into actions:master Oct 4, 2020
stackdumper pushed a commit to stackdumper/actions-runner-controller that referenced this pull request Oct 6, 2020
* Use self update ready entrypoint

* Add --once support for runsvc.sh

Run `cd runner; NAME=$DOCKER_USER/actions-runner TAG=dev make docker-build docker-push`,
`kubectl apply -f release/actions-runner-controller.yaml`,
then update the runner image(not the controller image) by updating e.g. `Runner.Spec.Image` to `$DOCKER_USER/actions-runner:$TAG`, for testing.

Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants