Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partial service update #1045

Open
cpuguy83 opened this issue Jun 22, 2016 · 9 comments
Open

Partial service update #1045

cpuguy83 opened this issue Jun 22, 2016 · 9 comments

Comments

@cpuguy83
Copy link
Member

It would be cool to be able to specify an update policy such that I only want to deploy an updated version of my service to a subset of the swarm (to e.g. manual verification that the service is OK).

ping @duglin

@duglin
Copy link

duglin commented Jun 23, 2016

A little more info....

Let's say I have 10 instances of my service running at v1 and I want to upgrade to v2.

I ask for a rolling upgrade but only want to upgrade one instance so I can do some testing before I do the rest (this assumes that the normal health checking done by swarmkit isn't sufficient to know if things are really ok). Right now there's no way to ask for 1 to be upgrade and then wait before doing the rest.

One option that @stevvooe mentioned was to allow for us to specify the number (or %) of instances to upgrade and then the user could increase that amount over time until they're all done.

@kelseyhightower
Copy link

This is where it might make sense to decouple a task from a service. It would be nice to have a service map to one or more tasks based on something like labels to identify a set of containers that should receive traffic for a service.

If services and tasks where decoupled then you can use a canary pattern to achieve what you want. Run two "jobs" with N number of tasks to reach your target ratio, then have a single service that points to both tasks.

Another possible way to do this is to add the ability to pause a rolling update at a certain % and resume it latter. Pause and resume might provide the right clues to help a user reason about the current state of the task.

@cpuguy83
Copy link
Member Author

@kelseyhightower I think pause/resume (with an automated pause) would achieve this quite well.

@resouer
Copy link

resouer commented Jun 24, 2016

I think it's better to be called blue-green deployment which is a standard
function in any PaaS like CF etc, but really not sure it should be included
in swarmkit. It's more like a function build on swarmkit.

On Fri, Jun 24, 2016 at 9:33 PM, Brian Goff notifications@github.com
wrote:

@kelseyhightower https://github.com/kelseyhightower I think
pause/resume (with an automated pause) would achieve this quite well.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#1045 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/ABn3lrD4WmRrT2C29Q0Gys4A2nIUbvOZks5qO9yogaJpZM4I8Q2g
.

@sukrit007
Copy link

really looking forward for blue-green kind of deploy. Currently, the only way i can think of doing using swarm mode is to bring up another service (which binds to a different port) and use something on top like cloud-loadbalancer on top for routing traffic to v1 or v2. But would love to see this as part of swarm kit.

@aluzzardi
Copy link
Member

Agreed with @resouer

@kelseyhightower I think just the service concept is enough. You can implement blue/green, canary, session stickiness, etc by using services.

For instance, you could deploy service-1 and service-2 then have an haproxy service where you can explicitly do traffic split (e.g. "send 1% of traffic to service-canary, the rest to service-prod).

We could have some abstraction in SwarmKit itself but there are gazillions of use cases (a/b testing?).

The simple service concept allows you do mix and match so that your own code can decide how to split traffic.

@stevvooe
Copy link
Contributor

stevvooe commented Sep 1, 2016

Here is an example of blue-green deployment with swarm services: https://github.com/stevvooe/sillyproxy.

@pnickolov
Copy link

pnickolov commented Sep 3, 2016

I propose that these can be looked as two separate issues:

  1. per @cpuguy83 - ability to manually control the update process and effectively control when the replacement occurs (also see Proposal: support manual control of service update process / canary moby#26160 which seems to a duplicate and opened by me in the wrong project)
  2. per @aluzzardi and @stevvooe - ability to change the distribution of traffic between the old (blue) and new (green) instances

Assuming random/round-robin load balancing, #1 also controls the traffic distribution but it is quite heavy (still, seems needed, see also moby/moby#26160 and moby/moby#26159)

#2 can provide a lightweight mechanism to control the traffic distribution (per @aluzzardi ) while not touching the number of instances, especially not destroying old instances that we may end up needing if we have to rollback.

Question: I am about to move moby/moby#26160 (manual control) and moby/moby#26159 (pause/unpause) here. If this issue remains about instances, it will make sense to merge at least moby/moby#26160 here. Agree?

cc/ @aaronlehmann
Thanks @stevvooe for connecting the the issues

@DoomTaper
Copy link

Any updates on this ? When will this feature be available ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants