Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes Issue # 23418: Race condition between device deferred removal and resume device. #23497

Merged
merged 1 commit into from
Aug 3, 2016

Conversation

shishir-a412ed
Copy link
Contributor

Fixes #23418

Signed-off-by: Shishir Mahajan shishir.mahajan@redhat.com

@thaJeztah
Copy link
Member

ping @LK4D4 @cpuguy83 @rhvgoyal ptal

if err := devices.poolHasFreeSpace(); err != nil {
return err
}

CancelledRemoval, err := devices.cancelDeferredRemoval(baseInfo)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lower-cased please

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@shishir-a412ed shishir-a412ed force-pushed the dm_task_run_failed branch 3 times, most recently from eaba1af to 0b9a466 Compare June 14, 2016 15:39
@@ -858,7 +889,8 @@ func (devices *DeviceSet) createRegisterSnapDevice(hash string, baseInfo *devInf
}

for {
if err := devicemapper.CreateSnapDevice(devices.getPoolDevName(), deviceID, baseInfo.Name(), baseInfo.DeviceID); err != nil {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't add new blank line after for starting block

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@rhvgoyal
Copy link
Contributor

LGTM


if err := CreateSnapDeviceRaw(poolName, deviceID, baseName, baseDeviceID); err != nil {
if doSuspend {
ResumeDevice(baseName)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check err?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So me and @rhvgoyal discussed this. If you check the original code, e.g https://github.com/docker/docker/blob/master/pkg/devicemapper/devmapper.go#L781

There was no check for error in ResumeDevice(baseName). One reason can be, if I check for error in ResumeDevice(baseName) and if err != nil and I return the error. the error from CreateSnapDeviceRaw will be ignored and not returned (Unless we concat both the errors and return them as one). So we decided to keep it as original code.

Shishir

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to log these errors, or probably even more something like:

if err2 := ResumeDevice(baseName); err2 != nil {
  if err == nil {
    err = err2
  }
  return err
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will only hit ResumeDevice(baseName) here if err != nil. err will not be equal to nil (err==nil) in this case.
You mean something like:

        if err := CreateSnapDeviceRaw(poolName, deviceID, baseName, baseDeviceID); err != nil {
                if doSuspend {
                        if err2 := ResumeDevice(baseName); err2 != nil{
                                return fmt.Errorf("ResumeDevice Error: (%v); CreateSnapDeviceRaw Error (%v)", err2, err)
                        }
                }
                return err
        }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@runcom
Copy link
Member

runcom commented Jun 22, 2016

19:56:14 ----------------------------------------------------------------------
19:56:14 FAIL: docker_api_swarm_test.go:556: DockerSwarmSuite.TestApiSwarmLeaveOnPendingJoin
19:56:14 
19:56:14 [d40976264] waiting for daemon to start
19:56:14 [d40976264] daemon started
19:56:14 [d64227023] waiting for daemon to start
19:56:14 [d64227023] exiting daemon
19:56:14 docker_api_swarm_test.go:558:
19:56:14     d2 := s.AddDaemon(c, false, false)
19:56:14 check_test.go:212:
19:56:14     c.Assert(err, check.IsNil)
19:56:14 ... value *errors.errorString = &errors.errorString{s:"[d64227023] Daemon exited during startup"} ("[d64227023] Daemon exited during startup")
19:56:14 
19:56:14 [d40976264] exiting daemon

could this be related? (failing in experimental)

@shishir-a412ed shishir-a412ed force-pushed the dm_task_run_failed branch 2 times, most recently from 9ddcaa3 to b9cc647 Compare June 27, 2016 15:17
@shishir-a412ed
Copy link
Contributor Author

@thaJeztah Any updates on this one ?

@thaJeztah
Copy link
Member

ping @cpuguy83 @runcom PTAL

@@ -2126,9 +2157,11 @@ func (devices *DeviceSet) removeDevice(devname string) error {
return err
}

func (devices *DeviceSet) cancelDeferredRemoval(info *devInfo) error {
func (devices *DeviceSet) cancelDeferredRemoval(info *devInfo) (bool, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like we don't really need this bool return.
If there is an error it wasn't cancelled, if there is no error it was cancelled.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are few conditions where err is nil still deferred remove was not cancelled.

  • If device was not scheduled for deferred removal.

  • If device is not active to being with.

    if devinfo != nil && devinfo.DeferredRemove == 0 {
            return nil
    }
    

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't these be error cases?
We are saying "go cancel this thing" and then essentially returning back "ok nothing is wrong".
We can always check the error type if need be.

I think it may be a little more clear handling this logic in the caller... this is similar to how we tend to ignore IsNotExist type errors when we are removing the file anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I realize this is probably totally subjective, but LMK what you think.

Not a big fan of adding this bool return.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand what you are saying, but devicemapper (cancelDeferredRemoval) coming back with err==nil doesn't necessarily means that the deferred removal was cancelled.
It means "Nothing is wrong" but we still need additional information to know if the deferred removal was actually cancelled, and do we need to reschedule it?

err == nil can mean
a) deferred removal option is not even supported. FALSE
b) the device exists but it is not marked for deferred removal. FALSE
c) the deferred removal was scheduled on the device, and now it is cancelled successfully. TRUE
d) The device was already gone, by the time we reached to cancel the deferred removal. So now it's not needed anymore. FALSE.

Only in case (c) we really need to reschedule the deferred removal, since it was actually cancelled.
I know what you are saying with os.IsNotExist(err) however that is a type of error and can be checked. We don't have an error here, and we need more information to make an informed decision.
Unless, you rewrite the entire functionality on how we implement cancelDeferredRemoval, I am afraid we would have to go with a boolean variable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't like bool, may be we can provide another helper function which only does the actual removal. Rest of the conditions can be checked by caller.

So say provide another funciton __cancelDeferredRmoval(info devInfo), which does not do various check and just calls devicemapper.CancelDeferredRemove(). Now this function will return success if deferred removal as actually cancelled or error.

Caller can check for rest of the conditions and call this only if.

  • Deferred removal is enabled.
  • There is an active device.
  • Device is in deferred remove state.

That way we will get rid of bool return parameter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this function will probably return ErrEnxio while it is retrying in busy loop.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically we need two functions.

  • One which just ensures that when the function finishes, device should not be in deferred remove state. Other states like device not being there any more etc are fine. This is what current function is implementing.
  • Another function which returns success only if deferred removal was cancelled otherwise appropriate error message.

Shishir, lets implement this second variant of function and any other required bits let open code in the caller. And that should take care of concern raised by @cpuguy83

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@shishir-a412ed shishir-a412ed force-pushed the dm_task_run_failed branch 2 times, most recently from 8f33eaa to 11b873b Compare July 15, 2016 14:28
@shishir-a412ed
Copy link
Contributor Author

@thaJeztah Please check now.

@thaJeztah
Copy link
Member

Thank you!

@shishir-a412ed
Copy link
Contributor Author

@thaJeztah Any updates on this one ?

@rhvgoyal
Copy link
Contributor

@thaJeztah can we merge this one now. Its a race condition users have reported to us and is an issue gets visible if deferred removal is enabled.

@thaJeztah
Copy link
Member

@rhvgoyal let me check

@shishir-a412ed shishir-a412ed force-pushed the dm_task_run_failed branch 3 times, most recently from 432ea4c to 9b33f0f Compare July 25, 2016 16:35
@shishir-a412ed
Copy link
Contributor Author

@thaJeztah Any updates :)

@cpuguy83
Copy link
Member

@shishir-a412ed We aren't doing any non-1.12 related merges until after the release.
Since this has been a problem since the feature was added and really hasn't had enough time to get some people actually using it and the potential impact of these changes, we decided not to include in 1.12.

@shishir-a412ed shishir-a412ed force-pushed the dm_task_run_failed branch 3 times, most recently from e0fa6c3 to 8df5663 Compare August 1, 2016 12:54
…nd resume device.

Problem Description:

An example scenario that involves deferred removal
1. A new base image gets created (e.g. 'docker load -i'). The base device is activated and
mounted at some point in time during image creation.
2. While image creation is in progress, a privileged container is started
from another image and the host's mount name space is shared with this
container ('docker run --privileged -v /:/host').
3. Image creation completes and the base device gets unmounted. However,
as the privileged container still holds a reference on the base image
mount point, the base device cannot be removed right away. So it gets
flagged for deferred removal.
4. Next, the privileged container terminates and thus its reference to the
base image mount point gets released. The base device (which is flagged
for deferred removal) may now be cleaned up by the device-mapper. This
opens up an opportunity for a race between a 'kworker' thread (executing
the do_deferred_remove() function) and the Docker daemon (executing the
CreateSnapDevice() function).

This PR cancel the deferred removal, if the device is marked for it. And reschedule the
deferred removal later after the device is resumed successfully.

Signed-off-by: Shishir Mahajan <shishir.mahajan@redhat.com>
@thaJeztah thaJeztah added this to the 1.13.0 milestone Aug 3, 2016
@thaJeztah
Copy link
Member

1.12 was released, and all green, so merging

@thaJeztah thaJeztah merged commit 87e48ec into moby:master Aug 3, 2016
shishir-a412ed pushed a commit to shishir-a412ed/docker that referenced this pull request Sep 1, 2016
… device deferred removal and resume device.

Upstream PR: moby#23497
Problem Description:

An example scenario that involves deferred removal
1. A new base image gets created (e.g. 'docker load -i'). The base device is activated and
mounted at some point in time during image creation.
2. While image creation is in progress, a privileged container is started
from another image and the host's mount name space is shared with this
container ('docker run --privileged -v /:/host').
3. Image creation completes and the base device gets unmounted. However,
as the privileged container still holds a reference on the base image
mount point, the base device cannot be removed right away. So it gets
flagged for deferred removal.
4. Next, the privileged container terminates and thus its reference to the
base image mount point gets released. The base device (which is flagged
for deferred removal) may now be cleaned up by the device-mapper. This
opens up an opportunity for a race between a 'kworker' thread (executing
the do_deferred_remove() function) and the Docker daemon (executing the
CreateSnapDevice() function).

This PR cancel the deferred removal, if the device is marked for it. And reschedule the
deferred removal later after the device is resumed successfully.

Signed-off-by: Shishir Mahajan <shishir.mahajan@redhat.com>
runcom pushed a commit to projectatomic/docker that referenced this pull request Sep 1, 2016
… device deferred removal and resume device.

Upstream PR: moby#23497
Problem Description:

An example scenario that involves deferred removal
1. A new base image gets created (e.g. 'docker load -i'). The base device is activated and
mounted at some point in time during image creation.
2. While image creation is in progress, a privileged container is started
from another image and the host's mount name space is shared with this
container ('docker run --privileged -v /:/host').
3. Image creation completes and the base device gets unmounted. However,
as the privileged container still holds a reference on the base image
mount point, the base device cannot be removed right away. So it gets
flagged for deferred removal.
4. Next, the privileged container terminates and thus its reference to the
base image mount point gets released. The base device (which is flagged
for deferred removal) may now be cleaned up by the device-mapper. This
opens up an opportunity for a race between a 'kworker' thread (executing
the do_deferred_remove() function) and the Docker daemon (executing the
CreateSnapDevice() function).

This PR cancel the deferred removal, if the device is marked for it. And reschedule the
deferred removal later after the device is resumed successfully.

Signed-off-by: Shishir Mahajan <shishir.mahajan@redhat.com>
liusdu pushed a commit to liusdu/moby that referenced this pull request Oct 30, 2017
…nd resume device.

Problem Description:

An example scenario that involves deferred removal
1. A new base image gets created (e.g. 'docker load -i'). The base device is activated and
mounted at some point in time during image creation.
2. While image creation is in progress, a privileged container is started
from another image and the host's mount name space is shared with this
container ('docker run --privileged -v /:/host').
3. Image creation completes and the base device gets unmounted. However,
as the privileged container still holds a reference on the base image
mount point, the base device cannot be removed right away. So it gets
flagged for deferred removal.
4. Next, the privileged container terminates and thus its reference to the
base image mount point gets released. The base device (which is flagged
for deferred removal) may now be cleaned up by the device-mapper. This
opens up an opportunity for a race between a 'kworker' thread (executing
the do_deferred_remove() function) and the Docker daemon (executing the
CreateSnapDevice() function).

This PR cancel the deferred removal, if the device is marked for it. And reschedule the
deferred removal later after the device is resumed successfully.

cherry-picked from docker upstream:
	moby#23497

Signed-off-by: Shishir Mahajan <shishir.mahajan@redhat.com>
Signed-off-by: Deng Guangxing <dengguangxing@huawei.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants