virtio-queue: Error out early on invalid available ring index #196

likebreath · 2022-09-27T21:24:04Z

The number of descriptor chain heads to process should always be smaller or equal to the queue size, as the driver should never ask the VMM to process a available ring entry more than once. Checking and reporting such incorrect driver behavior can prevent potential hanging and Denial-of-Service from happening on the VMM side.

Signed-off-by: Bo Chen chen.bo@intel.com

Summary of the PR

Please summarize here why the changes in this PR are needed.

Requirements

Before submitting your PR, please make sure you addressed the following
requirements:

All commits in this PR are signed (with git commit -s), and the commit
message has max 60 characters for the summary and max 75 characters for each
description line.
All added/changed functionality has a corresponding unit/integration
test.
Any newly added unsafe code is properly documented.

likebreath · 2022-09-27T21:25:31Z

/cc @rbradford @sboeuf

andreeaflorescu · 2022-09-28T09:25:11Z

crates/virtio-queue/src/queue.rs

+        // be smaller or equal to the queue size, as the driver should
+        // never ask the VMM to process a available ring entry more than
+        // once. Checking and reporting such incorrect driver behavior
+        // can avoid potential hanging and Denial-of-Service from VMMs.


Do you have a reproduction for this DoS? When does this happen?

When you call next on the returned object from this function we already check that it is less then size because we're doing a modulo:

vm-virtio/crates/virtio-queue/src/queue.rs

Line 750 in 7e203db

let elem_off =

.

According to the spec the driver is allowed to pass a number that is larger than QUEUE_SIZE: https://docs.oasis-open.org/virtio/virtio/v1.1/csprd01/virtio-v1.1-csprd01.html#x1-380006
"idx field indicates where the driver would put the next descriptor entry in the ring (modulo the queue size)."

Not sure the logic in the if block is correct (because of wrapping) but the goal is to avoid iterating and processing over the same descriptors over and over again.

e.g. if the queue size was n and guest set the next descriptor to current + 1000 * n it would force the virtio device to consider each descriptor in the array 1000 times. This would effectively DoS the thread processing the queue especially if the computation on the descriptor is costly. This was found with fuzzing.

This manifests itself by .next() continuing to return descriptor chains until next_avail and last_index are equal: https://github.com/rust-vmm/vm-virtio/blob/main/crates/virtio-queue/src/queue.rs#L743

@rbradford @likebreath do you think you can work on a unit test that exemplifies this scenario? We can use it as a regression test for the fix. I am trying to do that myself, but I am still not very sure how this can happen. The guest can flood the VMM with requests nevertheless, so some sort of a rate limiter needs to be implemented at the VMM side nevertheless. If we can improve the situation in any case at the device level, that would be great as well.

The guest can flood the VMM with requests nevertheless, so some sort of a rate limiter needs to be implemented at the VMM side nevertheless. If we can improve the situation in any case at the device level, that would be great as well.

A well behaved guest would never do this because they wouldn't want the same descriptors handled multiple times. I don't know if we could also use the used ring to identify this troublesome behaviour.

Looks like QEMU has the same check in place: https://github.com/qemu/qemu/blob/master/hw/virtio/virtio.c#L966

Looks like QEMU has the same check in place: https://github.com/qemu/qemu/blob/master/hw/virtio/virtio.c#L966

Cool. Do you think this is the right place or should it be in the implementation of .next() ?

We can't directly do idx - self.next_avail because that operation might underflow and cause a panic.

Both these are Wrapping<u16> so I think that is it safe.

Right, that is safe because it will return u16::MAX which is going to be > queue size irrespective of the queue size set by the driver.

Cool. Do you think this is the right place or should it be in the implementation of .next() ?

I am checking to see if there is any other code path that could trigger this behavior outside of iter. It might make sense to have it in the constructor of AvailIter.

So @likebreath looks like the actions that you need to do on this PR are address the comment I made about the commit message/comment and add a unit tests to ensure that invalid and only invalid ranges generate an error.

rbradford · 2022-09-28T15:05:12Z

crates/virtio-queue/src/queue.rs

-            .map(move |idx| AvailIter::new(mem, idx, self))
+        let idx = self.avail_idx(mem.deref(), Ordering::Acquire)?;
+
+        // The number of descriptor chain heads to process should always


I don't think this comment or that in the commit message is quite right. The VMM isn't the part that can cause the DoS it's the guest.

Thank you for letting me know the message was misleading. That was not what I intended to say.

Both the comments and commit message are updated. PTAL.

crates/virtio-queue/src/queue.rs

andreeaflorescu · 2022-09-28T15:20:20Z

crates/virtio-queue/src/queue.rs

+        // be smaller or equal to the queue size, as the driver should
+        // never ask the VMM to process a available ring entry more than
+        // once. Checking and reporting such incorrect driver behavior
+        // can avoid potential hanging and Denial-of-Service from VMMs.


Cool. Do you think this is the right place or should it be in the implementation of .next() ?

I am checking to see if there is any other code path that could trigger this behavior outside of iter. It might make sense to have it in the constructor of AvailIter.

likebreath · 2022-09-28T21:14:25Z

Thank you all the the feedbacks. All comments should be addressed. PTAL.

A summary of changes:

Moved the check for invalid available ring index to the constructor of AvailIter;
Clarified the commit message and comments;
Added a related unit test;

The number of descriptor chain heads to process should always be smaller or equal to the queue size, as the driver should never ask the VMM to process a available ring entry more than once. Checking and reporting such incorrect driver behavior can prevent potential hanging and Denial-of-Service from happening on the VMM side. Signed-off-by: Bo Chen <chen.bo@intel.com>

This test ensures constructing a descriptor chain iterator succeeds with valid available ring indexes while produces an error with invalid indexes. Signed-off-by: Bo Chen <chen.bo@intel.com>

The number of descriptor chain heads to process should always be smaller or equal to the queue size, as the driver should never ask the VMM to process a available ring entry more than once. Checking and reporting such incorrect driver behavior can prevent potential hanging and Denial-of-Service from happening on the VMM side. Issue reported in rust-vmm/vm-virtio and fixed in rust-vmm/vm-virtio#196. Signed-off-by: Diana Popa <dpopa@amazon.com> Co-authored-by: Bo Chen <chen.bo@intel.com>

likebreath requested review from alexandruag, andreeaflorescu, jiangliu, slp and stsquad as code owners September 27, 2022 21:24

likebreath force-pushed the 0927/virt_queue/report_abnormal_index branch from d1a8982 to 0195667 Compare September 27, 2022 21:28

andreeaflorescu reviewed Sep 28, 2022

View reviewed changes

rbradford reviewed Sep 28, 2022

View reviewed changes

andreeaflorescu reviewed Sep 28, 2022

View reviewed changes

likebreath force-pushed the 0927/virt_queue/report_abnormal_index branch from 0195667 to d37d46d Compare September 28, 2022 21:05

likebreath force-pushed the 0927/virt_queue/report_abnormal_index branch from d37d46d to 7391d16 Compare September 28, 2022 21:19

likebreath added 2 commits September 28, 2022 14:20

virtio-queue: Add unit test for iterator and avail ring idx

7391d16

This test ensures constructing a descriptor chain iterator succeeds with valid available ring indexes while produces an error with invalid indexes. Signed-off-by: Bo Chen <chen.bo@intel.com>

lauralt approved these changes Sep 29, 2022

View reviewed changes

sboeuf approved these changes Sep 29, 2022

View reviewed changes

rbradford approved these changes Sep 29, 2022

View reviewed changes

andreeaflorescu approved these changes Sep 29, 2022

View reviewed changes

andreeaflorescu merged commit 74e516d into rust-vmm:main Sep 29, 2022

andreeaflorescu mentioned this pull request Sep 29, 2022

Release new version of virtio-queue #197

Closed

dianpopa mentioned this pull request Sep 29, 2022

[Bug] Backport fix for a self-DOS scenario from rust-vmm's vm-virtio firecracker-microvm/firecracker#3149

Closed

3 tasks

dianpopa mentioned this pull request Oct 3, 2022

virtio-queue: Error out on invalid available ring index firecracker-microvm/firecracker#3156

Merged

9 tasks

andreeaflorescu mentioned this pull request Feb 16, 2023

Fix pop after reset #225

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

virtio-queue: Error out early on invalid available ring index #196

virtio-queue: Error out early on invalid available ring index #196

likebreath commented Sep 27, 2022 •

edited

Loading

likebreath commented Sep 27, 2022

andreeaflorescu Sep 28, 2022

rbradford Sep 28, 2022

rbradford Sep 28, 2022

andreeaflorescu Sep 28, 2022

rbradford Sep 28, 2022

andreeaflorescu Sep 28, 2022

rbradford Sep 28, 2022

andreeaflorescu Sep 28, 2022

andreeaflorescu Sep 28, 2022

rbradford Sep 28, 2022

rbradford Sep 28, 2022

likebreath Sep 28, 2022

andreeaflorescu Sep 28, 2022

likebreath commented Sep 28, 2022

virtio-queue: Error out early on invalid available ring index #196

virtio-queue: Error out early on invalid available ring index #196

Conversation

likebreath commented Sep 27, 2022 • edited Loading

Summary of the PR

Requirements

likebreath commented Sep 27, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

likebreath commented Sep 28, 2022

likebreath commented Sep 27, 2022 •

edited

Loading