Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge pull request #2 from torvalds/master #87

Closed
wants to merge 1 commit into from
Closed

Merge pull request #2 from torvalds/master #87

wants to merge 1 commit into from

Conversation

fbigun
Copy link

@fbigun fbigun commented Apr 2, 2014

合并作者的更新

@fbigun fbigun closed this Apr 2, 2014
liubogithub pushed a commit to liubogithub/btrfs-work that referenced this pull request Apr 2, 2014
Isolated balloon pages can wrongly end up in LRU lists when
migrate_pages() finishes its round without draining all the isolated
page list.

The same issue can happen when reclaim_clean_pages_from_list() tries to
reclaim pages from an isolated page list, before migration, in the CMA
path.  Such balloon page leak opens a race window against LRU lists
shrinkers that leads us to the following kernel panic:

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
  IP: [<ffffffff810c2625>] shrink_page_list+0x24e/0x897
  PGD 3cda2067 PUD 3d713067 PMD 0
  Oops: 0000 [#1] SMP
  CPU: 0 PID: 340 Comm: kswapd0 Not tainted 3.12.0-rc1-22626-g4367597 torvalds#87
  Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  RIP: shrink_page_list+0x24e/0x897
  RSP: 0000:ffff88003da499b8  EFLAGS: 00010286
  RAX: 0000000000000000 RBX: ffff88003e82bd60 RCX: 00000000000657d5
  RDX: 0000000000000000 RSI: 000000000000031f RDI: ffff88003e82bd40
  RBP: ffff88003da49ab0 R08: 0000000000000001 R09: 0000000081121a45
  R10: ffffffff81121a45 R11: ffff88003c4a9a28 R12: ffff88003e82bd40
  R13: ffff88003da0e800 R14: 0000000000000001 R15: ffff88003da49d58
  FS:  0000000000000000(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00000000067d9000 CR3: 000000003ace5000 CR4: 00000000000407b0
  Call Trace:
    shrink_inactive_list+0x240/0x3de
    shrink_lruvec+0x3e0/0x566
    __shrink_zone+0x94/0x178
    shrink_zone+0x3a/0x82
    balance_pgdat+0x32a/0x4c2
    kswapd+0x2f0/0x372
    kthread+0xa2/0xaa
    ret_from_fork+0x7c/0xb0
  Code: 80 7d 8f 01 48 83 95 68 ff ff ff 00 4c 89 e7 e8 5a 7b 00 00 48 85 c0 49 89 c5 75 08 80 7d 8f 00 74 3e eb 31 48 8b 80 18 01 00 00 <48> 8b 74 0d 48 8b 78 30 be 02 00 00 00 ff d2 eb
  RIP  [<ffffffff810c2625>] shrink_page_list+0x24e/0x897
   RSP <ffff88003da499b8>
  CR2: 0000000000000028
  ---[ end trace 703d2451af6ffbfd ]---
  Kernel panic - not syncing: Fatal exception

This patch fixes the issue, by assuring the proper tests are made at
putback_movable_pages() & reclaim_clean_pages_from_list() to avoid
isolated balloon pages being wrongly reinserted in LRU lists.

[akpm@linux-foundation.org: clarify awkward comment text]
Signed-off-by: Rafael Aquini <aquini@redhat.com>
Reported-by: Luiz Capitulino <lcapitulino@redhat.com>
Tested-by: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
liubogithub pushed a commit to liubogithub/btrfs-work that referenced this pull request Apr 2, 2014
Turn it into (for example):

[    0.073380] x86: Booting SMP configuration:
[    0.074005] .... node   #0, CPUs:          #1   #2   #3   #4   #5   torvalds#6   torvalds#7
[    0.603005] .... node   #1, CPUs:     torvalds#8   torvalds#9  torvalds#10  torvalds#11  torvalds#12  torvalds#13  torvalds#14  torvalds#15
[    1.200005] .... node   #2, CPUs:    torvalds#16  torvalds#17  torvalds#18  torvalds#19  torvalds#20  torvalds#21  torvalds#22  torvalds#23
[    1.796005] .... node   #3, CPUs:    torvalds#24  torvalds#25  torvalds#26  torvalds#27  torvalds#28  torvalds#29  torvalds#30  torvalds#31
[    2.393005] .... node   #4, CPUs:    torvalds#32  torvalds#33  torvalds#34  torvalds#35  torvalds#36  torvalds#37  torvalds#38  torvalds#39
[    2.996005] .... node   #5, CPUs:    torvalds#40  torvalds#41  torvalds#42  torvalds#43  torvalds#44  torvalds#45  torvalds#46  torvalds#47
[    3.600005] .... node   torvalds#6, CPUs:    torvalds#48  torvalds#49  torvalds#50  torvalds#51  #52  #53  torvalds#54  torvalds#55
[    4.202005] .... node   torvalds#7, CPUs:    torvalds#56  torvalds#57  #58  torvalds#59  torvalds#60  torvalds#61  torvalds#62  torvalds#63
[    4.811005] .... node   torvalds#8, CPUs:    torvalds#64  torvalds#65  torvalds#66  torvalds#67  torvalds#68  torvalds#69  #70  torvalds#71
[    5.421006] .... node   torvalds#9, CPUs:    torvalds#72  torvalds#73  torvalds#74  torvalds#75  torvalds#76  torvalds#77  torvalds#78  torvalds#79
[    6.032005] .... node  torvalds#10, CPUs:    torvalds#80  torvalds#81  torvalds#82  torvalds#83  torvalds#84  torvalds#85  torvalds#86  torvalds#87
[    6.648006] .... node  torvalds#11, CPUs:    torvalds#88  torvalds#89  torvalds#90  torvalds#91  torvalds#92  torvalds#93  torvalds#94  torvalds#95
[    7.262005] .... node  torvalds#12, CPUs:    torvalds#96  torvalds#97  torvalds#98  torvalds#99 torvalds#100 torvalds#101 torvalds#102 torvalds#103
[    7.865005] .... node  torvalds#13, CPUs:   torvalds#104 torvalds#105 torvalds#106 torvalds#107 torvalds#108 torvalds#109 torvalds#110 torvalds#111
[    8.466005] .... node  torvalds#14, CPUs:   torvalds#112 torvalds#113 torvalds#114 torvalds#115 torvalds#116 torvalds#117 torvalds#118 torvalds#119
[    9.073006] .... node  torvalds#15, CPUs:   torvalds#120 torvalds#121 torvalds#122 torvalds#123 torvalds#124 torvalds#125 torvalds#126 torvalds#127
[    9.679901] x86: Booted up 16 nodes, 128 CPUs

and drop useless elements.

Change num_digits() to hpa's division-avoiding, cell-phone-typed
version which he went at great lengths and pains to submit on a
Saturday evening.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: huawei.libin@huawei.com
Cc: wangyijing@huawei.com
Cc: fenghua.yu@intel.com
Cc: guohanjun@huawei.com
Cc: paul.gortmaker@windriver.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20130930095624.GB16383@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
gregnietsky pushed a commit to Distrotech/linux that referenced this pull request Apr 9, 2014
commit 117aad1 upstream.

Isolated balloon pages can wrongly end up in LRU lists when
migrate_pages() finishes its round without draining all the isolated
page list.

The same issue can happen when reclaim_clean_pages_from_list() tries to
reclaim pages from an isolated page list, before migration, in the CMA
path.  Such balloon page leak opens a race window against LRU lists
shrinkers that leads us to the following kernel panic:

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
  IP: [<ffffffff810c2625>] shrink_page_list+0x24e/0x897
  PGD 3cda2067 PUD 3d713067 PMD 0
  Oops: 0000 [#1] SMP
  CPU: 0 PID: 340 Comm: kswapd0 Not tainted 3.12.0-rc1-22626-g4367597 torvalds#87
  Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  RIP: shrink_page_list+0x24e/0x897
  RSP: 0000:ffff88003da499b8  EFLAGS: 00010286
  RAX: 0000000000000000 RBX: ffff88003e82bd60 RCX: 00000000000657d5
  RDX: 0000000000000000 RSI: 000000000000031f RDI: ffff88003e82bd40
  RBP: ffff88003da49ab0 R08: 0000000000000001 R09: 0000000081121a45
  R10: ffffffff81121a45 R11: ffff88003c4a9a28 R12: ffff88003e82bd40
  R13: ffff88003da0e800 R14: 0000000000000001 R15: ffff88003da49d58
  FS:  0000000000000000(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00000000067d9000 CR3: 000000003ace5000 CR4: 00000000000407b0
  Call Trace:
    shrink_inactive_list+0x240/0x3de
    shrink_lruvec+0x3e0/0x566
    __shrink_zone+0x94/0x178
    shrink_zone+0x3a/0x82
    balance_pgdat+0x32a/0x4c2
    kswapd+0x2f0/0x372
    kthread+0xa2/0xaa
    ret_from_fork+0x7c/0xb0
  Code: 80 7d 8f 01 48 83 95 68 ff ff ff 00 4c 89 e7 e8 5a 7b 00 00 48 85 c0 49 89 c5 75 08 80 7d 8f 00 74 3e eb 31 48 8b 80 18 01 00 00 <48> 8b 74 0d 48 8b 78 30 be 02 00 00 00 ff d2 eb
  RIP  [<ffffffff810c2625>] shrink_page_list+0x24e/0x897
   RSP <ffff88003da499b8>
  CR2: 0000000000000028
  ---[ end trace 703d2451af6ffbfd ]---
  Kernel panic - not syncing: Fatal exception

This patch fixes the issue, by assuring the proper tests are made at
putback_movable_pages() & reclaim_clean_pages_from_list() to avoid
isolated balloon pages being wrongly reinserted in LRU lists.

[akpm@linux-foundation.org: clarify awkward comment text]
Signed-off-by: Rafael Aquini <aquini@redhat.com>
Reported-by: Luiz Capitulino <lcapitulino@redhat.com>
Tested-by: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
gregnietsky pushed a commit to Distrotech/linux that referenced this pull request Apr 9, 2014
commit 117aad1 upstream.

Isolated balloon pages can wrongly end up in LRU lists when
migrate_pages() finishes its round without draining all the isolated
page list.

The same issue can happen when reclaim_clean_pages_from_list() tries to
reclaim pages from an isolated page list, before migration, in the CMA
path.  Such balloon page leak opens a race window against LRU lists
shrinkers that leads us to the following kernel panic:

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
  IP: [<ffffffff810c2625>] shrink_page_list+0x24e/0x897
  PGD 3cda2067 PUD 3d713067 PMD 0
  Oops: 0000 [#1] SMP
  CPU: 0 PID: 340 Comm: kswapd0 Not tainted 3.12.0-rc1-22626-g4367597 torvalds#87
  Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  RIP: shrink_page_list+0x24e/0x897
  RSP: 0000:ffff88003da499b8  EFLAGS: 00010286
  RAX: 0000000000000000 RBX: ffff88003e82bd60 RCX: 00000000000657d5
  RDX: 0000000000000000 RSI: 000000000000031f RDI: ffff88003e82bd40
  RBP: ffff88003da49ab0 R08: 0000000000000001 R09: 0000000081121a45
  R10: ffffffff81121a45 R11: ffff88003c4a9a28 R12: ffff88003e82bd40
  R13: ffff88003da0e800 R14: 0000000000000001 R15: ffff88003da49d58
  FS:  0000000000000000(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00000000067d9000 CR3: 000000003ace5000 CR4: 00000000000407b0
  Call Trace:
    shrink_inactive_list+0x240/0x3de
    shrink_lruvec+0x3e0/0x566
    __shrink_zone+0x94/0x178
    shrink_zone+0x3a/0x82
    balance_pgdat+0x32a/0x4c2
    kswapd+0x2f0/0x372
    kthread+0xa2/0xaa
    ret_from_fork+0x7c/0xb0
  Code: 80 7d 8f 01 48 83 95 68 ff ff ff 00 4c 89 e7 e8 5a 7b 00 00 48 85 c0 49 89 c5 75 08 80 7d 8f 00 74 3e eb 31 48 8b 80 18 01 00 00 <48> 8b 74 0d 48 8b 78 30 be 02 00 00 00 ff d2 eb
  RIP  [<ffffffff810c2625>] shrink_page_list+0x24e/0x897
   RSP <ffff88003da499b8>
  CR2: 0000000000000028
  ---[ end trace 703d2451af6ffbfd ]---
  Kernel panic - not syncing: Fatal exception

This patch fixes the issue, by assuring the proper tests are made at
putback_movable_pages() & reclaim_clean_pages_from_list() to avoid
isolated balloon pages being wrongly reinserted in LRU lists.

[akpm@linux-foundation.org: clarify awkward comment text]
Signed-off-by: Rafael Aquini <aquini@redhat.com>
Reported-by: Luiz Capitulino <lcapitulino@redhat.com>
Tested-by: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
dongsupark pushed a commit to dongsupark/linux that referenced this pull request Dec 24, 2014
Replace bio_for_each_segment_all() with bio_for_each_page_all(),
also in __bio_unmap_user(). Follow-up of commit 160e570ce.
Without this fix, page_cache_release() sometimes causes kernel to
crash like this:

BUG: Bad page state in process kworker/1:7  pfn:74322 page:ffffea0001d0c880
count:0 mapcount:1 mapping:          (null) index:0x7f0108bfb
flags: page dumped because: nonzero mapcount
CPU: 1 PID: 3795 Comm: kworker/1:7 Tainted: G W 3.19.0-rc1+ torvalds#87
Workqueue: events bio_dirty_fn
 ffffffff81c53428 ffff880077e5bbf8 ffffffff818004a6 ffff88007d1cf278
 ffffea0001d0c880 ffff880077e5bc28 ffffffff811b114c ffff880077e5bc28
 ffffea0001d0c880 ffffea0001d0c880 0000000000000000 ffff880077e5bc88
Call Trace:
 [<ffffffff818004a6>] dump_stack+0x4c/0x65
 [<ffffffff811b114c>] bad_page.part.57+0xbc/0x110
 [<ffffffff811b1d11>] free_pages_prepare+0x231/0x400
 [<ffffffff811b47f5>] free_hot_cold_page+0x35/0x1b0
 [<ffffffff811bc4b6>] ? __page_cache_release+0xf6/0x170
 [<ffffffff811bd1ef>] put_page+0x3f/0x70
 [<ffffffff814e5056>] bio_dirty_fn+0x76/0xa0
 [<ffffffff8109bd88>] process_one_work+0x1e8/0x7f0
 [<ffffffff8109bcfd>] ? process_one_work+0x15d/0x7f0
 [<ffffffff8109c48b>] ? worker_thread+0xfb/0x4b0
 [<ffffffff8109c3fb>] worker_thread+0x6b/0x4b0
 [<ffffffff8109c390>] ? process_one_work+0x7f0/0x7f0
 [<ffffffff810a232d>] kthread+0x10d/0x130
 [<ffffffff810d509d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff810a2220>] ? kthread_create_on_node+0x230/0x230
 [<ffffffff8180a1ec>] ret_from_fork+0x7c/0xb0
 [<ffffffff810a2220>] ? kthread_create_on_node+0x230/0x230

Signed-off-by: Dongsu Park <dongsu.park@profitbricks.com>
dongsupark pushed a commit to dongsupark/linux that referenced this pull request Dec 29, 2014
Replace bio_for_each_segment_all() with bio_for_each_page_all(),
also in __bio_unmap_user(). Follow-up of commit 160e570ce.
Without this fix, page_cache_release() sometimes causes kernel to
crash like this:

BUG: Bad page state in process kworker/1:7  pfn:74322 page:ffffea0001d0c880
count:0 mapcount:1 mapping:          (null) index:0x7f0108bfb
flags: page dumped because: nonzero mapcount
CPU: 1 PID: 3795 Comm: kworker/1:7 Tainted: G W 3.19.0-rc1+ torvalds#87
Workqueue: events bio_dirty_fn
 ffffffff81c53428 ffff880077e5bbf8 ffffffff818004a6 ffff88007d1cf278
 ffffea0001d0c880 ffff880077e5bc28 ffffffff811b114c ffff880077e5bc28
 ffffea0001d0c880 ffffea0001d0c880 0000000000000000 ffff880077e5bc88
Call Trace:
 [<ffffffff818004a6>] dump_stack+0x4c/0x65
 [<ffffffff811b114c>] bad_page.part.57+0xbc/0x110
 [<ffffffff811b1d11>] free_pages_prepare+0x231/0x400
 [<ffffffff811b47f5>] free_hot_cold_page+0x35/0x1b0
 [<ffffffff811bc4b6>] ? __page_cache_release+0xf6/0x170
 [<ffffffff811bd1ef>] put_page+0x3f/0x70
 [<ffffffff814e5056>] bio_dirty_fn+0x76/0xa0
 [<ffffffff8109bd88>] process_one_work+0x1e8/0x7f0
 [<ffffffff8109bcfd>] ? process_one_work+0x15d/0x7f0
 [<ffffffff8109c48b>] ? worker_thread+0xfb/0x4b0
 [<ffffffff8109c3fb>] worker_thread+0x6b/0x4b0
 [<ffffffff8109c390>] ? process_one_work+0x7f0/0x7f0
 [<ffffffff810a232d>] kthread+0x10d/0x130
 [<ffffffff810d509d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff810a2220>] ? kthread_create_on_node+0x230/0x230
 [<ffffffff8180a1ec>] ret_from_fork+0x7c/0xb0
 [<ffffffff810a2220>] ? kthread_create_on_node+0x230/0x230

Signed-off-by: Dongsu Park <dongsu.park@profitbricks.com>
dongsupark pushed a commit to dongsupark/linux that referenced this pull request Jan 12, 2015
Replace bio_for_each_segment_all() with bio_for_each_page_all(),
also in __bio_unmap_user(). Follow-up of commit 160e570ce.
Without this fix, page_cache_release() sometimes causes kernel to
crash like this:

BUG: Bad page state in process kworker/1:7  pfn:74322 page:ffffea0001d0c880
count:0 mapcount:1 mapping:          (null) index:0x7f0108bfb
flags: page dumped because: nonzero mapcount
CPU: 1 PID: 3795 Comm: kworker/1:7 Tainted: G W 3.19.0-rc1+ torvalds#87
Workqueue: events bio_dirty_fn
 ffffffff81c53428 ffff880077e5bbf8 ffffffff818004a6 ffff88007d1cf278
 ffffea0001d0c880 ffff880077e5bc28 ffffffff811b114c ffff880077e5bc28
 ffffea0001d0c880 ffffea0001d0c880 0000000000000000 ffff880077e5bc88
Call Trace:
 [<ffffffff818004a6>] dump_stack+0x4c/0x65
 [<ffffffff811b114c>] bad_page.part.57+0xbc/0x110
 [<ffffffff811b1d11>] free_pages_prepare+0x231/0x400
 [<ffffffff811b47f5>] free_hot_cold_page+0x35/0x1b0
 [<ffffffff811bc4b6>] ? __page_cache_release+0xf6/0x170
 [<ffffffff811bd1ef>] put_page+0x3f/0x70
 [<ffffffff814e5056>] bio_dirty_fn+0x76/0xa0
 [<ffffffff8109bd88>] process_one_work+0x1e8/0x7f0
 [<ffffffff8109bcfd>] ? process_one_work+0x15d/0x7f0
 [<ffffffff8109c48b>] ? worker_thread+0xfb/0x4b0
 [<ffffffff8109c3fb>] worker_thread+0x6b/0x4b0
 [<ffffffff8109c390>] ? process_one_work+0x7f0/0x7f0
 [<ffffffff810a232d>] kthread+0x10d/0x130
 [<ffffffff810d509d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff810a2220>] ? kthread_create_on_node+0x230/0x230
 [<ffffffff8180a1ec>] ret_from_fork+0x7c/0xb0
 [<ffffffff810a2220>] ? kthread_create_on_node+0x230/0x230

Signed-off-by: Dongsu Park <dongsu.park@profitbricks.com>
hzhuang1 pushed a commit to hzhuang1/linux that referenced this pull request Jul 6, 2015
Revert "enbable vblank event and reserve 128MB memory for graphic usage"
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Apr 15, 2016
This adds test cases mostly around ARG_PTR_TO_RAW_STACK to check the
verifier behaviour.

  [...]
  torvalds#84 raw_stack: no skb_load_bytes OK
  torvalds#85 raw_stack: skb_load_bytes, no init OK
  torvalds#86 raw_stack: skb_load_bytes, init OK
  torvalds#87 raw_stack: skb_load_bytes, spilled regs around bounds OK
  torvalds#88 raw_stack: skb_load_bytes, spilled regs corruption OK
  torvalds#89 raw_stack: skb_load_bytes, spilled regs corruption 2 OK
  torvalds#90 raw_stack: skb_load_bytes, spilled regs + data OK
  torvalds#91 raw_stack: skb_load_bytes, invalid access 1 OK
  torvalds#92 raw_stack: skb_load_bytes, invalid access 2 OK
  torvalds#93 raw_stack: skb_load_bytes, invalid access 3 OK
  torvalds#94 raw_stack: skb_load_bytes, invalid access 4 OK
  torvalds#95 raw_stack: skb_load_bytes, invalid access 5 OK
  torvalds#96 raw_stack: skb_load_bytes, invalid access 6 OK
  torvalds#97 raw_stack: skb_load_bytes, large access OK
  Summary: 98 PASSED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Jun 8, 2016
The police action is using its own code to initialize tcf hash
info, which makes us to forgot to initialize a->hinfo correctly.
Fix this by calling the helper function tcf_hash_create() directly.

This patch fixed the following crash:

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
 IP: [<ffffffff810c099f>] __lock_acquire+0xd3/0xf91
 PGD d3c34067 PUD d3e18067 PMD 0
 Oops: 0000 [#1] SMP
 CPU: 2 PID: 853 Comm: tc Not tainted 4.6.0+ torvalds#87
 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
 task: ffff8800d3e28040 ti: ffff8800d3f6c000 task.ti: ffff8800d3f6c000
 RIP: 0010:[<ffffffff810c099f>]  [<ffffffff810c099f>] __lock_acquire+0xd3/0xf91
 RSP: 0000:ffff88011b203c80  EFLAGS: 00010002
 RAX: 0000000000000046 RBX: 0000000000000000 RCX: 0000000000000000
 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000028
 RBP: ffff88011b203d40 R08: 0000000000000001 R09: 0000000000000000
 R10: ffff88011b203d58 R11: ffff88011b208000 R12: 0000000000000001
 R13: ffff8800d3e28040 R14: 0000000000000028 R15: 0000000000000000
 FS:  0000000000000000(0000) GS:ffff88011b200000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000028 CR3: 00000000d4be1000 CR4: 00000000000006e0
 Stack:
  ffff8800d3e289c0 0000000000000046 000000001b203d60 ffffffff00000000
  0000000000000000 ffff880000000000 0000000000000000 ffffffff00000000
  ffffffff8187142c ffff88011b203ce8 ffff88011b203ce8 ffffffff8101dbfc
 Call Trace:
  <IRQ>
  [<ffffffff8187142c>] ? __tcf_hash_release+0x77/0xd1
  [<ffffffff8101dbfc>] ? native_sched_clock+0x1a/0x35
  [<ffffffff8101dbfc>] ? native_sched_clock+0x1a/0x35
  [<ffffffff810a9604>] ? sched_clock_local+0x11/0x78
  [<ffffffff810bf6a1>] ? mark_lock+0x24/0x201
  [<ffffffff810c1dbd>] lock_acquire+0x120/0x1b4
  [<ffffffff810c1dbd>] ? lock_acquire+0x120/0x1b4
  [<ffffffff8187142c>] ? __tcf_hash_release+0x77/0xd1
  [<ffffffff81aad89f>] _raw_spin_lock_bh+0x3c/0x72
  [<ffffffff8187142c>] ? __tcf_hash_release+0x77/0xd1
  [<ffffffff8187142c>] __tcf_hash_release+0x77/0xd1
  [<ffffffff81871a27>] tcf_action_destroy+0x49/0x7c
  [<ffffffff81870b1c>] tcf_exts_destroy+0x20/0x2d
  [<ffffffff8189273b>] u32_destroy_key+0x1b/0x4d
  [<ffffffff81892788>] u32_delete_key_freepf_rcu+0x1b/0x1d
  [<ffffffff810de3b8>] rcu_process_callbacks+0x610/0x82e
  [<ffffffff8189276d>] ? u32_destroy_key+0x4d/0x4d
  [<ffffffff81ab0bc1>] __do_softirq+0x191/0x3f4

Fixes: ddf97cc ("net_sched: add network namespace support for tc actions")
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Aug 12, 2016
I got this:

    ================================================================================
    UBSAN: Undefined behaviour in ./include/linux/log2.h:63:13
    shift exponent 64 is too large for 64-bit type 'long unsigned int'
    CPU: 1 PID: 721 Comm: kworker/1:1 Not tainted 4.8.0-rc1+ torvalds#87
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
    Workqueue: events rht_deferred_worker
     0000000000000000 ffff88011661f8d8 ffffffff82344f50 0000000041b58ab3
     ffffffff84f98000 ffffffff82344ea4 ffff88011661f900 ffff88011661f8b0
     0000000000000001 ffff88011661f6b8 dffffc0000000000 ffffffff867f7640
    Call Trace:
     [<ffffffff82344f50>] dump_stack+0xac/0xfc
     [<ffffffff82344ea4>] ? _atomic_dec_and_lock+0xc4/0xc4
     [<ffffffff8242f5b8>] ubsan_epilogue+0xd/0x8a
     [<ffffffff82430c41>] __ubsan_handle_shift_out_of_bounds+0x255/0x29a
     [<ffffffff824309ec>] ? __ubsan_handle_out_of_bounds+0x180/0x180
     [<ffffffff84003436>] ? nl80211_req_set_reg+0x256/0x2f0
     [<ffffffff812112ba>] ? print_context_stack+0x8a/0x160
     [<ffffffff81200031>] ? amd_pmu_reset+0x341/0x380
     [<ffffffff823af808>] rht_deferred_worker+0x1618/0x1790
     [<ffffffff823af808>] ? rht_deferred_worker+0x1618/0x1790
     [<ffffffff823ae1f0>] ? rhashtable_jhash2+0x370/0x370
     [<ffffffff8134c12d>] ? process_one_work+0x6fd/0x1970
     [<ffffffff8134c1cf>] process_one_work+0x79f/0x1970
     [<ffffffff8134c12d>] ? process_one_work+0x6fd/0x1970
     [<ffffffff8134ba30>] ? try_to_grab_pending+0x4c0/0x4c0
     [<ffffffff8134d564>] ? worker_thread+0x1c4/0x1340
     [<ffffffff8134d8ff>] worker_thread+0x55f/0x1340
     [<ffffffff845e904f>] ? __schedule+0x4df/0x1d40
     [<ffffffff8134d3a0>] ? process_one_work+0x1970/0x1970
     [<ffffffff8134d3a0>] ? process_one_work+0x1970/0x1970
     [<ffffffff813642f7>] kthread+0x237/0x390
     [<ffffffff813640c0>] ? __kthread_parkme+0x280/0x280
     [<ffffffff845f8c93>] ? _raw_spin_unlock_irq+0x33/0x50
     [<ffffffff845f95df>] ret_from_fork+0x1f/0x40
     [<ffffffff813640c0>] ? __kthread_parkme+0x280/0x280
    ================================================================================

roundup_pow_of_two() is undefined when called with an argument of 0, so
let's avoid the call and just fall back to ht->p.min_size (which should
never be smaller than HASH_MIN_SIZE).

Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Aug 16, 2016
I got this:

    ================================================================================
    UBSAN: Undefined behaviour in ./include/linux/log2.h:63:13
    shift exponent 64 is too large for 64-bit type 'long unsigned int'
    CPU: 1 PID: 721 Comm: kworker/1:1 Not tainted 4.8.0-rc1+ torvalds#87
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
    Workqueue: events rht_deferred_worker
     0000000000000000 ffff88011661f8d8 ffffffff82344f50 0000000041b58ab3
     ffffffff84f98000 ffffffff82344ea4 ffff88011661f900 ffff88011661f8b0
     0000000000000001 ffff88011661f6b8 dffffc0000000000 ffffffff867f7640
    Call Trace:
     [<ffffffff82344f50>] dump_stack+0xac/0xfc
     [<ffffffff82344ea4>] ? _atomic_dec_and_lock+0xc4/0xc4
     [<ffffffff8242f5b8>] ubsan_epilogue+0xd/0x8a
     [<ffffffff82430c41>] __ubsan_handle_shift_out_of_bounds+0x255/0x29a
     [<ffffffff824309ec>] ? __ubsan_handle_out_of_bounds+0x180/0x180
     [<ffffffff84003436>] ? nl80211_req_set_reg+0x256/0x2f0
     [<ffffffff812112ba>] ? print_context_stack+0x8a/0x160
     [<ffffffff81200031>] ? amd_pmu_reset+0x341/0x380
     [<ffffffff823af808>] rht_deferred_worker+0x1618/0x1790
     [<ffffffff823af808>] ? rht_deferred_worker+0x1618/0x1790
     [<ffffffff823ae1f0>] ? rhashtable_jhash2+0x370/0x370
     [<ffffffff8134c12d>] ? process_one_work+0x6fd/0x1970
     [<ffffffff8134c1cf>] process_one_work+0x79f/0x1970
     [<ffffffff8134c12d>] ? process_one_work+0x6fd/0x1970
     [<ffffffff8134ba30>] ? try_to_grab_pending+0x4c0/0x4c0
     [<ffffffff8134d564>] ? worker_thread+0x1c4/0x1340
     [<ffffffff8134d8ff>] worker_thread+0x55f/0x1340
     [<ffffffff845e904f>] ? __schedule+0x4df/0x1d40
     [<ffffffff8134d3a0>] ? process_one_work+0x1970/0x1970
     [<ffffffff8134d3a0>] ? process_one_work+0x1970/0x1970
     [<ffffffff813642f7>] kthread+0x237/0x390
     [<ffffffff813640c0>] ? __kthread_parkme+0x280/0x280
     [<ffffffff845f8c93>] ? _raw_spin_unlock_irq+0x33/0x50
     [<ffffffff845f95df>] ret_from_fork+0x1f/0x40
     [<ffffffff813640c0>] ? __kthread_parkme+0x280/0x280
    ================================================================================

roundup_pow_of_two() is undefined when called with an argument of 0, so
let's avoid the call and just fall back to ht->p.min_size (which should
never be smaller than HASH_MIN_SIZE).

Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Aug 17, 2016
GIT bbcbd8247afc239ed98f4e3453cbce6d5ba11e5f

commit 9012e89f6e669d52e68dbaf6aa4b7aa3daeec72a
Author: Guenter Roeck <groeck@chromium.org>
Date:   Mon Aug 15 06:15:35 2016 -0700

    extcon: Introduce EXTCON_PROP_USB_SUPERSPEED property
    
    EXTCON_PROP_USB_SUPERSPEED[1] is necessary to distinguish between USB/USB2
    and USB3 connections on USB Type-C cables.
    
    [1] https://en.wikipedia.org/wiki/USB#Overview
    
    Cc: Chris Zhong <zyw@rock-chips.com>
    Signed-off-by: Guenter Roeck <groeck@chromium.org>
    Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>

commit 45d339309f491058f5a3f974a17aa893f5a35329
Author: Wei Yongjun <weiyj.lk@gmail.com>
Date:   Mon Aug 15 22:51:48 2016 +0000

    net: mediatek: remove unnecessary platform_set_drvdata()
    
    The driver core clears the driver data to NULL after device_release
    or on probe failure. Thus, it is not needed to manually clear the
    device driver data to NULL.
    
    Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 5288b6fff17369386e435b6780a00e3e3fd633de
Author: Wei Yongjun <weiyj.lk@gmail.com>
Date:   Mon Aug 15 22:51:29 2016 +0000

    net: thunderx: Remove unnecessary pci_set_drvdata()
    
    The driver core clears the driver data to NULL after device_release
    or on probe failure. Thus, it is not needed to manually clear the
    device driver data to NULL.
    
    Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 6e22066fd02b675260b980b3e42b7d616a9839c5
Author: Wei Yongjun <weiyj.lk@gmail.com>
Date:   Mon Aug 15 22:51:04 2016 +0000

    net: ena: Fix error return code in ena_device_init()
    
    Fix to return a negative error code from the invalid dma width
    error handling case instead of 0.
    
    Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 557bc7d44d52d52374bc72e9cc3b0beb41026886
Author: Wei Yongjun <weiyj.lk@gmail.com>
Date:   Mon Aug 15 22:50:34 2016 +0000

    net: ena: Remove unnecessary pci_set_drvdata()
    
    The driver core clears the driver data to NULL after device_release
    or on probe failure. Thus, it is not needed to manually clear the
    device driver data to NULL.
    
    Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 64721094b9a966c8e02d2b03729e5093471f9d1c
Author: Wei Yongjun <weiyj.lk@gmail.com>
Date:   Mon Aug 15 22:34:57 2016 +0000

    net: phy: Fix return value check in xgmiitorgmii_probe()
    
    In case of error, the function of_parse_phandle() returns NULL
    pointer not ERR_PTR(). The IS_ERR() test in the return value check
    should be replaced with NULL test.
    
    Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit d18a33c67119ccbc1963b2c8da19e4ee0c9904f3
Author: Kevin Hilman <khilman@baylibre.com>
Date:   Thu Jun 23 12:04:11 2016 -0700

    MMC: meson: initial support for GXBB platforms
    
    Initial support for the SD/eMMC controller in the Amlogic S905/GXBB
    family of SoCs.
    
    Currently working for the SD and eMMC interfaces, but not yet tested
    for SDIO.
    
    Signed-off-by: Kevin Hilman <khilman@baylibre.com>

commit fa91f6910dc6809bd0f68de119a43293ddd15330
Author: Kevin Hilman <khilman@baylibre.com>
Date:   Thu Jun 23 12:01:23 2016 -0700

    ARM64: dts: meson-gxbb: add MMC support
    
    Add binding and basic support for the SD/eMMC controller on Amlogic
    S905/GXBB devices.
    
    Signed-off-by: Kevin Hilman <khilman@baylibre.com>

commit 7e6a3a1d79fe9bb9b9daacb59e9b5c35f71570f5
Author: Stephen Boyd <sboyd@codeaurora.org>
Date:   Mon Aug 15 16:09:04 2016 -0700

    clk: qcom: Sort Makefile alphabetically
    
    We've started getting out of order, fix it.
    
    Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>

commit 3e99c7ab16af52be826cfd4a8be5b525dd38ccc9
Author: Neil Armstrong <narmstrong@baylibre.com>
Date:   Thu Aug 11 14:48:05 2016 +0200

    dt-bindings: clock: Update bindings for MDM9615 GCC and LCC
    
    Acked-by: Rob Herring <robh@kernel.org>
    Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
    Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>

commit b732ace40a1d5ea643ee9c28116e829ae950fe8f
Author: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Date:   Sat Aug 13 22:06:33 2016 +0530

    power: ds2760_battery: Remove deprecated create_singlethread_workqueue
    
    alloc_ordered_workqueue() with WQ_MEM_RECLAIM set replaces
    deprecated create_singlethread_workqueue(). This is the identity
    conversion.
    
    The workqueue "monitor_wqueue" is used to monitor the battery
    status. It has been identity converted.
    
    It queues multiple work items viz &di->monitor_work,
    &di->set_charged_work, which require execution ordering.
    Hence, alloc_workqueue has been used to replace the
    deprecated create_singlethread_workqueue instance.
    
    WQ_MEM_RECLAIM flag has been set to ensure forward progress under
    memory pressure.
    
    Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
    Signed-off-by: Sebastian Reichel <sre@kernel.org>

commit 1c53f3709cbc0da9fbf83bb10b2e3633ade05875
Author: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Date:   Sat Aug 13 22:05:00 2016 +0530

    power: ab8500_fg: Remove deprecated create_singlethread_workqueue
    
    alloc_ordered_workqueue() with WQ_MEM_RECLAIM set replaces
    deprecated create_singlethread_workqueue(). This is the identity
    conversion.
    
    The workqueue "fg_wq" is used for running the FG algorithm periodically.
    It has been identity converted.
    
    It has multiple work items viz fg_periodic_work, fg_low_bat_work,
    fg_reinit_work, fg_work, fg_acc_cur_work and fg_check_hw_failure_work,
    which require execution ordering. Hence, a dedicated ordered workqueue
    has been used here.
    
    The WQ_MEM_RECLAIM flag has been set to guarantee forward progress under
    memory pressure.
    
    Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
    Signed-off-by: Sebastian Reichel <sre@kernel.org>

commit 829f0e97cc03b0c834cc5f86ff72b2ec389020ef
Author: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Date:   Sat Aug 13 22:03:53 2016 +0530

    power: ipaq_micro_battery: Remove deprecated create_singlethread_workqueue
    
    The workqueue "wq" is used for handling battery related tasks.
    
    It has a single work item viz &mb->update and hence it doesn't require
    execution ordering. Hence, alloc_workqueue has been used to replace the
    deprecated create_singlethread_workqueue instance.
    
    The WQ_MEM_RECLAIM flag has been set to ensure forward progress under
    memory pressure.
    
    Since there is a single work item, explicit concurrency
    limit is unnecessary here.
    
    Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
    Signed-off-by: Sebastian Reichel <sre@kernel.org>

commit 9df82628265857e1c491c1ca0ace353e658457ea
Author: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Date:   Sat Aug 13 21:51:23 2016 +0530

    power: ab8500_charger: Remove deprecated create_singlethread_workqueue
    
    alloc_ordered_workqueue() with WQ_MEM_RECLAIM set replaces
    deprecated create_singlethread_workqueue(). This is the identity
    conversion.
    
    The workqueue "charger_wq" is used for the IRQs and checking HW state of
    the charger. It has been identity converted.
    
    It has multiple work items viz usb_charger_attached_work, kick_wd_work,
    check_vbat_work, check_hw_failure_work, usb_charger_attached_work,
    ac_work, ac_charger_attached_work, attach_work and check_usbchgnotok_work,
    which require execution ordering. Hence, a dedicated ordered workqueue
    has been used here.
    
    The WQ_MEM_RECLAIM flag has also been set to ensure
    forward progress under memory pressure.
    
    Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
    Signed-off-by: Sebastian Reichel <sre@kernel.org>

commit 87f818b35c3007d1014541c07688ab29443a22c2
Author: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Date:   Sat Aug 13 21:50:11 2016 +0530

    power: intel_mid_battery: Remove deprecated create_singlethread_workqueue
    
    The workqueue "monitor_wqueue" is used to monitor the PMIC battery status.
    It queues a single work item (pbi->monitor_battery) and hence doesn't
    require ordering. Hence, alloc_workqueue has been used to replace the
    deprecated create_singlethread_workqueue instance.
    
    Since PMIC battery status needs to be monitored for any change, the
    WQ_MEM_RECLAIM flag has been set to ensure forward progress under memory
    pressure.
    
    Since there is a single work item, explicit concurrency
    limit is unnecessary here.
    
    Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
    Signed-off-by: Sebastian Reichel <sre@kernel.org>

commit d8a69251fb58a756e2dd00cc5c7f1b54d383e203
Author: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Date:   Sat Aug 13 21:48:43 2016 +0530

    power: pm2301_charger: Remove deprecated create_singlethread_workqueue
    
    alloc_ordered_workqueue() with WQ_MEM_RECLAIM set replaces
    deprecated create_singlethread_workqueue(). This is the identity
    conversion.
    
    The workqueue "charger_wq" is used for running all the charger related
    tasks. This involves charger detection, checking for HW failure and HW
    status. This workqueue has been identity converted.
    
    It queues multiple workitems viz &pm2->check_main_thermal_prot_work,
    &pm2->check_hw_failure_work, &pm2->ac_work. Hence, the deprecated
    create_singlethread_workqueue() instance has been replaced with a
    dedicated ordered workqueue.
    
    The WQ_MEM_RECLAIM flag has been set to ensure forward progress under
    memory pressure.
    
    Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
    Signed-off-by: Sebastian Reichel <sre@kernel.org>

commit a8dd5b6868dd8ecb79f741dd94ac25a46ac19ba4
Author: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Date:   Sat Aug 13 21:47:35 2016 +0530

    power: ab8500_btemp: Remove deprecated create_singlethread_workqueue
    
    The workqueue "btemp_wq" is used for measuring the temperature
    periodically. It queues a single workitem (btemp_periodic_work) and
    hence doesn't require ordering. Thus, the deprecated
    create_singlethread_workqueue() instance has been replaced with
    alloc_workqueue().
    
    The WQ_MEM_RECLAIM flag has been set to ensure forward progress under
    memory pressure.
    
    Since there is a single work item, explicit concurrency
    limit is unnecessary here.
    
    Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
    Signed-off-by: Sebastian Reichel <sre@kernel.org>

commit 0b9992f76f65532be8727977bd6997aa55e1340e
Author: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Date:   Sat Aug 13 21:46:10 2016 +0530

    power: abx500_chargalg: Remove deprecated create_singlethread_workqueue
    
    alloc_ordered_workqueue() with WQ_MEM_RECLAIM set replaces
    deprecated create_singlethread_workqueue(). This is the identity
    conversion.
    
    The workqueue "chargalg_wq" is used for running the charging algorithm.
    It has multiple workitems viz &di->chargalg_periodic_work,
    &di->chargalg_wd_work, &di->chargalg_work per abx500_chargalg, which
    require ordering. It has been identity converted.
    
    Also, WQ_MEM_RECLAIM has been set to ensure forward progress under
    memory pressure.
    
    Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
    Signed-off-by: Sebastian Reichel <sre@kernel.org>

commit 7792a8d6713c33758636c252bd6ff7c8c001de12
Author: Neil Armstrong <narmstrong@baylibre.com>
Date:   Thu Aug 11 14:48:04 2016 +0200

    clk: mdm9615: Add support for MDM9615 Clock Controllers
    
    In order to support the Qualcomm MDM9615 SoC, add support for
    the Global and LPASS Clock Controllers.
    
    Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
    Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>

commit f7508fedd85e14a9d3c6d4e457231e691618b208
Author: Neil Armstrong <narmstrong@baylibre.com>
Date:   Thu Aug 11 14:48:03 2016 +0200

    dt-bindings: Add MDM9615 DT bindings include files for GCC and LCC
    
    Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
    Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>

commit 33608dcd01d0c0eb3f2442d88c8a97f1195bd2d5
Author: Kevin Hilman <khilman@baylibre.com>
Date:   Tue Aug 2 14:40:11 2016 -0700

    clk: gxbb: add MMC gate clocks, and expose for DT
    
    Add the SD/eMMC gate clocks and expose them for use by DT.
    
    While at it, also explose FCLK_DIV2 since this is one of the input
    clocks to the mux internal to each of the SD/eMMC blocks.
    
    Signed-off-by: Kevin Hilman <khilman@baylibre.com>
    Tested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
    Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>

commit 5d87f493ddb1b86a0569fa3c4037fa9efc0c7183
Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Date:   Sun Aug 14 04:07:32 2016 +0200

    x86/power/64: Use __pa() for physical address computation
    
    The value of temp_level4_pgt is the physical address of the
    top-level page directory, so use __pa() to compute it.
    
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Acked-by: Ingo Molnar <mingo@kernel.org>

commit 5a227cd1ab3693d36ac7a6f1fc4e21a7129f62f0
Author: Laxman Dewangan <ldewangan@nvidia.com>
Date:   Fri Jun 17 16:21:07 2016 +0530

    clk: max77686: Add support for MAX77620 clocks
    
    Maxim Max77620 has one 32KHz clock output and the clock HW
    IP used on this PMIC is same as what it is there in the MAX77686.
    
    Add clock driver support for MAX77620 on the MAX77686 driver.
    
    CC: Krzysztof Kozlowski <k.kozlowski@samsung.com>
    CC: Javier Martinez Canillas <javier@dowhile0.org>
    Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com>
    Tested-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
    Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
    Reviewed-by: Javier Martinez Canillas <javier@osg.samsung.com>
    Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>

commit 2ee565c934b7aa3ad84dcc3735fb2359026866a0
Author: Wei Yongjun <weiyj.lk@gmail.com>
Date:   Sat Aug 13 09:07:07 2016 +0000

    power: axp288_charger: remove duplicated include from axp288_charger.c
    
    Remove duplicated include.
    
    Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
    Acked-by: Chen-Yu Tsai <wens@csie.org>
    Signed-off-by: Sebastian Reichel <sre@kernel.org>

commit ad7656c75faebb43cf1102756c503668309d666f
Author: Wei Yongjun <weiyj.lk@gmail.com>
Date:   Sat Aug 13 09:06:47 2016 +0000

    power: axp288_fuel_gauge: remove duplicated include from axp288_fuel_gauge.c
    
    Remove duplicated include.
    
    Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
    Acked-by: Chen-Yu Tsai <wens@csie.org>
    Signed-off-by: Sebastian Reichel <sre@kernel.org>

commit 1bbd3d282557cf5e544cc749d577bd7cefe929f4
Author: Wei Yongjun <weiyj.lk@gmail.com>
Date:   Sat Aug 13 09:06:22 2016 +0000

    power: z2_battery: remove .owner field for driver
    
    Remove .owner field if calls are used which set it automatically.
    
    Generated by: scripts/coccinelle/api/platform_no_drv_owner.cocci
    
    Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
    Signed-off-by: Sebastian Reichel <sre@kernel.org>

commit e581245d8aaf897870afacd4aaf2bbce77cf1f1e
Author: Laxman Dewangan <ldewangan@nvidia.com>
Date:   Fri Jun 17 16:21:06 2016 +0530

    clk: max77686: Add DT binding details for PMIC MAX77620
    
    Maxim has used the same clock IP on multiple PMICs like MAX77686,
    MAX77802, MAX77620. Only differences are the number of clocks
    from these PMICs like MAX77686 has 3 clocks output, MAX776802 have
    two clock output and MAX77620 has one clock output.
    
    Add clock binding details and DT example for the MAX77620.
    
    Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com>
    CC: Krzysztof Kozlowski <k.kozlowski@samsung.com>
    CC: Javier Martinez Canillas <javier@dowhile0.org>
    Reviewed-by: Javier Martinez Canillas <javier@osg.samsung.com>
    Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
    Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>

commit 24f668debcd9fe774668920c4b8152fab0b5732a
Author: Laxman Dewangan <ldewangan@nvidia.com>
Date:   Fri Jun 17 16:21:05 2016 +0530

    clk: Combine DT binding doc for max77686 and max77802
    
    The clock IP used on the Maxim PMICs max77686 and max77802 are
    same. The configuration of clock register is also same except
    the number of clocks.
    
    Define the common DT binding file for the clocks of Maxim PMICs
    MAX77686 and MAX77802. For this, remove the separate DT binding
    document file for maxim,max77802 and move all information to
    maxim,max77686 DT binding document.
    
    Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com>
    CC: Krzysztof Kozlowski <k.kozlowski@samsung.com>
    CC: Javier Martinez Canillas <javier@dowhile0.org>
    Acked-by: Rob Herring <robh@kernel.org>
    Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
    Reviewed-by: Javier Martinez Canillas <javier@osg.samsung.com>
    Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>

commit 8ad313fe4e0016bac5dc6a7fbb323b8551977bd9
Author: Laxman Dewangan <ldewangan@nvidia.com>
Date:   Fri Jun 17 16:21:04 2016 +0530

    clk: max77686: Combine Maxim max77686 and max77802 driver
    
    The clock IP used on the Maxim PMICs max77686 and max77802 are
    same. The configuration of clock register is also same except
    the number of clocks.
    
    Part of common code utilisation, there is 3 files for these chips
    clock driver, one for common and two files for driver registration.
    
    Combine both drivers into single file and move common code into
    same common file reduces the 2 files and make max77686 and max77802
    clock driver in single fine. This driver does not depends on the
    parent driver structure. The regmap handle is acquired through
    regmap APIs for the register access.
    
    This combination of driver helps on adding clock driver for different
    Maxim PMICs which has similar clock IP like MAX77620 and MAX20024.
    
    Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com>
    CC: Krzysztof Kozlowski <k.kozlowski@samsung.com>
    CC: Javier Martinez Canillas <javier@dowhile0.org>
    Reviewed-by: Javier Martinez Canillas <javier@osg.samsung.com>
    Tested-by: Javier Martinez Canillas <javier@osg.samsung.com>
    Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
    Tested-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
    Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>

commit 8dfdd2a8426b71bf85b400d5226c4c6e0fa21781
Author: Bjorn Andersson <bjorn.andersson@linaro.org>
Date:   Wed Aug 3 22:04:06 2016 -0700

    power: reset: syscon-reboot-mode: Use managed resource API
    
    Use the managed resource version of reboot_mode_register().
    
    Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
    Tested-by: John Stultz <john.stultz@linaro.org>
    Signed-off-by: Sebastian Reichel <sre@kernel.org>

commit c1a9634f1aaf5e10c23f8890ea2e64c61d48cb44
Author: Bjorn Andersson <bjorn.andersson@linaro.org>
Date:   Wed Aug 3 22:04:05 2016 -0700

    power: reset: reboot-mode: Add managed resource API
    
    Provide managed resource version of reboot_mode_register() and
    reboot_mode_unregister() to simplify implementations.
    
    Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
    Tested-by: John Stultz <john.stultz@linaro.org>
    Signed-off-by: Sebastian Reichel <sre@kernel.org>

commit d336e9a71eedb1970b81bc8c042334b70fd4ddf7
Author: Stephen Boyd <sboyd@codeaurora.org>
Date:   Fri Aug 12 18:50:23 2016 -0700

    clk: fixed-rate: Remove export symbol on setup function
    
    This function is only called by builtin code, but we always
    exported it and had marked it as __init before commit
    e4eda8e0654c (clk: remove exported function from __init section,
    2013-01-06) removed that marking. Given that it isn't used by
    modules, lets unexport it and add back __init.
    
    Cc: Denis Efremov <yefremov.denis@gmail.com>
    Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>

commit 1caadde436d230bc37b3caaaef850580c25010e6
Author: Stephen Boyd <sboyd@codeaurora.org>
Date:   Fri Aug 12 18:50:22 2016 -0700

    clk: fixed-factor: Remove export symbol on setup function
    
    This function is marked __init, so it can't possibly need to be
    exported to modules. Remove the marking.
    
    Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
    Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>

commit f155d15b64e36b45ca89e3521fe0c1ccad5e5ff0
Author: Stephen Boyd <sboyd@codeaurora.org>
Date:   Mon Aug 15 14:32:23 2016 -0700

    clk: Return errors from clk providers in __of_clk_get_from_provider()
    
    Before commit 0861e5b8cf80 (clk: Add clk_hw OF clk providers,
    2016-02-05) __of_clk_get_from_provider() would return an error
    pointer of the provider's choosing if there was a provider
    registered and EPROBE_DEFER otherwise. After that commit, it
    would return EPROBE_DEFER regardless of whether or not the
    provider returned an error. This is odd and can lead to behavior
    where clk consumers keep probe deferring when they should be
    seeing some other error.
    
    Let's restore the previous behavior where we only return
    EPROBE_DEFER when there isn't a provider in our of_clk_providers
    list. Otherwise, return the error from the last provider we find
    that matches the node.
    
    Reported-by: Masahiro Yamada <yamada.masahiro@socionext.com>
    Fixes: 0861e5b8cf80 ("clk: Add clk_hw OF clk providers")
    Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>

commit 00746f104420e79efb2dd30467a3546c790b30eb
Author: Wei Yongjun <weiyj.lk@gmail.com>
Date:   Mon Aug 8 13:55:20 2016 +0000

    clk: gxbb: use builtin_platform_driver to simplify the code
    
    Use the builtin_platform_driver() macro to make the code simpler.
    
    Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
    Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>

commit 957cb720518341abf19009044f240a6342bdd039
Author: Joshua Clayton <stillcompiling@gmail.com>
Date:   Thu Aug 11 09:59:12 2016 -0700

    sbs-battery: add ability to get battery capacity
    
    Battery capacity level is a standard feature of sbs battery
    That can be used to tell what the remainig battery capacity is, and
    can tell if the battery has not been calibrated/initialized, which makes
    the capacity and charging/discharging percentages invalid.
    
    Signed-off-by: Joshua Clayton <stillcompiling@gmail.com>
    Signed-off-by: Sebastian Reichel <sre@kernel.org>

commit 33e7664a0af6e9a516f01014f39737aaa119b6d9
Author: Wei Yongjun <weiyj.lk@gmail.com>
Date:   Tue Jul 26 14:49:04 2016 +0000

    power_supply: tps65217-charger: fix missing platform_set_drvdata()
    
    Add missing platform_set_drvdata() in tps65217_charger_probe(), otherwise
    calling platform_get_drvdata() in remove returns NULL.
    
    This is detected by Coccinelle semantic patch.
    
    Fixes: 3636859b280c ("power_supply: Add support for tps65217-charger")
    Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
    Signed-off-by: Sebastian Reichel <sre@kernel.org>

commit 8ca4746a78abc39cc0496654068eaaadb0f3c4d0
Author: Gregory CLEMENT <gregory.clement@free-electrons.com>
Date:   Tue Jul 19 15:42:22 2016 +0200

    clk: mvebu: Add the peripheral clock driver for Armada 3700
    
    These clocks are the ones which will be used as source for the
    peripherals of the Armada 3700 SoC. On this SoC there is two blocks of
    clocks: the North bridge one and the South bridge one.
    
    Most of them are gatable. Most of the time their rate are their parent
    rated divided by a ratio depending of two registers. Their parent can be
    choose between the TBG clocks for most of them.
    
    However, some of them can't choose their parent or directly depend of the
    xtal clocks. Other ones do not use exactly the same pattern to find the
    ratio between their parent rate and their rate.
    
    For these reason each clock is a composite clock and the operations they
    use are different depending of the clock.
    
    According to the datasheet it would be possible to select the parent
    clock and the ratio, however currently the driver does not support it.
    
    Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
    Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>

commit c6d591c14e45e7ca8e158f2e0e5449371e23055c
Author: Gregory CLEMENT <gregory.clement@free-electrons.com>
Date:   Tue Jul 19 15:42:21 2016 +0200

    dt-bindings: clock: add DT binding for the peripheral clocks on Armada 3700
    
    This commit adds the DT binding documentation for the peripheral clocks
    used in the Marvell Armada 3700 SoCs.
    
    Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
    Acked-by: Rob Herring <robh@kernel.org>
    Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>

commit 96265523ace51b71dc1b2fd445a93c4b2f9bedb8
Author: Gregory CLEMENT <gregory.clement@free-electrons.com>
Date:   Tue Jul 19 15:42:20 2016 +0200

    clk: mvebu Add the time base generator clocks for Armada 3700
    
    These clocks are children of the xtal clock and each one can be selected
    as a source for the peripheral clocks.
    
    According to the datasheet it should be possible to modify their rate,
    but currently it is not supported.
    
    Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
    Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>

commit c302588e17d084408210a45a0fa6243978a0f785
Author: Gregory CLEMENT <gregory.clement@free-electrons.com>
Date:   Tue Jul 19 15:42:19 2016 +0200

    dt-bindings: clock: add DT binding for the TBG clocks on Armada 3700
    
    This commit adds the DT binding documentation for the Time Base Generator
    clock used in the Marvell Armada 3700 SoCs.
    
    Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
    Acked-by: Rob Herring <robh@kernel.org>
    Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>

commit 7ea8250406a6abe2f057c2096249c63b788b728f
Author: Gregory CLEMENT <gregory.clement@free-electrons.com>
Date:   Tue Jul 19 15:42:18 2016 +0200

    clk: mvebu: Add the xtal clock for Armada 3700 SoC
    
    This clock is the parent of all the Armada 3700 clocks. It is a fixed
    rate clock which depends on the gpio configuration read when resetting
    the SoC.
    
    Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
    Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>

commit cedfbc309d62f8f029dc0c3b926452cc84796023
Author: Gregory CLEMENT <gregory.clement@free-electrons.com>
Date:   Tue Jul 19 15:42:17 2016 +0200

    dt-bindings: clock: add DT binding for the Xtal clock on Armada 3700
    
    This commit adds the DT binding documentation for the the Xtal clock on
    Armada 3700 used in the Marvell Armada 3700 SoCs.
    
    Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
    Acked-by: Rob Herring <robh@kernel.org>
    Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>

commit d2fbdf76b85bcdfe57b8ef2ba09d20e8ada79abd
Author: Vegard Nossum <vegard.nossum@oracle.com>
Date:   Sat Jul 23 08:15:04 2016 +0200

    tipc: fix NULL pointer dereference in shutdown()
    
    tipc_msg_create() can return a NULL skb and if so, we shouldn't try to
    call tipc_node_xmit_skb() on it.
    
        general protection fault: 0000 [#1] PREEMPT SMP KASAN
        CPU: 3 PID: 30298 Comm: trinity-c0 Not tainted 4.7.0-rc7+ #19
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
        task: ffff8800baf09980 ti: ffff8800595b8000 task.ti: ffff8800595b8000
        RIP: 0010:[<ffffffff830bb46b>]  [<ffffffff830bb46b>] tipc_node_xmit_skb+0x6b/0x140
        RSP: 0018:ffff8800595bfce8  EFLAGS: 00010246
        RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000003023b0e0
        RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffffffff83d12580
        RBP: ffff8800595bfd78 R08: ffffed000b2b7f32 R09: 0000000000000000
        R10: fffffbfff0759725 R11: 0000000000000000 R12: 1ffff1000b2b7f9f
        R13: ffff8800595bfd58 R14: ffffffff83d12580 R15: dffffc0000000000
        FS:  00007fcdde242700(0000) GS:ffff88011af80000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 00007fcddde1db10 CR3: 000000006874b000 CR4: 00000000000006e0
        DR0: 00007fcdde248000 DR1: 00007fcddd73d000 DR2: 00007fcdde248000
        DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000090602
        Stack:
         0000000000000018 0000000000000018 0000000041b58ab3 ffffffff83954208
         ffffffff830bb400 ffff8800595bfd30 ffffffff8309d767 0000000000000018
         0000000000000018 ffff8800595bfd78 ffffffff8309da1a 00000000810ee611
        Call Trace:
         [<ffffffff830c84a3>] tipc_shutdown+0x553/0x880
         [<ffffffff825b4a3b>] SyS_shutdown+0x14b/0x170
         [<ffffffff8100334c>] do_syscall_64+0x19c/0x410
         [<ffffffff83295ca5>] entry_SYSCALL64_slow_path+0x25/0x25
        Code: 90 00 b4 0b 83 c7 00 f1 f1 f1 f1 4c 8d 6d e0 c7 40 04 00 00 00 f4 c7 40 08 f3 f3 f3 f3 48 89 d8 48 c1 e8 03 c7 45 b4 00 00 00 00 <80> 3c 30 00 75 78 48 8d 7b 08 49 8d 75 c0 48 b8 00 00 00 00 00
        RIP  [<ffffffff830bb46b>] tipc_node_xmit_skb+0x6b/0x140
         RSP <ffff8800595bfce8>
        ---[ end trace 57b0484e351e71f1 ]---
    
    I feel like we should maybe return -ENOMEM or -ENOBUFS, but I'm not sure
    userspace is equipped to handle that. Anyway, this is better than a GPF
    and looks somewhat consistent with other tipc_msg_create() callers.
    
    Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
    Acked-by: Ying Xue <ying.xue@windriver.com>
    Acked-by: Jon Maloy <jon.maloy@ericsson.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 0dbff144a1e7310e2f8b7a957352c4be9aeb38e4
Author: Vitaly Kuznetsov <vkuznets@redhat.com>
Date:   Mon Aug 15 17:48:43 2016 +0200

    hv_netvsc: fix bonding devices check in netvsc_netdev_event()
    
    Bonding driver sets IFF_BONDING on both master (the bonding device) and
    slave (the real NIC) devices and in netvsc_netdev_event() we want to skip
    master devices only. Currently, there is an uncertainty when a slave
    interface is removed: if bonding module comes first in netdev_chain it
    clears IFF_BONDING flag on the netdev and netvsc_netdev_event() correctly
    handles NETDEV_UNREGISTER event, but in case netvsc comes first on the
    chain it sees the device with IFF_BONDING still attached and skips it. As
    we still hold vf_netdev pointer to the device we crash on the next inject.
    
    Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
    Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 0f20d795f78d182c4b743d880a5e8dc4d39892fe
Author: Vitaly Kuznetsov <vkuznets@redhat.com>
Date:   Mon Aug 15 17:48:42 2016 +0200

    hv_netvsc: protect module refcount by checking net_device_ctx->vf_netdev
    
    We're not guaranteed to see NETDEV_REGISTER/NETDEV_UNREGISTER notifications
    only once per VF but we increase/decrease module refcount unconditionally.
    Check vf_netdev to make sure we don't take/release it twice. We presume
    that only one VF per netvsc device may exist.
    
    Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
    Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 57c1826b991244d2144eb6e3d5d1b13a53cbea63
Author: Vitaly Kuznetsov <vkuznets@redhat.com>
Date:   Mon Aug 15 17:48:41 2016 +0200

    hv_netvsc: reset vf_inject on VF removal
    
    We reset vf_inject on VF going down (netvsc_vf_down()) but we don't on
    VF removal (netvsc_unregister_vf()) so vf_inject stays 'true' while
    vf_netdev is already NULL and we're trying to inject packets into NULL
    net device in netvsc_recv_callback() causing kernel to crash.
    
    Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
    Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit d072218f214929194db06069564495b6b9fff34a
Author: Vitaly Kuznetsov <vkuznets@redhat.com>
Date:   Mon Aug 15 17:48:40 2016 +0200

    hv_netvsc: avoid deadlocks between rtnl lock and vf_use_cnt wait
    
    Here is a deadlock scenario:
    - netvsc_vf_up() schedules netvsc_notify_peers() work and quits.
    - netvsc_vf_down() runs before netvsc_notify_peers() gets executed. As it
      is being executed from netdev notifier chain we hold rtnl lock when we
      get here.
    - we enter while (atomic_read(&net_device_ctx->vf_use_cnt) != 0) loop and
      wait till netvsc_notify_peers() drops vf_use_cnt.
    - netvsc_notify_peers() starts on some other CPU but netdev_notify_peers()
      will hang on rtnl_lock().
    - deadlock!
    
    Instead of introducing additional synchronization I suggest we drop
    gwrk.dwrk completely and call NETDEV_NOTIFY_PEERS directly. As we're
    acting under rtnl lock this is legitimate.
    
    Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
    Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit f9a7da9130ef0143eb900794c7863dc5c9051fbc
Author: Vitaly Kuznetsov <vkuznets@redhat.com>
Date:   Mon Aug 15 17:48:39 2016 +0200

    hv_netvsc: don't lose VF information
    
    struct netvsc_device is not suitable for storing VF information as this
    structure is being destroyed on MTU change / set channel operation (see
    rndis_filter_device_remove()). Move all VF related stuff to struct
    net_device_context which is persistent.
    
    Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
    Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit cfaace269d0ce5a4ae26bfe442f1c4df1a9558de
Author: Colin Ian King <colin.king@canonical.com>
Date:   Mon Aug 15 13:55:17 2016 +0100

    net: hns: mdio->irq is an array, so no need to check if it is null
    
    The null check on mdio->irq is redundant since mdio->irq is an array
    of PHY_MAX_ADDR ints and hence can never be null. Remove the redundant
    check.
    
    Signed-off-by: Colin Ian King <colin.king@canonical.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 2eb03e6c4e305b71bdd2d0ce4250b9c9099d9128
Author: Or Gerlitz <ogerlitz@mellanox.com>
Date:   Mon Aug 15 14:51:54 2016 +0300

    switchdev: Put export declaration in the right place
    
    Move exporting of switchdev_port_same_parent_id to be right
    below it and not elsewhere.
    
    Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
    Reported-by: Ido Schimmel <idosch@mellanox.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit f759b640c7219c99e19f0a42c6e086a19d2e2a17
Author: Neil Armstrong <narmstrong@baylibre.com>
Date:   Sun Jul 10 11:11:06 2016 +0200

    ARM64: dts: amlogic: meson-gxbb: Add watchdog node
    
    Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
    Reviewed-by: Guenter Roeck <linux@roeck-us.net>
    Signed-off-by: Kevin Hilman <khilman@baylibre.com>

commit 3d7b33209201cbfa090d614db993571ca3c6b090
Author: Simon Horman <simon.horman@netronome.com>
Date:   Mon Aug 15 13:06:24 2016 +0200

    gre: set inner_protocol on xmit
    
    Ensure that the inner_protocol is set on transmit so that GSO segmentation,
    which relies on that field, works correctly.
    
    This is achieved by setting the inner_protocol in gre_build_header rather
    than each caller of that function. It ensures that the inner_protocol is
    set when gre_fb_xmit() is used to transmit GRE which was not previously the
    case.
    
    I have observed this is not the case when OvS transmits GRE using
    lwtunnel metadata (which it always does).
    
    Fixes: 38720352412a ("gre: Use inner_proto to obtain inner header protocol")
    Cc: Pravin Shelar <pshelar@ovn.org>
    Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
    Signed-off-by: Simon Horman <simon.horman@netronome.com>
    Acked-by: Pravin B Shelar <pshelar@ovn.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 525ef5c07f187bf0918fdf3bbc76ad18ce1d1cf9
Author: Yuval Mintz <Yuval.Mintz@qlogic.com>
Date:   Mon Aug 15 10:42:45 2016 +0300

    qed*: Add and modify some prints
    
    This patch touches various prints in the driver - it reduces the
    verbosity of some prints [which were previously logged by default]
    while adding several new debug prints and modifying others.
    
    Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 83aeb9339f4859c587d0ad3d80d225b520db047e
Author: Yuval Mintz <Yuval.Mintz@qlogic.com>
Date:   Mon Aug 15 10:42:44 2016 +0300

    qed*: Trivial modifications
    
    Change qed* code in trivial manner; This isn't necessarily
    semantic-only, but the end result is the same, i.e., no change
    should occur from user perspective. Changes include:
      - Using temporary variables to better fit 80-character restrictions.
      - Removal of unused variables & code with no effect.
    [plus some additional minor modifications].
    
    Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 1a635e488ecf6fcae00bffda61707b63bc1aacbe
Author: Yuval Mintz <Yuval.Mintz@qlogic.com>
Date:   Mon Aug 15 10:42:43 2016 +0300

    qed*: Semantic changes
    
    Make semantic-only adjustments to qed* drivers, such as:
      - Changes in code indentation.
      - Usage of BIT() macro.
      - re-naming of variables.
      - Re-ordering of variable declerations.
      - Removal of (== 0) and (!= 0) in conditions.
    
    Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 6ff9ea0da61fb436166635ff9ae3e8efc4c8b0ae
Author: Kevin Hilman <khilman@baylibre.com>
Date:   Mon Aug 15 13:08:29 2016 -0700

    arm64: defconfig: enable meson WDT as modules
    
    Signed-off-by: Kevin Hilman <khilman@baylibre.com>

commit d11708925d9ce5d0113ff8f7ae62cf00bcd30a74
Author: Kevin Hilman <khilman@baylibre.com>
Date:   Mon Aug 15 13:05:36 2016 -0700

    arm64: defconfig: enable HW random as module
    
    drivers/char/hw_random/Kconfig has 'default m', so
    simply removing this entry from the defconfig will
    enable building HW random drivers as modules.
    
    Signed-off-by: Kevin Hilman <khilman@baylibre.com>

commit 492ff9d8f5fa6ad44288050238b7961d457a239d
Author: Phil Reid <preid@electromag.com.au>
Date:   Mon Jul 25 10:42:59 2016 +0800

    power: sbs-battery: Use devm_power_supply_register
    
    Use devm_power_supply_register instead of power_supply_register.
    Remove call to power_supply_unregister.
    
    Signed-off-by: Phil Reid <preid@electromag.com.au>
    Signed-off-by: Sebastian Reichel <sre@kernel.org>

commit d2cec82c28802da31596b395ad292cb8f132fd63
Author: Phil Reid <preid@electromag.com.au>
Date:   Mon Jul 25 10:42:58 2016 +0800

    power: sbs-battery: Request threaded irq and fix dev callback cookie
    
    Currently the battery detect gpio can not be used with a chained interrupt
    controller that requires threaded irq handlers. Use threaded irq instead.
    In addition this was not going to be working at present because
    chip->power_supply is assigned after the request irq call.
    
    Signed-off-by: Phil Reid <preid@electromag.com.au>
    Signed-off-by: Sebastian Reichel <sre@kernel.org>

commit 9239a86f0976b58d3da7a2261ed659ac9eba0f25
Author: Phil Reid <preid@electromag.com.au>
Date:   Mon Jul 25 10:42:57 2016 +0800

    power: sbs-battery: Use devm_kzalloc to alloc data
    
    Use devm_kzalloc to allow memory to be freed automatically on
    driver probe failure or removal.
    
    Signed-off-by: Phil Reid <preid@electromag.com.au>
    Signed-off-by: Sebastian Reichel <sre@kernel.org>

commit e4a404a081df1abe95a06ab24b7c76d8cf02402f
Author: H. Nikolaus Schaller <hns@goldelico.com>
Date:   Mon Jul 18 18:12:09 2016 +0200

    power:bq27xxx: 27000/10 read FLAGS register as single
    
    The bq27000 and bq27010 have a single byte FLAGS register.
    Other gauges have 16 bit FLAGS registers.
    
    For reading the FLAGS register it is sufficient to read the single
    register instead of reading RSOC at the next higher address as
    well and then ignore the high byte.
    
    This does not change functionality but optimizes i2c and hdq
    traffic.
    
    Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com>
    Acked-by: Pali Rohár <pali.rohar@gmail.com>
    Acked-by: Andrew F. Davis <afd@ti.com>
    Signed-off-by: Sebastian Reichel <sre@kernel.org>

commit 47d7d5ed68d877269003a392b7008905d65650bb
Author: Marcin Niestroj <m.niestroj@grinn-global.com>
Date:   Mon Jun 20 12:50:54 2016 +0200

    power_supply: tps65217-charger: Add support for IRQs
    
    Make use of IRQ resources defined in tps65217 mfd code. If they are valid
    we use them instead separate poll task, in order to define AC power state.
    
    Signed-off-by: Marcin Niestroj <m.niestroj@grinn-global.com>
    Signed-off-by: Sebastian Reichel <sre@kernel.org>

commit 5e457896986e16c440c97bb94b9ccd95dd157292
Author: Lorenzo Colitti <lorenzo@google.com>
Date:   Sat Aug 13 01:13:38 2016 +0900

    net: ipv6: Fix ping to link-local addresses.
    
    ping_v6_sendmsg does not set flowi6_oif in response to
    sin6_scope_id or sk_bound_dev_if, so it is not possible to use
    these APIs to ping an IPv6 address on a different interface.
    Instead, it sets flowi6_iif, which is incorrect but harmless.
    
    Stop setting flowi6_iif, and support various ways of setting oif
    in the same priority order used by udpv6_sendmsg.
    
    Tested: https://android-review.googlesource.com/#/c/254470/
    Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 7e37deb7fae8437c0487d9fc3f13c7415770efd7
Author: Peter Ujfalusi <peter.ujfalusi@ti.com>
Date:   Mon May 30 11:55:11 2016 +0300

    clk: twl6040: Rename the driver and use consistent names in the code
    
    The driver is to provide the functional clock to OMAP4/5 McPDM. The clock
    is named as pdmclk in the documentations so change the function names,
    structure names and variables to align with this.
    At the same time rename the driver from "twl6040-clk" to "twl6040-pdmclk".
    This can be done w/o regression since the clock driver is not in use at
    the moment, the MFD core driver is not even registering the device for it.
    
    Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
    Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>

commit 994deaae37a05bfe59aded7bb176092fb849c5b4
Author: Peter Ujfalusi <peter.ujfalusi@ti.com>
Date:   Mon May 30 11:55:10 2016 +0300

    clk: twl6040: Register the clock as of_clk_provider
    
    In order ot be able to use the pdmclk clock via DT it need to be registered
    as of_clk_provide.
    Since the twl6040 clock driver does not have it's own DT node, use the
    parent's node for registering.
    
    Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
    Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>

commit 225ff4e87ab2ceca4d4db05a5930a8c7ad16d754
Author: Peter Ujfalusi <peter.ujfalusi@ti.com>
Date:   Mon May 30 11:55:09 2016 +0300

    clk: twl6040: Correct clk_ops
    
    Since the drover only supports prepare callbacks, the use of is_enabled is
    not correct, it should be handling is_prepared.
    
    Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
    Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>

commit 54be3d985e7ec834dc2512d8a1345329d246ccd5
Author: Markus Elfring <elfring@users.sourceforge.net>
Date:   Mon Aug 15 08:34:56 2016 +0200

    fjes: Delete owner assignment
    
    The field "owner" is set by core. Thus delete an extra initialisation.
    
    Generated by: scripts/coccinelle/api/platform_no_drv_owner.cocci
    Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 12311959ecf8a3a64676c01b62ce67a0c5f0fd49
Author: Vegard Nossum <vegard.nossum@oracle.com>
Date:   Fri Aug 12 20:10:44 2016 +0200

    rhashtable: fix shift by 64 when shrinking
    
    I got this:
    
        ================================================================================
        UBSAN: Undefined behaviour in ./include/linux/log2.h:63:13
        shift exponent 64 is too large for 64-bit type 'long unsigned int'
        CPU: 1 PID: 721 Comm: kworker/1:1 Not tainted 4.8.0-rc1+ #87
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
        Workqueue: events rht_deferred_worker
         0000000000000000 ffff88011661f8d8 ffffffff82344f50 0000000041b58ab3
         ffffffff84f98000 ffffffff82344ea4 ffff88011661f900 ffff88011661f8b0
         0000000000000001 ffff88011661f6b8 dffffc0000000000 ffffffff867f7640
        Call Trace:
         [<ffffffff82344f50>] dump_stack+0xac/0xfc
         [<ffffffff82344ea4>] ? _atomic_dec_and_lock+0xc4/0xc4
         [<ffffffff8242f5b8>] ubsan_epilogue+0xd/0x8a
         [<ffffffff82430c41>] __ubsan_handle_shift_out_of_bounds+0x255/0x29a
         [<ffffffff824309ec>] ? __ubsan_handle_out_of_bounds+0x180/0x180
         [<ffffffff84003436>] ? nl80211_req_set_reg+0x256/0x2f0
         [<ffffffff812112ba>] ? print_context_stack+0x8a/0x160
         [<ffffffff81200031>] ? amd_pmu_reset+0x341/0x380
         [<ffffffff823af808>] rht_deferred_worker+0x1618/0x1790
         [<ffffffff823af808>] ? rht_deferred_worker+0x1618/0x1790
         [<ffffffff823ae1f0>] ? rhashtable_jhash2+0x370/0x370
         [<ffffffff8134c12d>] ? process_one_work+0x6fd/0x1970
         [<ffffffff8134c1cf>] process_one_work+0x79f/0x1970
         [<ffffffff8134c12d>] ? process_one_work+0x6fd/0x1970
         [<ffffffff8134ba30>] ? try_to_grab_pending+0x4c0/0x4c0
         [<ffffffff8134d564>] ? worker_thread+0x1c4/0x1340
         [<ffffffff8134d8ff>] worker_thread+0x55f/0x1340
         [<ffffffff845e904f>] ? __schedule+0x4df/0x1d40
         [<ffffffff8134d3a0>] ? process_one_work+0x1970/0x1970
         [<ffffffff8134d3a0>] ? process_one_work+0x1970/0x1970
         [<ffffffff813642f7>] kthread+0x237/0x390
         [<ffffffff813640c0>] ? __kthread_parkme+0x280/0x280
         [<ffffffff845f8c93>] ? _raw_spin_unlock_irq+0x33/0x50
         [<ffffffff845f95df>] ret_from_fork+0x1f/0x40
         [<ffffffff813640c0>] ? __kthread_parkme+0x280/0x280
        ================================================================================
    
    roundup_pow_of_two() is undefined when called with an argument of 0, so
    let's avoid the call and just fall back to ht->p.min_size (which should
    never be smaller than HASH_MIN_SIZE).
    
    Cc: Herbert Xu <herbert@gondor.apana.org.au>
    Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
    Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 03459345bc00da70a35fa39bcfcf13d779097074
Author: Gao Feng <fgao@ikuai8.com>
Date:   Sat Aug 13 00:30:48 2016 +0800

    pptp: Refactor the struct and macros of PPTP codes
    
    1. Use struct gre_base_hdr directly in pptp_gre_header instead of
    duplicated members;
    2. Use existing macros like GRE_KEY, GRE_SEQ, and so on instead of
    duplicated macros defined by PPTP;
    3. Add new macros like GRE_IS_ACK/SEQ and so on instead of
    PPTP_GRE_IS_A/S and so on;
    
    Signed-off-by: Gao Feng <fgao@ikuai8.com>
    Reviewed-by: Philip Prindeville <philipp@redfish-solutions.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 92bb8d5d55f7fe1a7a1201c42120c1611840807c
Author: Jisheng Zhang <jszhang@marvell.com>
Date:   Mon Aug 15 09:20:21 2016 +0100

    ARM: 8597/1: VDSO: put RO and RO after init objects into proper sections
    
    vdso_data_mapping is never modified, so mark it as const.
    
    vdso_total_pages, vdso_data_page, vdso_text_mapping and cntvct_ok are
    initialized by vdso_init(), thereafter are read only.
    
    The fact that they are read only after init makes them candidates for
    __ro_after_init declarations.
    
    Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
    Reviewed-by: Kees Cook <keescook@chromium.org>
    Acked-by: Nathan Lynch <nathan_lynch@mentor.com>
    Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

commit 1c9690e5b1345c34ccc62259904380db60a36e3a
Author: Kevin Hilman <khilman@baylibre.com>
Date:   Thu Jun 23 13:38:24 2016 -0700

    arm64: defconfig: enable MMC for meson-gxbb
    
    Signed-off-by: Kevin Hilman <khilman@baylibre.com>

commit 08ef98069718cfa8dc9acab46fdc52810511636d
Author: Javier Martinez Canillas <javier@osg.samsung.com>
Date:   Mon Aug 1 12:47:04 2016 -0400

    ARM: dts: omap3/4/5/dra7: remove unneeded unit name for gpio-leds nodes
    
    This patch fixes the following DTC warnings for many boards:
    
    "Node /leds/led@1 has a unit name, but no reg property"
    
    Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
    Signed-off-by: Tony Lindgren <tony@atomide.com>

commit c731abd99121cae0b9a1735c062c96c56e2b72fc
Author: Javier Martinez Canillas <javier@osg.samsung.com>
Date:   Mon Aug 1 12:47:03 2016 -0400

    ARM: dts: am335x/437x/57xx: remove unneeded unit name for gpio-leds nodes
    
    This patch fixes the following DTC warnings for many boards:
    
    "Node /leds/led@1 has a unit name, but no reg property"
    
    Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
    Signed-off-by: Tony Lindgren <tony@atomide.com>

commit 45ed37f7b92d20541c0c651884815a9e504246fa
Author: Javier Martinez Canillas <javier@osg.samsung.com>
Date:   Mon Aug 1 12:47:02 2016 -0400

    ARM: dts: omap3/4: remove unneeded unit name for gpio-keys nodes
    
    This patch fixes the following DTC warnings for many boards:
    
    "Node /gpio_keys/button0@10 has a unit name, but no reg property"
    
    Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
    Signed-off-by: Tony Lindgren <tony@atomide.com>

commit 57a78a8a6f6e07253c6ff276847923e523a90a59
Author: Javier Martinez Canillas <javier@osg.samsung.com>
Date:   Mon Aug 1 12:47:01 2016 -0400

    ARM: dts: am335x/am437x: remove unneeded unit name for gpio-keys nodes
    
    This patch fixes the following DTC warnings for many boards:
    
    "Node /gpio_keys/button0@10 has a unit name, but no reg property"
    
    Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
    Signed-off-by: Tony Lindgren <tony@atomide.com>

commit 909b0ebde95d24af3cdecbe22863830413f3fe81
Author: Javier Martinez Canillas <javier@osg.samsung.com>
Date:   Mon Aug 1 12:47:00 2016 -0400

    ARM: dts: omap3/dra62x: remove unneeded unit name for fixed regulators
    
    This patch fixes the following DTC warnings for many boards:
    
    "Node /fixedregulator@0 has a unit name, but no reg property"
    
    Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
    Signed-off-by: Tony Lindgren <tony@atomide.com>

commit 0b0d912ab516782beeb3449149d2a126882d550a
Author: Javier Martinez Canillas <javier@osg.samsung.com>
Date:   Mon Aug 1 12:46:59 2016 -0400

    ARM: dts: da850/dm81x: remove unneeded unit name for fixed regulators
    
    This patch fixes the following DTC warnings for many boards:
    
    "Node /fixedregulator@0 has a unit name, but no reg property"
    
    Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
    Signed-off-by: Tony Lindgren <tony@atomide.com>

commit 4c049a5b7c89012872184c1fdaefe04ea3dc7dc0
Author: Javier Martinez Canillas <javier@osg.samsung.com>
Date:   Mon Aug 1 12:46:58 2016 -0400

    ARM: dts: am335x/am437x: remove unneeded unit name for fixed regulators
    
    This patch fixes the following DTC warnings for many boards:
    
    "Node /fixedregulator@0 has a unit name, but no reg property"
    
    Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
    Signed-off-by: Tony Lindgren <tony@atomide.com>

commit 18ad99d4c2d0c02f7811ecde37a79fd993258621
Author: Javier Martinez Canillas <javier@osg.samsung.com>
Date:   Mon Aug 1 12:46:57 2016 -0400

    ARM: dts: am335x/am437x: remove unneeded unit name for gpio-matrix-keypad
    
    This patch fixes the following DTC warnings for many boards:
    
    "Node /matrix_keypad@0 has a unit name, but no reg property"
    
    Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
    Signed-off-by: Tony Lindgren <tony@atomide.com>

commit 0b965a13ad81fa895e534d1f50b355ff8b0b3ed3
Author: Javier Martinez Canillas <javier@osg.samsung.com>
Date:   Mon Aug 1 12:46:56 2016 -0400

    ARM: dts: omap3: overo: add missing unit name for lcd35 display
    
    Commit b8d368caa8dc ("ARM: dts: omap3: overo: remove unneded unit names
    in display nodes") removed the unit names for all Overo display nodes
    that didn't have a reg property.
    
    But the display in arch/arm/boot/dts/omap3-overo-common-lcd35.dtsi does
    have a reg property so the correct fix was to make the unit name match
    the value of the reg property, instead of removing it.
    
    This patch fixes the following DTC warning for boards using this dtsi:
    
    "ocp/spi@48098000/display has a reg or ranges property, but no unit name"
    
    Fixes: b8d368caa8dc ("ARM: dts: omap3: overo: remove unneded unit names in display nodes")
    Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
    Signed-off-by: Tony Lindgren <tony@atomide.com>

commit f515f81424cbd75734197ac4723c4c379ab60add
Author: Javier Martinez Canillas <javier@osg.samsung.com>
Date:   Mon Aug 1 12:46:55 2016 -0400

    ARM: dts: omap3/am4372: add missing unit name to ocp node
    
    This patch fixes the following DTC warnings for many boards:
    
    "Node /ocp has a reg or ranges property, but no unit name"
    
    Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
    Signed-off-by: Tony Lindgren <tony@atomide.com>

commit 99f1c013194e64d4b67d5d318148303b0e1585e1
Author: Oleg Drokin <green@linuxhacker.ru>
Date:   Thu Jul 14 23:40:21 2016 -0400

    staging/lustre/llite: Close atomic_open race with several openers
    
    Right now, if it's an open of a negative dentry, a race is possible
    with several openers who all try to instantiate/rehash the same
    dentry and would hit a BUG_ON in d_add.
    But in fact if we got a negative dentry in atomic_open, that means
    we just revalidated it so no point in talking to MDS at all,
    just return ENOENT and make the race go away completely.
    
    Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
    Cc: stable <stable@vger.kernel.org> # 4.7+
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 34276bb062b8449b3b0a208c9b848a1a27920075
Author: Lucas Stach <l.stach@pengutronix.de>
Date:   Mon Aug 15 14:58:43 2016 +0200

    of: fix reference counting in of_graph_get_endpoint_by_regs
    
    The called of_graph_get_next_endpoint() already decrements the refcount
    of the prev node, so it is wrong to do it again in the calling function.
    
    Use the for_each_endpoint_of_node() helper to interate through the
    endpoint OF nodes, which already does the right thing and simplifies
    the code a bit.
    
    Fixes: 8ccd0d0ca041
    (of: add helper for getting endpoint node of specific identifiers)
    Cc: stable@vger.kernel.org
    Reported-by: David Jander <david@protonic.nl>
    Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
    Acked-by: Philipp Zabel <p.zabel@pengutronix.de>
    Signed-off-by: Rob Herring <robh@kernel.org>

commit 4875b8fcf68d8133713dd5c5df5bc79431be8be7
Author: Adam Ford <aford173@gmail.com>
Date:   Sat Aug 13 10:21:00 2016 -0500

    ARM: dts: logicpd-somlv: Fix NAND device nodes
    
    This fix was applied to a bunch of omap3 devices including LogicPD
    Torpedo, but this got missed since it was new around the same times
    the patches were applied.  This makes the GPMC parameters match the
    Torpedo since they have the same processor PoP memory.
    
    Signed-off-by: Adam Ford <aford173@gmail.com>
    Signed-off-by: Tony Lindgren <tony@atomide.com>

commit a8771a6a64226c24f4baf30b8d13a2116795487f
Author: Adam Ford <aford173@gmail.com>
Date:   Sat Aug 13 10:13:04 2016 -0500

    ARM: dts: logicpd-torpedo-som: Provide NAND ready pin
    
    This was applied to a variety of omap3 boards, so it should
    probably be applied here.  I did not test NAND performance, but
    I tested this with UBI to confirm read/write didn't break.
    
    Signed-off-by: Adam Ford <aford173@gmail.com>
    Signed-off-by: Tony Lindgren <tony@atomide.com>

commit 153b58ea932b2d0642fa5cd41c93bb0555f3f09b
Author: Johan Hovold <johan@kernel.org>
Date:   Mon Aug 15 09:10:49 2016 -0700

    ARM: dts: overo: fix gpmc nand on boards with ethernet
    
    The gpmc ranges property for NAND at CS0 was being overridden by later
    includes that defined gpmc ethernet nodes, effectively breaking NAND on
    these systems:
    
    	omap-gpmc 6e000000.gpmc: /ocp/gpmc@6e000000/nand@0,0 has
    	malformed 'reg' property
    
    Instead of redefining the NAND range in every such dtsi, define all
    currently used ranges in omap3-overo-base.dtsi.
    
    Fixes: 98ce6007efb4 ("ARM: dts: overo: Support PoP NAND")
    Cc: stable <stable@vger.kernel.org> # 4.3
    Signed-off-by: Johan Hovold <johan@kernel.org>
    Signed-off-by: Tony Lindgren <tony@atomide.com>

commit 5e0568dfbfb8c13cdb69c9fd06d600593ad4b430
Author: Johan Hovold <johan@kernel.org>
Date:   Mon Aug 15 09:10:45 2016 -0700

    ARM: dts: overo: fix gpmc nand cs0 range
    
    The gpmc ranges property for NAND at CS0 has been broken since it was
    first added.
    
    This currently prevents the nand gpmc child node from being probed:
    
    	omap-gpmc 6e000000.gpmc: /ocp/gpmc@6e000000/nand@0,0 has
    	malformed 'reg' property
    
    and consequently the NAND device from being registered.
    
    Fixes: 98ce6007efb4 ("ARM: dts: overo: Support PoP NAND")
    Cc: stable <stable@vger.kernel.org>	# 4.3
    Signed-off-by: Johan Hovold <johan@kernel.org>
    Signed-off-by: Tony Lindgren <tony@atomide.com>

commit 42647f947210cb9fd8a7737c0fd2a60002a81188
Author: Teresa Remmet <t.remmet@phytec.de>
Date:   Mon Aug 15 09:10:39 2016 -0700

    ARM: dts: am335x: Update elm phandle binding
    
    The check for the "elm_id" binding had been removed.
    This causes nand boot to fail on boards still using
    the old binding. Update the bindings on those boards.
    
    Signed-off-by: Teresa Remmet <t.remmet@phytec.de>
    Acked-by: Brian Norris <computersforpeace@gmail.com>
    Acked-by: Roger Quadros <rogerq@ti.com>
    Signed-off-by: Tony Lindgren <tony@atomide.com>

commit 3cd0126dca82ecba8b2a6bf5aca91454da0a0776
Author: Eric Sandeen <sandeen@redhat.com>
Date:   Fri Aug 12 17:40:09 2016 -0500

    quota: fill in Q_XGETQSTAT inode information for inactive quotas
    
    The manpage for quotactl says that the Q_XGETQSTAT command is
    "useful in finding out how much space is spent to store quota
    information," but the current implementation does not report this
    info if the inode is allocated, but its quota type is not enabled.
    
    This is a change from the earlier XFS implementation, which
    reported information about allocated quota inodes even if their
    quota type was not currently active.
    
    Change quota_getstate() and quota_getstatev() to copy out the inode
    information if the filesystem has provided it, even if the quota
    type for that inode is not currently active.
    
    Signed-off-by: Eric Sandeen <sandeen@redhat.com>
    Reviewed-by: Bill O'Donnell <billodo@redhat.com>
    Signed-off-by: Jan Kara <jack@suse.cz>

commit 6c6aba9e898582f289b4b93ecc0b991ab3caab31
Author: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Date:   Mon Aug 15 17:04:42 2016 +0200

    rtc: rx6110: remove owner assignment
    
    .owner is already set by the spi core.
    
    Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>

commit 7b142d8fd0bd4c9bf06ccb72ac4daedb503f0124
Author: Jann Horn <jannh@google.com>
Date:   Thu Jun 16 00:45:33 2016 +0200

    android: binder: fix dangling pointer comparison
    
    If /dev/binder is opened and the opener process then e.g. calls execve,
    proc->vma_vm_mm will still point to the location of the now-freed
    mm_struct. If the process then calls ioctl(binder_fd, ...), the dangling
    proc->vma_vm_mm pointer will be compared to current->mm.
    
    Let the binder take a reference to the mm_struct to avoid this.
    
    v2: use the right refcounter
    
    Fixes: a906d6931f3c ("android: binder: Sanity check at binder ioctl")
    Signed-off-by: Jann Horn <jannh@google.com>
    Reviewed-by: Chen Feng <puck.chen@hisilicon.com>
    Cc: stable <stable@vger.kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 4c1cc206ab2420be663de6aab729815d9c081c9d
Author: Markus Elfring <elfring@users.sourceforge.net>
Date:   Mon Aug 15 10:52:47 2016 +0200

    rtc: pic32: Delete owner assignment
    
    The field "owner" is set by core. Thus delete an extra initialisation.
    
    Generated by: scripts/coccinelle/api/platform_no_drv_owner.cocci
    Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
    Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>

commit 19140488337f3327bf90a4c794c2d2fb4ec43637
Author: Jan Östlund <jao@hms.se>
Date:   Thu Aug 11 13:31:44 2016 +0200

    rtc: bq32k: Fix handling of oscillator failure flag
    
    While the oscillator failure flag is set, the RTC registers
    should be considered invalid. bq32k_rtc_read_time() now
    returns an error instead of an invalid time.
    
    The failure flag is cleared the next time the clock is set.
    
    Signed-off-by: Jan Östlund <jao@hms.se>
    Signed-off-by: Daniel Romell <daro@hms.se>
    Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>

commit f92430c27bbb97aac234a861deaed85a1dd961e8
Author: Jan Östlund <jao@hms.se>
Date:   Thu Aug 11 13:31:43 2016 +0200

    rtc: bq32k: Use correct mask name for 'minutes' register.
    
    The BQ32K_SECONDS_MASK and BQ32K_MINUTES_MASK both has the same
    value. This is no functional change.
    
    Signed-off-by: Jan Östlund <jao@hms.se>
    Signed-off-by: Daniel Romell <daro@hms.se>
    Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>

commit add125054b8727103631dce116361668436ef6a7
Author: Gavin Li <git@thegavinli.com>
Date:   Fri Aug 12 00:52:56 2016 -0700

    cdc-acm: fix wrong pipe type on rx interrupt xfers
    
    This fixes the "BOGUS urb xfer" warning logged by usb_submit_urb().
    
    Signed-off-by: Gavin Li <git@thegavinli.com>
    Acked-by: Oliver Neukum <oneukum@suse.com>
    Cc: stable <stable@vger.kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 73577d61799e8d8bb7d69a9acdc54923e5998138
Author: Icenowy Zheng <icenowy@aosc.xyz>
Date:   Fri Aug 12 11:06:22 2016 +0800

    ehci-platform: add the max clock number to 4
    
    Allwinner A64 EHCI requires 4 clocks to be enabled.
    
    Signed-off-by: Icenowy Zheng <icenowy@aosc.xyz>
    Acked-by: Alan Stern <stern@rowland.harvard.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit d6b76c4ddb124dd22c6e910ca9332e472e7b3273
Author: Rafał Miłecki <rafal@milecki.pl>
Date:   Wed Aug 10 11:56:46 2016 +0200

    USB: bcma: support old USB 2.0 controller on Northstar devices
    
    Currently bcma-hcd driver handles 3 different bcma cores:
    1) BCMA_CORE_USB20_HOST (0x819)
    2) BCMA_CORE_NS_USB20 (0x504)
    3) BCMA_CORE_NS_USB30 (0x505)
    
    The first one was introduced years ago and so far was used on MIPS
    devices only. All Northstar (ARM) devices were using other two cores
    which allowed easy implementation of separated initialization paths.
    
    It seems however Broadcom decided to reuse this old USB 2.0 controller
    on some recently introduced cheaper Northstar BCM53573 SoCs. I noticed
    this on Tenda AC9 (based on BCM47189B0 belonging to BCM53573 family).
    
    There is no difference in this old controller core identification
    between MIPS and ARM devices: they share the same id and revision. We
    need different controller initialization procedure however.
    To handle this add a check for architecture and implement required
    initialization for ARM case.
    
    Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 6e958051cb0742dd54bb61528c130bd6e…
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Aug 19, 2016
GIT 184ca823481c99dadd7d946e5afd4bb921eab30d

commit b5ac851885accffe0485aea2805df8f2d49c95a8
Author: Roman Mashak <mrv@mojatatu.com>
Date:   Sat Aug 13 22:35:02 2016 -0700

    net_sched: allow flushing tc police actions
    
    The act_police uses its own code to walk the
    action hashtable, which leads to that we could
    not flush standalone tc police actions, so just
    switch to tcf_generic_walker() like other actions.
    
    (Joint work from Roman and Cong.)
    
    Signed-off-by: Roman Mashak <mrv@mojatatu.com>
    Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
    Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 0852e455238f8550fa92b1e40355eb2c6805787e
Author: WANG Cong <xiyou.wangcong@gmail.com>
Date:   Sat Aug 13 22:35:01 2016 -0700

    net_sched: unify the init logic for act_police
    
    Jamal reported a crash when we create a police action
    with a specific index, this is because the init logic
    is not correct, we should always create one for this
    case. Just unify the logic with other tc actions.
    
    Fixes: a03e6fe56971 ("act_police: fix a crash during removal")
    Reported-by: Jamal Hadi Salim <jhs@mojatatu.com>
    Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
    Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 22dc13c837c33207548c8ee5116b64e2930a6e23
Author: WANG Cong <xiyou.wangcong@gmail.com>
Date:   Sat Aug 13 22:35:00 2016 -0700

    net_sched: convert tcf_exts from list to pointer array
    
    As pointed out by Jamal, an action could be shared by
    multiple filters, so we can't use list to chain them
    any more after we get rid of the original tc_action.
    Instead, we could just save pointers to these actions
    in tcf_exts, since they are refcount'ed, so convert
    the list to an array of pointers.
    
    The "ugly" part is the action API still accepts list
    as a parameter, I just introduce a helper function to
    convert the array of pointers to a list, instead of
    relying on the C99 feature to iterate the array.
    
    Fixes: a85a970af265 ("net_sched: move tc_action into tcf_common")
    Reported-by: Jamal Hadi Salim <jhs@mojatatu.com>
    Cc: Jamal Hadi Salim <jhs@mojatatu.com>
    Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
    Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 2734437ef3c2943090d0914bf91caa6b30451615
Author: WANG Cong <xiyou.wangcong@gmail.com>
Date:   Sat Aug 13 22:34:59 2016 -0700

    net_sched: move tc offload macros to pkt_cls.h
    
    struct tcf_exts belongs to filters, should not be visible
    to plain tc actions.
    
    Cc: Ido Schimmel <idosch@mellanox.com>
    Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
    Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 0c23c3e705691cfb99c94f2760df2b456fe45194
Author: WANG Cong <xiyou.wangcong@gmail.com>
Date:   Sat Aug 13 22:34:58 2016 -0700

    net_sched: fix a typo in tc_for_each_action()
    
    It is harmless because all users pass 'a' to this macro.
    
    Fixes: 00175aec941e ("net/sched: Macro instead of CONFIG_NET_CLS_ACT ifdef")
    Cc: Amir Vadai <amir@vadai.me>
    Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
    Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 824a7e8863b3eb283343f891b11a782b4ec0d0de
Author: WANG Cong <xiyou.wangcong@gmail.com>
Date:   Sat Aug 13 22:34:57 2016 -0700

    net_sched: remove an unnecessary list_del()
    
    This list_del() for tc action is not needed actually,
    because we only use this list to chain bulk operations,
    therefore should not be carried for latter operations.
    
    Fixes: ec0595cc4495 ("net_sched: get rid of struct tcf_common")
    Cc: Jamal Hadi Salim <jhs@mojatatu.com>
    Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
    Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit f07fed82ad7994cc4d779ee79bdf7a46848c4b8f
Author: WANG Cong <xiyou.wangcong@gmail.com>
Date:   Sat Aug 13 22:34:56 2016 -0700

    net_sched: remove the leftover cleanup_a()
    
    After refactoring tc_action into tcf_common, we no
    longer need to cleanup temporary "actions" in list,
    they are permanently stored in the hashtable.
    
    Fixes: a85a970af265 ("net_sched: move tc_action into tcf_common")
    Reported-by: Jamal Hadi Salim <jhs@mojatatu.com>
    Cc: Jamal Hadi Salim <jhs@mojatatu.com>
    Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
    Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 9ffcc3725f096e9f0d985f738b0e44214cd72d93
Author: Ido Schimmel <idosch@mellanox.com>
Date:   Wed Aug 17 16:39:37 2016 +0200

    mlxsw: spectrum: Allow packets to be trapped from any PG
    
    When packets enter the device they are classified to a priority group
    (PG) buffer based on their PCP value. After their egress port and
    traffic class are determined they are moved to the switch's shared
    buffer and await transmission, if:
    
    (Ingress{Port}.Usage < Thres && Ingress{Port,PG}.Usage < Thres &&
     Egress{Port}.Usage < Thres && Egress{Port,TC}.Usage < Thres)
    ||
    (Ingress{Port}.Usage < Min || Ingress{Port,PG} < Min ||
     Egress{Port}.Usage < Min || Egress{Port,TC}.Usage < Min)
    
    Packets scheduled to transmission through CPU port (trapped to CPU) use
    traffic class 7, which has a zero maximum and minimum quotas. However,
    when such packets arrive from PG 0 they are admitted to the shared
    buffer as PG 0 has a non-zero minimum quota.
    
    Allow all packets to be trapped to the CPU - regardless of the PG they
    were classified to - by assigning a 10KB minimum quota for CPU port and
    TC7.
    
    Fixes: 8e8dfe9fdf06 ("mlxsw: spectrum: Add IEEE 802.1Qaz ETS support")
    Reported-by: Tamir Winetroub <tamirw@mellanox.com>
    Tested-by: Tamir Winetroub <tamirw@mellanox.com>
    Signed-off-by: Ido Schimmel <idosch@mellanox.com>
    Signed-off-by: Jiri Pirko <jiri@mellanox.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 8168287b5dfac9227a549ed87f5e111b7005e8a4
Author: Ido Schimmel <idosch@mellanox.com>
Date:   Wed Aug 17 16:39:36 2016 +0200

    mlxsw: spectrum: Unmap 802.1Q FID before destroying it
    
    Before destroying the 802.1Q FID we should first remove the VID-to-FID
    mapping. This makes mlxsw_sp_fid_destroy() symmetric with regards to
    mlxsw_sp_fid_create().
    
    Fixes: 14d39461b3f4 ("mlxsw: spectrum: Use per-FID struct for the VLAN-aware bridge")
    Signed-off-by: Ido Schimmel <idosch@mellanox.com>
    Signed-off-by: Jiri Pirko <jiri@mellanox.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 0583272d91f0f4e21f1eb666786286863185be7e
Author: Ido Schimmel <idosch@mellanox.com>
Date:   Wed Aug 17 16:39:35 2016 +0200

    mlxsw: spectrum: Add missing rollbacks in error path
    
    While going over the code I noticed we are missing two rollbacks in the
    port's creation error path. Add them and adjust the place of one of them
    in the port's removal sequence so that both are symmetric.
    
    Fixes: 56ade8fe3fe1 ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
    Signed-off-by: Ido Schimmel <idosch@mellanox.com>
    Signed-off-by: Jiri Pirko <jiri@mellanox.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 0e7df1a290abbcf3ecf697bbbbd4549c9a113db0
Author: Jiri Pirko <jiri@mellanox.com>
Date:   Wed Aug 17 16:39:34 2016 +0200

    mlxsw: reg: Fix missing op field fill-up
    
    Ralue pack function needs to set op, otherwise it is 0 for add always.
    
    Fixes: d5a1c749d22 ("mlxsw: reg: Add Router Algorithmic LPM Unicast Entry Register definition")
    Signed-off-by: Jiri Pirko <jiri@mellanox.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit a94a614fa2bd32848a67f8261228e193beb826ca
Author: Ido Schimmel <idosch@mellanox.com>
Date:   Wed Aug 17 16:39:33 2016 +0200

    mlxsw: spectrum: Trap loop-backed packets
    
    One of the conditions to generate an ICMP Redirect Message is that "the
    packet is being forwarded out the same physical interface that it was
    received from" (RFC 1812).
    
    Therefore, we need to be able to trap such packets and let the kernel
    decide what to do with them.
    
    For each RIF, enable the loop-back filter, which will raise the LBERROR
    trap whenever the ingress RIF equals the egress RIF.
    
    Fixes: 99724c18fc66 ("mlxsw: spectrum: Introduce support for router interfaces")
    Reported-by: Ilan Tayari <ilant@mellanox.com>
    Signed-off-by: Ido Schimmel <idosch@mellanox.com>
    Signed-off-by: Jiri Pirko <jiri@mellanox.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit c20b80187a93b4fcc1c5c46fc8a436df1f17636d
Author: Elad Raz <eladr@mellanox.com>
Date:   Wed Aug 17 16:39:32 2016 +0200

    mlxsw: spectrum: Add missing packet traps
    
    Add the following traps:
    
    1) MTU Error: Trap packets whose size is bigger than the egress RIF's
    MTU. If DF bit isn't set, traffic will continue to be routed in slow
    path.
    
    2) TTL Error: Trap packets whose TTL expired. This allows traceroute to
    work properly.
    
    3) OSPF packets.
    
    Fixes: 7b27ce7bb9cd ("mlxsw: spectrum: Add traps needed for router implementation")
    Signed-off-by: Elad Raz <eladr@mellanox.com>
    Signed-off-by: Ido Schimmel <idosch@mellanox.com>
    Signed-off-by: Jiri Pirko <jiri@mellanox.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 2f25844c233650b2abb92b66b3d0af7d73b5f88f
Author: Ido Schimmel <idosch@mellanox.com>
Date:   Wed Aug 17 16:39:31 2016 +0200

    mlxsw: spectrum: Mark port as active before registering it
    
    Commit bbf2a4757b30 ("mlxsw: spectrum: Initialize ports at the end of
    init sequence") moved ports initialization to the end of the init
    sequence, which means ports are the first to be removed during fini.
    
    Since the FDB delayed work is still active when ports are removed it's
    possible for it to process FDB notifications of inactive ports,
    resulting in a warning message.
    
    Fix that by marking ports as inactive only after unregistering them. The
    NETDEV_UNREGISTER event will invoke bridge's driver port removal
    sequence that will cause the FDB (and FDB notifications) to be flushed.
    
    Fixes: bbf2a4757b30 ("mlxsw: spectrum: Initialize ports at the end of init sequence")
    Signed-off-by: Ido Schimmel <idosch@mellanox.com>
    Signed-off-by: Jiri Pirko <jiri@mellanox.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 05978481e77e47b0bcb1767d3783fa0e5a18f399
Author: Ido Schimmel <idosch@mellanox.com>
Date:   Wed Aug 17 16:39:30 2016 +0200

    mlxsw: spectrum: Create PVID vPort before registering netdevice
    
    After registering a netdevice it's possible for user space applications
    to configure an IP address on it. From the driver's perspective, this
    means a router interface (RIF) should be created for the PVID vPort.
    
    Therefore, we must create the PVID vPort before registering the
    netdevice.
    
    Fixes: 99724c18fc66 ("mlxsw: spectrum: Introduce support for router interfaces")
    Signed-off-by: Ido Schimmel <idosch@mellanox.com>
    Signed-off-by: Jiri Pirko <jiri@mellanox.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit fa66d7e3fea7504e241e9004998af2c71814da18
Author: Ido Schimmel <idosch@mellanox.com>
Date:   Wed Aug 17 16:39:29 2016 +0200

    mlxsw: spectrum: Remove redundant errors from the code
    
    Currently, when device configuration fails we emit errors to the kernel
    log despite the fact we already get these from the EMAD transaction
    layer, so remove them.
    
    In addition to being unnecessary, removing these error messages will
    allow us to reuse mlxsw_sp_port_add_vid() to create the PVID vPort
    before registering the netdevice.
    
    Fixes: 99724c18fc66 ("mlxsw: spectrum: Introduce support for router interfaces")
    Signed-off-by: Ido Schimmel <idosch@mellanox.com>
    Signed-off-by: Jiri Pirko <jiri@mellanox.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 7a35583ec5b64f17559c9de8d7c47f7360e40362
Author: Ido Schimmel <idosch@mellanox.com>
Date:   Wed Aug 17 16:39:28 2016 +0200

    mlxsw: spectrum: Don't return upon error in removal path
    
    When removing a VLAN filter from the device we shouldn't return upon the
    first error we encounter, as otherwise we'll have resources that will
    never be freed nor used.
    
    Instead, we should keep trying to free as much resources as possible in
    a best effort mode.
    
    Remove the error message as well, since we already get these from the
    EMAD transaction code.
    
    Fixes: 99724c18fc66 ("mlxsw: spectrum: Introduce support for router interfaces")
    Signed-off-by: Ido Schimmel <idosch@mellanox.com>
    Signed-off-by: Jiri Pirko <jiri@mellanox.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit fbfe12c64f9650aa22f434dd9dd22df7ddf63221
Author: Dave Ertman <david.m.ertman@intel.com>
Date:   Fri Aug 12 09:56:32 2016 -0700

    i40e: check for and deal with non-contiguous TCs
    
    The i40e driver was causing a kernel panic when
    non-contiguous Traffic Classes, or Traffic Classes not
    starting with TC0, were configured on a link partner switch.
    i40e does not support non-contiguous TCs.
    
    To fix this, the patch changes the logic when determining
    the total number of TCs enabled.  Before, this would use the
    highest TC number enabled and assume that all TCs below it were
    also enabled.  Now, we create a bitmask of enabled TCs and scan
    it to determine not only the number of TCs, but also if the set
    of enabled TCs starts at zero and is contiguous.  If not, then
    DCB is disabled by only returning one TC.
    
    Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
    Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
    Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

commit 3d951822be216d8c6fcfc8abf75e5ed307eeb646
Author: Alexander Duyck <alexander.h.duyck@intel.com>
Date:   Fri Aug 12 09:53:39 2016 -0700

    ixgbe: Re-enable ability to toggle VLAN filtering
    
    Back when I submitted the GSO code I messed up and dropped the support for
    disabling the VLAN tag filtering via the feature bit.  This patch
    re-enables the use of the NETIF_F_HW_VLAN_CTAG_FILTER to enable/disable the
    VLAN filtering independent of toggling promiscuous mode.
    
    Fixes: b83e30104b ("ixgbe/ixgbevf: Add support for GSO partial")
    Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
    Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
    Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

commit f60439bc21e3337429838e477903214f5bd8277f
Author: Alexander Duyck <alexander.h.duyck@intel.com>
Date:   Thu Aug 11 14:51:56 2016 -0700

    ixgbe: Force VLNCTRL.VFE to be set in all VMDq paths
    
    When I was adding the code for enabling VLAN promiscuous mode with SR-IOV
    enabled I had inadvertently left the VLNCTRL.VFE bit unchanged as I has
    assumed there was code in another path that was setting it when we enabled
    SR-IOV.  This wasn't the case and as a result we were just disabling VLAN
    filtering for all the VFs apparently.
    
    Also the previous patches were always clearing CFIEN which was always set
    to 0 by the hardware anyway so I am dropping the redundant bit clearing.
    
    Fixes: 16369564915a ("ixgbe: Add support for VLAN promiscuous with SR-IOV")
    Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
    Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
    Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

commit 8037dd60f45264c3fbbea4cc0cea5f2f0a774b5e
Author: Jarod Wilson <jarod@redhat.com>
Date:   Tue Jul 26 14:25:35 2016 -0400

    e1000e: fix PTP on e1000_pch_lpt variants
    
    I've got reports that the Intel I-218V NIC in Intel NUC5i5RYH systems used
    as a PTP slave experiences random ~10 hour clock jumps, which are resolved
    if the same workaround for the 82574 and 82583 is employed, so set the
    appropriate flag2 in e1000_pch_lpt_info too.
    
    Reported-by: Rupesh Patel <rupatel@redhat.com>
    Signed-off-by: Jarod Wilson <jarod@redhat.com>
    Tested-by: Aaron Brown <aaron.f.brown@intel.com>
    Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

commit 0be5b96cd8400aeb4bf3f8c5e7f5efaa38ae5055
Author: Jarod Wilson <jarod@redhat.com>
Date:   Tue Jul 26 14:25:34 2016 -0400

    e1000e: factor out systim sanitization
    
    This is prepatory work for an expanding list of adapter families that have
    occasional ~10 hour clock jumps when being used for PTP. Factor out the
    sanitization function and convert to using a feature (bug) flag, per
    suggestion from Jesse Brandeburg.
    
    Littering functional code with device-specific checks is much messier than
    simply checking a flag, and having device-specific init set flags as needed.
    There are probably a number of other cases in the e1000e code that
    could/should be converted similarly.
    
    Suggested-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
    Signed-off-by: Jarod Wilson <jarod@redhat.com>
    Tested-by: Aaron Brown <aaron.f.brown@intel.com>
    Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

commit 0066c8b6f4050d7c57f6379d6fd4535e2f267f17
Author: Kshitiz Gupta <kshitiz.gupta@ni.com>
Date:   Sat Jul 16 02:23:45 2016 -0500

    igb: fix adjusting PTP timestamps for Tx/Rx latency
    
    Fix PHY delay compensation math in igb_ptp_tx_hwtstamp() and
    igb_ptp_rx_rgtstamp. Add PHY delay compensation in
    igb_ptp_rx_pktstamp().
    
    In the IGB driver, there are two functions that retrieve timestamps
    received by the PHY - igb_ptp_rx_rgtstamp() and igb_ptp_rx_pktstamp().
    The previous commit only changed igb_ptp_rx_rgtstamp(), and the change
    was incorrect.
    
    There are two instances in which PHY delay compensations should be
    made:
    
    - Before the packet transmission over the PHY, the latency between
      when the packet is timestamped and transmission of the packets,
      should be an add operation, but it is currently a subtract.
    
    - After the packets are received from the PHY, the latency between
      the receiving and timestamping of the packets should be a subtract
      operation, but it is currently an add.
    
    Signed-off-by: Kshitiz Gupta <kshitiz.gupta@ni.com>
    Fixes: 3f544d2 (igb: adjust ptp timestamps for tx/rx latency)
    Tested-by: Aaron Brown <aaron.f.brown@intel.com>
    Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

commit 55a4e778191cfcf675aa1f9716edb71a3014d5fb
Author: sean.wang@mediatek.com <sean.wang@mediatek.com>
Date:   Tue Aug 16 13:55:15 2016 +0800

    net: ethernet: mediatek: fix runtime warning raised by inconsistent struct device pointers passed to DMA API
    
    Runtime warning occurs if DMA-API debug feature is enabled that would be
    raised by pointers passed to DMA API as arguments to inconsistent struct
    device objects, so that the patch makes them usage aligned between DMA
    operations such as dma_map_*() and dma_unmap_*() to eliminate the warning.
    
    Signed-off-by: Sean Wang <sean.wang@mediatek.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit b2025c7cc92d5bfc8c5ce756c8d8a6f57c776fbd
Author: sean.wang@mediatek.com <sean.wang@mediatek.com>
Date:   Tue Aug 16 13:55:14 2016 +0800

    net: ethernet: mediatek: fix flow control settings on GMAC0 is not being enabled properly
    
    Commit 08ef55c6f257acf3bdc6940813f80e8f0f5d90ec
    ("net-next: mediatek: fix gigabit and flow control advertisement")
    had supported proper flow control settings for GMAC1. But for GMAC0,
    
    1.GMAC0 shares the common logic with GMAC1 inside mtk_phy_link_adjust()
    to adapt various settings for the target phy.
    
    2.GMAC0 uses fixed-phy to connect to a builtin gigabit switch with
    fixed link speed as commit 0c72c50f6f93b0c3daa9ea35d89ab3a933c7b5a0
    ("net-next: mediatek: add fixed-phy support") describes.
    
    3.However, fixed-phy doesn't enable SUPPORTED_Pause & SUPPORTED_Asym_Pause
    supported flag on default that would cause mtk_phy_link_adjust() not to
    enable flow control setting on GMAC0 properly and cause packet dropped
    when high traffic.
    
    Due to these reasons, the patch adds SUPPORTED_Pause & SUPPORTED_Asym_Pause
    supported flags on fixed-phy used by the driver to have proper handling on
    the both GMAC with the shared common logic.
    
    Signed-off-by: Sean Wang <sean.wang@mediatek.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 8ca7f4fe0733342c862b8585dd6eb6521b9bf533
Author: sean.wang@mediatek.com <sean.wang@mediatek.com>
Date:   Tue Aug 16 13:55:13 2016 +0800

    net: ethernet: mediatek: fix RMII mode and add REVMII supported by GMAC
    
    The patch fixes up the incorrect setup of reduced MII (RMII) on GMAC
    and adds the supplement for the setup of reverse MII (REVMII) on GMAC
    , and rearranges the error handling for invalid PHY argument.
    
    Signed-off-by: Sean Wang <sean.wang@mediatek.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 33e7664a0af6e9a516f01014f39737aaa119b6d9
Author: Wei Yongjun <weiyj.lk@gmail.com>
Date:   Tue Jul 26 14:49:04 2016 +0000

    power_supply: tps65217-charger: fix missing platform_set_drvdata()
    
    Add missing platform_set_drvdata() in tps65217_charger_probe(), otherwise
    calling platform_get_drvdata() in remove returns NULL.
    
    This is detected by Coccinelle semantic patch.
    
    Fixes: 3636859b280c ("power_supply: Add support for tps65217-charger")
    Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
    Signed-off-by: Sebastian Reichel <sre@kernel.org>

commit d2fbdf76b85bcdfe57b8ef2ba09d20e8ada79abd
Author: Vegard Nossum <vegard.nossum@oracle.com>
Date:   Sat Jul 23 08:15:04 2016 +0200

    tipc: fix NULL pointer dereference in shutdown()
    
    tipc_msg_create() can return a NULL skb and if so, we shouldn't try to
    call tipc_node_xmit_skb() on it.
    
        general protection fault: 0000 [#1] PREEMPT SMP KASAN
        CPU: 3 PID: 30298 Comm: trinity-c0 Not tainted 4.7.0-rc7+ #19
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
        task: ffff8800baf09980 ti: ffff8800595b8000 task.ti: ffff8800595b8000
        RIP: 0010:[<ffffffff830bb46b>]  [<ffffffff830bb46b>] tipc_node_xmit_skb+0x6b/0x140
        RSP: 0018:ffff8800595bfce8  EFLAGS: 00010246
        RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000003023b0e0
        RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffffffff83d12580
        RBP: ffff8800595bfd78 R08: ffffed000b2b7f32 R09: 0000000000000000
        R10: fffffbfff0759725 R11: 0000000000000000 R12: 1ffff1000b2b7f9f
        R13: ffff8800595bfd58 R14: ffffffff83d12580 R15: dffffc0000000000
        FS:  00007fcdde242700(0000) GS:ffff88011af80000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 00007fcddde1db10 CR3: 000000006874b000 CR4: 00000000000006e0
        DR0: 00007fcdde248000 DR1: 00007fcddd73d000 DR2: 00007fcdde248000
        DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000090602
        Stack:
         0000000000000018 0000000000000018 0000000041b58ab3 ffffffff83954208
         ffffffff830bb400 ffff8800595bfd30 ffffffff8309d767 0000000000000018
         0000000000000018 ffff8800595bfd78 ffffffff8309da1a 00000000810ee611
        Call Trace:
         [<ffffffff830c84a3>] tipc_shutdown+0x553/0x880
         [<ffffffff825b4a3b>] SyS_shutdown+0x14b/0x170
         [<ffffffff8100334c>] do_syscall_64+0x19c/0x410
         [<ffffffff83295ca5>] entry_SYSCALL64_slow_path+0x25/0x25
        Code: 90 00 b4 0b 83 c7 00 f1 f1 f1 f1 4c 8d 6d e0 c7 40 04 00 00 00 f4 c7 40 08 f3 f3 f3 f3 48 89 d8 48 c1 e8 03 c7 45 b4 00 00 00 00 <80> 3c 30 00 75 78 48 8d 7b 08 49 8d 75 c0 48 b8 00 00 00 00 00
        RIP  [<ffffffff830bb46b>] tipc_node_xmit_skb+0x6b/0x140
         RSP <ffff8800595bfce8>
        ---[ end trace 57b0484e351e71f1 ]---
    
    I feel like we should maybe return -ENOMEM or -ENOBUFS, but I'm not sure
    userspace is equipped to handle that. Anyway, this is better than a GPF
    and looks somewhat consistent with other tipc_msg_create() callers.
    
    Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
    Acked-by: Ying Xue <ying.xue@windriver.com>
    Acked-by: Jon Maloy <jon.maloy@ericsson.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 0dbff144a1e7310e2f8b7a957352c4be9aeb38e4
Author: Vitaly Kuznetsov <vkuznets@redhat.com>
Date:   Mon Aug 15 17:48:43 2016 +0200

    hv_netvsc: fix bonding devices check in netvsc_netdev_event()
    
    Bonding driver sets IFF_BONDING on both master (the bonding device) and
    slave (the real NIC) devices and in netvsc_netdev_event() we want to skip
    master devices only. Currently, there is an uncertainty when a slave
    interface is removed: if bonding module comes first in netdev_chain it
    clears IFF_BONDING flag on the netdev and netvsc_netdev_event() correctly
    handles NETDEV_UNREGISTER event, but in case netvsc comes first on the
    chain it sees the device with IFF_BONDING still attached and skips it. As
    we still hold vf_netdev pointer to the device we crash on the next inject.
    
    Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
    Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 0f20d795f78d182c4b743d880a5e8dc4d39892fe
Author: Vitaly Kuznetsov <vkuznets@redhat.com>
Date:   Mon Aug 15 17:48:42 2016 +0200

    hv_netvsc: protect module refcount by checking net_device_ctx->vf_netdev
    
    We're not guaranteed to see NETDEV_REGISTER/NETDEV_UNREGISTER notifications
    only once per VF but we increase/decrease module refcount unconditionally.
    Check vf_netdev to make sure we don't take/release it twice. We presume
    that only one VF per netvsc device may exist.
    
    Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
    Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 57c1826b991244d2144eb6e3d5d1b13a53cbea63
Author: Vitaly Kuznetsov <vkuznets@redhat.com>
Date:   Mon Aug 15 17:48:41 2016 +0200

    hv_netvsc: reset vf_inject on VF removal
    
    We reset vf_inject on VF going down (netvsc_vf_down()) but we don't on
    VF removal (netvsc_unregister_vf()) so vf_inject stays 'true' while
    vf_netdev is already NULL and we're trying to inject packets into NULL
    net device in netvsc_recv_callback() causing kernel to crash.
    
    Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
    Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit d072218f214929194db06069564495b6b9fff34a
Author: Vitaly Kuznetsov <vkuznets@redhat.com>
Date:   Mon Aug 15 17:48:40 2016 +0200

    hv_netvsc: avoid deadlocks between rtnl lock and vf_use_cnt wait
    
    Here is a deadlock scenario:
    - netvsc_vf_up() schedules netvsc_notify_peers() work and quits.
    - netvsc_vf_down() runs before netvsc_notify_peers() gets executed. As it
      is being executed from netdev notifier chain we hold rtnl lock when we
      get here.
    - we enter while (atomic_read(&net_device_ctx->vf_use_cnt) != 0) loop and
      wait till netvsc_notify_peers() drops vf_use_cnt.
    - netvsc_notify_peers() starts on some other CPU but netdev_notify_peers()
      will hang on rtnl_lock().
    - deadlock!
    
    Instead of introducing additional synchronization I suggest we drop
    gwrk.dwrk completely and call NETDEV_NOTIFY_PEERS directly. As we're
    acting under rtnl lock this is legitimate.
    
    Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
    Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit f9a7da9130ef0143eb900794c7863dc5c9051fbc
Author: Vitaly Kuznetsov <vkuznets@redhat.com>
Date:   Mon Aug 15 17:48:39 2016 +0200

    hv_netvsc: don't lose VF information
    
    struct netvsc_device is not suitable for storing VF information as this
    structure is being destroyed on MTU change / set channel operation (see
    rndis_filter_device_remove()). Move all VF related stuff to struct
    net_device_context which is persistent.
    
    Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
    Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 3d7b33209201cbfa090d614db993571ca3c6b090
Author: Simon Horman <simon.horman@netronome.com>
Date:   Mon Aug 15 13:06:24 2016 +0200

    gre: set inner_protocol on xmit
    
    Ensure that the inner_protocol is set on transmit so that GSO segmentation,
    which relies on that field, works correctly.
    
    This is achieved by setting the inner_protocol in gre_build_header rather
    than each caller of that function. It ensures that the inner_protocol is
    set when gre_fb_xmit() is used to transmit GRE which was not previously the
    case.
    
    I have observed this is not the case when OvS transmits GRE using
    lwtunnel metadata (which it always does).
    
    Fixes: 38720352412a ("gre: Use inner_proto to obtain inner header protocol")
    Cc: Pravin Shelar <pshelar@ovn.org>
    Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
    Signed-off-by: Simon Horman <simon.horman@netronome.com>
    Acked-by: Pravin B Shelar <pshelar@ovn.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 5e457896986e16c440c97bb94b9ccd95dd157292
Author: Lorenzo Colitti <lorenzo@google.com>
Date:   Sat Aug 13 01:13:38 2016 +0900

    net: ipv6: Fix ping to link-local addresses.
    
    ping_v6_sendmsg does not set flowi6_oif in response to
    sin6_scope_id or sk_bound_dev_if, so it is not possible to use
    these APIs to ping an IPv6 address on a different interface.
    Instead, it sets flowi6_iif, which is incorrect but harmless.
    
    Stop setting flowi6_iif, and support various ways of setting oif
    in the same priority order used by udpv6_sendmsg.
    
    Tested: https://android-review.googlesource.com/#/c/254470/
    Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 12311959ecf8a3a64676c01b62ce67a0c5f0fd49
Author: Vegard Nossum <vegard.nossum@oracle.com>
Date:   Fri Aug 12 20:10:44 2016 +0200

    rhashtable: fix shift by 64 when shrinking
    
    I got this:
    
        ================================================================================
        UBSAN: Undefined behaviour in ./include/linux/log2.h:63:13
        shift exponent 64 is too large for 64-bit type 'long unsigned int'
        CPU: 1 PID: 721 Comm: kworker/1:1 Not tainted 4.8.0-rc1+ #87
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
        Workqueue: events rht_deferred_worker
         0000000000000000 ffff88011661f8d8 ffffffff82344f50 0000000041b58ab3
         ffffffff84f98000 ffffffff82344ea4 ffff88011661f900 ffff88011661f8b0
         0000000000000001 ffff88011661f6b8 dffffc0000000000 ffffffff867f7640
        Call Trace:
         [<ffffffff82344f50>] dump_stack+0xac/0xfc
         [<ffffffff82344ea4>] ? _atomic_dec_and_lock+0xc4/0xc4
         [<ffffffff8242f5b8>] ubsan_epilogue+0xd/0x8a
         [<ffffffff82430c41>] __ubsan_handle_shift_out_of_bounds+0x255/0x29a
         [<ffffffff824309ec>] ? __ubsan_handle_out_of_bounds+0x180/0x180
         [<ffffffff84003436>] ? nl80211_req_set_reg+0x256/0x2f0
         [<ffffffff812112ba>] ? print_context_stack+0x8a/0x160
         [<ffffffff81200031>] ? amd_pmu_reset+0x341/0x380
         [<ffffffff823af808>] rht_deferred_worker+0x1618/0x1790
         [<ffffffff823af808>] ? rht_deferred_worker+0x1618/0x1790
         [<ffffffff823ae1f0>] ? rhashtable_jhash2+0x370/0x370
         [<ffffffff8134c12d>] ? process_one_work+0x6fd/0x1970
         [<ffffffff8134c1cf>] process_one_work+0x79f/0x1970
         [<ffffffff8134c12d>] ? process_one_work+0x6fd/0x1970
         [<ffffffff8134ba30>] ? try_to_grab_pending+0x4c0/0x4c0
         [<ffffffff8134d564>] ? worker_thread+0x1c4/0x1340
         [<ffffffff8134d8ff>] worker_thread+0x55f/0x1340
         [<ffffffff845e904f>] ? __schedule+0x4df/0x1d40
         [<ffffffff8134d3a0>] ? process_one_work+0x1970/0x1970
         [<ffffffff8134d3a0>] ? process_one_work+0x1970/0x1970
         [<ffffffff813642f7>] kthread+0x237/0x390
         [<ffffffff813640c0>] ? __kthread_parkme+0x280/0x280
         [<ffffffff845f8c93>] ? _raw_spin_unlock_irq+0x33/0x50
         [<ffffffff845f95df>] ret_from_fork+0x1f/0x40
         [<ffffffff813640c0>] ? __kthread_parkme+0x280/0x280
        ================================================================================
    
    roundup_pow_of_two() is undefined when called with an argument of 0, so
    let's avoid the call and just fall back to ht->p.min_size (which should
    never be smaller than HASH_MIN_SIZE).
    
    Cc: Herbert Xu <herbert@gondor.apana.org.au>
    Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
    Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit eb8fc32354aa77678dc6e7950a8f0c79cace204f
Author: Vincent <vincent.stehle@laposte.net>
Date:   Sun Aug 14 15:38:29 2016 +0200

    mlxsw: spectrum_router: Fix use after free
    
    In mlxsw_sp_router_fib4_add_info_destroy(), the fib_entry pointer is used
    after it has been freed by mlxsw_sp_fib_entry_destroy(). Use a temporary
    variable to fix this.
    
    Fixes: 61c503f976b5449e ("mlxsw: spectrum_router: Implement fib4 add/del switchdev obj ops")
    Signed-off-by: Vincent Stehlé <vincent.stehle@laposte.net>
    Cc: Jiri Pirko <jiri@mellanox.com>
    Acked-by: Ido Schimmel <idosch@mellanox.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 4cf0b354d92ee2c642532ee39e330f8f580fd985
Author: Florian Westphal <fw@strlen.de>
Date:   Fri Aug 12 12:03:52 2016 +0200

    rhashtable: avoid large lock-array allocations
    
    Sander reports following splat after netfilter nat bysrc table got
    converted to rhashtable:
    
    swapper/0: page allocation failure: order:3, mode:0x2084020(GFP_ATOMIC|__GFP_COMP)
     CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.0-rc1 [..]
     [<ffffffff811633ed>] warn_alloc_failed+0xdd/0x140
     [<ffffffff811638b1>] __alloc_pages_nodemask+0x3e1/0xcf0
     [<ffffffff811a72ed>] alloc_pages_current+0x8d/0x110
     [<ffffffff8117cb7f>] kmalloc_order+0x1f/0x70
     [<ffffffff811aec19>] __kmalloc+0x129/0x140
     [<ffffffff8146d561>] bucket_table_alloc+0xc1/0x1d0
     [<ffffffff8146da1d>] rhashtable_insert_rehash+0x5d/0xe0
     [<ffffffff819fcfff>] nf_nat_setup_info+0x2ef/0x400
    
    The failure happens when allocating the spinlock array.
    Even with GFP_KERNEL its unlikely for such a large allocation
    to succeed.
    
    Thomas Graf pointed me at inet_ehash_locks_alloc(), so in addition
    to adding NOWARN for atomic allocations this also makes the bucket-array
    sizing more conservative.
    
    In commit 095dc8e0c3686 ("tcp: fix/cleanup inet_ehash_locks_alloc()"),
    Eric Dumazet says: "Budget 2 cache lines per cpu worth of 'spinlocks'".
    IOW, consider size needed by a single spinlock when determining
    number of locks per cpu.  So with 64 byte per cacheline and 4 byte per
    spinlock this gives 32 locks per cpu.
    
    Resulting size of the lock-array (sizeof(spinlock) == 4):
    
    cpus:    1   2   4   8   16   32   64
    old:    1k  1k  4k  8k  16k  16k  16k
    new:   128 256 512  1k   2k   4k   8k
    
    8k allocation should have decent chance of success even
    with GFP_ATOMIC, and should not fail with GFP_KERNEL.
    
    With 72-byte spinlock (LOCKDEP):
    cpus :   1   2
    old:    9k 18k
    new:   ~2k ~4k
    
    Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
    Suggested-by: Thomas Graf <tgraf@suug.ch>
    Signed-off-by: Florian Westphal <fw@strlen.de>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 6be3ffaa0e15c64f560904b025f5c50bef5886f9
Author: Michael S. Tsirkin <mst@redhat.com>
Date:   Mon Aug 15 04:50:55 2016 +0300

    tools/virtio: add dma stubs
    
    Fixes build after recent IOMMU-related changes,
    mustly by adding more stubs.
    
    Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

commit 446374d7c7f89603d8151b56824e2cac85ed8e0d
Author: Michael S. Tsirkin <mst@redhat.com>
Date:   Mon Aug 15 04:28:12 2016 +0300

    vhost/test: fix after swiotlb changes
    
    Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

commit 21bc54fc0cdc31de72b57d2b3c79cf9c2b83cf39
Author: Gerard Garcia <ggarcia@deic.uab.cat>
Date:   Wed Aug 10 17:24:34 2016 +0200

    vhost/vsock: drop space available check for TX vq
    
    Remove unnecessary use of enable/disable callback notifications
    and the incorrect more space available check.
    
    The virtio_transport_tx_work handles when the TX virtqueue
    has more buffers available.
    
    Signed-off-by: Gerard Garcia <ggarcia@deic.uab.cat>
    Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
    Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

commit 52012619e5a2ca0491426c3712fb9054692d4a3c
Author: Michael S. Tsirkin <mst@redhat.com>
Date:   Sun Aug 14 23:44:21 2016 +0300

    ringtest: test build fix
    
    Recent changes to ptr_ring broke the ringtest
    which lacks a likely() stub. Fix it up.
    
    Fixes: 982fb490c298896d15e9323a882f34a57c11ff56
    	("ptr_ring: support zero length ring")
    Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

commit 952fcfd08c8109951622579d0ae7b9cd6cafd688
Author: Sabrina Dubroca <sd@queasysnail.net>
Date:   Fri Aug 12 16:10:33 2016 +0200

    net: remove type_check from dev_get_nest_level()
    
    The idea for type_check in dev_get_nest_level() was to count the number
    of nested devices of the same type (currently, only macvlan or vlan
    devices).
    This prevented the false positive lockdep warning on configurations such
    as:
    
    eth0 <--- macvlan0 <--- vlan0 <--- macvlan1
    
    However, this doesn't prevent a warning on a configuration such as:
    
    eth0 <--- macvlan0 <--- vlan0
    eth1 <--- vlan1 <--- macvlan1
    
    In this case, all the locks end up with a nesting subclass of 1, so
    lockdep thinks that there is still a deadlock:
    
    - in the first case we have (macvlan_netdev_addr_lock_key, 1) and then
      take (vlan_netdev_xmit_lock_key, 1)
    - in the second case, we have (vlan_netdev_xmit_lock_key, 1) and then
      take (macvlan_netdev_addr_lock_key, 1)
    
    By removing the linktype check in dev_get_nest_level() and always
    incrementing the nesting depth, lockdep considers this configuration
    valid.
    
    Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit e20038724552cd05e351cd7d7526d646953d26b7
Author: Sabrina Dubroca <sd@queasysnail.net>
Date:   Fri Aug 12 16:10:32 2016 +0200

    macsec: fix lockdep splats when nesting devices
    
    Currently, trying to setup a vlan over a macsec device, or other
    combinations of devices, triggers a lockdep warning.
    
    Use netdev_lockdep_set_classes and ndo_get_lock_subclass, similar to
    what macvlan does.
    
    Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit bc561632dddd5af0c4444d919f01cbf6d553aa0a
Author: Mike Manning <mmanning@brocade.com>
Date:   Fri Aug 12 12:02:38 2016 +0100

    net: ipv6: Do not keep IPv6 addresses when IPv6 is disabled
    
    If IPv6 is disabled when the option is set to keep IPv6
    addresses on link down, userspace is unaware of this as
    there is no such indication via netlink. The solution is to
    remove the IPv6 addresses in this case, which results in
    netlink messages indicating removal of addresses in the
    usual manner. This fix also makes the behavior consistent
    with the case of having IPv6 disabled first, which stops
    IPv6 addresses from being added.
    
    Fixes: f1705ec197e7 ("net: ipv6: Make address flushing on ifdown optional")
    Signed-off-by: Mike Manning <mmanning@brocade.com>
    Acked-by: David Ahern <dsa@cumulusnetworks.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 54236ab09e9696a27baaae693c288920a26e8588
Author: Vegard Nossum <vegard.nossum@oracle.com>
Date:   Fri Aug 12 09:50:51 2016 +0200

    net/sctp: always initialise sctp_ht_iter::start_fail
    
    sctp_transport_seq_start() does not currently clear iter->start_fail on
    success, but relies on it being zero when it is allocated (by
    seq_open_net()).
    
    This can be a problem in the following sequence:
    
        open() // allocates iter (and implicitly sets iter->start_fail = 0)
        read()
         - iter->start() // fails and sets iter->start_fail = 1
         - iter->stop() // doesn't call sctp_transport_walk_stop() (correct)
        read() again
         - iter->start() // succeeds, but doesn't change iter->start_fail
         - iter->stop() // doesn't call sctp_transport_walk_stop() (wrong)
    
    We should initialize sctp_ht_iter::start_fail to zero if ->start()
    succeeds, otherwise it's possible that we leave an old value of 1 there,
    which will cause ->stop() to not call sctp_transport_walk_stop(), which
    causes all sorts of problems like not calling rcu_read_unlock() (and
    preempt_enable()), eventually leading to more warnings like this:
    
        BUG: sleeping function called from invalid context at mm/slab.h:388
        in_atomic(): 0, irqs_disabled(): 0, pid: 16551, name: trinity-c2
        Preemption disabled at:[<ffffffff819bceb6>] rhashtable_walk_start+0x46/0x150
    
         [<ffffffff81149abb>] preempt_count_add+0x1fb/0x280
         [<ffffffff83295892>] _raw_spin_lock+0x12/0x40
         [<ffffffff819bceb6>] rhashtable_walk_start+0x46/0x150
         [<ffffffff82ec665f>] sctp_transport_walk_start+0x2f/0x60
         [<ffffffff82edda1d>] sctp_transport_seq_start+0x4d/0x150
         [<ffffffff81439e50>] traverse+0x170/0x850
         [<ffffffff8143aeec>] seq_read+0x7cc/0x1180
         [<ffffffff814f996c>] proc_reg_read+0xbc/0x180
         [<ffffffff813d0384>] do_loop_readv_writev+0x134/0x210
         [<ffffffff813d2a95>] do_readv_writev+0x565/0x660
         [<ffffffff813d6857>] vfs_readv+0x67/0xa0
         [<ffffffff813d6c16>] do_preadv+0x126/0x170
         [<ffffffff813d710c>] SyS_preadv+0xc/0x10
         [<ffffffff8100334c>] do_syscall_64+0x19c/0x410
         [<ffffffff83296225>] return_from_SYSCALL_64+0x0/0x6a
         [<ffffffffffffffff>] 0xffffffffffffffff
    
    Notice that this is a subtly different stacktrace from the one in commit
    5fc382d875 ("net/sctp: terminate rhashtable walk correctly").
    
    Cc: Xin Long <lucien.xin@gmail.com>
    Cc: Herbert Xu <herbert@gondor.apana.org.au>
    Cc: Eric W. Biederman <ebiederm@xmission.com>
    Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
    Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
    Acked-By: Neil Horman <nhorman@tuxdriver.com>
    Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 5ba092efc7ddff040777ae7162f1d195f513571b
Author: Vegard Nossum <vegard.nossum@oracle.com>
Date:   Fri Aug 12 10:29:13 2016 +0200

    net/irda: handle iriap_register_lsap() allocation failure
    
    If iriap_register_lsap() fails to allocate memory, self->lsap is
    set to NULL. However, none of the callers handle the failure and
    irlmp_connect_request() will happily dereference it:
    
        iriap_register_lsap: Unable to allocated LSAP!
        ================================================================================
        UBSAN: Undefined behaviour in net/irda/irlmp.c:378:2
        member access within null pointer of type 'struct lsap_cb'
        CPU: 1 PID: 15403 Comm: trinity-c0 Not tainted 4.8.0-rc1+ #81
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org
        04/01/2014
         0000000000000000 ffff88010c7e78a8 ffffffff82344f40 0000000041b58ab3
         ffffffff84f98000 ffffffff82344e94 ffff88010c7e78d0 ffff88010c7e7880
         ffff88010630ad00 ffffffff84a5fae0 ffffffff84d3f5c0 000000000000017a
        Call Trace:
         [<ffffffff82344f40>] dump_stack+0xac/0xfc
         [<ffffffff8242f5a8>] ubsan_epilogue+0xd/0x8a
         [<ffffffff824302bf>] __ubsan_handle_type_mismatch+0x157/0x411
         [<ffffffff83b7bdbc>] irlmp_connect_request+0x7ac/0x970
         [<ffffffff83b77cc0>] iriap_connect_request+0xa0/0x160
         [<ffffffff83b77f48>] state_s_disconnect+0x88/0xd0
         [<ffffffff83b78904>] iriap_do_client_event+0x94/0x120
         [<ffffffff83b77710>] iriap_getvaluebyclass_request+0x3e0/0x6d0
         [<ffffffff83ba6ebb>] irda_find_lsap_sel+0x1eb/0x630
         [<ffffffff83ba90c8>] irda_connect+0x828/0x12d0
         [<ffffffff833c0dfb>] SYSC_connect+0x22b/0x340
         [<ffffffff833c7e09>] SyS_connect+0x9/0x10
         [<ffffffff81007bd3>] do_syscall_64+0x1b3/0x4b0
         [<ffffffff845f946a>] entry_SYSCALL64_slow_path+0x25/0x25
        ================================================================================
    
    The bug seems to have been around since forever.
    
    There's more problems with missing error checks in iriap_init() (and
    indeed all of irda_init()), but that's a bigger problem that needs
    very careful review and testing. This patch will fix the most serious
    bug (as it's easily reached from unprivileged userspace).
    
    I have tested my patch with a reproducer.
    
    Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit c15c0ab12fd62f2b19181d05c62d24bc9fa55a42
Author: Johannes Berg <johannes.berg@intel.com>
Date:   Fri Aug 12 07:48:21 2016 +0200

    ipv6: suppress sparse warnings in IP6_ECN_set_ce()
    
    Pass the correct type __wsum to csum_sub() and csum_add(). This doesn't
    really change anything since __wsum really *is* __be32, but removes the
    address space warnings from sparse.
    
    Cc: Eric Dumazet <edumazet@google.com>
    Fixes: 34ae6a1aa054 ("ipv6: update skb->csum when CE mark is propagated")
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Acked-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 0ed661d5a48fa6df0b50ae64d27fe759a3ce42cf
Author: Daniel Borkmann <daniel@iogearbox.net>
Date:   Thu Aug 11 21:38:37 2016 +0200

    bpf: fix write helpers with regards to non-linear parts
    
    Fix the bpf_try_make_writable() helper and all call sites we have in BPF,
    it's currently defect with regards to skbs when the write_len spans into
    non-linear parts, no matter if cloned or not.
    
    There are multiple issues at once. First, using skb_store_bits() is not
    correct since even if we have a cloned skb, page frags can still be shared.
    To really make them private, we need to pull them in via __pskb_pull_tail()
    first, which also gets us a private head via pskb_expand_head() implicitly.
    
    This is for helpers like bpf_skb_store_bytes(), bpf_l3_csum_replace(),
    bpf_l4_csum_replace(). Really, the only thing reasonable and working here
    is to call skb_ensure_writable() before any write operation. Meaning, via
    pskb_may_pull() it makes sure that parts we want to access are pulled in and
    if not does so plus unclones the skb implicitly. If our write_len still fits
    the headlen and we're cloned and our header of the clone is not writable,
    then we need to make a private copy via pskb_expand_head(). skb_store_bits()
    is a bit misleading and only safe to store into non-linear data in different
    contexts such as 357b40a18b04 ("[IPV6]: IPV6_CHECKSUM socket option can
    corrupt kernel memory").
    
    For above BPF helper functions, it means after fixed bpf_try_make_writable(),
    we've pulled in enough, so that we operate always based on skb->data. Thus,
    the call to skb_header_pointer() and skb_store_bits() becomes superfluous.
    In bpf_skb_store_bytes(), the len check is unnecessary too since it can
    only pass in maximum of BPF stack size, so adding offset is guaranteed to
    never overflow. Also bpf_l3/4_csum_replace() helpers must test for proper
    offset alignment since they use __sum16 pointer for writing resulting csum.
    
    The remaining helpers that change skb data not discussed here yet are
    bpf_skb_vlan_push(), bpf_skb_vlan_pop() and bpf_skb_change_proto(). The
    vlan helpers internally call either skb_ensure_writable() (pop case) and
    skb_cow_head() (push case, for head expansion), respectively. Similarly,
    bpf_skb_proto_xlat() takes care to not mangle page frags.
    
    Fixes: 608cd71a9c7c ("tc: bpf: generalize pedit action")
    Fixes: 91bc4822c3d6 ("tc: bpf: add checksum helpers")
    Fixes: 3697649ff29e ("bpf: try harder on clones when writing into skb")
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit e8c2993a4c9fdb0c9e6fc983edd5b52716ce7442
Author: sean.wang@mediatek.com <sean.wang@mediatek.com>
Date:   Sat Aug 13 19:16:19 2016 +0800

    net: ethernet: mediatek: add the missing of_node_put() after node is used done
    
    This patch adds the missing of_node_put() after finishing the usage
    of of_parse_phandle() or of_node_get() used by fixed_phy.
    
    Signed-off-by: Sean Wang <sean.wang@mediatek.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit d7005652cd31dfc5660e1e32bf7e53538ef14987
Author: sean.wang@mediatek.com <sean.wang@mediatek.com>
Date:   Sat Aug 13 19:16:18 2016 +0800

    net: ethernet: mediatek: fixed that initializing u64_stats_sync is missing
    
    To fix runtime warning with lockdep is enabled due that u64_stats_sync
    is not initialized well, so add it.
    
    Signed-off-by: Sean Wang <sean.wang@mediatek.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit b4c0e0c61f81dedc82dda35c287ea149ff98b434
Author: Colin Ian King <colin.king@canonical.com>
Date:   Thu Aug 11 18:17:22 2016 +0100

    calipso: fix resource leak on calipso_genopt failure
    
    Currently, if calipso_genopt fails then the error exit path
    does not free the ipv6_opt_hdr new causing a memory leak. Fix
    this by kfree'ing new on the error exit path.
    
    Signed-off-by: Colin Ian King <colin.king@canonical.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 747ea55e4f78fd980350c39570a986b8c1c3e4aa
Author: Daniel Borkmann <daniel@iogearbox.net>
Date:   Fri Aug 12 22:17:17 2016 +0200

    bpf: fix bpf_skb_in_cgroup helper naming
    
    While hashing out BPF's current_task_under_cgroup helper bits, it came
    to discussion that the skb_in_cgroup helper name was suboptimally chosen.
    
    Tejun says:
    
      So, I think in_cgroup should mean that the object is in that
      particular cgroup while under_cgroup in the subhierarchy of that
      cgroup. Let's rename the other subhierarchy test to under too. I
      think that'd be a lot less confusing going forward.
    
      [...]
    
      It's more intuitive and gives us the room to implement the real
      "in" test if ever necessary in the future.
    
    Since this touches uapi bits, we need to change this as long as v4.8
    is not yet officially released. Thus, change the helper enum and rename
    related bits.
    
    Fixes: 4a482f34afcc ("cgroup: bpf: Add bpf_skb_in_cgroup_proto")
    Reference: http://patchwork.ozlabs.org/patch/658500/
    Suggested-by: Sargun Dhillon <sargun@sargun.me>
    Suggested-by: Tejun Heo <tj@kernel.org>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: Alexei Starovoitov <ast@kernel.org>

commit 601bbae0bc10d4306857b93d84240b039b3d9a6c
Author: Arnd Bergmann <arnd@arndb.de>
Date:   Wed Aug 10 23:54:08 2016 +0200

    dsa: mv88e6xxx: hide unused functions
    
    When CONFIG_NET_DSA_HWMON is disabled, we get warnings about two unused
    functions whose only callers are all inside of an #ifdef:
    
    drivers/net/dsa/mv88e6xxx.c:3257:12: 'mv88e6xxx_mdio_page_write' defined but not used [-Werror=unused-function]
    drivers/net/dsa/mv88e6xxx.c:3244:12: 'mv88e6xxx_mdio_page_read' defined but not used [-Werror=unused-function]
    
    This adds another ifdef around the function definitions. The warnings
    appeared after the functions were marked 'static', but the problem
    was already there before that.
    
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Fixes: 57d3231057e9 ("net: dsa: mv88e6xxx: fix style issues")
    Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit bae170efd6c42bf116f513a1dd07639d68fa71b9
Author: Arvind Yadav <arvind.yadav.cs@gmail.com>
Date:   Fri Aug 12 20:49:18 2016 +0530

    power: reset: hisi-reboot: Unmap region obtained by of_iomap
    
    Free memory mapping, if probe is not successful.
    
    Fixes: 4a9b37371822 ("power: reset: move hisilicon reboot code")
    Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
    Signed-off-by: Sebastian Reichel <sre@kernel.org>

commit 7a4947cf6f26b7d0a5db260d212f6df41a563d23
Author: Andy Yan <andy.yan@rock-chips.com>
Date:   Fri Aug 12 18:01:50 2016 +0800

    power: reset: reboot-mode: fix build error of missing ioremap/iounmap on UM
    
    commit 4fcd504edbf7 ("power: reset: add reboot mode driver") uses api from
    syscon, and syscon uses ioremap/iounmap which depends on HAS_IOMEM, so
    let's depend on MFD_SYSCON instead of selecting it directly to avoid the
    um-allyesconfig like build error on archs that without iomem:
    
    drivers/mfd/syscon.c: In function 'of_syscon_register':
    drivers/mfd/syscon.c:67:9: error: implicit declaration of function 'ioremap' [-Werror=implicit-function-declaration]
      base = ioremap(res.start, resource_size(&res));
             ^
    drivers/mfd/syscon.c:67:7: warning: assignment makes pointer from integer without a cast [-Wint-conversion]
      base = ioremap(res.start, resource_size(&res));
           ^
    drivers/mfd/syscon.c:109:2: error: implicit declaration of function 'iounmap' [-Werror=implicit-function-declaration]
      iounmap(base);
      ^
    
    Reported-by: Kbuild test robot <fengguang.wu@intel.com>
    Fixes: 4fcd504edbf7("power: reset: add reboot mode driver")
    Signed-off-by: Andy Yan <andy.yan@rock-chips.com>
    Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
    Signed-off-by: Sebastian Reichel <sre@kernel.org>

commit 5381cfb6f0422da24cfa9da35b0433c0415830e0
Author: Sven Van Asbroeck <thesven73@gmail.com>
Date:   Fri Aug 12 09:10:27 2016 -0400

    power: supply: max17042_battery: fix model download bug.
    
    The device's model download function returns the model data as
    an array of u32s, which is later compared to the reference
    model data. However, since the latter is an array of u16s,
    the comparison does not happen correctly, and model verification
    fails. This in turn breaks the POR initialization sequence.
    
    Fixes: 39e7213edc4f3 ("max17042_battery: Support regmap to access device's registers")
    Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
    Signed-off-by: Sven Van Asbroeck <TheSven73@googlemail.com>
    Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
    Signed-off-by: Sebastian Reichel <sre@kernel.org>

commit bbe11fab0b6c1d113776b2898e085bf4d1fdc607
Author: Sabrina Dubroca <sd@queasysnail.net>
Date:   Thu Aug 11 15:24:27 2016 +0200

    macsec: use after free when deleting the underlying device
    
    macsec_notify() loops over the list of macsec devices configured on the
    underlying device when this device is being removed.  This list is part
    of the rx_handler data.
    
    However, macsec_dellink unregisters the rx_handler and frees the
    rx_handler data when the last macsec device is removed from the
    underlying device.
    
    Add macsec_common_dellink() to delete macsec devices without
    unregistering the rx_handler and freeing the associated data.
    
    Fixes: 960d5848dbf1 ("macsec: fix memory leaks around rx_handler (un)registration")
    Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 104a493390940e85fb7c840a9fd5214aba5cb3bd
Author: Jason Wang <jasowang@redhat.com>
Date:   Thu Aug 11 18:15:56 2016 +0800

    macvtap: fix use after free for skb_array during release
    
    We've clean skb_array in macvtap_put_queue() but still try to pop from
    it during macvtap_sock_destruct(). Fix this use after free by moving
    the skb array cleanup to macvtap_sock_destruct() instead.
    
    Fixes: 362899b8725b ("macvtap: switch to use skb array")
    Reported-by: Cornelia Huck <cornelia.huck@de.ibm.com>
    Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com>
    Signed-off-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit af7752106e4f12b4ee47b4eca3e7ba4bcec6e7e5
Author: Stefan Haberland <sth@linux.vnet.ibm.com>
Date:   Tue Aug 9 15:58:48 2016 +0200

    s390/dasd: fix failing CUIR assignment under LPAR
    
    On LPAR the read message buffer command should be executed on the path
    it was received on otherwise there is a chance that the CUIR assignment
    might be faulty and the wrong channel path is set online/offline.
    
    Fix by setting the path mask accordingly.
    On z/VM we might not be able to do I/O on this path but there it does
    not matter on which path the read message buffer command is executed.
    Therefor implement a retry with an open path mask.
    
    Signed-off-by: Stefan Haberland <sth@linux.vnet.ibm.com>

commit 4b5b9ba553f9aa5f484ab972fc9b58061885ceca
Author: Martynas Pumputis <martynas@weave.works>
Date:   Tue Aug 9 16:24:50 2016 +0100

    openvswitch: do not ignore netdev errors when creating tunnel vports
    
    The creation of a tunnel vport (geneve, gre, vxlan) brings up a
    corresponding netdev, a multi-step operation which can fail.
    
    For example, changing a vxlan vport's netdev state to 'up' binds the
    vport's socket to a UDP port - if the binding fails (e.g. due to the
    port being in use), the error is currently ignored giving the
    appearance that the tunnel vport creation completed successfully.
    
    Signed-off-by: Martynas Pumputis <martynas@weave.works>
    Acked-by: Pravin B Shelar <pshelar@ovn.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit dafa6b0db2d62164c5ef81a40312d5ba514126b9
Author: Fabian Frederick <fabf@skynet.be>
Date:   Wed Aug 10 17:48:36 2016 +0200

    net: hns: fix typo in g_gmac_stats_string[]
    
    s/gamc/gmac/
    
    Signed-off-by: Fabian Frederick <fabf@skynet.be>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 672ca65d9aa8578f382784fe73578cd499664828
Author: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Date:   Wed Aug 10 14:07:34 2016 +0200

    tipc: fix variable dereference before NULL check
    
    In commit cf6f7e1d5109 ("tipc: dump monitor attributes"),
    I dereferenced a pointer before checking if its valid.
    This is reported by static check Smatch as:
    net/tipc/monitor.c:733 tipc_nl_add_monitor_peer()
         warn: variable dereferenced before check 'mon' (see line 731)
    
    In this commit, we check for a valid monitor before proceeding
    with any other operation.
    
    Fixes: cf6f7e1d5109 ("tipc: dump monitor attributes")
    Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
    Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit e95d0dfb229fffe96dc4c29054f6c7a7302e111e
Author: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Date:   Tue Aug 2 18:18:33 2016 +0300

    pinctrl: intel: merrifield: Add missed header
    
    On x86 builds the absense of <linux/io.h> makes static analyzer and compiler
    unhappy which fails to build the driver.
    
    CHECK   drivers/pinctrl/intel/pinctrl-merrifield.c
    drivers/pinctrl/intel/pinctrl-merrifield.c:518:17:
      error: undefined identifier 'readl'
    drivers/pinctrl/intel/pinctrl-merrifield.c:570:17:
      error: undefined identifier 'readl'
    drivers/pinctrl/intel/pinctrl-merrifield.c:575:9:
      error: undefined identifier 'writel'
    drivers/pinctrl/intel/pinctrl-merrifield.c:645:17:
      error: undefined identifier 'readl'
      CC      drivers/pinctrl/intel/pinctrl-merrifield.o
    drivers/pinctrl/intel/pinctrl-merrifield.c: In function ‘mrfld_pin_dbg_show’:
    drivers/pinctrl/intel/pinctrl-merrifield.c:518:10:
      error: implicit declaration of function ‘readl’
      [-Werror=implicit-function-declaration]
      value = readl(bufcfg);
                ^
    drivers/pinctrl/intel/pinctrl-merrifield.c: In function ‘mrfld_update_bufcfg’:
    drivers/pinctrl/intel/pinctrl-merrifield.c:575:2:
      error: implicit declaration of function ‘writel’
      [-Werror=implicit-function-declaration]
      writel(value, bufcfg);
        ^
    cc1: some warnings being treated as errors
    
    Add header to the top of the module.
    
    Fixes: 4e80c8f50574 ("pinctrl: intel: Add Intel Merrifield pin controller support")
    Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
    Signed-off-by: Linus Walleij <linus.walleij@linaro.org>

commit 8cf4345575a416e6856a6856ac6eaa31ad883126
Author: Agrawal, Nitesh-kumar <Nitesh-kumar.Agrawal@amd.com>
Date:   Tue Jul 26 08:28:19 2016 +0000

    pinctrl/amd: Remove the default de-bounce time
    
    In the function amd_gpio_irq_enable() and
    amd_gpio_direction_input(), remove the code which is setting
    the default de-bounce time to 2.75ms.
    
    The driver code shall use the same settings as specified in
    BIOS. Any default assignment impacts TouchPad behaviour when
    the LevelTrig is set to EDGE FALLING.
    
    Cc: stable@vger.kernel.org
    Reviewed-by:  Ken Xue <Ken.Xue@amd.com>
    Signed-off-by: Nitesh Kumar Agrawal <Nitesh-kumar.Agrawal@amd.com>
    Signed-off-by: Linus Walleij <linus.walleij@linaro.org>

commit b120a3c286520ca465c54e8afa442be10560053b
Author: Wei Yongjun <weiyj.lk@gmail.com>
Date:   Tue Jul 26 14:52:57 2016 +0000

    pinctrl: pistachio: Drop pinctrl_unregister for devm_ registered device
    
    It's not necessary to unregister pin controller device registered
    with devm_pinctrl_register() and using pinctrl_unregister() leads
    to a double free.
    
    This is detected by Coccinelle semantic patch.
    
    Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
    Signed-off-by: Linus Walleij <linus.walleij@linaro.org>

commit 5b236d0fde21d88351420ef0b9a6cb7aeeea0c54
Author: Wei Yongjun <weiyj.lk@gmail.com>
Date:   Tue Jul 26 14:51:58 2016 +0000

    pinctrl: meson: Drop pinctrl_unregister for devm_ registered device
    
    It's not necessary to unregister pin controller device registered
    with devm_pinctrl_register() and using pinctrl_unregister() leads
    to a double free.
    
    This is detected by Coccinelle semantic patch.
    
    Fixes: e649f7ec8c5f ("pinctrl: meson: Use devm_pinctrl_register() for pinctrl registration")
    Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
    Reviewed-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
    Acked-by: Kevin Hilman <khilman@baylibre.com>
    Signed-off-by: Linus Walleij <linus.walleij@linaro.org>

commit 4da449ae1df9cfeb167e78f250b250eff64bc65e
Author: Laura Garcia Liebana <nevola@gmail.com>
Date:   Tue Aug 9 20:46:16 2016 +0200

    netfilter: nft_exthdr: Add size check on u8 nft_exthdr attributes
    
    Fix the direct assignment of offset and length attributes included in
    nft_exthdr structure from u32 data to u8.
    
    Signed-off-by: Laura Garcia Liebana <nevola@gmail.com>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

commit c987ff0d3cb37d7fe1ddaa370811dfd9f73643fa
Author: Robin Murphy <robin.murphy@arm.com>
Date:   Tue Aug 9 17:31:35 2016 +0100

    iommu/dma: Respect IOMMU aperture when allocating
    
    Where a device driver has set a 64-bit DMA mask to indicate the absence
    of addressing limitations, we still need to ensure that we don't
    allocate IOVAs beyond the actual input size of the IOMMU. The reported
    aperture is the most reliable way we have of inferring that input
    address size, so use that to enforce a hard upper limit where available.
    
    Fixes: 0db2e5d18f76 ("iommu: Implem…
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Sep 11, 2016
…tch-fixes

ERROR: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 0123456789ab ("commit description")'
torvalds#12: 
	commit 7c9cb38

WARNING: line over 80 characters
torvalds#87: FILE: kernel/relay.c:337:
+	struct rchan_buf *buf = container_of(work, struct rchan_buf, wakeup_work);

WARNING: waitqueue_active without comment
torvalds#119: FILE: kernel/relay.c:772:
+		if (waitqueue_active(&buf->read_wait)) {

total: 1 errors, 2 warnings, 70 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

./patches/relay-use-irq_work-instead-of-plain-timer-for-deferred-wakeup.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Sep 12, 2016
…tch-fixes

ERROR: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 0123456789ab ("commit description")'
torvalds#12:
	commit 7c9cb38

WARNING: line over 80 characters
torvalds#87: FILE: kernel/relay.c:337:
+	struct rchan_buf *buf = container_of(work, struct rchan_buf, wakeup_work);

WARNING: waitqueue_active without comment
torvalds#119: FILE: kernel/relay.c:772:
+		if (waitqueue_active(&buf->read_wait)) {

total: 1 errors, 2 warnings, 70 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

./patches/relay-use-irq_work-instead-of-plain-timer-for-deferred-wakeup.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Sep 15, 2016
…tch-fixes

ERROR: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 0123456789ab ("commit description")'
torvalds#12:
	commit 7c9cb38

WARNING: line over 80 characters
torvalds#87: FILE: kernel/relay.c:337:
+	struct rchan_buf *buf = container_of(work, struct rchan_buf, wakeup_work);

WARNING: waitqueue_active without comment
torvalds#119: FILE: kernel/relay.c:772:
+		if (waitqueue_active(&buf->read_wait)) {

total: 1 errors, 2 warnings, 70 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

./patches/relay-use-irq_work-instead-of-plain-timer-for-deferred-wakeup.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Sep 16, 2016
…tch-fixes

ERROR: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 0123456789ab ("commit description")'
torvalds#12: 
	commit 7c9cb38

WARNING: line over 80 characters
torvalds#87: FILE: kernel/relay.c:337:
+	struct rchan_buf *buf = container_of(work, struct rchan_buf, wakeup_work);

WARNING: waitqueue_active without comment
torvalds#119: FILE: kernel/relay.c:772:
+		if (waitqueue_active(&buf->read_wait)) {

total: 1 errors, 2 warnings, 70 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

./patches/relay-use-irq_work-instead-of-plain-timer-for-deferred-wakeup.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Sep 20, 2016
…tch-fixes

ERROR: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 0123456789ab ("commit description")'
torvalds#12: 
	commit 7c9cb38

WARNING: line over 80 characters
torvalds#87: FILE: kernel/relay.c:337:
+	struct rchan_buf *buf = container_of(work, struct rchan_buf, wakeup_work);

WARNING: waitqueue_active without comment
torvalds#119: FILE: kernel/relay.c:772:
+		if (waitqueue_active(&buf->read_wait)) {

total: 1 errors, 2 warnings, 70 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

./patches/relay-use-irq_work-instead-of-plain-timer-for-deferred-wakeup.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Sep 20, 2016
…tch-fixes

ERROR: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 0123456789ab ("commit description")'
torvalds#12:
	commit 7c9cb38

WARNING: line over 80 characters
torvalds#87: FILE: kernel/relay.c:337:
+	struct rchan_buf *buf = container_of(work, struct rchan_buf, wakeup_work);

WARNING: waitqueue_active without comment
torvalds#119: FILE: kernel/relay.c:772:
+		if (waitqueue_active(&buf->read_wait)) {

total: 1 errors, 2 warnings, 70 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

./patches/relay-use-irq_work-instead-of-plain-timer-for-deferred-wakeup.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Sep 21, 2016
…tch-fixes

ERROR: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 0123456789ab ("commit description")'
torvalds#12: 
	commit 7c9cb38

WARNING: line over 80 characters
torvalds#87: FILE: kernel/relay.c:337:
+	struct rchan_buf *buf = container_of(work, struct rchan_buf, wakeup_work);

WARNING: waitqueue_active without comment
torvalds#119: FILE: kernel/relay.c:772:
+		if (waitqueue_active(&buf->read_wait)) {

total: 1 errors, 2 warnings, 70 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

./patches/relay-use-irq_work-instead-of-plain-timer-for-deferred-wakeup.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
logic10492 pushed a commit to logic10492/linux-amd-zen2 that referenced this pull request Jan 18, 2024
gyroninja added a commit to gyroninja/linux that referenced this pull request Jan 28, 2024
KSAN calls into rcu code which then triggers a write that reenters into KSAN
getting the system stuck doing infinite recursion.

#0  kmsan_get_context () at mm/kmsan/kmsan.h:106
#1  __msan_get_context_state () at mm/kmsan/instrumentation.c:331
#2  0xffffffff81495671 in get_current () at ./arch/x86/include/asm/current.h:42
#3  rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
#4  __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
#5  0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#6  pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#7  kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#8  virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#9  0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#10 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#11 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#12 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#13 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#14 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#15 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#16 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#17 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#18 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#19 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#20 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#21 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#22 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#23 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#24 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#25 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#26 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#27 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#28 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#29 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#30 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#31 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#32 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#33 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#34 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#35 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#36 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#37 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#38 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#39 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#40 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#41 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#42 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#43 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#44 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#45 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#46 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#47 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#48 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#49 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#50 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#51 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
#52 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
#53 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#54 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#55 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#56 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#57 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
#58 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#59 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#60 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#61 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#62 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#63 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#64 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#65 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#66 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#67 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#68 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#69 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
#70 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#71 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#72 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#73 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#74 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#75 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#76 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#77 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff86203c90) at ./arch/x86/include/asm/kmsan.h:82
torvalds#78 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff86203c90) at mm/kmsan/shadow.c:75
torvalds#79 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff86203c90, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#80 kmsan_get_shadow_origin_ptr (address=0xffffffff86203c90, size=8, store=false) at mm/kmsan/shadow.c:97
torvalds#81 0xffffffff81b1dc72 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=8, store=false) at mm/kmsan/instrumentation.c:36
torvalds#82 __msan_metadata_ptr_for_load_8 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:92
torvalds#83 0xffffffff814fdb9e in filter_irq_stacks (entries=<optimized out>, nr_entries=4) at kernel/stacktrace.c:397
torvalds#84 0xffffffff829520e8 in stack_depot_save_flags (entries=0xffffffff8620d974 <init_task+1012>, nr_entries=4, alloc_flags=0, depot_flags=0) at lib/stackdepot.c:500
torvalds#85 0xffffffff81b1e560 in __msan_poison_alloca (address=0xffffffff86203da0, size=24, descr=<optimized out>) at mm/kmsan/instrumentation.c:285
torvalds#86 0xffffffff8562821c in _printk (fmt=0xffffffff85f191a5 "\0016Attempting lock1") at kernel/printk/printk.c:2324
torvalds#87 0xffffffff81942aa2 in kmem_cache_create_usercopy (name=0xffffffff85f18903 "mm_struct", size=1296, align=0, flags=270336, useroffset=<optimized out>, usersize=<optimized out>, ctor=0x0 <fixed_percpu_data>) at mm/slab_common.c:296
torvalds#88 0xffffffff86f337a0 in mm_cache_init () at kernel/fork.c:3262
torvalds#89 0xffffffff86eacb8e in start_kernel () at init/main.c:932
torvalds#90 0xffffffff86ecdf94 in x86_64_start_reservations (real_mode_data=0x140e0 <exception_stacks+28896> <error: Cannot access memory at address 0x140e0>) at arch/x86/kernel/head64.c:555
torvalds#91 0xffffffff86ecde9b in x86_64_start_kernel (real_mode_data=0x140e0 <exception_stacks+28896> <error: Cannot access memory at address 0x140e0>) at arch/x86/kernel/head64.c:536
torvalds#92 0xffffffff810001d3 in secondary_startup_64 () at /pool/workspace/linux/arch/x86/kernel/head_64.S:461
torvalds#93 0x0000000000000000 in ??
gyroninja added a commit to gyroninja/linux that referenced this pull request Jan 28, 2024
As of 5ec8e8e(mm/sparsemem: fix race in accessing memory_section->usage) KMSAN
now calls into RCU tree code during kmsan_get_metadata. This will trigger a
write that will reenter into KMSAN getting the system stuck doing infinite
recursion.

#0  kmsan_get_context () at mm/kmsan/kmsan.h:106
#1  __msan_get_context_state () at mm/kmsan/instrumentation.c:331
#2  0xffffffff81495671 in get_current () at ./arch/x86/include/asm/current.h:42
#3  rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
#4  __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
#5  0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#6  pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#7  kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#8  virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#9  0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#10 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#11 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#12 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#13 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#14 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#15 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#16 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#17 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#18 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#19 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#20 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#21 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#22 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#23 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#24 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#25 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#26 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#27 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#28 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#29 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#30 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#31 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#32 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#33 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#34 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#35 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#36 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#37 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#38 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#39 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#40 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#41 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#42 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#43 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#44 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#45 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#46 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#47 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#48 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#49 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#50 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#51 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
#52 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
#53 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#54 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#55 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#56 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#57 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
#58 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#59 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#60 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#61 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#62 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#63 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#64 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#65 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#66 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#67 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#68 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#69 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
#70 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#71 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#72 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#73 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#74 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#75 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#76 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#77 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff86203c90) at ./arch/x86/include/asm/kmsan.h:82
torvalds#78 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff86203c90) at mm/kmsan/shadow.c:75
torvalds#79 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff86203c90, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#80 kmsan_get_shadow_origin_ptr (address=0xffffffff86203c90, size=8, store=false) at mm/kmsan/shadow.c:97
torvalds#81 0xffffffff81b1dc72 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=8, store=false) at mm/kmsan/instrumentation.c:36
torvalds#82 __msan_metadata_ptr_for_load_8 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:92
torvalds#83 0xffffffff814fdb9e in filter_irq_stacks (entries=<optimized out>, nr_entries=4) at kernel/stacktrace.c:397
torvalds#84 0xffffffff829520e8 in stack_depot_save_flags (entries=0xffffffff8620d974 <init_task+1012>, nr_entries=4, alloc_flags=0, depot_flags=0) at lib/stackdepot.c:500
torvalds#85 0xffffffff81b1e560 in __msan_poison_alloca (address=0xffffffff86203da0, size=24, descr=<optimized out>) at mm/kmsan/instrumentation.c:285
torvalds#86 0xffffffff8562821c in _printk (fmt=0xffffffff85f191a5 "\0016Attempting lock1") at kernel/printk/printk.c:2324
torvalds#87 0xffffffff81942aa2 in kmem_cache_create_usercopy (name=0xffffffff85f18903 "mm_struct", size=1296, align=0, flags=270336, useroffset=<optimized out>, usersize=<optimized out>, ctor=0x0 <fixed_percpu_data>) at mm/slab_common.c:296
torvalds#88 0xffffffff86f337a0 in mm_cache_init () at kernel/fork.c:3262
torvalds#89 0xffffffff86eacb8e in start_kernel () at init/main.c:932
torvalds#90 0xffffffff86ecdf94 in x86_64_start_reservations (real_mode_data=0x140e0 <exception_stacks+28896> <error: Cannot access memory at address 0x140e0>) at arch/x86/kernel/head64.c:555
torvalds#91 0xffffffff86ecde9b in x86_64_start_kernel (real_mode_data=0x140e0 <exception_stacks+28896> <error: Cannot access memory at address 0x140e0>) at arch/x86/kernel/head64.c:536
torvalds#92 0xffffffff810001d3 in secondary_startup_64 () at /pool/workspace/linux/arch/x86/kernel/head_64.S:461
torvalds#93 0x0000000000000000 in ??
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Nov 8, 2024
If open_cached_dir() encounters an error parsing the lease from the
server, the error handling may race with receiving a lease break,
resulting in open_cached_dir() freeing the cfid while the queued work is
pending.

Update open_cached_dir() to drop refs rather than directly freeing the
cfid.

Have cached_dir_lease_break(), cfids_laundromat_worker(), and
invalidate_all_cached_dirs() clear has_lease immediately while still
holding cfids->cfid_list_lock, and then use this to also simplify the
reference counting in cifds_laundromat_worker() and
invalidate_all_cached_dirs().

Fixes this KASAN splat (which manually injects an error and lease break
in open_cached_dir()):

==================================================================
BUG: KASAN: slab-use-after-free in smb2_cached_lease_break+0x27/0xb0
Read of size 8 at addr ffff88811cc24c10 by task kworker/3:1/65

CPU: 3 UID: 0 PID: 65 Comm: kworker/3:1 Not tainted 6.12.0-rc6-g255cf264e6e5-dirty torvalds#87
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
Workqueue: cifsiod smb2_cached_lease_break
Call Trace:
 <TASK>
 dump_stack_lvl+0x77/0xb0
 print_report+0xce/0x660
 kasan_report+0xd3/0x110
 smb2_cached_lease_break+0x27/0xb0
 process_one_work+0x50a/0xc50
 worker_thread+0x2ba/0x530
 kthread+0x17c/0x1c0
 ret_from_fork+0x34/0x60
 ret_from_fork_asm+0x1a/0x30
 </TASK>

Allocated by task 2464:
 kasan_save_stack+0x33/0x60
 kasan_save_track+0x14/0x30
 __kasan_kmalloc+0xaa/0xb0
 open_cached_dir+0xa7d/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Freed by task 2464:
 kasan_save_stack+0x33/0x60
 kasan_save_track+0x14/0x30
 kasan_save_free_info+0x3b/0x60
 __kasan_slab_free+0x51/0x70
 kfree+0x174/0x520
 open_cached_dir+0x97f/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Last potentially related work creation:
 kasan_save_stack+0x33/0x60
 __kasan_record_aux_stack+0xad/0xc0
 insert_work+0x32/0x100
 __queue_work+0x5c9/0x870
 queue_work_on+0x82/0x90
 open_cached_dir+0x1369/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

The buggy address belongs to the object at ffff88811cc24c00
 which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 16 bytes inside of
 freed 1024-byte region [ffff88811cc24c00, ffff88811cc25000)

Cc: stable@vger.kernel.org
Signed-off-by: Paul Aurich <paul@darkrain42.org>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Nov 11, 2024
If open_cached_dir() encounters an error parsing the lease from the
server, the error handling may race with receiving a lease break,
resulting in open_cached_dir() freeing the cfid while the queued work is
pending.

Update open_cached_dir() to drop refs rather than directly freeing the
cfid.

Have cached_dir_lease_break(), cfids_laundromat_worker(), and
invalidate_all_cached_dirs() clear has_lease immediately while still
holding cfids->cfid_list_lock, and then use this to also simplify the
reference counting in cifds_laundromat_worker() and
invalidate_all_cached_dirs().

Fixes this KASAN splat (which manually injects an error and lease break
in open_cached_dir()):

==================================================================
BUG: KASAN: slab-use-after-free in smb2_cached_lease_break+0x27/0xb0
Read of size 8 at addr ffff88811cc24c10 by task kworker/3:1/65

CPU: 3 UID: 0 PID: 65 Comm: kworker/3:1 Not tainted 6.12.0-rc6-g255cf264e6e5-dirty torvalds#87
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
Workqueue: cifsiod smb2_cached_lease_break
Call Trace:
 <TASK>
 dump_stack_lvl+0x77/0xb0
 print_report+0xce/0x660
 kasan_report+0xd3/0x110
 smb2_cached_lease_break+0x27/0xb0
 process_one_work+0x50a/0xc50
 worker_thread+0x2ba/0x530
 kthread+0x17c/0x1c0
 ret_from_fork+0x34/0x60
 ret_from_fork_asm+0x1a/0x30
 </TASK>

Allocated by task 2464:
 kasan_save_stack+0x33/0x60
 kasan_save_track+0x14/0x30
 __kasan_kmalloc+0xaa/0xb0
 open_cached_dir+0xa7d/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Freed by task 2464:
 kasan_save_stack+0x33/0x60
 kasan_save_track+0x14/0x30
 kasan_save_free_info+0x3b/0x60
 __kasan_slab_free+0x51/0x70
 kfree+0x174/0x520
 open_cached_dir+0x97f/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Last potentially related work creation:
 kasan_save_stack+0x33/0x60
 __kasan_record_aux_stack+0xad/0xc0
 insert_work+0x32/0x100
 __queue_work+0x5c9/0x870
 queue_work_on+0x82/0x90
 open_cached_dir+0x1369/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

The buggy address belongs to the object at ffff88811cc24c00
 which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 16 bytes inside of
 freed 1024-byte region [ffff88811cc24c00, ffff88811cc25000)

Cc: stable@vger.kernel.org
Signed-off-by: Paul Aurich <paul@darkrain42.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
staging-kernelci-org pushed a commit to kernelci/linux that referenced this pull request Nov 14, 2024
If open_cached_dir() encounters an error parsing the lease from the
server, the error handling may race with receiving a lease break,
resulting in open_cached_dir() freeing the cfid while the queued work is
pending.

Update open_cached_dir() to drop refs rather than directly freeing the
cfid.

Have cached_dir_lease_break(), cfids_laundromat_worker(), and
invalidate_all_cached_dirs() clear has_lease immediately while still
holding cfids->cfid_list_lock, and then use this to also simplify the
reference counting in cifds_laundromat_worker() and
invalidate_all_cached_dirs().

Fixes this KASAN splat (which manually injects an error and lease break
in open_cached_dir()):

==================================================================
BUG: KASAN: slab-use-after-free in smb2_cached_lease_break+0x27/0xb0
Read of size 8 at addr ffff88811cc24c10 by task kworker/3:1/65

CPU: 3 UID: 0 PID: 65 Comm: kworker/3:1 Not tainted 6.12.0-rc6-g255cf264e6e5-dirty torvalds#87
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
Workqueue: cifsiod smb2_cached_lease_break
Call Trace:
 <TASK>
 dump_stack_lvl+0x77/0xb0
 print_report+0xce/0x660
 kasan_report+0xd3/0x110
 smb2_cached_lease_break+0x27/0xb0
 process_one_work+0x50a/0xc50
 worker_thread+0x2ba/0x530
 kthread+0x17c/0x1c0
 ret_from_fork+0x34/0x60
 ret_from_fork_asm+0x1a/0x30
 </TASK>

Allocated by task 2464:
 kasan_save_stack+0x33/0x60
 kasan_save_track+0x14/0x30
 __kasan_kmalloc+0xaa/0xb0
 open_cached_dir+0xa7d/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Freed by task 2464:
 kasan_save_stack+0x33/0x60
 kasan_save_track+0x14/0x30
 kasan_save_free_info+0x3b/0x60
 __kasan_slab_free+0x51/0x70
 kfree+0x174/0x520
 open_cached_dir+0x97f/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Last potentially related work creation:
 kasan_save_stack+0x33/0x60
 __kasan_record_aux_stack+0xad/0xc0
 insert_work+0x32/0x100
 __queue_work+0x5c9/0x870
 queue_work_on+0x82/0x90
 open_cached_dir+0x1369/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

The buggy address belongs to the object at ffff88811cc24c00
 which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 16 bytes inside of
 freed 1024-byte region [ffff88811cc24c00, ffff88811cc25000)

Cc: stable@vger.kernel.org
Signed-off-by: Paul Aurich <paul@darkrain42.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
mj22226 pushed a commit to mj22226/linux that referenced this pull request Nov 20, 2024
If open_cached_dir() encounters an error parsing the lease from the
server, the error handling may race with receiving a lease break,
resulting in open_cached_dir() freeing the cfid while the queued work is
pending.

Update open_cached_dir() to drop refs rather than directly freeing the
cfid.

Have cached_dir_lease_break(), cfids_laundromat_worker(), and
invalidate_all_cached_dirs() clear has_lease immediately while still
holding cfids->cfid_list_lock, and then use this to also simplify the
reference counting in cfids_laundromat_worker() and
invalidate_all_cached_dirs().

Fixes this KASAN splat (which manually injects an error and lease break
in open_cached_dir()):

==================================================================
BUG: KASAN: slab-use-after-free in smb2_cached_lease_break+0x27/0xb0
Read of size 8 at addr ffff88811cc24c10 by task kworker/3:1/65

CPU: 3 UID: 0 PID: 65 Comm: kworker/3:1 Not tainted 6.12.0-rc6-g255cf264e6e5-dirty torvalds#87
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
Workqueue: cifsiod smb2_cached_lease_break
Call Trace:
 <TASK>
 dump_stack_lvl+0x77/0xb0
 print_report+0xce/0x660
 kasan_report+0xd3/0x110
 smb2_cached_lease_break+0x27/0xb0
 process_one_work+0x50a/0xc50
 worker_thread+0x2ba/0x530
 kthread+0x17c/0x1c0
 ret_from_fork+0x34/0x60
 ret_from_fork_asm+0x1a/0x30
 </TASK>

Allocated by task 2464:
 kasan_save_stack+0x33/0x60
 kasan_save_track+0x14/0x30
 __kasan_kmalloc+0xaa/0xb0
 open_cached_dir+0xa7d/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Freed by task 2464:
 kasan_save_stack+0x33/0x60
 kasan_save_track+0x14/0x30
 kasan_save_free_info+0x3b/0x60
 __kasan_slab_free+0x51/0x70
 kfree+0x174/0x520
 open_cached_dir+0x97f/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Last potentially related work creation:
 kasan_save_stack+0x33/0x60
 __kasan_record_aux_stack+0xad/0xc0
 insert_work+0x32/0x100
 __queue_work+0x5c9/0x870
 queue_work_on+0x82/0x90
 open_cached_dir+0x1369/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

The buggy address belongs to the object at ffff88811cc24c00
 which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 16 bytes inside of
 freed 1024-byte region [ffff88811cc24c00, ffff88811cc25000)

Cc: stable@vger.kernel.org
Signed-off-by: Paul Aurich <paul@darkrain42.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Nov 21, 2024
If open_cached_dir() encounters an error parsing the lease from the
server, the error handling may race with receiving a lease break,
resulting in open_cached_dir() freeing the cfid while the queued work is
pending.

Update open_cached_dir() to drop refs rather than directly freeing the
cfid.

Have cached_dir_lease_break(), cfids_laundromat_worker(), and
invalidate_all_cached_dirs() clear has_lease immediately while still
holding cfids->cfid_list_lock, and then use this to also simplify the
reference counting in cfids_laundromat_worker() and
invalidate_all_cached_dirs().

Fixes this KASAN splat (which manually injects an error and lease break
in open_cached_dir()):

==================================================================
BUG: KASAN: slab-use-after-free in smb2_cached_lease_break+0x27/0xb0
Read of size 8 at addr ffff88811cc24c10 by task kworker/3:1/65

CPU: 3 UID: 0 PID: 65 Comm: kworker/3:1 Not tainted 6.12.0-rc6-g255cf264e6e5-dirty torvalds#87
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
Workqueue: cifsiod smb2_cached_lease_break
Call Trace:
 <TASK>
 dump_stack_lvl+0x77/0xb0
 print_report+0xce/0x660
 kasan_report+0xd3/0x110
 smb2_cached_lease_break+0x27/0xb0
 process_one_work+0x50a/0xc50
 worker_thread+0x2ba/0x530
 kthread+0x17c/0x1c0
 ret_from_fork+0x34/0x60
 ret_from_fork_asm+0x1a/0x30
 </TASK>

Allocated by task 2464:
 kasan_save_stack+0x33/0x60
 kasan_save_track+0x14/0x30
 __kasan_kmalloc+0xaa/0xb0
 open_cached_dir+0xa7d/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Freed by task 2464:
 kasan_save_stack+0x33/0x60
 kasan_save_track+0x14/0x30
 kasan_save_free_info+0x3b/0x60
 __kasan_slab_free+0x51/0x70
 kfree+0x174/0x520
 open_cached_dir+0x97f/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Last potentially related work creation:
 kasan_save_stack+0x33/0x60
 __kasan_record_aux_stack+0xad/0xc0
 insert_work+0x32/0x100
 __queue_work+0x5c9/0x870
 queue_work_on+0x82/0x90
 open_cached_dir+0x1369/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

The buggy address belongs to the object at ffff88811cc24c00
 which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 16 bytes inside of
 freed 1024-byte region [ffff88811cc24c00, ffff88811cc25000)

Cc: stable@vger.kernel.org
Signed-off-by: Paul Aurich <paul@darkrain42.org>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Nov 21, 2024
If open_cached_dir() encounters an error parsing the lease from the
server, the error handling may race with receiving a lease break,
resulting in open_cached_dir() freeing the cfid while the queued work is
pending.

Update open_cached_dir() to drop refs rather than directly freeing the
cfid.

Have cached_dir_lease_break(), cfids_laundromat_worker(), and
invalidate_all_cached_dirs() clear has_lease immediately while still
holding cfids->cfid_list_lock, and then use this to also simplify the
reference counting in cfids_laundromat_worker() and
invalidate_all_cached_dirs().

Fixes this KASAN splat (which manually injects an error and lease break
in open_cached_dir()):

==================================================================
BUG: KASAN: slab-use-after-free in smb2_cached_lease_break+0x27/0xb0
Read of size 8 at addr ffff88811cc24c10 by task kworker/3:1/65

CPU: 3 UID: 0 PID: 65 Comm: kworker/3:1 Not tainted 6.12.0-rc6-g255cf264e6e5-dirty torvalds#87
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
Workqueue: cifsiod smb2_cached_lease_break
Call Trace:
 <TASK>
 dump_stack_lvl+0x77/0xb0
 print_report+0xce/0x660
 kasan_report+0xd3/0x110
 smb2_cached_lease_break+0x27/0xb0
 process_one_work+0x50a/0xc50
 worker_thread+0x2ba/0x530
 kthread+0x17c/0x1c0
 ret_from_fork+0x34/0x60
 ret_from_fork_asm+0x1a/0x30
 </TASK>

Allocated by task 2464:
 kasan_save_stack+0x33/0x60
 kasan_save_track+0x14/0x30
 __kasan_kmalloc+0xaa/0xb0
 open_cached_dir+0xa7d/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Freed by task 2464:
 kasan_save_stack+0x33/0x60
 kasan_save_track+0x14/0x30
 kasan_save_free_info+0x3b/0x60
 __kasan_slab_free+0x51/0x70
 kfree+0x174/0x520
 open_cached_dir+0x97f/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Last potentially related work creation:
 kasan_save_stack+0x33/0x60
 __kasan_record_aux_stack+0xad/0xc0
 insert_work+0x32/0x100
 __queue_work+0x5c9/0x870
 queue_work_on+0x82/0x90
 open_cached_dir+0x1369/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

The buggy address belongs to the object at ffff88811cc24c00
 which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 16 bytes inside of
 freed 1024-byte region [ffff88811cc24c00, ffff88811cc25000)

Cc: stable@vger.kernel.org
Signed-off-by: Paul Aurich <paul@darkrain42.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
mj22226 pushed a commit to mj22226/linux that referenced this pull request Nov 22, 2024
If open_cached_dir() encounters an error parsing the lease from the
server, the error handling may race with receiving a lease break,
resulting in open_cached_dir() freeing the cfid while the queued work is
pending.

Update open_cached_dir() to drop refs rather than directly freeing the
cfid.

Have cached_dir_lease_break(), cfids_laundromat_worker(), and
invalidate_all_cached_dirs() clear has_lease immediately while still
holding cfids->cfid_list_lock, and then use this to also simplify the
reference counting in cfids_laundromat_worker() and
invalidate_all_cached_dirs().

Fixes this KASAN splat (which manually injects an error and lease break
in open_cached_dir()):

==================================================================
BUG: KASAN: slab-use-after-free in smb2_cached_lease_break+0x27/0xb0
Read of size 8 at addr ffff88811cc24c10 by task kworker/3:1/65

CPU: 3 UID: 0 PID: 65 Comm: kworker/3:1 Not tainted 6.12.0-rc6-g255cf264e6e5-dirty torvalds#87
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
Workqueue: cifsiod smb2_cached_lease_break
Call Trace:
 <TASK>
 dump_stack_lvl+0x77/0xb0
 print_report+0xce/0x660
 kasan_report+0xd3/0x110
 smb2_cached_lease_break+0x27/0xb0
 process_one_work+0x50a/0xc50
 worker_thread+0x2ba/0x530
 kthread+0x17c/0x1c0
 ret_from_fork+0x34/0x60
 ret_from_fork_asm+0x1a/0x30
 </TASK>

Allocated by task 2464:
 kasan_save_stack+0x33/0x60
 kasan_save_track+0x14/0x30
 __kasan_kmalloc+0xaa/0xb0
 open_cached_dir+0xa7d/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Freed by task 2464:
 kasan_save_stack+0x33/0x60
 kasan_save_track+0x14/0x30
 kasan_save_free_info+0x3b/0x60
 __kasan_slab_free+0x51/0x70
 kfree+0x174/0x520
 open_cached_dir+0x97f/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Last potentially related work creation:
 kasan_save_stack+0x33/0x60
 __kasan_record_aux_stack+0xad/0xc0
 insert_work+0x32/0x100
 __queue_work+0x5c9/0x870
 queue_work_on+0x82/0x90
 open_cached_dir+0x1369/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

The buggy address belongs to the object at ffff88811cc24c00
 which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 16 bytes inside of
 freed 1024-byte region [ffff88811cc24c00, ffff88811cc25000)

Cc: stable@vger.kernel.org
Signed-off-by: Paul Aurich <paul@darkrain42.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
mj22226 pushed a commit to mj22226/linux that referenced this pull request Dec 4, 2024
commit a9685b4 upstream.

If open_cached_dir() encounters an error parsing the lease from the
server, the error handling may race with receiving a lease break,
resulting in open_cached_dir() freeing the cfid while the queued work is
pending.

Update open_cached_dir() to drop refs rather than directly freeing the
cfid.

Have cached_dir_lease_break(), cfids_laundromat_worker(), and
invalidate_all_cached_dirs() clear has_lease immediately while still
holding cfids->cfid_list_lock, and then use this to also simplify the
reference counting in cfids_laundromat_worker() and
invalidate_all_cached_dirs().

Fixes this KASAN splat (which manually injects an error and lease break
in open_cached_dir()):

==================================================================
BUG: KASAN: slab-use-after-free in smb2_cached_lease_break+0x27/0xb0
Read of size 8 at addr ffff88811cc24c10 by task kworker/3:1/65

CPU: 3 UID: 0 PID: 65 Comm: kworker/3:1 Not tainted 6.12.0-rc6-g255cf264e6e5-dirty torvalds#87
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
Workqueue: cifsiod smb2_cached_lease_break
Call Trace:
 <TASK>
 dump_stack_lvl+0x77/0xb0
 print_report+0xce/0x660
 kasan_report+0xd3/0x110
 smb2_cached_lease_break+0x27/0xb0
 process_one_work+0x50a/0xc50
 worker_thread+0x2ba/0x530
 kthread+0x17c/0x1c0
 ret_from_fork+0x34/0x60
 ret_from_fork_asm+0x1a/0x30
 </TASK>

Allocated by task 2464:
 kasan_save_stack+0x33/0x60
 kasan_save_track+0x14/0x30
 __kasan_kmalloc+0xaa/0xb0
 open_cached_dir+0xa7d/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Freed by task 2464:
 kasan_save_stack+0x33/0x60
 kasan_save_track+0x14/0x30
 kasan_save_free_info+0x3b/0x60
 __kasan_slab_free+0x51/0x70
 kfree+0x174/0x520
 open_cached_dir+0x97f/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Last potentially related work creation:
 kasan_save_stack+0x33/0x60
 __kasan_record_aux_stack+0xad/0xc0
 insert_work+0x32/0x100
 __queue_work+0x5c9/0x870
 queue_work_on+0x82/0x90
 open_cached_dir+0x1369/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

The buggy address belongs to the object at ffff88811cc24c00
 which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 16 bytes inside of
 freed 1024-byte region [ffff88811cc24c00, ffff88811cc25000)

Cc: stable@vger.kernel.org
Signed-off-by: Paul Aurich <paul@darkrain42.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
mj22226 pushed a commit to mj22226/linux that referenced this pull request Dec 4, 2024
commit a9685b4 upstream.

If open_cached_dir() encounters an error parsing the lease from the
server, the error handling may race with receiving a lease break,
resulting in open_cached_dir() freeing the cfid while the queued work is
pending.

Update open_cached_dir() to drop refs rather than directly freeing the
cfid.

Have cached_dir_lease_break(), cfids_laundromat_worker(), and
invalidate_all_cached_dirs() clear has_lease immediately while still
holding cfids->cfid_list_lock, and then use this to also simplify the
reference counting in cfids_laundromat_worker() and
invalidate_all_cached_dirs().

Fixes this KASAN splat (which manually injects an error and lease break
in open_cached_dir()):

==================================================================
BUG: KASAN: slab-use-after-free in smb2_cached_lease_break+0x27/0xb0
Read of size 8 at addr ffff88811cc24c10 by task kworker/3:1/65

CPU: 3 UID: 0 PID: 65 Comm: kworker/3:1 Not tainted 6.12.0-rc6-g255cf264e6e5-dirty torvalds#87
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
Workqueue: cifsiod smb2_cached_lease_break
Call Trace:
 <TASK>
 dump_stack_lvl+0x77/0xb0
 print_report+0xce/0x660
 kasan_report+0xd3/0x110
 smb2_cached_lease_break+0x27/0xb0
 process_one_work+0x50a/0xc50
 worker_thread+0x2ba/0x530
 kthread+0x17c/0x1c0
 ret_from_fork+0x34/0x60
 ret_from_fork_asm+0x1a/0x30
 </TASK>

Allocated by task 2464:
 kasan_save_stack+0x33/0x60
 kasan_save_track+0x14/0x30
 __kasan_kmalloc+0xaa/0xb0
 open_cached_dir+0xa7d/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Freed by task 2464:
 kasan_save_stack+0x33/0x60
 kasan_save_track+0x14/0x30
 kasan_save_free_info+0x3b/0x60
 __kasan_slab_free+0x51/0x70
 kfree+0x174/0x520
 open_cached_dir+0x97f/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Last potentially related work creation:
 kasan_save_stack+0x33/0x60
 __kasan_record_aux_stack+0xad/0xc0
 insert_work+0x32/0x100
 __queue_work+0x5c9/0x870
 queue_work_on+0x82/0x90
 open_cached_dir+0x1369/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

The buggy address belongs to the object at ffff88811cc24c00
 which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 16 bytes inside of
 freed 1024-byte region [ffff88811cc24c00, ffff88811cc25000)

Cc: stable@vger.kernel.org
Signed-off-by: Paul Aurich <paul@darkrain42.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ptr1337 pushed a commit to CachyOS/linux that referenced this pull request Dec 5, 2024
commit a9685b4 upstream.

If open_cached_dir() encounters an error parsing the lease from the
server, the error handling may race with receiving a lease break,
resulting in open_cached_dir() freeing the cfid while the queued work is
pending.

Update open_cached_dir() to drop refs rather than directly freeing the
cfid.

Have cached_dir_lease_break(), cfids_laundromat_worker(), and
invalidate_all_cached_dirs() clear has_lease immediately while still
holding cfids->cfid_list_lock, and then use this to also simplify the
reference counting in cfids_laundromat_worker() and
invalidate_all_cached_dirs().

Fixes this KASAN splat (which manually injects an error and lease break
in open_cached_dir()):

==================================================================
BUG: KASAN: slab-use-after-free in smb2_cached_lease_break+0x27/0xb0
Read of size 8 at addr ffff88811cc24c10 by task kworker/3:1/65

CPU: 3 UID: 0 PID: 65 Comm: kworker/3:1 Not tainted 6.12.0-rc6-g255cf264e6e5-dirty torvalds#87
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
Workqueue: cifsiod smb2_cached_lease_break
Call Trace:
 <TASK>
 dump_stack_lvl+0x77/0xb0
 print_report+0xce/0x660
 kasan_report+0xd3/0x110
 smb2_cached_lease_break+0x27/0xb0
 process_one_work+0x50a/0xc50
 worker_thread+0x2ba/0x530
 kthread+0x17c/0x1c0
 ret_from_fork+0x34/0x60
 ret_from_fork_asm+0x1a/0x30
 </TASK>

Allocated by task 2464:
 kasan_save_stack+0x33/0x60
 kasan_save_track+0x14/0x30
 __kasan_kmalloc+0xaa/0xb0
 open_cached_dir+0xa7d/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Freed by task 2464:
 kasan_save_stack+0x33/0x60
 kasan_save_track+0x14/0x30
 kasan_save_free_info+0x3b/0x60
 __kasan_slab_free+0x51/0x70
 kfree+0x174/0x520
 open_cached_dir+0x97f/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Last potentially related work creation:
 kasan_save_stack+0x33/0x60
 __kasan_record_aux_stack+0xad/0xc0
 insert_work+0x32/0x100
 __queue_work+0x5c9/0x870
 queue_work_on+0x82/0x90
 open_cached_dir+0x1369/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

The buggy address belongs to the object at ffff88811cc24c00
 which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 16 bytes inside of
 freed 1024-byte region [ffff88811cc24c00, ffff88811cc25000)

Cc: stable@vger.kernel.org
Signed-off-by: Paul Aurich <paul@darkrain42.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Kaz205 pushed a commit to Kaz205/linux that referenced this pull request Dec 5, 2024
commit a9685b4 upstream.

If open_cached_dir() encounters an error parsing the lease from the
server, the error handling may race with receiving a lease break,
resulting in open_cached_dir() freeing the cfid while the queued work is
pending.

Update open_cached_dir() to drop refs rather than directly freeing the
cfid.

Have cached_dir_lease_break(), cfids_laundromat_worker(), and
invalidate_all_cached_dirs() clear has_lease immediately while still
holding cfids->cfid_list_lock, and then use this to also simplify the
reference counting in cfids_laundromat_worker() and
invalidate_all_cached_dirs().

Fixes this KASAN splat (which manually injects an error and lease break
in open_cached_dir()):

==================================================================
BUG: KASAN: slab-use-after-free in smb2_cached_lease_break+0x27/0xb0
Read of size 8 at addr ffff88811cc24c10 by task kworker/3:1/65

CPU: 3 UID: 0 PID: 65 Comm: kworker/3:1 Not tainted 6.12.0-rc6-g255cf264e6e5-dirty torvalds#87
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
Workqueue: cifsiod smb2_cached_lease_break
Call Trace:
 <TASK>
 dump_stack_lvl+0x77/0xb0
 print_report+0xce/0x660
 kasan_report+0xd3/0x110
 smb2_cached_lease_break+0x27/0xb0
 process_one_work+0x50a/0xc50
 worker_thread+0x2ba/0x530
 kthread+0x17c/0x1c0
 ret_from_fork+0x34/0x60
 ret_from_fork_asm+0x1a/0x30
 </TASK>

Allocated by task 2464:
 kasan_save_stack+0x33/0x60
 kasan_save_track+0x14/0x30
 __kasan_kmalloc+0xaa/0xb0
 open_cached_dir+0xa7d/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Freed by task 2464:
 kasan_save_stack+0x33/0x60
 kasan_save_track+0x14/0x30
 kasan_save_free_info+0x3b/0x60
 __kasan_slab_free+0x51/0x70
 kfree+0x174/0x520
 open_cached_dir+0x97f/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Last potentially related work creation:
 kasan_save_stack+0x33/0x60
 __kasan_record_aux_stack+0xad/0xc0
 insert_work+0x32/0x100
 __queue_work+0x5c9/0x870
 queue_work_on+0x82/0x90
 open_cached_dir+0x1369/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

The buggy address belongs to the object at ffff88811cc24c00
 which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 16 bytes inside of
 freed 1024-byte region [ffff88811cc24c00, ffff88811cc25000)

Cc: stable@vger.kernel.org
Signed-off-by: Paul Aurich <paul@darkrain42.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
mj22226 pushed a commit to mj22226/linux that referenced this pull request Dec 6, 2024
commit a9685b4 upstream.

If open_cached_dir() encounters an error parsing the lease from the
server, the error handling may race with receiving a lease break,
resulting in open_cached_dir() freeing the cfid while the queued work is
pending.

Update open_cached_dir() to drop refs rather than directly freeing the
cfid.

Have cached_dir_lease_break(), cfids_laundromat_worker(), and
invalidate_all_cached_dirs() clear has_lease immediately while still
holding cfids->cfid_list_lock, and then use this to also simplify the
reference counting in cfids_laundromat_worker() and
invalidate_all_cached_dirs().

Fixes this KASAN splat (which manually injects an error and lease break
in open_cached_dir()):

==================================================================
BUG: KASAN: slab-use-after-free in smb2_cached_lease_break+0x27/0xb0
Read of size 8 at addr ffff88811cc24c10 by task kworker/3:1/65

CPU: 3 UID: 0 PID: 65 Comm: kworker/3:1 Not tainted 6.12.0-rc6-g255cf264e6e5-dirty torvalds#87
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
Workqueue: cifsiod smb2_cached_lease_break
Call Trace:
 <TASK>
 dump_stack_lvl+0x77/0xb0
 print_report+0xce/0x660
 kasan_report+0xd3/0x110
 smb2_cached_lease_break+0x27/0xb0
 process_one_work+0x50a/0xc50
 worker_thread+0x2ba/0x530
 kthread+0x17c/0x1c0
 ret_from_fork+0x34/0x60
 ret_from_fork_asm+0x1a/0x30
 </TASK>

Allocated by task 2464:
 kasan_save_stack+0x33/0x60
 kasan_save_track+0x14/0x30
 __kasan_kmalloc+0xaa/0xb0
 open_cached_dir+0xa7d/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Freed by task 2464:
 kasan_save_stack+0x33/0x60
 kasan_save_track+0x14/0x30
 kasan_save_free_info+0x3b/0x60
 __kasan_slab_free+0x51/0x70
 kfree+0x174/0x520
 open_cached_dir+0x97f/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Last potentially related work creation:
 kasan_save_stack+0x33/0x60
 __kasan_record_aux_stack+0xad/0xc0
 insert_work+0x32/0x100
 __queue_work+0x5c9/0x870
 queue_work_on+0x82/0x90
 open_cached_dir+0x1369/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

The buggy address belongs to the object at ffff88811cc24c00
 which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 16 bytes inside of
 freed 1024-byte region [ffff88811cc24c00, ffff88811cc25000)

Cc: stable@vger.kernel.org
Signed-off-by: Paul Aurich <paul@darkrain42.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Dec 8, 2024
[BUG]
With CONFIG_DEBUG_VM set, test case generic/476 has some chance to crash
with the following VM_BUG_ON_FOLIO():

 BTRFS error (device dm-3): cow_file_range failed, start 1146880 end 1253375 len 106496 ret -28
 BTRFS error (device dm-3): run_delalloc_nocow failed, start 1146880 end 1253375 len 106496 ret -28
 page: refcount:4 mapcount:0 mapping:00000000592787cc index:0x12 pfn:0x10664
 aops:btrfs_aops [btrfs] ino:101 dentry name(?):"f1774"
 flags: 0x2fffff80004028(uptodate|lru|private|node=0|zone=2|lastcpupid=0xfffff)
 page dumped because: VM_BUG_ON_FOLIO(!folio_test_locked(folio))
 ------------[ cut here ]------------
 kernel BUG at mm/page-writeback.c:2992!
 Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
 CPU: 2 UID: 0 PID: 3943513 Comm: kworker/u24:15 Tainted: G           OE      6.12.0-rc7-custom+ torvalds#87
 Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
 Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022
 Workqueue: events_unbound btrfs_async_reclaim_data_space [btrfs]
 pc : folio_clear_dirty_for_io+0x128/0x258
 lr : folio_clear_dirty_for_io+0x128/0x258
 Call trace:
  folio_clear_dirty_for_io+0x128/0x258
  btrfs_folio_clamp_clear_dirty+0x80/0xd0 [btrfs]
  __process_folios_contig+0x154/0x268 [btrfs]
  extent_clear_unlock_delalloc+0x5c/0x80 [btrfs]
  run_delalloc_nocow+0x5f8/0x760 [btrfs]
  btrfs_run_delalloc_range+0xa8/0x220 [btrfs]
  writepage_delalloc+0x230/0x4c8 [btrfs]
  extent_writepage+0xb8/0x358 [btrfs]
  extent_write_cache_pages+0x21c/0x4e8 [btrfs]
  btrfs_writepages+0x94/0x150 [btrfs]
  do_writepages+0x74/0x190
  filemap_fdatawrite_wbc+0x88/0xc8
  start_delalloc_inodes+0x178/0x3a8 [btrfs]
  btrfs_start_delalloc_roots+0x174/0x280 [btrfs]
  shrink_delalloc+0x114/0x280 [btrfs]
  flush_space+0x250/0x2f8 [btrfs]
  btrfs_async_reclaim_data_space+0x180/0x228 [btrfs]
  process_one_work+0x164/0x408
  worker_thread+0x25c/0x388
  kthread+0x100/0x118
  ret_from_fork+0x10/0x20
 Code: 910a8021 a90363f7 a9046bf9 94012379 (d4210000)
 ---[ end trace 0000000000000000 ]---

[CAUSE]
The first two lines of extra debug messages show the problem is caused
by the error handling of run_delalloc_nocow().

E.g. we have the following dirtied range (4K blocksize 4K page size):

    0                 16K                  32K
    |//////////////////////////////////////|
    |  Pre-allocated  |

And the range [0, 16K) has a preallocated extent.

- Enter run_delalloc_nocow() for range [0, 16K)
  Which found range [0, 16K) is preallocated, can do the proper NOCOW
  write.

- Enter fallback_to_fow() for range [16K, 32K)
  Since the range [16K, 32K) is not backed by preallocated extent, we
  have to go COW.

- cow_file_range() failed for range [16K, 32K)
  So cow_file_range() will do the clean up by clearing folio dirty,
  unlock the folios.

  Now the folios in range [16K, 32K) is unlocked.

- Enter extent_clear_unlock_delalloc() from run_delalloc_nocow()
  Which is called with PAGE_START_WRITEBACK to start page writeback.
  But folios can only be marked writeback when it's properly locked,
  thus this triggered the VM_BUG_ON_FOLIO().

Furthermore there is another hidden but common bug that
run_delalloc_nocow() is not clearing the folio dirty flags in its error
handling path.
This is the common bug shared between run_delalloc_nocow() and
cow_file_range().

[FIX]
- Clear folio dirty for range [@start, @cur_offset)
  Introduce a helper, cleanup_dirty_folios(), which
  will find and lock the folio in the range, clear the dirty flag and
  start/end the writeback, with the extra handling for the
  @locked_folio.

- Introduce a helper to record the last failed COW range end
  This is to trace which range we should skip, to avoid double
  unlocking.

- Skip the failed COW range for the error handling

Cc: stable@vger.kernel.org
Signed-off-by: Qu Wenruo <wqu@suse.com>
klarasm pushed a commit to klarasm/linux that referenced this pull request Dec 9, 2024
commit a9685b4 upstream.

If open_cached_dir() encounters an error parsing the lease from the
server, the error handling may race with receiving a lease break,
resulting in open_cached_dir() freeing the cfid while the queued work is
pending.

Update open_cached_dir() to drop refs rather than directly freeing the
cfid.

Have cached_dir_lease_break(), cfids_laundromat_worker(), and
invalidate_all_cached_dirs() clear has_lease immediately while still
holding cfids->cfid_list_lock, and then use this to also simplify the
reference counting in cfids_laundromat_worker() and
invalidate_all_cached_dirs().

Fixes this KASAN splat (which manually injects an error and lease break
in open_cached_dir()):

==================================================================
BUG: KASAN: slab-use-after-free in smb2_cached_lease_break+0x27/0xb0
Read of size 8 at addr ffff88811cc24c10 by task kworker/3:1/65

CPU: 3 UID: 0 PID: 65 Comm: kworker/3:1 Not tainted 6.12.0-rc6-g255cf264e6e5-dirty torvalds#87
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
Workqueue: cifsiod smb2_cached_lease_break
Call Trace:
 <TASK>
 dump_stack_lvl+0x77/0xb0
 print_report+0xce/0x660
 kasan_report+0xd3/0x110
 smb2_cached_lease_break+0x27/0xb0
 process_one_work+0x50a/0xc50
 worker_thread+0x2ba/0x530
 kthread+0x17c/0x1c0
 ret_from_fork+0x34/0x60
 ret_from_fork_asm+0x1a/0x30
 </TASK>

Allocated by task 2464:
 kasan_save_stack+0x33/0x60
 kasan_save_track+0x14/0x30
 __kasan_kmalloc+0xaa/0xb0
 open_cached_dir+0xa7d/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Freed by task 2464:
 kasan_save_stack+0x33/0x60
 kasan_save_track+0x14/0x30
 kasan_save_free_info+0x3b/0x60
 __kasan_slab_free+0x51/0x70
 kfree+0x174/0x520
 open_cached_dir+0x97f/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Last potentially related work creation:
 kasan_save_stack+0x33/0x60
 __kasan_record_aux_stack+0xad/0xc0
 insert_work+0x32/0x100
 __queue_work+0x5c9/0x870
 queue_work_on+0x82/0x90
 open_cached_dir+0x1369/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

The buggy address belongs to the object at ffff88811cc24c00
 which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 16 bytes inside of
 freed 1024-byte region [ffff88811cc24c00, ffff88811cc25000)

Cc: stable@vger.kernel.org
Signed-off-by: Paul Aurich <paul@darkrain42.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
adam900710 added a commit to adam900710/linux that referenced this pull request Dec 11, 2024
[BUG]
With CONFIG_DEBUG_VM set, test case generic/476 has some chance to crash
with the following VM_BUG_ON_FOLIO():

 BTRFS error (device dm-3): cow_file_range failed, start 1146880 end 1253375 len 106496 ret -28
 BTRFS error (device dm-3): run_delalloc_nocow failed, start 1146880 end 1253375 len 106496 ret -28
 page: refcount:4 mapcount:0 mapping:00000000592787cc index:0x12 pfn:0x10664
 aops:btrfs_aops [btrfs] ino:101 dentry name(?):"f1774"
 flags: 0x2fffff80004028(uptodate|lru|private|node=0|zone=2|lastcpupid=0xfffff)
 page dumped because: VM_BUG_ON_FOLIO(!folio_test_locked(folio))
 ------------[ cut here ]------------
 kernel BUG at mm/page-writeback.c:2992!
 Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
 CPU: 2 UID: 0 PID: 3943513 Comm: kworker/u24:15 Tainted: G           OE      6.12.0-rc7-custom+ torvalds#87
 Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
 Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022
 Workqueue: events_unbound btrfs_async_reclaim_data_space [btrfs]
 pc : folio_clear_dirty_for_io+0x128/0x258
 lr : folio_clear_dirty_for_io+0x128/0x258
 Call trace:
  folio_clear_dirty_for_io+0x128/0x258
  btrfs_folio_clamp_clear_dirty+0x80/0xd0 [btrfs]
  __process_folios_contig+0x154/0x268 [btrfs]
  extent_clear_unlock_delalloc+0x5c/0x80 [btrfs]
  run_delalloc_nocow+0x5f8/0x760 [btrfs]
  btrfs_run_delalloc_range+0xa8/0x220 [btrfs]
  writepage_delalloc+0x230/0x4c8 [btrfs]
  extent_writepage+0xb8/0x358 [btrfs]
  extent_write_cache_pages+0x21c/0x4e8 [btrfs]
  btrfs_writepages+0x94/0x150 [btrfs]
  do_writepages+0x74/0x190
  filemap_fdatawrite_wbc+0x88/0xc8
  start_delalloc_inodes+0x178/0x3a8 [btrfs]
  btrfs_start_delalloc_roots+0x174/0x280 [btrfs]
  shrink_delalloc+0x114/0x280 [btrfs]
  flush_space+0x250/0x2f8 [btrfs]
  btrfs_async_reclaim_data_space+0x180/0x228 [btrfs]
  process_one_work+0x164/0x408
  worker_thread+0x25c/0x388
  kthread+0x100/0x118
  ret_from_fork+0x10/0x20
 Code: 910a8021 a90363f7 a9046bf9 94012379 (d4210000)
 ---[ end trace 0000000000000000 ]---

[CAUSE]
The first two lines of extra debug messages show the problem is caused
by the error handling of run_delalloc_nocow().

E.g. we have the following dirtied range (4K blocksize 4K page size):

    0                 16K                  32K
    |//////////////////////////////////////|
    |  Pre-allocated  |

And the range [0, 16K) has a preallocated extent.

- Enter run_delalloc_nocow() for range [0, 16K)
  Which found range [0, 16K) is preallocated, can do the proper NOCOW
  write.

- Enter fallback_to_fow() for range [16K, 32K)
  Since the range [16K, 32K) is not backed by preallocated extent, we
  have to go COW.

- cow_file_range() failed for range [16K, 32K)
  So cow_file_range() will do the clean up by clearing folio dirty,
  unlock the folios.

  Now the folios in range [16K, 32K) is unlocked.

- Enter extent_clear_unlock_delalloc() from run_delalloc_nocow()
  Which is called with PAGE_START_WRITEBACK to start page writeback.
  But folios can only be marked writeback when it's properly locked,
  thus this triggered the VM_BUG_ON_FOLIO().

Furthermore there is another hidden but common bug that
run_delalloc_nocow() is not clearing the folio dirty flags in its error
handling path.
This is the common bug shared between run_delalloc_nocow() and
cow_file_range().

[FIX]
- Clear folio dirty for range [@start, @cur_offset)
  Introduce a helper, cleanup_dirty_folios(), which
  will find and lock the folio in the range, clear the dirty flag and
  start/end the writeback, with the extra handling for the
  @locked_folio.

- Introduce a helper to record the last failed COW range end
  This is to trace which range we should skip, to avoid double
  unlocking.

- Skip the failed COW range for the error handling

Cc: stable@vger.kernel.org
Signed-off-by: Qu Wenruo <wqu@suse.com>
adam900710 added a commit to adam900710/linux that referenced this pull request Dec 11, 2024
[BUG]
With CONFIG_DEBUG_VM set, test case generic/476 has some chance to crash
with the following VM_BUG_ON_FOLIO():

 BTRFS error (device dm-3): cow_file_range failed, start 1146880 end 1253375 len 106496 ret -28
 BTRFS error (device dm-3): run_delalloc_nocow failed, start 1146880 end 1253375 len 106496 ret -28
 page: refcount:4 mapcount:0 mapping:00000000592787cc index:0x12 pfn:0x10664
 aops:btrfs_aops [btrfs] ino:101 dentry name(?):"f1774"
 flags: 0x2fffff80004028(uptodate|lru|private|node=0|zone=2|lastcpupid=0xfffff)
 page dumped because: VM_BUG_ON_FOLIO(!folio_test_locked(folio))
 ------------[ cut here ]------------
 kernel BUG at mm/page-writeback.c:2992!
 Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
 CPU: 2 UID: 0 PID: 3943513 Comm: kworker/u24:15 Tainted: G           OE      6.12.0-rc7-custom+ torvalds#87
 Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
 Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022
 Workqueue: events_unbound btrfs_async_reclaim_data_space [btrfs]
 pc : folio_clear_dirty_for_io+0x128/0x258
 lr : folio_clear_dirty_for_io+0x128/0x258
 Call trace:
  folio_clear_dirty_for_io+0x128/0x258
  btrfs_folio_clamp_clear_dirty+0x80/0xd0 [btrfs]
  __process_folios_contig+0x154/0x268 [btrfs]
  extent_clear_unlock_delalloc+0x5c/0x80 [btrfs]
  run_delalloc_nocow+0x5f8/0x760 [btrfs]
  btrfs_run_delalloc_range+0xa8/0x220 [btrfs]
  writepage_delalloc+0x230/0x4c8 [btrfs]
  extent_writepage+0xb8/0x358 [btrfs]
  extent_write_cache_pages+0x21c/0x4e8 [btrfs]
  btrfs_writepages+0x94/0x150 [btrfs]
  do_writepages+0x74/0x190
  filemap_fdatawrite_wbc+0x88/0xc8
  start_delalloc_inodes+0x178/0x3a8 [btrfs]
  btrfs_start_delalloc_roots+0x174/0x280 [btrfs]
  shrink_delalloc+0x114/0x280 [btrfs]
  flush_space+0x250/0x2f8 [btrfs]
  btrfs_async_reclaim_data_space+0x180/0x228 [btrfs]
  process_one_work+0x164/0x408
  worker_thread+0x25c/0x388
  kthread+0x100/0x118
  ret_from_fork+0x10/0x20
 Code: 910a8021 a90363f7 a9046bf9 94012379 (d4210000)
 ---[ end trace 0000000000000000 ]---

[CAUSE]
The first two lines of extra debug messages show the problem is caused
by the error handling of run_delalloc_nocow().

E.g. we have the following dirtied range (4K blocksize 4K page size):

    0                 16K                  32K
    |//////////////////////////////////////|
    |  Pre-allocated  |

And the range [0, 16K) has a preallocated extent.

- Enter run_delalloc_nocow() for range [0, 16K)
  Which found range [0, 16K) is preallocated, can do the proper NOCOW
  write.

- Enter fallback_to_fow() for range [16K, 32K)
  Since the range [16K, 32K) is not backed by preallocated extent, we
  have to go COW.

- cow_file_range() failed for range [16K, 32K)
  So cow_file_range() will do the clean up by clearing folio dirty,
  unlock the folios.

  Now the folios in range [16K, 32K) is unlocked.

- Enter extent_clear_unlock_delalloc() from run_delalloc_nocow()
  Which is called with PAGE_START_WRITEBACK to start page writeback.
  But folios can only be marked writeback when it's properly locked,
  thus this triggered the VM_BUG_ON_FOLIO().

Furthermore there is another hidden but common bug that
run_delalloc_nocow() is not clearing the folio dirty flags in its error
handling path.
This is the common bug shared between run_delalloc_nocow() and
cow_file_range().

[FIX]
- Clear folio dirty for range [@start, @cur_offset)
  Introduce a helper, cleanup_dirty_folios(), which
  will find and lock the folio in the range, clear the dirty flag and
  start/end the writeback, with the extra handling for the
  @locked_folio.

- Introduce a helper to record the last failed COW range end
  This is to trace which range we should skip, to avoid double
  unlocking.

- Skip the failed COW range for the error handling

Cc: stable@vger.kernel.org
Signed-off-by: Qu Wenruo <wqu@suse.com>
adam900710 added a commit to adam900710/linux that referenced this pull request Dec 11, 2024
[BUG]
With CONFIG_DEBUG_VM set, test case generic/476 has some chance to crash
with the following VM_BUG_ON_FOLIO():

 BTRFS error (device dm-3): cow_file_range failed, start 1146880 end 1253375 len 106496 ret -28
 BTRFS error (device dm-3): run_delalloc_nocow failed, start 1146880 end 1253375 len 106496 ret -28
 page: refcount:4 mapcount:0 mapping:00000000592787cc index:0x12 pfn:0x10664
 aops:btrfs_aops [btrfs] ino:101 dentry name(?):"f1774"
 flags: 0x2fffff80004028(uptodate|lru|private|node=0|zone=2|lastcpupid=0xfffff)
 page dumped because: VM_BUG_ON_FOLIO(!folio_test_locked(folio))
 ------------[ cut here ]------------
 kernel BUG at mm/page-writeback.c:2992!
 Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
 CPU: 2 UID: 0 PID: 3943513 Comm: kworker/u24:15 Tainted: G           OE      6.12.0-rc7-custom+ torvalds#87
 Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
 Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022
 Workqueue: events_unbound btrfs_async_reclaim_data_space [btrfs]
 pc : folio_clear_dirty_for_io+0x128/0x258
 lr : folio_clear_dirty_for_io+0x128/0x258
 Call trace:
  folio_clear_dirty_for_io+0x128/0x258
  btrfs_folio_clamp_clear_dirty+0x80/0xd0 [btrfs]
  __process_folios_contig+0x154/0x268 [btrfs]
  extent_clear_unlock_delalloc+0x5c/0x80 [btrfs]
  run_delalloc_nocow+0x5f8/0x760 [btrfs]
  btrfs_run_delalloc_range+0xa8/0x220 [btrfs]
  writepage_delalloc+0x230/0x4c8 [btrfs]
  extent_writepage+0xb8/0x358 [btrfs]
  extent_write_cache_pages+0x21c/0x4e8 [btrfs]
  btrfs_writepages+0x94/0x150 [btrfs]
  do_writepages+0x74/0x190
  filemap_fdatawrite_wbc+0x88/0xc8
  start_delalloc_inodes+0x178/0x3a8 [btrfs]
  btrfs_start_delalloc_roots+0x174/0x280 [btrfs]
  shrink_delalloc+0x114/0x280 [btrfs]
  flush_space+0x250/0x2f8 [btrfs]
  btrfs_async_reclaim_data_space+0x180/0x228 [btrfs]
  process_one_work+0x164/0x408
  worker_thread+0x25c/0x388
  kthread+0x100/0x118
  ret_from_fork+0x10/0x20
 Code: 910a8021 a90363f7 a9046bf9 94012379 (d4210000)
 ---[ end trace 0000000000000000 ]---

[CAUSE]
The first two lines of extra debug messages show the problem is caused
by the error handling of run_delalloc_nocow().

E.g. we have the following dirtied range (4K blocksize 4K page size):

    0                 16K                  32K
    |//////////////////////////////////////|
    |  Pre-allocated  |

And the range [0, 16K) has a preallocated extent.

- Enter run_delalloc_nocow() for range [0, 16K)
  Which found range [0, 16K) is preallocated, can do the proper NOCOW
  write.

- Enter fallback_to_fow() for range [16K, 32K)
  Since the range [16K, 32K) is not backed by preallocated extent, we
  have to go COW.

- cow_file_range() failed for range [16K, 32K)
  So cow_file_range() will do the clean up by clearing folio dirty,
  unlock the folios.

  Now the folios in range [16K, 32K) is unlocked.

- Enter extent_clear_unlock_delalloc() from run_delalloc_nocow()
  Which is called with PAGE_START_WRITEBACK to start page writeback.
  But folios can only be marked writeback when it's properly locked,
  thus this triggered the VM_BUG_ON_FOLIO().

Furthermore there is another hidden but common bug that
run_delalloc_nocow() is not clearing the folio dirty flags in its error
handling path.
This is the common bug shared between run_delalloc_nocow() and
cow_file_range().

[FIX]
- Clear folio dirty for range [@start, @cur_offset)
  Introduce a helper, cleanup_dirty_folios(), which
  will find and lock the folio in the range, clear the dirty flag and
  start/end the writeback, with the extra handling for the
  @locked_folio.

- Introduce a helper to record the last failed COW range end
  This is to trace which range we should skip, to avoid double
  unlocking.

- Skip the failed COW range for the error handling

Cc: stable@vger.kernel.org
Signed-off-by: Qu Wenruo <wqu@suse.com>
adam900710 added a commit to adam900710/linux that referenced this pull request Dec 11, 2024
[BUG]
With CONFIG_DEBUG_VM set, test case generic/476 has some chance to crash
with the following VM_BUG_ON_FOLIO():

 BTRFS error (device dm-3): cow_file_range failed, start 1146880 end 1253375 len 106496 ret -28
 BTRFS error (device dm-3): run_delalloc_nocow failed, start 1146880 end 1253375 len 106496 ret -28
 page: refcount:4 mapcount:0 mapping:00000000592787cc index:0x12 pfn:0x10664
 aops:btrfs_aops [btrfs] ino:101 dentry name(?):"f1774"
 flags: 0x2fffff80004028(uptodate|lru|private|node=0|zone=2|lastcpupid=0xfffff)
 page dumped because: VM_BUG_ON_FOLIO(!folio_test_locked(folio))
 ------------[ cut here ]------------
 kernel BUG at mm/page-writeback.c:2992!
 Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
 CPU: 2 UID: 0 PID: 3943513 Comm: kworker/u24:15 Tainted: G           OE      6.12.0-rc7-custom+ torvalds#87
 Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
 Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022
 Workqueue: events_unbound btrfs_async_reclaim_data_space [btrfs]
 pc : folio_clear_dirty_for_io+0x128/0x258
 lr : folio_clear_dirty_for_io+0x128/0x258
 Call trace:
  folio_clear_dirty_for_io+0x128/0x258
  btrfs_folio_clamp_clear_dirty+0x80/0xd0 [btrfs]
  __process_folios_contig+0x154/0x268 [btrfs]
  extent_clear_unlock_delalloc+0x5c/0x80 [btrfs]
  run_delalloc_nocow+0x5f8/0x760 [btrfs]
  btrfs_run_delalloc_range+0xa8/0x220 [btrfs]
  writepage_delalloc+0x230/0x4c8 [btrfs]
  extent_writepage+0xb8/0x358 [btrfs]
  extent_write_cache_pages+0x21c/0x4e8 [btrfs]
  btrfs_writepages+0x94/0x150 [btrfs]
  do_writepages+0x74/0x190
  filemap_fdatawrite_wbc+0x88/0xc8
  start_delalloc_inodes+0x178/0x3a8 [btrfs]
  btrfs_start_delalloc_roots+0x174/0x280 [btrfs]
  shrink_delalloc+0x114/0x280 [btrfs]
  flush_space+0x250/0x2f8 [btrfs]
  btrfs_async_reclaim_data_space+0x180/0x228 [btrfs]
  process_one_work+0x164/0x408
  worker_thread+0x25c/0x388
  kthread+0x100/0x118
  ret_from_fork+0x10/0x20
 Code: 910a8021 a90363f7 a9046bf9 94012379 (d4210000)
 ---[ end trace 0000000000000000 ]---

[CAUSE]
The first two lines of extra debug messages show the problem is caused
by the error handling of run_delalloc_nocow().

E.g. we have the following dirtied range (4K blocksize 4K page size):

    0                 16K                  32K
    |//////////////////////////////////////|
    |  Pre-allocated  |

And the range [0, 16K) has a preallocated extent.

- Enter run_delalloc_nocow() for range [0, 16K)
  Which found range [0, 16K) is preallocated, can do the proper NOCOW
  write.

- Enter fallback_to_fow() for range [16K, 32K)
  Since the range [16K, 32K) is not backed by preallocated extent, we
  have to go COW.

- cow_file_range() failed for range [16K, 32K)
  So cow_file_range() will do the clean up by clearing folio dirty,
  unlock the folios.

  Now the folios in range [16K, 32K) is unlocked.

- Enter extent_clear_unlock_delalloc() from run_delalloc_nocow()
  Which is called with PAGE_START_WRITEBACK to start page writeback.
  But folios can only be marked writeback when it's properly locked,
  thus this triggered the VM_BUG_ON_FOLIO().

Furthermore there is another hidden but common bug that
run_delalloc_nocow() is not clearing the folio dirty flags in its error
handling path.
This is the common bug shared between run_delalloc_nocow() and
cow_file_range().

[FIX]
- Clear folio dirty for range [@start, @cur_offset)
  Introduce a helper, cleanup_dirty_folios(), which
  will find and lock the folio in the range, clear the dirty flag and
  start/end the writeback, with the extra handling for the
  @locked_folio.

- Introduce a helper to record the last failed COW range end
  This is to trace which range we should skip, to avoid double
  unlocking.

- Skip the failed COW range for the error handling

Cc: stable@vger.kernel.org
Signed-off-by: Qu Wenruo <wqu@suse.com>
adam900710 added a commit to adam900710/linux that referenced this pull request Dec 12, 2024
[BUG]
With CONFIG_DEBUG_VM set, test case generic/476 has some chance to crash
with the following VM_BUG_ON_FOLIO():

 BTRFS error (device dm-3): cow_file_range failed, start 1146880 end 1253375 len 106496 ret -28
 BTRFS error (device dm-3): run_delalloc_nocow failed, start 1146880 end 1253375 len 106496 ret -28
 page: refcount:4 mapcount:0 mapping:00000000592787cc index:0x12 pfn:0x10664
 aops:btrfs_aops [btrfs] ino:101 dentry name(?):"f1774"
 flags: 0x2fffff80004028(uptodate|lru|private|node=0|zone=2|lastcpupid=0xfffff)
 page dumped because: VM_BUG_ON_FOLIO(!folio_test_locked(folio))
 ------------[ cut here ]------------
 kernel BUG at mm/page-writeback.c:2992!
 Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
 CPU: 2 UID: 0 PID: 3943513 Comm: kworker/u24:15 Tainted: G           OE      6.12.0-rc7-custom+ torvalds#87
 Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
 Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022
 Workqueue: events_unbound btrfs_async_reclaim_data_space [btrfs]
 pc : folio_clear_dirty_for_io+0x128/0x258
 lr : folio_clear_dirty_for_io+0x128/0x258
 Call trace:
  folio_clear_dirty_for_io+0x128/0x258
  btrfs_folio_clamp_clear_dirty+0x80/0xd0 [btrfs]
  __process_folios_contig+0x154/0x268 [btrfs]
  extent_clear_unlock_delalloc+0x5c/0x80 [btrfs]
  run_delalloc_nocow+0x5f8/0x760 [btrfs]
  btrfs_run_delalloc_range+0xa8/0x220 [btrfs]
  writepage_delalloc+0x230/0x4c8 [btrfs]
  extent_writepage+0xb8/0x358 [btrfs]
  extent_write_cache_pages+0x21c/0x4e8 [btrfs]
  btrfs_writepages+0x94/0x150 [btrfs]
  do_writepages+0x74/0x190
  filemap_fdatawrite_wbc+0x88/0xc8
  start_delalloc_inodes+0x178/0x3a8 [btrfs]
  btrfs_start_delalloc_roots+0x174/0x280 [btrfs]
  shrink_delalloc+0x114/0x280 [btrfs]
  flush_space+0x250/0x2f8 [btrfs]
  btrfs_async_reclaim_data_space+0x180/0x228 [btrfs]
  process_one_work+0x164/0x408
  worker_thread+0x25c/0x388
  kthread+0x100/0x118
  ret_from_fork+0x10/0x20
 Code: 910a8021 a90363f7 a9046bf9 94012379 (d4210000)
 ---[ end trace 0000000000000000 ]---

[CAUSE]
The first two lines of extra debug messages show the problem is caused
by the error handling of run_delalloc_nocow().

E.g. we have the following dirtied range (4K blocksize 4K page size):

    0                 16K                  32K
    |//////////////////////////////////////|
    |  Pre-allocated  |

And the range [0, 16K) has a preallocated extent.

- Enter run_delalloc_nocow() for range [0, 16K)
  Which found range [0, 16K) is preallocated, can do the proper NOCOW
  write.

- Enter fallback_to_fow() for range [16K, 32K)
  Since the range [16K, 32K) is not backed by preallocated extent, we
  have to go COW.

- cow_file_range() failed for range [16K, 32K)
  So cow_file_range() will do the clean up by clearing folio dirty,
  unlock the folios.

  Now the folios in range [16K, 32K) is unlocked.

- Enter extent_clear_unlock_delalloc() from run_delalloc_nocow()
  Which is called with PAGE_START_WRITEBACK to start page writeback.
  But folios can only be marked writeback when it's properly locked,
  thus this triggered the VM_BUG_ON_FOLIO().

Furthermore there is another hidden but common bug that
run_delalloc_nocow() is not clearing the folio dirty flags in its error
handling path.
This is the common bug shared between run_delalloc_nocow() and
cow_file_range().

[FIX]
- Clear folio dirty for range [@start, @cur_offset)
  Introduce a helper, cleanup_dirty_folios(), which
  will find and lock the folio in the range, clear the dirty flag and
  start/end the writeback, with the extra handling for the
  @locked_folio.

- Introduce a helper to record the last failed COW range end
  This is to trace which range we should skip, to avoid double
  unlocking.

- Skip the failed COW range for the error handling

Cc: stable@vger.kernel.org
Signed-off-by: Qu Wenruo <wqu@suse.com>
adam900710 added a commit to adam900710/linux that referenced this pull request Dec 12, 2024
[BUG]
With CONFIG_DEBUG_VM set, test case generic/476 has some chance to crash
with the following VM_BUG_ON_FOLIO():

 BTRFS error (device dm-3): cow_file_range failed, start 1146880 end 1253375 len 106496 ret -28
 BTRFS error (device dm-3): run_delalloc_nocow failed, start 1146880 end 1253375 len 106496 ret -28
 page: refcount:4 mapcount:0 mapping:00000000592787cc index:0x12 pfn:0x10664
 aops:btrfs_aops [btrfs] ino:101 dentry name(?):"f1774"
 flags: 0x2fffff80004028(uptodate|lru|private|node=0|zone=2|lastcpupid=0xfffff)
 page dumped because: VM_BUG_ON_FOLIO(!folio_test_locked(folio))
 ------------[ cut here ]------------
 kernel BUG at mm/page-writeback.c:2992!
 Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
 CPU: 2 UID: 0 PID: 3943513 Comm: kworker/u24:15 Tainted: G           OE      6.12.0-rc7-custom+ torvalds#87
 Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
 Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022
 Workqueue: events_unbound btrfs_async_reclaim_data_space [btrfs]
 pc : folio_clear_dirty_for_io+0x128/0x258
 lr : folio_clear_dirty_for_io+0x128/0x258
 Call trace:
  folio_clear_dirty_for_io+0x128/0x258
  btrfs_folio_clamp_clear_dirty+0x80/0xd0 [btrfs]
  __process_folios_contig+0x154/0x268 [btrfs]
  extent_clear_unlock_delalloc+0x5c/0x80 [btrfs]
  run_delalloc_nocow+0x5f8/0x760 [btrfs]
  btrfs_run_delalloc_range+0xa8/0x220 [btrfs]
  writepage_delalloc+0x230/0x4c8 [btrfs]
  extent_writepage+0xb8/0x358 [btrfs]
  extent_write_cache_pages+0x21c/0x4e8 [btrfs]
  btrfs_writepages+0x94/0x150 [btrfs]
  do_writepages+0x74/0x190
  filemap_fdatawrite_wbc+0x88/0xc8
  start_delalloc_inodes+0x178/0x3a8 [btrfs]
  btrfs_start_delalloc_roots+0x174/0x280 [btrfs]
  shrink_delalloc+0x114/0x280 [btrfs]
  flush_space+0x250/0x2f8 [btrfs]
  btrfs_async_reclaim_data_space+0x180/0x228 [btrfs]
  process_one_work+0x164/0x408
  worker_thread+0x25c/0x388
  kthread+0x100/0x118
  ret_from_fork+0x10/0x20
 Code: 910a8021 a90363f7 a9046bf9 94012379 (d4210000)
 ---[ end trace 0000000000000000 ]---

[CAUSE]
The first two lines of extra debug messages show the problem is caused
by the error handling of run_delalloc_nocow().

E.g. we have the following dirtied range (4K blocksize 4K page size):

    0                 16K                  32K
    |//////////////////////////////////////|
    |  Pre-allocated  |

And the range [0, 16K) has a preallocated extent.

- Enter run_delalloc_nocow() for range [0, 16K)
  Which found range [0, 16K) is preallocated, can do the proper NOCOW
  write.

- Enter fallback_to_fow() for range [16K, 32K)
  Since the range [16K, 32K) is not backed by preallocated extent, we
  have to go COW.

- cow_file_range() failed for range [16K, 32K)
  So cow_file_range() will do the clean up by clearing folio dirty,
  unlock the folios.

  Now the folios in range [16K, 32K) is unlocked.

- Enter extent_clear_unlock_delalloc() from run_delalloc_nocow()
  Which is called with PAGE_START_WRITEBACK to start page writeback.
  But folios can only be marked writeback when it's properly locked,
  thus this triggered the VM_BUG_ON_FOLIO().

Furthermore there is another hidden but common bug that
run_delalloc_nocow() is not clearing the folio dirty flags in its error
handling path.
This is the common bug shared between run_delalloc_nocow() and
cow_file_range().

[FIX]
- Clear folio dirty for range [@start, @cur_offset)
  Introduce a helper, cleanup_dirty_folios(), which
  will find and lock the folio in the range, clear the dirty flag and
  start/end the writeback, with the extra handling for the
  @locked_folio.

- Introduce a helper to record the last failed COW range end
  This is to trace which range we should skip, to avoid double
  unlocking.

- Skip the failed COW range for the error handling

Cc: stable@vger.kernel.org
Signed-off-by: Qu Wenruo <wqu@suse.com>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Dec 12, 2024
[BUG]
With CONFIG_DEBUG_VM set, test case generic/476 has some chance to crash
with the following VM_BUG_ON_FOLIO():

 BTRFS error (device dm-3): cow_file_range failed, start 1146880 end 1253375 len 106496 ret -28
 BTRFS error (device dm-3): run_delalloc_nocow failed, start 1146880 end 1253375 len 106496 ret -28
 page: refcount:4 mapcount:0 mapping:00000000592787cc index:0x12 pfn:0x10664
 aops:btrfs_aops [btrfs] ino:101 dentry name(?):"f1774"
 flags: 0x2fffff80004028(uptodate|lru|private|node=0|zone=2|lastcpupid=0xfffff)
 page dumped because: VM_BUG_ON_FOLIO(!folio_test_locked(folio))
 ------------[ cut here ]------------
 kernel BUG at mm/page-writeback.c:2992!
 Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
 CPU: 2 UID: 0 PID: 3943513 Comm: kworker/u24:15 Tainted: G           OE      6.12.0-rc7-custom+ torvalds#87
 Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
 Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022
 Workqueue: events_unbound btrfs_async_reclaim_data_space [btrfs]
 pc : folio_clear_dirty_for_io+0x128/0x258
 lr : folio_clear_dirty_for_io+0x128/0x258
 Call trace:
  folio_clear_dirty_for_io+0x128/0x258
  btrfs_folio_clamp_clear_dirty+0x80/0xd0 [btrfs]
  __process_folios_contig+0x154/0x268 [btrfs]
  extent_clear_unlock_delalloc+0x5c/0x80 [btrfs]
  run_delalloc_nocow+0x5f8/0x760 [btrfs]
  btrfs_run_delalloc_range+0xa8/0x220 [btrfs]
  writepage_delalloc+0x230/0x4c8 [btrfs]
  extent_writepage+0xb8/0x358 [btrfs]
  extent_write_cache_pages+0x21c/0x4e8 [btrfs]
  btrfs_writepages+0x94/0x150 [btrfs]
  do_writepages+0x74/0x190
  filemap_fdatawrite_wbc+0x88/0xc8
  start_delalloc_inodes+0x178/0x3a8 [btrfs]
  btrfs_start_delalloc_roots+0x174/0x280 [btrfs]
  shrink_delalloc+0x114/0x280 [btrfs]
  flush_space+0x250/0x2f8 [btrfs]
  btrfs_async_reclaim_data_space+0x180/0x228 [btrfs]
  process_one_work+0x164/0x408
  worker_thread+0x25c/0x388
  kthread+0x100/0x118
  ret_from_fork+0x10/0x20
 Code: 910a8021 a90363f7 a9046bf9 94012379 (d4210000)
 ---[ end trace 0000000000000000 ]---

[CAUSE]
The first two lines of extra debug messages show the problem is caused
by the error handling of run_delalloc_nocow().

E.g. we have the following dirtied range (4K blocksize 4K page size):

    0                 16K                  32K
    |//////////////////////////////////////|
    |  Pre-allocated  |

And the range [0, 16K) has a preallocated extent.

- Enter run_delalloc_nocow() for range [0, 16K)
  Which found range [0, 16K) is preallocated, can do the proper NOCOW
  write.

- Enter fallback_to_fow() for range [16K, 32K)
  Since the range [16K, 32K) is not backed by preallocated extent, we
  have to go COW.

- cow_file_range() failed for range [16K, 32K)
  So cow_file_range() will do the clean up by clearing folio dirty,
  unlock the folios.

  Now the folios in range [16K, 32K) is unlocked.

- Enter extent_clear_unlock_delalloc() from run_delalloc_nocow()
  Which is called with PAGE_START_WRITEBACK to start page writeback.
  But folios can only be marked writeback when it's properly locked,
  thus this triggered the VM_BUG_ON_FOLIO().

Furthermore there is another hidden but common bug that
run_delalloc_nocow() is not clearing the folio dirty flags in its error
handling path.
This is the common bug shared between run_delalloc_nocow() and
cow_file_range().

[FIX]
- Clear folio dirty for range [@start, @cur_offset)
  Introduce a helper, cleanup_dirty_folios(), which
  will find and lock the folio in the range, clear the dirty flag and
  start/end the writeback, with the extra handling for the
  @locked_folio.

- Introduce a helper to record the last failed COW range end
  This is to trace which range we should skip, to avoid double
  unlocking.

- Skip the failed COW range for the error handling

Cc: stable@vger.kernel.org
Signed-off-by: Qu Wenruo <wqu@suse.com>
kdave pushed a commit to kdave/btrfs-devel that referenced this pull request Dec 13, 2024
[BUG]
With CONFIG_DEBUG_VM set, test case generic/476 has some chance to crash
with the following VM_BUG_ON_FOLIO():

 BTRFS error (device dm-3): cow_file_range failed, start 1146880 end 1253375 len 106496 ret -28
 BTRFS error (device dm-3): run_delalloc_nocow failed, start 1146880 end 1253375 len 106496 ret -28
 page: refcount:4 mapcount:0 mapping:00000000592787cc index:0x12 pfn:0x10664
 aops:btrfs_aops [btrfs] ino:101 dentry name(?):"f1774"
 flags: 0x2fffff80004028(uptodate|lru|private|node=0|zone=2|lastcpupid=0xfffff)
 page dumped because: VM_BUG_ON_FOLIO(!folio_test_locked(folio))
 ------------[ cut here ]------------
 kernel BUG at mm/page-writeback.c:2992!
 Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
 CPU: 2 UID: 0 PID: 3943513 Comm: kworker/u24:15 Tainted: G           OE      6.12.0-rc7-custom+ torvalds#87
 Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
 Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022
 Workqueue: events_unbound btrfs_async_reclaim_data_space [btrfs]
 pc : folio_clear_dirty_for_io+0x128/0x258
 lr : folio_clear_dirty_for_io+0x128/0x258
 Call trace:
  folio_clear_dirty_for_io+0x128/0x258
  btrfs_folio_clamp_clear_dirty+0x80/0xd0 [btrfs]
  __process_folios_contig+0x154/0x268 [btrfs]
  extent_clear_unlock_delalloc+0x5c/0x80 [btrfs]
  run_delalloc_nocow+0x5f8/0x760 [btrfs]
  btrfs_run_delalloc_range+0xa8/0x220 [btrfs]
  writepage_delalloc+0x230/0x4c8 [btrfs]
  extent_writepage+0xb8/0x358 [btrfs]
  extent_write_cache_pages+0x21c/0x4e8 [btrfs]
  btrfs_writepages+0x94/0x150 [btrfs]
  do_writepages+0x74/0x190
  filemap_fdatawrite_wbc+0x88/0xc8
  start_delalloc_inodes+0x178/0x3a8 [btrfs]
  btrfs_start_delalloc_roots+0x174/0x280 [btrfs]
  shrink_delalloc+0x114/0x280 [btrfs]
  flush_space+0x250/0x2f8 [btrfs]
  btrfs_async_reclaim_data_space+0x180/0x228 [btrfs]
  process_one_work+0x164/0x408
  worker_thread+0x25c/0x388
  kthread+0x100/0x118
  ret_from_fork+0x10/0x20
 Code: 910a8021 a90363f7 a9046bf9 94012379 (d4210000)
 ---[ end trace 0000000000000000 ]---

[CAUSE]
The first two lines of extra debug messages show the problem is caused
by the error handling of run_delalloc_nocow().

E.g. we have the following dirtied range (4K blocksize 4K page size):

    0                 16K                  32K
    |//////////////////////////////////////|
    |  Pre-allocated  |

And the range [0, 16K) has a preallocated extent.

- Enter run_delalloc_nocow() for range [0, 16K)
  Which found range [0, 16K) is preallocated, can do the proper NOCOW
  write.

- Enter fallback_to_fow() for range [16K, 32K)
  Since the range [16K, 32K) is not backed by preallocated extent, we
  have to go COW.

- cow_file_range() failed for range [16K, 32K)
  So cow_file_range() will do the clean up by clearing folio dirty,
  unlock the folios.

  Now the folios in range [16K, 32K) is unlocked.

- Enter extent_clear_unlock_delalloc() from run_delalloc_nocow()
  Which is called with PAGE_START_WRITEBACK to start page writeback.
  But folios can only be marked writeback when it's properly locked,
  thus this triggered the VM_BUG_ON_FOLIO().

Furthermore there is another hidden but common bug that
run_delalloc_nocow() is not clearing the folio dirty flags in its error
handling path.
This is the common bug shared between run_delalloc_nocow() and
cow_file_range().

[FIX]
- Clear folio dirty for range [@start, @cur_offset)
  Introduce a helper, cleanup_dirty_folios(), which
  will find and lock the folio in the range, clear the dirty flag and
  start/end the writeback, with the extra handling for the
  @locked_folio.

- Introduce a helper to record the last failed COW range end
  This is to trace which range we should skip, to avoid double
  unlocking.

- Skip the failed COW range for the error handling

Cc: stable@vger.kernel.org
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
kdave pushed a commit to kdave/btrfs-devel that referenced this pull request Dec 13, 2024
[BUG]
With CONFIG_DEBUG_VM set, test case generic/476 has some chance to crash
with the following VM_BUG_ON_FOLIO():

 BTRFS error (device dm-3): cow_file_range failed, start 1146880 end 1253375 len 106496 ret -28
 BTRFS error (device dm-3): run_delalloc_nocow failed, start 1146880 end 1253375 len 106496 ret -28
 page: refcount:4 mapcount:0 mapping:00000000592787cc index:0x12 pfn:0x10664
 aops:btrfs_aops [btrfs] ino:101 dentry name(?):"f1774"
 flags: 0x2fffff80004028(uptodate|lru|private|node=0|zone=2|lastcpupid=0xfffff)
 page dumped because: VM_BUG_ON_FOLIO(!folio_test_locked(folio))
 ------------[ cut here ]------------
 kernel BUG at mm/page-writeback.c:2992!
 Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
 CPU: 2 UID: 0 PID: 3943513 Comm: kworker/u24:15 Tainted: G           OE      6.12.0-rc7-custom+ torvalds#87
 Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
 Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022
 Workqueue: events_unbound btrfs_async_reclaim_data_space [btrfs]
 pc : folio_clear_dirty_for_io+0x128/0x258
 lr : folio_clear_dirty_for_io+0x128/0x258
 Call trace:
  folio_clear_dirty_for_io+0x128/0x258
  btrfs_folio_clamp_clear_dirty+0x80/0xd0 [btrfs]
  __process_folios_contig+0x154/0x268 [btrfs]
  extent_clear_unlock_delalloc+0x5c/0x80 [btrfs]
  run_delalloc_nocow+0x5f8/0x760 [btrfs]
  btrfs_run_delalloc_range+0xa8/0x220 [btrfs]
  writepage_delalloc+0x230/0x4c8 [btrfs]
  extent_writepage+0xb8/0x358 [btrfs]
  extent_write_cache_pages+0x21c/0x4e8 [btrfs]
  btrfs_writepages+0x94/0x150 [btrfs]
  do_writepages+0x74/0x190
  filemap_fdatawrite_wbc+0x88/0xc8
  start_delalloc_inodes+0x178/0x3a8 [btrfs]
  btrfs_start_delalloc_roots+0x174/0x280 [btrfs]
  shrink_delalloc+0x114/0x280 [btrfs]
  flush_space+0x250/0x2f8 [btrfs]
  btrfs_async_reclaim_data_space+0x180/0x228 [btrfs]
  process_one_work+0x164/0x408
  worker_thread+0x25c/0x388
  kthread+0x100/0x118
  ret_from_fork+0x10/0x20
 Code: 910a8021 a90363f7 a9046bf9 94012379 (d4210000)
 ---[ end trace 0000000000000000 ]---

[CAUSE]
The first two lines of extra debug messages show the problem is caused
by the error handling of run_delalloc_nocow().

E.g. we have the following dirtied range (4K blocksize 4K page size):

    0                 16K                  32K
    |//////////////////////////////////////|
    |  Pre-allocated  |

And the range [0, 16K) has a preallocated extent.

- Enter run_delalloc_nocow() for range [0, 16K)
  Which found range [0, 16K) is preallocated, can do the proper NOCOW
  write.

- Enter fallback_to_fow() for range [16K, 32K)
  Since the range [16K, 32K) is not backed by preallocated extent, we
  have to go COW.

- cow_file_range() failed for range [16K, 32K)
  So cow_file_range() will do the clean up by clearing folio dirty,
  unlock the folios.

  Now the folios in range [16K, 32K) is unlocked.

- Enter extent_clear_unlock_delalloc() from run_delalloc_nocow()
  Which is called with PAGE_START_WRITEBACK to start page writeback.
  But folios can only be marked writeback when it's properly locked,
  thus this triggered the VM_BUG_ON_FOLIO().

Furthermore there is another hidden but common bug that
run_delalloc_nocow() is not clearing the folio dirty flags in its error
handling path.
This is the common bug shared between run_delalloc_nocow() and
cow_file_range().

[FIX]
- Clear folio dirty for range [@start, @cur_offset)
  Introduce a helper, cleanup_dirty_folios(), which
  will find and lock the folio in the range, clear the dirty flag and
  start/end the writeback, with the extra handling for the
  @locked_folio.

- Introduce a helper to record the last failed COW range end
  This is to trace which range we should skip, to avoid double
  unlocking.

- Skip the failed COW range for the error handling

Cc: stable@vger.kernel.org
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
kdave pushed a commit to kdave/btrfs-devel that referenced this pull request Dec 18, 2024
[BUG]
With CONFIG_DEBUG_VM set, test case generic/476 has some chance to crash
with the following VM_BUG_ON_FOLIO():

 BTRFS error (device dm-3): cow_file_range failed, start 1146880 end 1253375 len 106496 ret -28
 BTRFS error (device dm-3): run_delalloc_nocow failed, start 1146880 end 1253375 len 106496 ret -28
 page: refcount:4 mapcount:0 mapping:00000000592787cc index:0x12 pfn:0x10664
 aops:btrfs_aops [btrfs] ino:101 dentry name(?):"f1774"
 flags: 0x2fffff80004028(uptodate|lru|private|node=0|zone=2|lastcpupid=0xfffff)
 page dumped because: VM_BUG_ON_FOLIO(!folio_test_locked(folio))
 ------------[ cut here ]------------
 kernel BUG at mm/page-writeback.c:2992!
 Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
 CPU: 2 UID: 0 PID: 3943513 Comm: kworker/u24:15 Tainted: G           OE      6.12.0-rc7-custom+ torvalds#87
 Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
 Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022
 Workqueue: events_unbound btrfs_async_reclaim_data_space [btrfs]
 pc : folio_clear_dirty_for_io+0x128/0x258
 lr : folio_clear_dirty_for_io+0x128/0x258
 Call trace:
  folio_clear_dirty_for_io+0x128/0x258
  btrfs_folio_clamp_clear_dirty+0x80/0xd0 [btrfs]
  __process_folios_contig+0x154/0x268 [btrfs]
  extent_clear_unlock_delalloc+0x5c/0x80 [btrfs]
  run_delalloc_nocow+0x5f8/0x760 [btrfs]
  btrfs_run_delalloc_range+0xa8/0x220 [btrfs]
  writepage_delalloc+0x230/0x4c8 [btrfs]
  extent_writepage+0xb8/0x358 [btrfs]
  extent_write_cache_pages+0x21c/0x4e8 [btrfs]
  btrfs_writepages+0x94/0x150 [btrfs]
  do_writepages+0x74/0x190
  filemap_fdatawrite_wbc+0x88/0xc8
  start_delalloc_inodes+0x178/0x3a8 [btrfs]
  btrfs_start_delalloc_roots+0x174/0x280 [btrfs]
  shrink_delalloc+0x114/0x280 [btrfs]
  flush_space+0x250/0x2f8 [btrfs]
  btrfs_async_reclaim_data_space+0x180/0x228 [btrfs]
  process_one_work+0x164/0x408
  worker_thread+0x25c/0x388
  kthread+0x100/0x118
  ret_from_fork+0x10/0x20
 Code: 910a8021 a90363f7 a9046bf9 94012379 (d4210000)
 ---[ end trace 0000000000000000 ]---

[CAUSE]
The first two lines of extra debug messages show the problem is caused
by the error handling of run_delalloc_nocow().

E.g. we have the following dirtied range (4K blocksize 4K page size):

    0                 16K                  32K
    |//////////////////////////////////////|
    |  Pre-allocated  |

And the range [0, 16K) has a preallocated extent.

- Enter run_delalloc_nocow() for range [0, 16K)
  Which found range [0, 16K) is preallocated, can do the proper NOCOW
  write.

- Enter fallback_to_fow() for range [16K, 32K)
  Since the range [16K, 32K) is not backed by preallocated extent, we
  have to go COW.

- cow_file_range() failed for range [16K, 32K)
  So cow_file_range() will do the clean up by clearing folio dirty,
  unlock the folios.

  Now the folios in range [16K, 32K) is unlocked.

- Enter extent_clear_unlock_delalloc() from run_delalloc_nocow()
  Which is called with PAGE_START_WRITEBACK to start page writeback.
  But folios can only be marked writeback when it's properly locked,
  thus this triggered the VM_BUG_ON_FOLIO().

Furthermore there is another hidden but common bug that
run_delalloc_nocow() is not clearing the folio dirty flags in its error
handling path.
This is the common bug shared between run_delalloc_nocow() and
cow_file_range().

[FIX]
- Clear folio dirty for range [@start, @cur_offset)
  Introduce a helper, cleanup_dirty_folios(), which
  will find and lock the folio in the range, clear the dirty flag and
  start/end the writeback, with the extra handling for the
  @locked_folio.

- Introduce a helper to record the last failed COW range end
  This is to trace which range we should skip, to avoid double
  unlocking.

- Skip the failed COW range for the error handling

Cc: stable@vger.kernel.org
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
kdave pushed a commit to kdave/btrfs-devel that referenced this pull request Dec 23, 2024
[BUG]
With CONFIG_DEBUG_VM set, test case generic/476 has some chance to crash
with the following VM_BUG_ON_FOLIO():

 BTRFS error (device dm-3): cow_file_range failed, start 1146880 end 1253375 len 106496 ret -28
 BTRFS error (device dm-3): run_delalloc_nocow failed, start 1146880 end 1253375 len 106496 ret -28
 page: refcount:4 mapcount:0 mapping:00000000592787cc index:0x12 pfn:0x10664
 aops:btrfs_aops [btrfs] ino:101 dentry name(?):"f1774"
 flags: 0x2fffff80004028(uptodate|lru|private|node=0|zone=2|lastcpupid=0xfffff)
 page dumped because: VM_BUG_ON_FOLIO(!folio_test_locked(folio))
 ------------[ cut here ]------------
 kernel BUG at mm/page-writeback.c:2992!
 Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
 CPU: 2 UID: 0 PID: 3943513 Comm: kworker/u24:15 Tainted: G           OE      6.12.0-rc7-custom+ torvalds#87
 Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
 Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022
 Workqueue: events_unbound btrfs_async_reclaim_data_space [btrfs]
 pc : folio_clear_dirty_for_io+0x128/0x258
 lr : folio_clear_dirty_for_io+0x128/0x258
 Call trace:
  folio_clear_dirty_for_io+0x128/0x258
  btrfs_folio_clamp_clear_dirty+0x80/0xd0 [btrfs]
  __process_folios_contig+0x154/0x268 [btrfs]
  extent_clear_unlock_delalloc+0x5c/0x80 [btrfs]
  run_delalloc_nocow+0x5f8/0x760 [btrfs]
  btrfs_run_delalloc_range+0xa8/0x220 [btrfs]
  writepage_delalloc+0x230/0x4c8 [btrfs]
  extent_writepage+0xb8/0x358 [btrfs]
  extent_write_cache_pages+0x21c/0x4e8 [btrfs]
  btrfs_writepages+0x94/0x150 [btrfs]
  do_writepages+0x74/0x190
  filemap_fdatawrite_wbc+0x88/0xc8
  start_delalloc_inodes+0x178/0x3a8 [btrfs]
  btrfs_start_delalloc_roots+0x174/0x280 [btrfs]
  shrink_delalloc+0x114/0x280 [btrfs]
  flush_space+0x250/0x2f8 [btrfs]
  btrfs_async_reclaim_data_space+0x180/0x228 [btrfs]
  process_one_work+0x164/0x408
  worker_thread+0x25c/0x388
  kthread+0x100/0x118
  ret_from_fork+0x10/0x20
 Code: 910a8021 a90363f7 a9046bf9 94012379 (d4210000)
 ---[ end trace 0000000000000000 ]---

[CAUSE]
The first two lines of extra debug messages show the problem is caused
by the error handling of run_delalloc_nocow().

E.g. we have the following dirtied range (4K blocksize 4K page size):

    0                 16K                  32K
    |//////////////////////////////////////|
    |  Pre-allocated  |

And the range [0, 16K) has a preallocated extent.

- Enter run_delalloc_nocow() for range [0, 16K)
  Which found range [0, 16K) is preallocated, can do the proper NOCOW
  write.

- Enter fallback_to_fow() for range [16K, 32K)
  Since the range [16K, 32K) is not backed by preallocated extent, we
  have to go COW.

- cow_file_range() failed for range [16K, 32K)
  So cow_file_range() will do the clean up by clearing folio dirty,
  unlock the folios.

  Now the folios in range [16K, 32K) is unlocked.

- Enter extent_clear_unlock_delalloc() from run_delalloc_nocow()
  Which is called with PAGE_START_WRITEBACK to start page writeback.
  But folios can only be marked writeback when it's properly locked,
  thus this triggered the VM_BUG_ON_FOLIO().

Furthermore there is another hidden but common bug that
run_delalloc_nocow() is not clearing the folio dirty flags in its error
handling path.
This is the common bug shared between run_delalloc_nocow() and
cow_file_range().

[FIX]
- Clear folio dirty for range [@start, @cur_offset)
  Introduce a helper, cleanup_dirty_folios(), which
  will find and lock the folio in the range, clear the dirty flag and
  start/end the writeback, with the extra handling for the
  @locked_folio.

- Introduce a helper to record the last failed COW range end
  This is to trace which range we should skip, to avoid double
  unlocking.

- Skip the failed COW range for the error handling

Cc: stable@vger.kernel.org
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
kdave pushed a commit to kdave/btrfs-devel that referenced this pull request Dec 23, 2024
[BUG]
With CONFIG_DEBUG_VM set, test case generic/476 has some chance to crash
with the following VM_BUG_ON_FOLIO():

 BTRFS error (device dm-3): cow_file_range failed, start 1146880 end 1253375 len 106496 ret -28
 BTRFS error (device dm-3): run_delalloc_nocow failed, start 1146880 end 1253375 len 106496 ret -28
 page: refcount:4 mapcount:0 mapping:00000000592787cc index:0x12 pfn:0x10664
 aops:btrfs_aops [btrfs] ino:101 dentry name(?):"f1774"
 flags: 0x2fffff80004028(uptodate|lru|private|node=0|zone=2|lastcpupid=0xfffff)
 page dumped because: VM_BUG_ON_FOLIO(!folio_test_locked(folio))
 ------------[ cut here ]------------
 kernel BUG at mm/page-writeback.c:2992!
 Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
 CPU: 2 UID: 0 PID: 3943513 Comm: kworker/u24:15 Tainted: G           OE      6.12.0-rc7-custom+ torvalds#87
 Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
 Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022
 Workqueue: events_unbound btrfs_async_reclaim_data_space [btrfs]
 pc : folio_clear_dirty_for_io+0x128/0x258
 lr : folio_clear_dirty_for_io+0x128/0x258
 Call trace:
  folio_clear_dirty_for_io+0x128/0x258
  btrfs_folio_clamp_clear_dirty+0x80/0xd0 [btrfs]
  __process_folios_contig+0x154/0x268 [btrfs]
  extent_clear_unlock_delalloc+0x5c/0x80 [btrfs]
  run_delalloc_nocow+0x5f8/0x760 [btrfs]
  btrfs_run_delalloc_range+0xa8/0x220 [btrfs]
  writepage_delalloc+0x230/0x4c8 [btrfs]
  extent_writepage+0xb8/0x358 [btrfs]
  extent_write_cache_pages+0x21c/0x4e8 [btrfs]
  btrfs_writepages+0x94/0x150 [btrfs]
  do_writepages+0x74/0x190
  filemap_fdatawrite_wbc+0x88/0xc8
  start_delalloc_inodes+0x178/0x3a8 [btrfs]
  btrfs_start_delalloc_roots+0x174/0x280 [btrfs]
  shrink_delalloc+0x114/0x280 [btrfs]
  flush_space+0x250/0x2f8 [btrfs]
  btrfs_async_reclaim_data_space+0x180/0x228 [btrfs]
  process_one_work+0x164/0x408
  worker_thread+0x25c/0x388
  kthread+0x100/0x118
  ret_from_fork+0x10/0x20
 Code: 910a8021 a90363f7 a9046bf9 94012379 (d4210000)
 ---[ end trace 0000000000000000 ]---

[CAUSE]
The first two lines of extra debug messages show the problem is caused
by the error handling of run_delalloc_nocow().

E.g. we have the following dirtied range (4K blocksize 4K page size):

    0                 16K                  32K
    |//////////////////////////////////////|
    |  Pre-allocated  |

And the range [0, 16K) has a preallocated extent.

- Enter run_delalloc_nocow() for range [0, 16K)
  Which found range [0, 16K) is preallocated, can do the proper NOCOW
  write.

- Enter fallback_to_fow() for range [16K, 32K)
  Since the range [16K, 32K) is not backed by preallocated extent, we
  have to go COW.

- cow_file_range() failed for range [16K, 32K)
  So cow_file_range() will do the clean up by clearing folio dirty,
  unlock the folios.

  Now the folios in range [16K, 32K) is unlocked.

- Enter extent_clear_unlock_delalloc() from run_delalloc_nocow()
  Which is called with PAGE_START_WRITEBACK to start page writeback.
  But folios can only be marked writeback when it's properly locked,
  thus this triggered the VM_BUG_ON_FOLIO().

Furthermore there is another hidden but common bug that
run_delalloc_nocow() is not clearing the folio dirty flags in its error
handling path.
This is the common bug shared between run_delalloc_nocow() and
cow_file_range().

[FIX]
- Clear folio dirty for range [@start, @cur_offset)
  Introduce a helper, cleanup_dirty_folios(), which
  will find and lock the folio in the range, clear the dirty flag and
  start/end the writeback, with the extra handling for the
  @locked_folio.

- Introduce a helper to record the last failed COW range end
  This is to trace which range we should skip, to avoid double
  unlocking.

- Skip the failed COW range for the error handling

Cc: stable@vger.kernel.org
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
kdave pushed a commit to kdave/btrfs-devel that referenced this pull request Dec 23, 2024
[BUG]
With CONFIG_DEBUG_VM set, test case generic/476 has some chance to crash
with the following VM_BUG_ON_FOLIO():

 BTRFS error (device dm-3): cow_file_range failed, start 1146880 end 1253375 len 106496 ret -28
 BTRFS error (device dm-3): run_delalloc_nocow failed, start 1146880 end 1253375 len 106496 ret -28
 page: refcount:4 mapcount:0 mapping:00000000592787cc index:0x12 pfn:0x10664
 aops:btrfs_aops [btrfs] ino:101 dentry name(?):"f1774"
 flags: 0x2fffff80004028(uptodate|lru|private|node=0|zone=2|lastcpupid=0xfffff)
 page dumped because: VM_BUG_ON_FOLIO(!folio_test_locked(folio))
 ------------[ cut here ]------------
 kernel BUG at mm/page-writeback.c:2992!
 Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
 CPU: 2 UID: 0 PID: 3943513 Comm: kworker/u24:15 Tainted: G           OE      6.12.0-rc7-custom+ torvalds#87
 Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
 Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022
 Workqueue: events_unbound btrfs_async_reclaim_data_space [btrfs]
 pc : folio_clear_dirty_for_io+0x128/0x258
 lr : folio_clear_dirty_for_io+0x128/0x258
 Call trace:
  folio_clear_dirty_for_io+0x128/0x258
  btrfs_folio_clamp_clear_dirty+0x80/0xd0 [btrfs]
  __process_folios_contig+0x154/0x268 [btrfs]
  extent_clear_unlock_delalloc+0x5c/0x80 [btrfs]
  run_delalloc_nocow+0x5f8/0x760 [btrfs]
  btrfs_run_delalloc_range+0xa8/0x220 [btrfs]
  writepage_delalloc+0x230/0x4c8 [btrfs]
  extent_writepage+0xb8/0x358 [btrfs]
  extent_write_cache_pages+0x21c/0x4e8 [btrfs]
  btrfs_writepages+0x94/0x150 [btrfs]
  do_writepages+0x74/0x190
  filemap_fdatawrite_wbc+0x88/0xc8
  start_delalloc_inodes+0x178/0x3a8 [btrfs]
  btrfs_start_delalloc_roots+0x174/0x280 [btrfs]
  shrink_delalloc+0x114/0x280 [btrfs]
  flush_space+0x250/0x2f8 [btrfs]
  btrfs_async_reclaim_data_space+0x180/0x228 [btrfs]
  process_one_work+0x164/0x408
  worker_thread+0x25c/0x388
  kthread+0x100/0x118
  ret_from_fork+0x10/0x20
 Code: 910a8021 a90363f7 a9046bf9 94012379 (d4210000)
 ---[ end trace 0000000000000000 ]---

[CAUSE]
The first two lines of extra debug messages show the problem is caused
by the error handling of run_delalloc_nocow().

E.g. we have the following dirtied range (4K blocksize 4K page size):

    0                 16K                  32K
    |//////////////////////////////////////|
    |  Pre-allocated  |

And the range [0, 16K) has a preallocated extent.

- Enter run_delalloc_nocow() for range [0, 16K)
  Which found range [0, 16K) is preallocated, can do the proper NOCOW
  write.

- Enter fallback_to_fow() for range [16K, 32K)
  Since the range [16K, 32K) is not backed by preallocated extent, we
  have to go COW.

- cow_file_range() failed for range [16K, 32K)
  So cow_file_range() will do the clean up by clearing folio dirty,
  unlock the folios.

  Now the folios in range [16K, 32K) is unlocked.

- Enter extent_clear_unlock_delalloc() from run_delalloc_nocow()
  Which is called with PAGE_START_WRITEBACK to start page writeback.
  But folios can only be marked writeback when it's properly locked,
  thus this triggered the VM_BUG_ON_FOLIO().

Furthermore there is another hidden but common bug that
run_delalloc_nocow() is not clearing the folio dirty flags in its error
handling path.
This is the common bug shared between run_delalloc_nocow() and
cow_file_range().

[FIX]
- Clear folio dirty for range [@start, @cur_offset)
  Introduce a helper, cleanup_dirty_folios(), which
  will find and lock the folio in the range, clear the dirty flag and
  start/end the writeback, with the extra handling for the
  @locked_folio.

- Introduce a helper to record the last failed COW range end
  This is to trace which range we should skip, to avoid double
  unlocking.

- Skip the failed COW range for the error handling

Cc: stable@vger.kernel.org
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant