Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Odroidxu4 v4.14 #2

Merged
merged 3 commits into from
Oct 15, 2017

Conversation

tobetter
Copy link

As per your request...

Signed-off-by: Dongjin Kim <tobetter@gmail.com>
This driver helps to register the device of GPIO based IR receiver, "gpio-ir-recv",
with the gpio number and pulse trigger when driver is loading. For example,

	# modprobe gpio-ir-recv
	# modprobe gpioplug-ir-recv gpio_nr=24 active_low=1

Change-Id: Ia0cf06140f51eb3f43d89dfe68f442d3e1f1a57e
Signed-off-by: Dongjin Kim <tobetter@gmail.com>
Change-Id: I65e9f38395dddfbe14daf7300a34a955453b5cb4
Signed-off-by: Dongjin Kim <tobetter@gmail.com>
@mihailescu2m mihailescu2m merged this pull request into mihailescu2m:odroidxu4-4.14.y Oct 15, 2017
mihailescu2m pushed a commit that referenced this pull request Oct 16, 2017
…p state

Driver calls request_firmware() whenever the device is opened for the
first time. As the device gets opened and closed, dev->num_inst == 1
is true several times. This is not necessary since the firmware is saved
in the fw_buf. s5p_mfc_load_firmware() copies the buffer returned by
the request_firmware() to dev->fw_buf.

fw_buf sticks around until it gets released from s5p_mfc_remove(), hence
there is no need to keep requesting firmware and copying it to fw_buf.

This might have been overlooked when changes are made to free fw_buf from
the device release interface s5p_mfc_release().

Fix s5p_mfc_load_firmware() to call request_firmware() once and keep state.
Change _probe() to load firmware once fw_buf has been allocated.

s5p_mfc_open() and it continues to call s5p_mfc_load_firmware() and init
hardware which is the step where firmware is written to the device.

This addresses the mfc_mutex contention due to repeated request_firmware()
calls from open() in the following circular locking warning:

[  552.194115] qtdemux0:sink/2710 is trying to acquire lock:
[  552.199488]  (&dev->mfc_mutex){+.+.}, at: [<bf145544>] s5p_mfc_mmap+0x28/0xd4 [s5p_mfc]
[  552.207459]
               but task is already holding lock:
[  552.213264]  (&mm->mmap_sem){++++}, at: [<c01df2e4>] vm_mmap_pgoff+0x44/0xb8
[  552.220284]
               which lock already depends on the new lock.

[  552.228429]
               the existing dependency chain (in reverse order) is:
[  552.235881]
               -> #2 (&mm->mmap_sem){++++}:
[  552.241259]        __might_fault+0x80/0xb0
[  552.245331]        filldir64+0xc0/0x2f8
[  552.249144]        call_filldir+0xb0/0x14c
[  552.253214]        ext4_readdir+0x768/0x90c
[  552.257374]        iterate_dir+0x74/0x168
[  552.261360]        SyS_getdents64+0x7c/0x1a0
[  552.265608]        ret_fast_syscall+0x0/0x28
[  552.269850]
               -> #1 (&type->i_mutex_dir_key#2){++++}:
[  552.276180]        down_read+0x48/0x90
[  552.279904]        lookup_slow+0x74/0x178
[  552.283889]        walk_component+0x1a4/0x2e4
[  552.288222]        link_path_walk+0x174/0x4a0
[  552.292555]        path_openat+0x68/0x944
[  552.296541]        do_filp_open+0x60/0xc4
[  552.300528]        file_open_name+0xe4/0x114
[  552.304772]        filp_open+0x28/0x48
[  552.308499]        kernel_read_file_from_path+0x30/0x78
[  552.313700]        _request_firmware+0x3ec/0x78c
[  552.318291]        request_firmware+0x3c/0x54
[  552.322642]        s5p_mfc_load_firmware+0x54/0x150 [s5p_mfc]
[  552.328358]        s5p_mfc_open+0x4e4/0x550 [s5p_mfc]
[  552.333394]        v4l2_open+0xa0/0x104 [videodev]
[  552.338137]        chrdev_open+0xa4/0x18c
[  552.342121]        do_dentry_open+0x208/0x310
[  552.346454]        path_openat+0x28c/0x944
[  552.350526]        do_filp_open+0x60/0xc4
[  552.354512]        do_sys_open+0x118/0x1c8
[  552.358586]        ret_fast_syscall+0x0/0x28
[  552.362830]
               -> #0 (&dev->mfc_mutex){+.+.}:
               -> #0 (&dev->mfc_mutex){+.+.}:
[  552.368379]        lock_acquire+0x6c/0x88
[  552.372364]        __mutex_lock+0x68/0xa34
[  552.376437]        mutex_lock_interruptible_nested+0x1c/0x24
[  552.382086]        s5p_mfc_mmap+0x28/0xd4 [s5p_mfc]
[  552.386939]        v4l2_mmap+0x54/0x88 [videodev]
[  552.391601]        mmap_region+0x3a8/0x638
[  552.395673]        do_mmap+0x330/0x3a4
[  552.399400]        vm_mmap_pgoff+0x90/0xb8
[  552.403472]        SyS_mmap_pgoff+0x90/0xc0
[  552.407632]        ret_fast_syscall+0x0/0x28
[  552.411876]
               other info that might help us debug this:

[  552.419848] Chain exists of:
                 &dev->mfc_mutex --> &type->i_mutex_dir_key#2 --> &mm->mmap_sem

[  552.431200]  Possible unsafe locking scenario:

[  552.437092]        CPU0                    CPU1
[  552.441598]        ----                    ----
[  552.446104]   lock(&mm->mmap_sem);
[  552.449484]                                lock(&type->i_mutex_dir_key#2);
[  552.456329]                                lock(&mm->mmap_sem);
[  552.462222]   lock(&dev->mfc_mutex);
[  552.465775]
                *** DEADLOCK ***

Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: memeka <mihailescu2m@gmail.com>
mihailescu2m pushed a commit that referenced this pull request Oct 24, 2017
Thomas reported that 'perf buildid-list' gets a SEGFAULT due to NULL
pointer deref when he ran it on a data with namespace events.  It was
because the buildid_id__mark_dso_hit_ops lacks the namespace event
handler and perf_too__fill_default() didn't set it.

  Program received signal SIGSEGV, Segmentation fault.
  0x0000000000000000 in ?? ()
  Missing separate debuginfos, use: dnf debuginfo-install audit-libs-2.7.7-1.fc25.s390x bzip2-libs-1.0.6-21.fc25.s390x elfutils-libelf-0.169-1.fc25.s390x
  +elfutils-libs-0.169-1.fc25.s390x libcap-ng-0.7.8-1.fc25.s390x numactl-libs-2.0.11-2.ibm.fc25.s390x openssl-libs-1.1.0e-1.1.ibm.fc25.s390x perl-libs-5.24.1-386.fc25.s390x
  +python-libs-2.7.13-2.fc25.s390x slang-2.3.0-7.fc25.s390x xz-libs-5.2.3-2.fc25.s390x zlib-1.2.8-10.fc25.s390x
  (gdb) where
  #0  0x0000000000000000 in ?? ()
  #1  0x00000000010fad6a in machines__deliver_event (machines=<optimized out>, machines@entry=0x2c6fd18,
      evlist=<optimized out>, event=event@entry=0x3fffdf00470, sample=0x3ffffffe880, sample@entry=0x3ffffffe888,
      tool=tool@entry=0x1312968 <build_id.mark_dso_hit_ops>, file_offset=1136) at util/session.c:1287
  #2  0x00000000010fbf4e in perf_session__deliver_event (file_offset=1136, tool=0x1312968 <build_id.mark_dso_hit_ops>,
      sample=0x3ffffffe888, event=0x3fffdf00470, session=0x2c6fc30) at util/session.c:1340
  #3  perf_session__process_event (session=0x2c6fc30, session@entry=0x0, event=event@entry=0x3fffdf00470,
      file_offset=file_offset@entry=1136) at util/session.c:1522
  #4  0x00000000010fddde in __perf_session__process_events (file_size=11880, data_size=<optimized out>,
      data_offset=<optimized out>, session=0x0) at util/session.c:1899
  #5  perf_session__process_events (session=0x0, session@entry=0x2c6fc30) at util/session.c:1953
  #6  0x000000000103b2ac in perf_session__list_build_ids (with_hits=<optimized out>, force=<optimized out>)
      at builtin-buildid-list.c:83
  #7  cmd_buildid_list (argc=<optimized out>, argv=<optimized out>) at builtin-buildid-list.c:115
  #8  0x00000000010a026c in run_builtin (p=0x1311f78 <commands+24>, argc=argc@entry=2, argv=argv@entry=0x3fffffff3c0)
      at perf.c:296
  #9  0x000000000102bc00 in handle_internal_command (argv=<optimized out>, argc=2) at perf.c:348
  hardkernel#10 run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:392
  hardkernel#11 main (argc=<optimized out>, argv=0x3fffffff3c0) at perf.c:536
  (gdb)

Fix it by adding a stub event handler for namespace event.

Committer testing:

Further clarifying, plain using 'perf buildid-list' will not end up in a
SEGFAULT when processing a perf.data file with namespace info:

  # perf record -a --namespaces sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 2.024 MB perf.data (1058 samples) ]
  # perf buildid-list | wc -l
  38
  # perf buildid-list | head -5
  e2a171c7b905826fc8494f0711ba76ab6abbd604 /lib/modules/4.14.0-rc3+/build/vmlinux
  874840a02d8f8a31cedd605d0b8653145472ced3 /lib/modules/4.14.0-rc3+/kernel/arch/x86/kvm/kvm-intel.ko
  ea7223776730cd8a22f320040aae4d54312984bc /lib/modules/4.14.0-rc3+/kernel/drivers/gpu/drm/i915/i915.ko
  5961535e6732a8edb7f22b3f148bb2fa2e0be4b9 /lib/modules/4.14.0-rc3+/kernel/drivers/gpu/drm/drm.ko
  f045f54aa78cf1931cc893f78b6cbc52c72a8cb1 /usr/lib64/libc-2.25.so
  #

It is only when one asks for checking what of those entries actually had
samples, i.e. when we use either -H or --with-hits, that we will process
all the PERF_RECORD_ events, and since tools/perf/builtin-buildid-list.c
neither explicitely set a perf_tool.namespaces() callback nor the
default stub was set that we end up, when processing a
PERF_RECORD_NAMESPACE record, causing a SEGFAULT:

  # perf buildid-list -H
  Segmentation fault (core dumped)
  ^C
  #

Reported-and-Tested-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Hari Bathini <hbathini@linux.vnet.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Fixes: f3b3614 ("perf tools: Add PERF_RECORD_NAMESPACES to include namespaces related info")
Link: http://lkml.kernel.org/r/20171017132900.11043-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
mihailescu2m pushed a commit that referenced this pull request Oct 24, 2017
When an EMAD is transmitted, a timeout work item is scheduled with a
delay of 200ms, so that another EMAD will be retried until a maximum of
five retries.

In certain situations, it's possible for the function waiting on the
EMAD to be associated with a work item that is queued on the same
workqueue (`mlxsw_core`) as the timeout work item. This results in
flushing a work item on the same workqueue.

According to commit e159489 ("workqueue: relax lockdep annotation
on flush_work()") the above may lead to a deadlock in case the workqueue
has only one worker active or if the system in under memory pressure and
the rescue worker is in use. The latter explains the very rare and
random nature of the lockdep splats we have been seeing:

[   52.730240] ============================================
[   52.736179] WARNING: possible recursive locking detected
[   52.742119] 4.14.0-rc3jiri+ #4 Not tainted
[   52.746697] --------------------------------------------
[   52.752635] kworker/1:3/599 is trying to acquire lock:
[   52.758378]  (mlxsw_core_driver_name){+.+.}, at: [<ffffffff811c4fa4>] flush_work+0x3a4/0x5e0
[   52.767837]
               but task is already holding lock:
[   52.774360]  (mlxsw_core_driver_name){+.+.}, at: [<ffffffff811c65c4>] process_one_work+0x7d4/0x12f0
[   52.784495]
               other info that might help us debug this:
[   52.791794]  Possible unsafe locking scenario:
[   52.798413]        CPU0
[   52.801144]        ----
[   52.803875]   lock(mlxsw_core_driver_name);
[   52.808556]   lock(mlxsw_core_driver_name);
[   52.813236]
                *** DEADLOCK ***
[   52.819857]  May be due to missing lock nesting notation
[   52.827450] 3 locks held by kworker/1:3/599:
[   52.832221]  #0:  (mlxsw_core_driver_name){+.+.}, at: [<ffffffff811c65c4>] process_one_work+0x7d4/0x12f0
[   52.842846]  #1:  ((&(&bridge->fdb_notify.dw)->work)){+.+.}, at: [<ffffffff811c65c4>] process_one_work+0x7d4/0x12f0
[   52.854537]  #2:  (rtnl_mutex){+.+.}, at: [<ffffffff822ad8e7>] rtnl_lock+0x17/0x20
[   52.863021]
               stack backtrace:
[   52.867890] CPU: 1 PID: 599 Comm: kworker/1:3 Not tainted 4.14.0-rc3jiri+ #4
[   52.875773] Hardware name: Mellanox Technologies Ltd. "MSN2100-CB2F"/"SA001017", BIOS 5.6.5 06/07/2016
[   52.886267] Workqueue: mlxsw_core mlxsw_sp_fdb_notify_work [mlxsw_spectrum]
[   52.894060] Call Trace:
[   52.909122]  __lock_acquire+0xf6f/0x2a10
[   53.025412]  lock_acquire+0x158/0x440
[   53.047557]  flush_work+0x3c4/0x5e0
[   53.087571]  __cancel_work_timer+0x3ca/0x5e0
[   53.177051]  cancel_delayed_work_sync+0x13/0x20
[   53.182142]  mlxsw_reg_trans_bulk_wait+0x12d/0x7a0 [mlxsw_core]
[   53.194571]  mlxsw_core_reg_access+0x586/0x990 [mlxsw_core]
[   53.225365]  mlxsw_reg_query+0x10/0x20 [mlxsw_core]
[   53.230882]  mlxsw_sp_fdb_notify_work+0x2a3/0x9d0 [mlxsw_spectrum]
[   53.237801]  process_one_work+0x8f1/0x12f0
[   53.321804]  worker_thread+0x1fd/0x10c0
[   53.435158]  kthread+0x28e/0x370
[   53.448703]  ret_from_fork+0x2a/0x40
[   53.453017] mlxsw_spectrum 0000:01:00.0: EMAD retries (2/5) (tid=bf4549b100000774)
[   53.453119] mlxsw_spectrum 0000:01:00.0: EMAD retries (5/5) (tid=bf4549b100000770)
[   53.453132] mlxsw_spectrum 0000:01:00.0: EMAD reg access failed (tid=bf4549b100000770,reg_id=200b(sfn),type=query,status=0(operation performed))
[   53.453143] mlxsw_spectrum 0000:01:00.0: Failed to get FDB notifications

Fix this by creating another workqueue for EMAD timeouts, thereby
preventing the situation of a work item trying to flush a work item
queued on the same workqueue.

Fixes: caf7297 ("mlxsw: core: Introduce support for asynchronous EMAD register access")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mihailescu2m pushed a commit that referenced this pull request Oct 31, 2017
v4.10 commit 6f2ce1c ("scsi: zfcp: fix rport unblock race with LUN
recovery") extended accessing parent pointer fields of struct
zfcp_erp_action for tracing.  If an erp_action has never been enqueued
before, these parent pointer fields are uninitialized and NULL. Examples
are zfcp objects freshly added to the parent object's children list,
before enqueueing their first recovery subsequently. In
zfcp_erp_try_rport_unblock(), we iterate such list. Accessing erp_action
fields can cause a NULL pointer dereference.  Since the kernel can read
from lowcore on s390, it does not immediately cause a kernel page
fault. Instead it can cause hangs on trying to acquire the wrong
erp_action->adapter->dbf->rec_lock in zfcp_dbf_rec_action_lvl()
                      ^bogus^
while holding already other locks with IRQs disabled.

Real life example from attaching lots of LUNs in parallel on many CPUs:

crash> bt 17723
PID: 17723  TASK: ...               CPU: 25  COMMAND: "zfcperp0.0.1800"
 LOWCORE INFO:
  -psw      : 0x0404300180000000 0x000000000038e424
  -function : _raw_spin_lock_wait_flags at 38e424
...
 #0 [fdde8fc90] zfcp_dbf_rec_action_lvl at 3e0004e9862 [zfcp]
 #1 [fdde8fce8] zfcp_erp_try_rport_unblock at 3e0004dfddc [zfcp]
 #2 [fdde8fd38] zfcp_erp_strategy at 3e0004e0234 [zfcp]
 #3 [fdde8fda8] zfcp_erp_thread at 3e0004e0a12 [zfcp]
 #4 [fdde8fe60] kthread at 173550
 #5 [fdde8feb8] kernel_thread_starter at 10add2

zfcp_adapter
 zfcp_port
  zfcp_unit <address>, 0x404040d600000000
  scsi_device NULL, returning early!
zfcp_scsi_dev.status = 0x40000000
0x40000000 ZFCP_STATUS_COMMON_RUNNING

crash> zfcp_unit <address>
struct zfcp_unit {
  erp_action = {
    adapter = 0x0,
    port = 0x0,
    unit = 0x0,
  },
}

zfcp_erp_action is always fully embedded into its container object. Such
container object is never moved in its object tree (only add or delete).
Hence, erp_action parent pointers can never change.

To fix the issue, initialize the erp_action parent pointers before
adding the erp_action container to any list and thus before it becomes
accessible from outside of its initializing function.

In order to also close the time window between zfcp_erp_setup_act()
memsetting the entire erp_action to zero and setting the parent pointers
again, drop the memset and instead explicitly initialize individually
all erp_action fields except for parent pointers. To be extra careful
not to introduce any other unintended side effect, even keep zeroing the
erp_action fields for list and timer. Also double-check with
WARN_ON_ONCE that erp_action parent pointers never change, so we get to
know when we would deviate from previous behavior.

Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: 6f2ce1c ("scsi: zfcp: fix rport unblock race with LUN recovery")
Cc: <stable@vger.kernel.org> #2.6.32+
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
mihailescu2m pushed a commit that referenced this pull request Nov 17, 2017
syzkaller with KASAN reported an out-of-bounds read in
asn1_ber_decoder().  It can be reproduced by the following command,
assuming CONFIG_X509_CERTIFICATE_PARSER=y and CONFIG_KASAN=y:

    keyctl add asymmetric desc $'\x30\x30' @s

The bug is that the length of an ASN.1 data value isn't validated in the
case where it is encoded using the short form, causing the decoder to
read past the end of the input buffer.  Fix it by validating the length.

The bug report was:

    BUG: KASAN: slab-out-of-bounds in asn1_ber_decoder+0x10cb/0x1730 lib/asn1_decoder.c:233
    Read of size 1 at addr ffff88003cccfa02 by task syz-executor0/6818

    CPU: 1 PID: 6818 Comm: syz-executor0 Not tainted 4.14.0-rc7-00008-g5f479447d983 #2
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
    Call Trace:
     __dump_stack lib/dump_stack.c:16 [inline]
     dump_stack+0xb3/0x10b lib/dump_stack.c:52
     print_address_description+0x79/0x2a0 mm/kasan/report.c:252
     kasan_report_error mm/kasan/report.c:351 [inline]
     kasan_report+0x236/0x340 mm/kasan/report.c:409
     __asan_report_load1_noabort+0x14/0x20 mm/kasan/report.c:427
     asn1_ber_decoder+0x10cb/0x1730 lib/asn1_decoder.c:233
     x509_cert_parse+0x1db/0x650 crypto/asymmetric_keys/x509_cert_parser.c:89
     x509_key_preparse+0x64/0x7a0 crypto/asymmetric_keys/x509_public_key.c:174
     asymmetric_key_preparse+0xcb/0x1a0 crypto/asymmetric_keys/asymmetric_type.c:388
     key_create_or_update+0x347/0xb20 security/keys/key.c:855
     SYSC_add_key security/keys/keyctl.c:122 [inline]
     SyS_add_key+0x1cd/0x340 security/keys/keyctl.c:62
     entry_SYSCALL_64_fastpath+0x1f/0xbe
    RIP: 0033:0x447c89
    RSP: 002b:00007fca7a5d3bd8 EFLAGS: 00000246 ORIG_RAX: 00000000000000f8
    RAX: ffffffffffffffda RBX: 00007fca7a5d46cc RCX: 0000000000447c89
    RDX: 0000000020006f4a RSI: 0000000020006000 RDI: 0000000020001ff5
    RBP: 0000000000000046 R08: fffffffffffffffd R09: 0000000000000000
    R10: 0000000000000002 R11: 0000000000000246 R12: 0000000000000000
    R13: 0000000000000000 R14: 00007fca7a5d49c0 R15: 00007fca7a5d4700

Fixes: 42d5ec2 ("X.509: Add an ASN.1 decoder")
Cc: <stable@vger.kernel.org> # v3.7+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
mihailescu2m pushed a commit that referenced this pull request Nov 17, 2017
syzkaller reported a NULL pointer dereference in asn1_ber_decoder().  It
can be reproduced by the following command, assuming
CONFIG_PKCS7_TEST_KEY=y:

        keyctl add pkcs7_test desc '' @s

The bug is that if the data buffer is empty, an integer underflow occurs
in the following check:

        if (unlikely(dp >= datalen - 1))
                goto data_overrun_error;

This results in the NULL data pointer being dereferenced.

Fix it by checking for 'datalen - dp < 2' instead.

Also fix the similar check for 'dp >= datalen - n' later in the same
function.  That one possibly could result in a buffer overread.

The NULL pointer dereference was reproducible using the "pkcs7_test" key
type but not the "asymmetric" key type because the "asymmetric" key type
checks for a 0-length payload before calling into the ASN.1 decoder but
the "pkcs7_test" key type does not.

The bug report was:

    BUG: unable to handle kernel NULL pointer dereference at           (null)
    IP: asn1_ber_decoder+0x17f/0xe60 lib/asn1_decoder.c:233
    PGD 7b708067 P4D 7b708067 PUD 7b6ee067 PMD 0
    Oops: 0000 [#1] SMP
    Modules linked in:
    CPU: 0 PID: 522 Comm: syz-executor1 Not tainted 4.14.0-rc8 #7
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.3-20171021_125229-anatol 04/01/2014
    task: ffff9b6b3798c040 task.stack: ffff9b6b37970000
    RIP: 0010:asn1_ber_decoder+0x17f/0xe60 lib/asn1_decoder.c:233
    RSP: 0018:ffff9b6b37973c78 EFLAGS: 00010216
    RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000000021c
    RDX: ffffffff814a04ed RSI: ffffb1524066e000 RDI: ffffffff910759e0
    RBP: ffff9b6b37973d60 R08: 0000000000000001 R09: ffff9b6b3caa4180
    R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002
    R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
    FS:  00007f10ed1f2700(0000) GS:ffff9b6b3ea00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000000 CR3: 000000007b6f3000 CR4: 00000000000006f0
    Call Trace:
     pkcs7_parse_message+0xee/0x240 crypto/asymmetric_keys/pkcs7_parser.c:139
     verify_pkcs7_signature+0x33/0x180 certs/system_keyring.c:216
     pkcs7_preparse+0x41/0x70 crypto/asymmetric_keys/pkcs7_key_type.c:63
     key_create_or_update+0x180/0x530 security/keys/key.c:855
     SYSC_add_key security/keys/keyctl.c:122 [inline]
     SyS_add_key+0xbf/0x250 security/keys/keyctl.c:62
     entry_SYSCALL_64_fastpath+0x1f/0xbe
    RIP: 0033:0x4585c9
    RSP: 002b:00007f10ed1f1bd8 EFLAGS: 00000216 ORIG_RAX: 00000000000000f8
    RAX: ffffffffffffffda RBX: 00007f10ed1f2700 RCX: 00000000004585c9
    RDX: 0000000020000000 RSI: 0000000020008ffb RDI: 0000000020008000
    RBP: 0000000000000000 R08: ffffffffffffffff R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000216 R12: 00007fff1b2260ae
    R13: 00007fff1b2260af R14: 00007f10ed1f2700 R15: 0000000000000000
    Code: dd ca ff 48 8b 45 88 48 83 e8 01 4c 39 f0 0f 86 a8 07 00 00 e8 53 dd ca ff 49 8d 46 01 48 89 85 58 ff ff ff 48 8b 85 60 ff ff ff <42> 0f b6 0c 30 89 c8 88 8d 75 ff ff ff 83 e0 1f 89 8d 28 ff ff
    RIP: asn1_ber_decoder+0x17f/0xe60 lib/asn1_decoder.c:233 RSP: ffff9b6b37973c78
    CR2: 0000000000000000

Fixes: 42d5ec2 ("X.509: Add an ASN.1 decoder")
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: <stable@vger.kernel.org> # v3.7+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
mihailescu2m pushed a commit that referenced this pull request Nov 17, 2017
…kernel/git/jmorris/linux-security

Pull key handling fix from James Morris:
 "Fix by Eric Biggers for the keys subsystem"

* 'fixes-v4.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  KEYS: fix NULL pointer dereference during ASN.1 parsing [ver #2]
mihailescu2m pushed a commit that referenced this pull request Jan 21, 2019
This patch modified test_btf pretty print test to cover
the bitfield with struct member equal to or greater 256.

Without the previous kernel patch fix, the modified test will fail:

  $ test_btf -p
  ......
  BTF pretty print array(#1)......unexpected pprint output
  expected: 0: {0,0,0,0x3,0x0,0x3,{0|[0,0,0,0,0,0,0,0]},ENUM_ZERO,4,0x1}
      read: 0: {0,0,0,0x3,0x0,0x3,{0|[0,0,0,0,0,0,0,0]},ENUM_ZERO,4,0x0}

  BTF pretty print array(#2)......unexpected pprint output
  expected: 0: {0,0,0,0x3,0x0,0x3,{0|[0,0,0,0,0,0,0,0]},ENUM_ZERO,4,0x1}
      read: 0: {0,0,0,0x3,0x0,0x3,{0|[0,0,0,0,0,0,0,0]},ENUM_ZERO,4,0x0}

  PASS:6 SKIP:0 FAIL:2

With the kernel fix, the modified test will succeed:
  $ test_btf -p
  ......
  BTF pretty print array(#1)......OK
  BTF pretty print array(#2)......OK
  PASS:8 SKIP:0 FAIL:0

Fixes: 9d5f9f7 ("bpf: btf: fix struct/union/fwd types with kind_flag")
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
mihailescu2m pushed a commit that referenced this pull request Jan 21, 2019
Yonghong Song says:

====================
The previous BTF kind_flag support patch set introduced a bug
for kernel bpffs pretty printing and another bug for bpftool
map pretty printing. If a bitfield struct member offset is
greater than 256 bits, printed value for that struct
member will be incorrect.

- Patch #1 fixed the bug in kernel bpffs pretty printing.
- Patch #2 enhanced the test_btf test case to cover the
           issue exposed by patch #1.
- Patch #3 fixed the bug in bpftool map pretty printing.
====================

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
mihailescu2m pushed a commit that referenced this pull request Jan 21, 2019
Taehee Yoo says:

====================
net: bpfilter: fix two bugs in bpfilter

This patches fix two bugs in the bpfilter_umh which are related in
iptables command.

The first patch adds an exit code for UMH process.
This provides an opportunity to cleanup members of the umh_info
to modules which use the UMH.
In order to identify UMH processes, a new flag PF_UMH is added.

The second patch makes the bpfilter_umh use UMH cleanup callback.

The third patch adds re-start routine for the bpfilter_umh.
The bpfilter_umh does not re-start after error occurred.
because there is no re-start routine in the module.

The fourth patch ensures that the bpfilter.ko module will not removed while
it's being used.
The bpfilter.ko is not protected by locks or module reference counter.
Therefore that can be removed while module is being used.
In order to protect that, mutex is used.

The first and second patch are preparation patches for the third and
fourth patch.

TEST #1
   while :
   do
	modprobe bpfilter
	kill -9 <pid of the bpfilter_umh>
	iptables -vnL
   done

TEST #2
   while :
   do
	iptables -I FORWARD -m string --string ap --algo kmp &
	iptables -F &
	modprobe -rv bpfilter &
   done

TEST #3
   while :
   do
	modprobe bpfilter &
	modprobe -rv bpfilter &
   done

The TEST1 makes a failure of iptables command.
This is fixed by the third patch.

The TEST2 makes a panic because of a race condition in the bpfilter_umh
module.
This is fixed by the fourth patch.

The TEST3 makes a double-create UMH process.
This is fixed by the third and fourth patch.

v4 :
 - declare the exit_umh() as static inline
 - check stop flag in the load_umh() to avoid a double-create UMH
v3 :
 - Avoid unnecessary list lookup for non-UMH processes
 - Add a new PF_UMH flag
v2 : add the first and second patch
v1 : Initial patch
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
mihailescu2m pushed a commit that referenced this pull request Jan 28, 2019
syzbot reports following splat:

BUG: KMSAN: uninit-value in strlen+0x3b/0xa0 lib/string.c:486
CPU: 1 PID: 11057 Comm: syz-executor0 Not tainted 4.20.0-rc7+ #2
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x173/0x1d0 lib/dump_stack.c:113
 kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:613
 __msan_warning+0x82/0xf0 mm/kmsan/kmsan_instr.c:295
 strlen+0x3b/0xa0 lib/string.c:486
 nla_put_string include/net/netlink.h:1154 [inline]
 tipc_nl_compat_link_reset_stats+0x1f0/0x360 net/tipc/netlink_compat.c:760
 __tipc_nl_compat_doit net/tipc/netlink_compat.c:311 [inline]
 tipc_nl_compat_doit+0x3aa/0xaf0 net/tipc/netlink_compat.c:344
 tipc_nl_compat_handle net/tipc/netlink_compat.c:1107 [inline]
 tipc_nl_compat_recv+0x14d7/0x2760 net/tipc/netlink_compat.c:1210
 genl_family_rcv_msg net/netlink/genetlink.c:601 [inline]
 genl_rcv_msg+0x185f/0x1a60 net/netlink/genetlink.c:626
 netlink_rcv_skb+0x444/0x640 net/netlink/af_netlink.c:2477
 genl_rcv+0x63/0x80 net/netlink/genetlink.c:637
 netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
 netlink_unicast+0xf40/0x1020 net/netlink/af_netlink.c:1336
 netlink_sendmsg+0x127f/0x1300 net/netlink/af_netlink.c:1917
 sock_sendmsg_nosec net/socket.c:621 [inline]
 sock_sendmsg net/socket.c:631 [inline]
 ___sys_sendmsg+0xdb9/0x11b0 net/socket.c:2116
 __sys_sendmsg net/socket.c:2154 [inline]
 __do_sys_sendmsg net/socket.c:2163 [inline]
 __se_sys_sendmsg+0x305/0x460 net/socket.c:2161
 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161
 do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
 entry_SYSCALL_64_after_hwframe+0x63/0xe7
RIP: 0033:0x457ec9
Code: 6d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f2557338c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457ec9
RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000003
RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f25573396d4
R13: 00000000004cb478 R14: 00000000004d86c8 R15: 00000000ffffffff

Uninit was created at:
 kmsan_save_stack_with_flags mm/kmsan/kmsan.c:204 [inline]
 kmsan_internal_poison_shadow+0x92/0x150 mm/kmsan/kmsan.c:158
 kmsan_kmalloc+0xa6/0x130 mm/kmsan/kmsan_hooks.c:176
 kmsan_slab_alloc+0xe/0x10 mm/kmsan/kmsan_hooks.c:185
 slab_post_alloc_hook mm/slab.h:446 [inline]
 slab_alloc_node mm/slub.c:2759 [inline]
 __kmalloc_node_track_caller+0xe18/0x1030 mm/slub.c:4383
 __kmalloc_reserve net/core/skbuff.c:137 [inline]
 __alloc_skb+0x309/0xa20 net/core/skbuff.c:205
 alloc_skb include/linux/skbuff.h:998 [inline]
 netlink_alloc_large_skb net/netlink/af_netlink.c:1182 [inline]
 netlink_sendmsg+0xb82/0x1300 net/netlink/af_netlink.c:1892
 sock_sendmsg_nosec net/socket.c:621 [inline]
 sock_sendmsg net/socket.c:631 [inline]
 ___sys_sendmsg+0xdb9/0x11b0 net/socket.c:2116
 __sys_sendmsg net/socket.c:2154 [inline]
 __do_sys_sendmsg net/socket.c:2163 [inline]
 __se_sys_sendmsg+0x305/0x460 net/socket.c:2161
 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161
 do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
 entry_SYSCALL_64_after_hwframe+0x63/0xe7

The uninitialised access happened in tipc_nl_compat_link_reset_stats:
    nla_put_string(skb, TIPC_NLA_LINK_NAME, name)

This is because name string is not validated before it's used.

Reported-by: syzbot+e01d94b5a4c266be6e4c@syzkaller.appspotmail.com
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mihailescu2m pushed a commit that referenced this pull request Jan 28, 2019
syzbot reports following splat:

BUG: KMSAN: uninit-value in strlen+0x3b/0xa0 lib/string.c:486
CPU: 1 PID: 9306 Comm: syz-executor172 Not tainted 4.20.0-rc7+ #2
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x173/0x1d0 lib/dump_stack.c:113
  kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:613
  __msan_warning+0x82/0xf0 mm/kmsan/kmsan_instr.c:313
  strlen+0x3b/0xa0 lib/string.c:486
  nla_put_string include/net/netlink.h:1154 [inline]
  __tipc_nl_compat_link_set net/tipc/netlink_compat.c:708 [inline]
  tipc_nl_compat_link_set+0x929/0x1220 net/tipc/netlink_compat.c:744
  __tipc_nl_compat_doit net/tipc/netlink_compat.c:311 [inline]
  tipc_nl_compat_doit+0x3aa/0xaf0 net/tipc/netlink_compat.c:344
  tipc_nl_compat_handle net/tipc/netlink_compat.c:1107 [inline]
  tipc_nl_compat_recv+0x14d7/0x2760 net/tipc/netlink_compat.c:1210
  genl_family_rcv_msg net/netlink/genetlink.c:601 [inline]
  genl_rcv_msg+0x185f/0x1a60 net/netlink/genetlink.c:626
  netlink_rcv_skb+0x444/0x640 net/netlink/af_netlink.c:2477
  genl_rcv+0x63/0x80 net/netlink/genetlink.c:637
  netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
  netlink_unicast+0xf40/0x1020 net/netlink/af_netlink.c:1336
  netlink_sendmsg+0x127f/0x1300 net/netlink/af_netlink.c:1917
  sock_sendmsg_nosec net/socket.c:621 [inline]
  sock_sendmsg net/socket.c:631 [inline]
  ___sys_sendmsg+0xdb9/0x11b0 net/socket.c:2116
  __sys_sendmsg net/socket.c:2154 [inline]
  __do_sys_sendmsg net/socket.c:2163 [inline]
  __se_sys_sendmsg+0x305/0x460 net/socket.c:2161
  __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161
  do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
  entry_SYSCALL_64_after_hwframe+0x63/0xe7

The uninitialised access happened in
    nla_put_string(skb, TIPC_NLA_LINK_NAME, lc->name)

This is because lc->name string is not validated before it's used.

Reported-by: syzbot+d78b8a29241a195aefb8@syzkaller.appspotmail.com
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mihailescu2m pushed a commit that referenced this pull request Jan 28, 2019
Ido Schimmel says:

====================
mlxsw: Various fixes

This patchset contains small fixes in mlxsw and one fix in the bridge
driver.

Patches #1-#4 perform small adjustments in PCI and FID code following
recent tests that were performed on the Spectrum-2 ASIC.

Patch #5 fixes the bridge driver to mark FDB entries that were added by
user as such. Otherwise, these entries will be ignored by underlying
switch drivers.

Patch #6 fixes a long standing issue in mlxsw where the driver
incorrectly programmed static FDB entries as both static and sticky.

Patches #7-#8 add test cases for above mentioned bugs.

Please consider patches #1, #2 and #4 for stable.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
mihailescu2m pushed a commit that referenced this pull request Feb 4, 2019
Song Liu reported crash in 'perf record':

  > #0  0x0000000000500055 in ordered_events(float, long double,...)(...) ()
  > #1  0x0000000000500196 in ordered_events.reinit ()
  > #2  0x00000000004fe413 in perf_session.process_events ()
  > #3  0x0000000000440431 in cmd_record ()
  > #4  0x00000000004a439f in run_builtin ()
  > #5  0x000000000042b3e5 in main ()"

This can happen when we get out of buffers during event processing.

The subsequent ordered_events__free() call assumes oe->buffer != NULL
and crashes. Add a check to prevent that.

Reported-by: Song Liu <liu.song.a23@gmail.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Song Liu <liu.song.a23@gmail.com>
Tested-by: Song Liu <liu.song.a23@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190117113017.12977-1-jolsa@kernel.org
Fixes: d5ceb62 ("perf ordered_events: Add 'struct ordered_events_buffer' layer")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
mihailescu2m pushed a commit that referenced this pull request Feb 4, 2019
With the following commit:

  73d5e2b ("cpu/hotplug: detect SMT disabled by BIOS")

... the hotplug code attempted to detect when SMT was disabled by BIOS,
in which case it reported SMT as permanently disabled.  However, that
code broke a virt hotplug scenario, where the guest is booted with only
primary CPU threads, and a sibling is brought online later.

The problem is that there doesn't seem to be a way to reliably
distinguish between the HW "SMT disabled by BIOS" case and the virt
"sibling not yet brought online" case.  So the above-mentioned commit
was a bit misguided, as it permanently disabled SMT for both cases,
preventing future virt sibling hotplugs.

Going back and reviewing the original problems which were attempted to
be solved by that commit, when SMT was disabled in BIOS:

  1) /sys/devices/system/cpu/smt/control showed "on" instead of
     "notsupported"; and

  2) vmx_vm_init() was incorrectly showing the L1TF_MSG_SMT warning.

I'd propose that we instead consider #1 above to not actually be a
problem.  Because, at least in the virt case, it's possible that SMT
wasn't disabled by BIOS and a sibling thread could be brought online
later.  So it makes sense to just always default the smt control to "on"
to allow for that possibility (assuming cpuid indicates that the CPU
supports SMT).

The real problem is #2, which has a simple fix: change vmx_vm_init() to
query the actual current SMT state -- i.e., whether any siblings are
currently online -- instead of looking at the SMT "control" sysfs value.

So fix it by:

  a) reverting the original "fix" and its followup fix:

     73d5e2b ("cpu/hotplug: detect SMT disabled by BIOS")
     bc2d8d2 ("cpu/hotplug: Fix SMT supported evaluation")

     and

  b) changing vmx_vm_init() to query the actual current SMT state --
     instead of the sysfs control value -- to determine whether the L1TF
     warning is needed.  This also requires the 'sched_smt_present'
     variable to exported, instead of 'cpu_smt_control'.

Fixes: 73d5e2b ("cpu/hotplug: detect SMT disabled by BIOS")
Reported-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Joe Mario <jmario@redhat.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kvm@vger.kernel.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/e3a85d585da28cc333ecbc1e78ee9216e6da9396.1548794349.git.jpoimboe@redhat.com
mihailescu2m pushed a commit that referenced this pull request Feb 4, 2019
When option CONFIG_KASAN is enabled toghether with ftrace, function
ftrace_graph_caller() gets in to a recursion, via functions
kasan_check_read() and kasan_check_write().

 Breakpoint 2, ftrace_graph_caller () at ../arch/arm64/kernel/entry-ftrace.S:179
 179             mcount_get_pc             x0    //     function's pc
 (gdb) bt
 #0  ftrace_graph_caller () at ../arch/arm64/kernel/entry-ftrace.S:179
 #1  0xffffff90101406c8 in ftrace_caller () at ../arch/arm64/kernel/entry-ftrace.S:151
 #2  0xffffff90106fd084 in kasan_check_write (p=0xffffffc06c170878, size=4) at ../mm/kasan/common.c:105
 #3  0xffffff90104a2464 in atomic_add_return (v=<optimized out>, i=<optimized out>) at ./include/generated/atomic-instrumented.h:71
 #4  atomic_inc_return (v=<optimized out>) at ./include/generated/atomic-fallback.h:284
 #5  trace_graph_entry (trace=0xffffffc03f5ff380) at ../kernel/trace/trace_functions_graph.c:441
 #6  0xffffff9010481774 in trace_graph_entry_watchdog (trace=<optimized out>) at ../kernel/trace/trace_selftest.c:741
 #7  0xffffff90104a185c in function_graph_enter (ret=<optimized out>, func=<optimized out>, frame_pointer=18446743799894897728, retp=<optimized out>) at ../kernel/trace/trace_functions_graph.c:196
 #8  0xffffff9010140628 in prepare_ftrace_return (self_addr=18446743592948977792, parent=0xffffffc03f5ff418, frame_pointer=18446743799894897728) at ../arch/arm64/kernel/ftrace.c:231
 #9  0xffffff90101406f4 in ftrace_graph_caller () at ../arch/arm64/kernel/entry-ftrace.S:182
 Backtrace stopped: previous frame identical to this frame (corrupt stack?)
 (gdb)

Rework so that the kasan implementation isn't traced.

Link: http://lkml.kernel.org/r/20181212183447.15890-1-anders.roxell@linaro.org
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mihailescu2m pushed a commit that referenced this pull request Feb 14, 2019
Yonghong Song says:

====================
The current btf implementation disallows the typedef of
a func_proto type. This actually is allowed per C standard.
This patch fixed btf verification to permit such types.
Patch #1 fixed the kernel side and Patch #2 fixed
the tools test_btf test.
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
mihailescu2m pushed a commit that referenced this pull request Feb 14, 2019
Lockdep found a potential deadlock between cpu_hotplug_lock, bpf_event_mutex, and cpuctx_mutex:
[   13.007000] WARNING: possible circular locking dependency detected
[   13.007587] 5.0.0-rc3-00018-g2fa53f892422-dirty torvalds#477 Not tainted
[   13.008124] ------------------------------------------------------
[   13.008624] test_progs/246 is trying to acquire lock:
[   13.009030] 0000000094160d1d (tracepoints_mutex){+.+.}, at: tracepoint_probe_register_prio+0x2d/0x300
[   13.009770]
[   13.009770] but task is already holding lock:
[   13.010239] 00000000d663ef86 (bpf_event_mutex){+.+.}, at: bpf_probe_register+0x1d/0x60
[   13.010877]
[   13.010877] which lock already depends on the new lock.
[   13.010877]
[   13.011532]
[   13.011532] the existing dependency chain (in reverse order) is:
[   13.012129]
[   13.012129] -> #4 (bpf_event_mutex){+.+.}:
[   13.012582]        perf_event_query_prog_array+0x9b/0x130
[   13.013016]        _perf_ioctl+0x3aa/0x830
[   13.013354]        perf_ioctl+0x2e/0x50
[   13.013668]        do_vfs_ioctl+0x8f/0x6a0
[   13.014003]        ksys_ioctl+0x70/0x80
[   13.014320]        __x64_sys_ioctl+0x16/0x20
[   13.014668]        do_syscall_64+0x4a/0x180
[   13.015007]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[   13.015469]
[   13.015469] -> #3 (&cpuctx_mutex){+.+.}:
[   13.015910]        perf_event_init_cpu+0x5a/0x90
[   13.016291]        perf_event_init+0x1b2/0x1de
[   13.016654]        start_kernel+0x2b8/0x42a
[   13.016995]        secondary_startup_64+0xa4/0xb0
[   13.017382]
[   13.017382] -> #2 (pmus_lock){+.+.}:
[   13.017794]        perf_event_init_cpu+0x21/0x90
[   13.018172]        cpuhp_invoke_callback+0xb3/0x960
[   13.018573]        _cpu_up+0xa7/0x140
[   13.018871]        do_cpu_up+0xa4/0xc0
[   13.019178]        smp_init+0xcd/0xd2
[   13.019483]        kernel_init_freeable+0x123/0x24f
[   13.019878]        kernel_init+0xa/0x110
[   13.020201]        ret_from_fork+0x24/0x30
[   13.020541]
[   13.020541] -> #1 (cpu_hotplug_lock.rw_sem){++++}:
[   13.021051]        static_key_slow_inc+0xe/0x20
[   13.021424]        tracepoint_probe_register_prio+0x28c/0x300
[   13.021891]        perf_trace_event_init+0x11f/0x250
[   13.022297]        perf_trace_init+0x6b/0xa0
[   13.022644]        perf_tp_event_init+0x25/0x40
[   13.023011]        perf_try_init_event+0x6b/0x90
[   13.023386]        perf_event_alloc+0x9a8/0xc40
[   13.023754]        __do_sys_perf_event_open+0x1dd/0xd30
[   13.024173]        do_syscall_64+0x4a/0x180
[   13.024519]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[   13.024968]
[   13.024968] -> #0 (tracepoints_mutex){+.+.}:
[   13.025434]        __mutex_lock+0x86/0x970
[   13.025764]        tracepoint_probe_register_prio+0x2d/0x300
[   13.026215]        bpf_probe_register+0x40/0x60
[   13.026584]        bpf_raw_tracepoint_open.isra.34+0xa4/0x130
[   13.027042]        __do_sys_bpf+0x94f/0x1a90
[   13.027389]        do_syscall_64+0x4a/0x180
[   13.027727]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[   13.028171]
[   13.028171] other info that might help us debug this:
[   13.028171]
[   13.028807] Chain exists of:
[   13.028807]   tracepoints_mutex --> &cpuctx_mutex --> bpf_event_mutex
[   13.028807]
[   13.029666]  Possible unsafe locking scenario:
[   13.029666]
[   13.030140]        CPU0                    CPU1
[   13.030510]        ----                    ----
[   13.030875]   lock(bpf_event_mutex);
[   13.031166]                                lock(&cpuctx_mutex);
[   13.031645]                                lock(bpf_event_mutex);
[   13.032135]   lock(tracepoints_mutex);
[   13.032441]
[   13.032441]  *** DEADLOCK ***
[   13.032441]
[   13.032911] 1 lock held by test_progs/246:
[   13.033239]  #0: 00000000d663ef86 (bpf_event_mutex){+.+.}, at: bpf_probe_register+0x1d/0x60
[   13.033909]
[   13.033909] stack backtrace:
[   13.034258] CPU: 1 PID: 246 Comm: test_progs Not tainted 5.0.0-rc3-00018-g2fa53f892422-dirty torvalds#477
[   13.034964] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
[   13.035657] Call Trace:
[   13.035859]  dump_stack+0x5f/0x8b
[   13.036130]  print_circular_bug.isra.37+0x1ce/0x1db
[   13.036526]  __lock_acquire+0x1158/0x1350
[   13.036852]  ? lock_acquire+0x98/0x190
[   13.037154]  lock_acquire+0x98/0x190
[   13.037447]  ? tracepoint_probe_register_prio+0x2d/0x300
[   13.037876]  __mutex_lock+0x86/0x970
[   13.038167]  ? tracepoint_probe_register_prio+0x2d/0x300
[   13.038600]  ? tracepoint_probe_register_prio+0x2d/0x300
[   13.039028]  ? __mutex_lock+0x86/0x970
[   13.039337]  ? __mutex_lock+0x24a/0x970
[   13.039649]  ? bpf_probe_register+0x1d/0x60
[   13.039992]  ? __bpf_trace_sched_wake_idle_without_ipi+0x10/0x10
[   13.040478]  ? tracepoint_probe_register_prio+0x2d/0x300
[   13.040906]  tracepoint_probe_register_prio+0x2d/0x300
[   13.041325]  bpf_probe_register+0x40/0x60
[   13.041649]  bpf_raw_tracepoint_open.isra.34+0xa4/0x130
[   13.042068]  ? __might_fault+0x3e/0x90
[   13.042374]  __do_sys_bpf+0x94f/0x1a90
[   13.042678]  do_syscall_64+0x4a/0x180
[   13.042975]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[   13.043382] RIP: 0033:0x7f23b10a07f9
[   13.045155] RSP: 002b:00007ffdef42fdd8 EFLAGS: 00000202 ORIG_RAX: 0000000000000141
[   13.045759] RAX: ffffffffffffffda RBX: 00007ffdef42ff70 RCX: 00007f23b10a07f9
[   13.046326] RDX: 0000000000000070 RSI: 00007ffdef42fe10 RDI: 0000000000000011
[   13.046893] RBP: 00007ffdef42fdf0 R08: 0000000000000038 R09: 00007ffdef42fe10
[   13.047462] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000000
[   13.048029] R13: 0000000000000016 R14: 00007f23b1db4690 R15: 0000000000000000

Since tracepoints_mutex will be taken in tracepoint_probe_register/unregister()
there is no need to take bpf_event_mutex too.
bpf_event_mutex is protecting modifications to prog array used in kprobe/perf bpf progs.
bpf_raw_tracepoints don't need to take this mutex.

Fixes: c4f6699 ("bpf: introduce BPF_RAW_TRACEPOINT")
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
mihailescu2m pushed a commit that referenced this pull request Feb 14, 2019
Presently when an error is encountered during probe of the cxlflash
adapter, a deadlock is seen with cpu thread stuck inside
cxlflash_remove(). Below is the trace of the deadlock as logged by
khungtaskd:

cxlflash 0006:00:00.0: cxlflash_probe: init_afu failed rc=-16
INFO: task kworker/80:1:890 blocked for more than 120 seconds.
       Not tainted 5.0.0-rc4-capi2-kexec+ #2
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/80:1    D    0   890      2 0x00000808
Workqueue: events work_for_cpu_fn

Call Trace:
 0x4d72136320 (unreliable)
 __switch_to+0x2cc/0x460
 __schedule+0x2bc/0xac0
 schedule+0x40/0xb0
 cxlflash_remove+0xec/0x640 [cxlflash]
 cxlflash_probe+0x370/0x8f0 [cxlflash]
 local_pci_probe+0x6c/0x140
 work_for_cpu_fn+0x38/0x60
 process_one_work+0x260/0x530
 worker_thread+0x280/0x5d0
 kthread+0x1a8/0x1b0
 ret_from_kernel_thread+0x5c/0x80
INFO: task systemd-udevd:5160 blocked for more than 120 seconds.

The deadlock occurs as cxlflash_remove() is called from cxlflash_probe()
without setting 'cxlflash_cfg->state' to STATE_PROBED and the probe thread
starts to wait on 'cxlflash_cfg->reset_waitq'. Since the device was never
successfully probed the 'cxlflash_cfg->state' never changes from
STATE_PROBING hence the deadlock occurs.

We fix this deadlock by setting the variable 'cxlflash_cfg->state' to
STATE_PROBED in case an error occurs during cxlflash_probe() and just
before calling cxlflash_remove().

Cc: stable@vger.kernel.org
Fixes: c21e0bb("cxlflash: Base support for IBM CXL Flash Adapter")
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
mihailescu2m pushed a commit that referenced this pull request Feb 14, 2019
This patch moves clk_get_rate() call from trigger() to hw_params()
callback to avoid calling sleeping clk API from atomic context
and prevent deadlock as indicated below.

Before this change clk_get_rate() was being called with same
spinlock held as the one passed to the clk API when registering
clocks exposed by the I2S driver.

[   82.109780] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:908
[   82.117009] in_atomic(): 1, irqs_disabled(): 128, pid: 1554, name: speaker-test
[   82.124235] 3 locks held by speaker-test/1554:
[   82.128653]  #0: cc8c5328 (snd_pcm_link_rwlock){...-}, at: snd_pcm_stream_lock_irq+0x20/0x38
[   82.137058]  #1: ec9eda17 (&(&substream->self_group.lock)->rlock){..-.}, at: snd_pcm_ioctl+0x900/0x1268
[   82.146417]  #2: 6ac279bf (&(&pri_dai->spinlock)->rlock){..-.}, at: i2s_trigger+0x64/0x6d4
[   82.154650] irq event stamp: 8144
[   82.157949] hardirqs last  enabled at (8143): [<c0a0f574>] _raw_read_unlock_irq+0x24/0x5c
[   82.166089] hardirqs last disabled at (8144): [<c0a0f6a8>] _raw_read_lock_irq+0x18/0x58
[   82.174063] softirqs last  enabled at (8004): [<c01024e4>] __do_softirq+0x3a4/0x66c
[   82.181688] softirqs last disabled at (7997): [<c012d730>] irq_exit+0x140/0x168
[   82.188964] Preemption disabled at:
[   82.188967] [<00000000>]   (null)
[   82.195728] CPU: 6 PID: 1554 Comm: speaker-test Not tainted 5.0.0-rc5-00192-ga6e6caca8f03 hardkernel#191
[   82.204302] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[   82.210376] [<c0111a54>] (unwind_backtrace) from [<c010d8f4>] (show_stack+0x10/0x14)
[   82.218084] [<c010d8f4>] (show_stack) from [<c09ef004>] (dump_stack+0x90/0xc8)
[   82.225278] [<c09ef004>] (dump_stack) from [<c0152980>] (___might_sleep+0x22c/0x2c8)
[   82.232990] [<c0152980>] (___might_sleep) from [<c0a0a2e4>] (__mutex_lock+0x28/0xa3c)
[   82.240788] [<c0a0a2e4>] (__mutex_lock) from [<c0a0ad80>] (mutex_lock_nested+0x1c/0x24)
[   82.248763] [<c0a0ad80>] (mutex_lock_nested) from [<c04923dc>] (clk_prepare_lock+0x78/0xec)
[   82.257079] [<c04923dc>] (clk_prepare_lock) from [<c049538c>] (clk_core_get_rate+0xc/0x5c)
[   82.265309] [<c049538c>] (clk_core_get_rate) from [<c0766b18>] (i2s_trigger+0x490/0x6d4)
[   82.273369] [<c0766b18>] (i2s_trigger) from [<c074fec4>] (soc_pcm_trigger+0x100/0x140)
[   82.281254] [<c074fec4>] (soc_pcm_trigger) from [<c07378a0>] (snd_pcm_do_start+0x2c/0x30)
[   82.289400] [<c07378a0>] (snd_pcm_do_start) from [<c07376cc>] (snd_pcm_action_single+0x38/0x78)
[   82.298065] [<c07376cc>] (snd_pcm_action_single) from [<c073a450>] (snd_pcm_ioctl+0x910/0x1268)
[   82.306734] [<c073a450>] (snd_pcm_ioctl) from [<c0292344>] (do_vfs_ioctl+0x90/0x9ec)
[   82.314443] [<c0292344>] (do_vfs_ioctl) from [<c0292cd4>] (ksys_ioctl+0x34/0x60)
[   82.321808] [<c0292cd4>] (ksys_ioctl) from [<c0101000>] (ret_fast_syscall+0x0/0x28)
[   82.329431] Exception stack(0xeb875fa8 to 0xeb875ff0)
[   82.334459] 5fa0:                   00033c18 b6e31000 00000004 00004142 00033d80 00033d80
[   82.342605] 5fc0: 00033c18 b6e31000 00008000 00000036 00008000 00000000 beea38a8 00008000
[   82.350748] 5fe0: b6e3142c beea384c b6da9a30 b6c9212c
[   82.355789]
[   82.357245] ======================================================
[   82.363397] WARNING: possible circular locking dependency detected
[   82.369551] 5.0.0-rc5-00192-ga6e6caca8f03 hardkernel#191 Tainted: G        W
[   82.376395] ------------------------------------------------------
[   82.382548] speaker-test/1554 is trying to acquire lock:
[   82.387834] 6d2007f4 (prepare_lock){+.+.}, at: clk_prepare_lock+0x78/0xec
[   82.394593]
[   82.394593] but task is already holding lock:
[   82.400398] 6ac279bf (&(&pri_dai->spinlock)->rlock){..-.}, at: i2s_trigger+0x64/0x6d4
[   82.408197]
[   82.408197] which lock already depends on the new lock.
[   82.416343]
[   82.416343] the existing dependency chain (in reverse order) is:
[   82.423795]
[   82.423795] -> #1 (&(&pri_dai->spinlock)->rlock){..-.}:
[   82.430472]        clk_mux_set_parent+0x34/0xb8
[   82.434975]        clk_core_set_parent_nolock+0x1c4/0x52c
[   82.440347]        clk_set_parent+0x38/0x6c
[   82.444509]        of_clk_set_defaults+0xc8/0x308
[   82.449186]        of_clk_add_provider+0x84/0xd0
[   82.453779]        samsung_i2s_probe+0x408/0x5f8
[   82.458376]        platform_drv_probe+0x48/0x98
[   82.462879]        really_probe+0x224/0x3f4
[   82.467037]        driver_probe_device+0x70/0x1c4
[   82.471716]        bus_for_each_drv+0x44/0x8c
[   82.476049]        __device_attach+0xa0/0x138
[   82.480382]        bus_probe_device+0x88/0x90
[   82.484715]        deferred_probe_work_func+0x6c/0xbc
[   82.489741]        process_one_work+0x200/0x740
[   82.494246]        worker_thread+0x2c/0x4c8
[   82.498408]        kthread+0x128/0x164
[   82.502131]        ret_from_fork+0x14/0x20
[   82.506204]          (null)
[   82.508976]
[   82.508976] -> #0 (prepare_lock){+.+.}:
[   82.514264]        __mutex_lock+0x60/0xa3c
[   82.518336]        mutex_lock_nested+0x1c/0x24
[   82.522756]        clk_prepare_lock+0x78/0xec
[   82.527088]        clk_core_get_rate+0xc/0x5c
[   82.531421]        i2s_trigger+0x490/0x6d4
[   82.535494]        soc_pcm_trigger+0x100/0x140
[   82.539913]        snd_pcm_do_start+0x2c/0x30
[   82.544246]        snd_pcm_action_single+0x38/0x78
[   82.549012]        snd_pcm_ioctl+0x910/0x1268
[   82.553345]        do_vfs_ioctl+0x90/0x9ec
[   82.557417]        ksys_ioctl+0x34/0x60
[   82.561229]        ret_fast_syscall+0x0/0x28
[   82.565477]        0xbeea384c
[   82.568421]
[   82.568421] other info that might help us debug this:
[   82.568421]
[   82.576394]  Possible unsafe locking scenario:
[   82.576394]
[   82.582285]        CPU0                    CPU1
[   82.586792]        ----                    ----
[   82.591297]   lock(&(&pri_dai->spinlock)->rlock);
[   82.595977]                                lock(prepare_lock);
[   82.601782]                                lock(&(&pri_dai->spinlock)->rlock);
[   82.608975]   lock(prepare_lock);
[   82.612268]
[   82.612268]  *** DEADLOCK ***

Fixes: 647d04f ("ASoC: samsung: i2s: Ensure the RCLK rate is properly determined")
Reported-by: Krzysztof Kozłowski <krzk@kernel.org>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
mihailescu2m pushed a commit that referenced this pull request Feb 27, 2019
On ESP output, sk_wmem_alloc is incremented for the added padding if a
socket is associated to the skb. When replying with TCP SYNACKs over
IPsec, the associated sk is a casted request socket, only. Increasing
sk_wmem_alloc on a request socket results in a write at an arbitrary
struct offset. In the best case, this produces the following WARNING:

WARNING: CPU: 1 PID: 0 at lib/refcount.c:102 esp_output_head+0x2e4/0x308 [esp4]
refcount_t: addition on 0; use-after-free.
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.0.0-rc3 #2
Hardware name: Marvell Armada 380/385 (Device Tree)
[...]
[<bf0ff354>] (esp_output_head [esp4]) from [<bf1006a4>] (esp_output+0xb8/0x180 [esp4])
[<bf1006a4>] (esp_output [esp4]) from [<c05dee64>] (xfrm_output_resume+0x558/0x664)
[<c05dee64>] (xfrm_output_resume) from [<c05d07b0>] (xfrm4_output+0x44/0xc4)
[<c05d07b0>] (xfrm4_output) from [<c05956bc>] (tcp_v4_send_synack+0xa8/0xe8)
[<c05956bc>] (tcp_v4_send_synack) from [<c0586ad8>] (tcp_conn_request+0x7f4/0x948)
[<c0586ad8>] (tcp_conn_request) from [<c058c404>] (tcp_rcv_state_process+0x2a0/0xe64)
[<c058c404>] (tcp_rcv_state_process) from [<c05958ac>] (tcp_v4_do_rcv+0xf0/0x1f4)
[<c05958ac>] (tcp_v4_do_rcv) from [<c0598a4c>] (tcp_v4_rcv+0xdb8/0xe20)
[<c0598a4c>] (tcp_v4_rcv) from [<c056eb74>] (ip_protocol_deliver_rcu+0x2c/0x2dc)
[<c056eb74>] (ip_protocol_deliver_rcu) from [<c056ee6c>] (ip_local_deliver_finish+0x48/0x54)
[<c056ee6c>] (ip_local_deliver_finish) from [<c056eecc>] (ip_local_deliver+0x54/0xec)
[<c056eecc>] (ip_local_deliver) from [<c056efac>] (ip_rcv+0x48/0xb8)
[<c056efac>] (ip_rcv) from [<c0519c2c>] (__netif_receive_skb_one_core+0x50/0x6c)
[...]

The issue triggers only when not using TCP syncookies, as for syncookies
no socket is associated.

Fixes: cac2661 ("esp4: Avoid skb_cow_data whenever possible")
Fixes: 03e2a30 ("esp6: Avoid skb_cow_data whenever possible")
Signed-off-by: Martin Willi <martin@strongswan.org>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
mihailescu2m pushed a commit that referenced this pull request Mar 7, 2019
It can be reproduced by following steps:
1. virtio_net NIC is configured with gso/tso on
2. configure nginx as http server with an index file bigger than 1M bytes
3. use tc netem to produce duplicate packets and delay:
   tc qdisc add dev eth0 root netem delay 100ms 10ms 30% duplicate 90%
4. continually curl the nginx http server to get index file on client
5. BUG_ON is seen quickly

[10258690.371129] kernel BUG at net/core/skbuff.c:4028!
[10258690.371748] invalid opcode: 0000 [#1] SMP PTI
[10258690.372094] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G        W         5.0.0-rc6 #2
[10258690.372094] RSP: 0018:ffffa05797b43da0 EFLAGS: 00010202
[10258690.372094] RBP: 00000000000005ea R08: 0000000000000000 R09: 00000000000005ea
[10258690.372094] R10: ffffa0579334d800 R11: 00000000000002c0 R12: 0000000000000002
[10258690.372094] R13: 0000000000000000 R14: ffffa05793122900 R15: ffffa0578f7cb028
[10258690.372094] FS:  0000000000000000(0000) GS:ffffa05797b40000(0000) knlGS:0000000000000000
[10258690.372094] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[10258690.372094] CR2: 00007f1a6dc00868 CR3: 000000001000e000 CR4: 00000000000006e0
[10258690.372094] Call Trace:
[10258690.372094]  <IRQ>
[10258690.372094]  skb_to_sgvec+0x11/0x40
[10258690.372094]  start_xmit+0x38c/0x520 [virtio_net]
[10258690.372094]  dev_hard_start_xmit+0x9b/0x200
[10258690.372094]  sch_direct_xmit+0xff/0x260
[10258690.372094]  __qdisc_run+0x15e/0x4e0
[10258690.372094]  net_tx_action+0x137/0x210
[10258690.372094]  __do_softirq+0xd6/0x2a9
[10258690.372094]  irq_exit+0xde/0xf0
[10258690.372094]  smp_apic_timer_interrupt+0x74/0x140
[10258690.372094]  apic_timer_interrupt+0xf/0x20
[10258690.372094]  </IRQ>

In __skb_to_sgvec(), the skb->len is not equal to the sum of the skb's
linear data size and nonlinear data size, thus BUG_ON triggered.
Because the skb is cloned and a part of nonlinear data is split off.

Duplicate packet is cloned in netem_enqueue() and may be delayed
some time in qdisc. When qdisc len reached the limit and returns
NET_XMIT_DROP, the skb will be retransmit later in write queue.
the skb will be fragmented by tso_fragment(), the limit size
that depends on cwnd and mss decrease, the skb's nonlinear
data will be split off. The length of the skb cloned by netem
will not be updated. When we use virtio_net NIC and invoke skb_to_sgvec(),
the BUG_ON trigger.

To fix it, netem returns NET_XMIT_SUCCESS to upper stack
when it clones a duplicate packet.

Fixes: 35d889d ("sch_netem: fix skb leak in netem_enqueue()")
Signed-off-by: Sheng Lan <lansheng@huawei.com>
Reported-by: Qin Ji <jiqin.ji@huawei.com>
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mihailescu2m pushed a commit that referenced this pull request Mar 20, 2019
[ Upstream commit a843dc4 ]

In func check_6rd,tunnel->ip6rd.relay_prefixlen may equal to
32,so UBSAN complain about it.

UBSAN: Undefined behaviour in net/ipv6/sit.c:781:47
shift exponent 32 is too large for 32-bit type 'unsigned int'
CPU: 6 PID: 20036 Comm: syz-executor.0 Not tainted 4.19.27 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1
04/01/2014
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0xca/0x13e lib/dump_stack.c:113
ubsan_epilogue+0xe/0x81 lib/ubsan.c:159
__ubsan_handle_shift_out_of_bounds+0x293/0x2e8 lib/ubsan.c:425
check_6rd.constprop.9+0x433/0x4e0 net/ipv6/sit.c:781
try_6rd net/ipv6/sit.c:806 [inline]
ipip6_tunnel_xmit net/ipv6/sit.c:866 [inline]
sit_tunnel_xmit+0x141c/0x2720 net/ipv6/sit.c:1033
__netdev_start_xmit include/linux/netdevice.h:4300 [inline]
netdev_start_xmit include/linux/netdevice.h:4309 [inline]
xmit_one net/core/dev.c:3243 [inline]
dev_hard_start_xmit+0x17c/0x780 net/core/dev.c:3259
__dev_queue_xmit+0x1656/0x2500 net/core/dev.c:3829
neigh_output include/net/neighbour.h:501 [inline]
ip6_finish_output2+0xa36/0x2290 net/ipv6/ip6_output.c:120
ip6_finish_output+0x3e7/0xa20 net/ipv6/ip6_output.c:154
NF_HOOK_COND include/linux/netfilter.h:278 [inline]
ip6_output+0x1e2/0x720 net/ipv6/ip6_output.c:171
dst_output include/net/dst.h:444 [inline]
ip6_local_out+0x99/0x170 net/ipv6/output_core.c:176
ip6_send_skb+0x9d/0x2f0 net/ipv6/ip6_output.c:1697
ip6_push_pending_frames+0xc0/0x100 net/ipv6/ip6_output.c:1717
rawv6_push_pending_frames net/ipv6/raw.c:616 [inline]
rawv6_sendmsg+0x2435/0x3530 net/ipv6/raw.c:946
inet_sendmsg+0xf8/0x5c0 net/ipv4/af_inet.c:798
sock_sendmsg_nosec net/socket.c:621 [inline]
sock_sendmsg+0xc8/0x110 net/socket.c:631
___sys_sendmsg+0x6cf/0x890 net/socket.c:2114
__sys_sendmsg+0xf0/0x1b0 net/socket.c:2152
do_syscall_64+0xc8/0x580 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe

Signed-off-by: linmiaohe <linmiaohe@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
mihailescu2m pushed a commit that referenced this pull request Nov 7, 2019
Original version of g920_get_config() contained two kind of actions:

    1. Device specific communication to query/set some parameters
       which requires active communication channel with the device,
       or, put in other way, for the call to be sandwiched between
       hid_device_io_start() and hid_device_io_stop().

    2. Input subsystem specific FF controller initialization which, in
       order to access a valid 'struct hid_input' via
       'hid->inputs.next', requires claimed hidinput which means be
       executed after the call to hid_hw_start() with connect_mask
       containing HID_CONNECT_HIDINPUT.

Location of g920_get_config() can only fulfill requirements for #1 and
not #2, which might result in following backtrace:

[   88.312258] logitech-hidpp-device 0003:046D:C262.0005: HID++ 4.2 device connected.
[   88.320298] BUG: kernel NULL pointer dereference, address: 0000000000000018
[   88.320304] #PF: supervisor read access in kernel mode
[   88.320307] #PF: error_code(0x0000) - not-present page
[   88.320309] PGD 0 P4D 0
[   88.320315] Oops: 0000 [#1] SMP PTI
[   88.320320] CPU: 1 PID: 3080 Comm: systemd-udevd Not tainted 5.4.0-rc1+ hardkernel#31
[   88.320322] Hardware name: Apple Inc. MacBookPro11,1/Mac-189A3D4F975D5FFC, BIOS 149.0.0.0.0 09/17/2018
[   88.320334] RIP: 0010:hidpp_probe+0x61f/0x948 [hid_logitech_hidpp]
[   88.320338] Code: 81 00 00 48 89 ef e8 f0 d6 ff ff 41 89 c6 85 c0 75 b5 0f b6 44 24 28 48 8b 5d 00 88 44 24 1e 89 44 24 0c 48 8b 83 18 1c 00 00 <48> 8b 48 18 48 8b 83 10 19 00 00 48 8b 40 40 48 89 0c 24 0f b7 80
[   88.320341] RSP: 0018:ffffb0a6824aba68 EFLAGS: 00010246
[   88.320345] RAX: 0000000000000000 RBX: ffff93a50756e000 RCX: 0000000000010408
[   88.320347] RDX: 0000000000000000 RSI: ffff93a51f0ad0a0 RDI: 000000000002d0a0
[   88.320350] RBP: ffff93a50416da28 R08: ffff93a50416da70 R09: ffff93a50416da70
[   88.320352] R10: 000000148ae9e60c R11: 00000000000f1525 R12: ffff93a50756e000
[   88.320354] R13: ffff93a50756f8d0 R14: 0000000000000000 R15: ffff93a50756fc38
[   88.320358] FS:  00007f8d8c1e0940(0000) GS:ffff93a51f080000(0000) knlGS:0000000000000000
[   88.320361] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   88.320363] CR2: 0000000000000018 CR3: 00000003996d8003 CR4: 00000000001606e0
[   88.320366] Call Trace:
[   88.320377]  ? _cond_resched+0x15/0x30
[   88.320387]  ? create_pinctrl+0x2f/0x3c0
[   88.320393]  ? kernfs_link_sibling+0x94/0xe0
[   88.320398]  ? _cond_resched+0x15/0x30
[   88.320402]  ? kernfs_activate+0x5f/0x80
[   88.320406]  ? kernfs_add_one+0xe2/0x130
[   88.320411]  hid_device_probe+0x106/0x170
[   88.320419]  really_probe+0x147/0x3c0
[   88.320424]  driver_probe_device+0xb6/0x100
[   88.320428]  device_driver_attach+0x53/0x60
[   88.320433]  __driver_attach+0x8a/0x150
[   88.320437]  ? device_driver_attach+0x60/0x60
[   88.320440]  bus_for_each_dev+0x78/0xc0
[   88.320445]  bus_add_driver+0x14d/0x1f0
[   88.320450]  driver_register+0x6c/0xc0
[   88.320453]  ? 0xffffffffc0d67000
[   88.320457]  __hid_register_driver+0x4c/0x80
[   88.320464]  do_one_initcall+0x46/0x1f4
[   88.320469]  ? _cond_resched+0x15/0x30
[   88.320474]  ? kmem_cache_alloc_trace+0x162/0x220
[   88.320481]  ? do_init_module+0x23/0x230
[   88.320486]  do_init_module+0x5c/0x230
[   88.320491]  load_module+0x26e1/0x2990
[   88.320502]  ? ima_post_read_file+0xf0/0x100
[   88.320508]  ? __do_sys_finit_module+0xaa/0x110
[   88.320512]  __do_sys_finit_module+0xaa/0x110
[   88.320520]  do_syscall_64+0x5b/0x180
[   88.320525]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   88.320528] RIP: 0033:0x7f8d8d1f01fd
[   88.320532] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 5b 8c 0c 00 f7 d8 64 89 01 48
[   88.320535] RSP: 002b:00007ffefa3bb068 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[   88.320539] RAX: ffffffffffffffda RBX: 000055922040cb40 RCX: 00007f8d8d1f01fd
[   88.320541] RDX: 0000000000000000 RSI: 00007f8d8ce4984d RDI: 0000000000000006
[   88.320543] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000007
[   88.320545] R10: 0000000000000006 R11: 0000000000000246 R12: 00007f8d8ce4984d
[   88.320547] R13: 0000000000000000 R14: 000055922040efc0 R15: 000055922040cb40
[   88.320551] Modules linked in: hid_logitech_hidpp(+) fuse rfcomm ccm xt_CHECKSUM xt_MASQUERADE bridge stp llc nf_nat_tftp nf_conntrack_tftp nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat tun iptable_mangle iptable_raw iptable_security nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables cmac bnep sunrpc dm_crypt nls_utf8 hfsplus intel_rapl_msr intel_rapl_common ath9k_htc ath9k_common x86_pkg_temp_thermal intel_powerclamp b43 ath9k_hw coretemp snd_hda_codec_hdmi cordic kvm_intel snd_hda_codec_cirrus mac80211 snd_hda_codec_generic ledtrig_audio kvm snd_hda_intel snd_intel_nhlt irqbypass snd_hda_codec btusb btrtl snd_hda_core ath btbcm ssb snd_hwdep btintel snd_seq crct10dif_pclmul iTCO_wdt snd_seq_device crc32_pclmul bluetooth mmc_core iTCO_vendor_support joydev cfg80211
[   88.320602]  applesmc ghash_clmulni_intel ecdh_generic snd_pcm input_polldev intel_cstate ecc intel_uncore thunderbolt snd_timer i2c_i801 libarc4 rfkill intel_rapl_perf lpc_ich mei_me pcspkr bcm5974 snd bcma mei soundcore acpi_als sbs kfifo_buf sbshc industrialio apple_bl i915 i2c_algo_bit drm_kms_helper drm uas crc32c_intel usb_storage video hid_apple
[   88.320630] CR2: 0000000000000018
[   88.320633] ---[ end trace 933491c8a4fadeb7 ]---
[   88.320642] RIP: 0010:hidpp_probe+0x61f/0x948 [hid_logitech_hidpp]
[   88.320645] Code: 81 00 00 48 89 ef e8 f0 d6 ff ff 41 89 c6 85 c0 75 b5 0f b6 44 24 28 48 8b 5d 00 88 44 24 1e 89 44 24 0c 48 8b 83 18 1c 00 00 <48> 8b 48 18 48 8b 83 10 19 00 00 48 8b 40 40 48 89 0c 24 0f b7 80
[   88.320647] RSP: 0018:ffffb0a6824aba68 EFLAGS: 00010246
[   88.320650] RAX: 0000000000000000 RBX: ffff93a50756e000 RCX: 0000000000010408
[   88.320652] RDX: 0000000000000000 RSI: ffff93a51f0ad0a0 RDI: 000000000002d0a0
[   88.320655] RBP: ffff93a50416da28 R08: ffff93a50416da70 R09: ffff93a50416da70
[   88.320657] R10: 000000148ae9e60c R11: 00000000000f1525 R12: ffff93a50756e000
[   88.320659] R13: ffff93a50756f8d0 R14: 0000000000000000 R15: ffff93a50756fc38
[   88.320662] FS:  00007f8d8c1e0940(0000) GS:ffff93a51f080000(0000) knlGS:0000000000000000
[   88.320664] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   88.320667] CR2: 0000000000000018 CR3: 00000003996d8003 CR4: 00000000001606e0

To solve this issue:

   1. Split g920_get_config() such that all of the device specific
      communication remains a part of the function and input subsystem
      initialization bits go to hidpp_ff_init()

   2. Move call to hidpp_ff_init() from being a part of
      g920_get_config() to be the last step of .probe(), right after a
      call to hid_hw_start() with connect_mask containing
      HID_CONNECT_HIDINPUT.

Fixes: 91cf9a9 ("HID: logitech-hidpp: make .probe usbhid capable")
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Tested-by: Sam Bazley <sambazley@fastmail.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Cc: Henrik Rydberg <rydberg@bitmath.org>
Cc: Pierre-Loup A. Griffais <pgriffais@valvesoftware.com>
Cc: Austin Palmer <austinp@valvesoftware.com>
Cc: linux-input@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org # 5.2+
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
mihailescu2m pushed a commit that referenced this pull request Nov 7, 2019
All bonding device has same lockdep key and subclass is initialized with
nest_level.
But actual nest_level value can be changed when a lower device is attached.
And at this moment, the subclass should be updated but it seems to be
unsafe.
So this patch makes bonding use dynamic lockdep key instead of the
subclass.

Test commands:
    ip link add bond0 type bond

    for i in {1..5}
    do
	    let A=$i-1
	    ip link add bond$i type bond
	    ip link set bond$i master bond$A
    done
    ip link set bond5 master bond0

Splat looks like:
[  307.992912] WARNING: possible recursive locking detected
[  307.993656] 5.4.0-rc3+ hardkernel#96 Tainted: G        W
[  307.994367] --------------------------------------------
[  307.995092] ip/761 is trying to acquire lock:
[  307.995710] ffff8880513aac60 (&(&bond->stats_lock)->rlock#2/2){+.+.}, at: bond_get_stats+0xb8/0x500 [bonding]
[  307.997045]
	       but task is already holding lock:
[  307.997923] ffff88805fcbac60 (&(&bond->stats_lock)->rlock#2/2){+.+.}, at: bond_get_stats+0xb8/0x500 [bonding]
[  307.999215]
	       other info that might help us debug this:
[  308.000251]  Possible unsafe locking scenario:

[  308.001137]        CPU0
[  308.001533]        ----
[  308.001915]   lock(&(&bond->stats_lock)->rlock#2/2);
[  308.002609]   lock(&(&bond->stats_lock)->rlock#2/2);
[  308.003302]
		*** DEADLOCK ***

[  308.004310]  May be due to missing lock nesting notation

[  308.005319] 3 locks held by ip/761:
[  308.005830]  #0: ffffffff9fcc42b0 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x466/0x8a0
[  308.006894]  #1: ffff88805fcbac60 (&(&bond->stats_lock)->rlock#2/2){+.+.}, at: bond_get_stats+0xb8/0x500 [bonding]
[  308.008243]  #2: ffffffff9f9219c0 (rcu_read_lock){....}, at: bond_get_stats+0x9f/0x500 [bonding]
[  308.009422]
	       stack backtrace:
[  308.010124] CPU: 0 PID: 761 Comm: ip Tainted: G        W         5.4.0-rc3+ hardkernel#96
[  308.011097] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[  308.012179] Call Trace:
[  308.012601]  dump_stack+0x7c/0xbb
[  308.013089]  __lock_acquire+0x269d/0x3de0
[  308.013669]  ? register_lock_class+0x14d0/0x14d0
[  308.014318]  lock_acquire+0x164/0x3b0
[  308.014858]  ? bond_get_stats+0xb8/0x500 [bonding]
[  308.015520]  _raw_spin_lock_nested+0x2e/0x60
[  308.016129]  ? bond_get_stats+0xb8/0x500 [bonding]
[  308.017215]  bond_get_stats+0xb8/0x500 [bonding]
[  308.018454]  ? bond_arp_rcv+0xf10/0xf10 [bonding]
[  308.019710]  ? rcu_read_lock_held+0x90/0xa0
[  308.020605]  ? rcu_read_lock_sched_held+0xc0/0xc0
[  308.021286]  ? bond_get_stats+0x9f/0x500 [bonding]
[  308.021953]  dev_get_stats+0x1ec/0x270
[  308.022508]  bond_get_stats+0x1d1/0x500 [bonding]

Fixes: d3fff6c ("net: add netdev_lockdep_set_classes() helper")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mihailescu2m pushed a commit that referenced this pull request Nov 14, 2019
When lockdep is enabled, plugging Thunderbolt dock on Dominik's laptop
triggers following splat:

  ======================================================
  WARNING: possible circular locking dependency detected
  5.3.0-rc6+ #1 Tainted: G                T
  ------------------------------------------------------
  pool-/usr/lib/b/1258 is trying to acquire lock:
  000000005ab0ad43 (pci_rescan_remove_lock){+.+.}, at: authorized_store+0xe8/0x210

  but task is already holding lock:
  00000000bfb796b5 (&tb->lock){+.+.}, at: authorized_store+0x7c/0x210

  which lock already depends on the new lock.

  the existing dependency chain (in reverse order) is:

  -> #1 (&tb->lock){+.+.}:
         __mutex_lock+0xac/0x9a0
         tb_domain_add+0x2d/0x130
         nhi_probe+0x1dd/0x330
         pci_device_probe+0xd2/0x150
         really_probe+0xee/0x280
         driver_probe_device+0x50/0xc0
         bus_for_each_drv+0x84/0xd0
         __device_attach+0xe4/0x150
         pci_bus_add_device+0x4e/0x70
         pci_bus_add_devices+0x2e/0x66
         pci_bus_add_devices+0x59/0x66
         pci_bus_add_devices+0x59/0x66
         enable_slot+0x344/0x450
         acpiphp_check_bridge.part.0+0x119/0x150
         acpiphp_hotplug_notify+0xaa/0x140
         acpi_device_hotplug+0xa2/0x3f0
         acpi_hotplug_work_fn+0x1a/0x30
         process_one_work+0x234/0x580
         worker_thread+0x50/0x3b0
         kthread+0x10a/0x140
         ret_from_fork+0x3a/0x50

  -> #0 (pci_rescan_remove_lock){+.+.}:
         __lock_acquire+0xe54/0x1ac0
         lock_acquire+0xb8/0x1b0
         __mutex_lock+0xac/0x9a0
         authorized_store+0xe8/0x210
         kernfs_fop_write+0x125/0x1b0
         vfs_write+0xc2/0x1d0
         ksys_write+0x6c/0xf0
         do_syscall_64+0x50/0x180
         entry_SYSCALL_64_after_hwframe+0x49/0xbe

  other info that might help us debug this:
   Possible unsafe locking scenario:
         CPU0                    CPU1
         ----                    ----
    lock(&tb->lock);
                                 lock(pci_rescan_remove_lock);
                                 lock(&tb->lock);
    lock(pci_rescan_remove_lock);

   *** DEADLOCK ***
  5 locks held by pool-/usr/lib/b/1258:
   #0: 000000003df1a1ad (&f->f_pos_lock){+.+.}, at: __fdget_pos+0x4d/0x60
   #1: 0000000095a40b02 (sb_writers#6){.+.+}, at: vfs_write+0x185/0x1d0
   #2: 0000000017a7d714 (&of->mutex){+.+.}, at: kernfs_fop_write+0xf2/0x1b0
   #3: 000000004f262981 (kn->count#208){.+.+}, at: kernfs_fop_write+0xfa/0x1b0
   #4: 00000000bfb796b5 (&tb->lock){+.+.}, at: authorized_store+0x7c/0x210

  stack backtrace:
  CPU: 0 PID: 1258 Comm: pool-/usr/lib/b Tainted: G                T 5.3.0-rc6+ #1

On an system using ACPI hotplug the host router gets hotplugged first and then
the firmware starts sending notifications about connected devices so the above
scenario should not happen in reality. However, after taking a second
look at commit a03e828 ("thunderbolt: Serialize PCIe tunnel
creation with PCI rescan") that introduced the locking, I don't think it
is actually correct. It may have cured the symptom but probably the real
root cause was somewhere closer to PCI stack and possibly is already
fixed with recent kernels. I also tried to reproduce the original issue
with the commit reverted but could not.

So to keep lockdep happy and the code bit less complex drop calls to
pci_lock_rescan_remove()/pci_unlock_rescan_remove() in
tb_switch_set_authorized() effectively reverting a03e828.

Link: https://lkml.org/lkml/2019/8/30/513
Fixes: a03e828 ("thunderbolt: Serialize PCIe tunnel creation with PCI rescan")
Reported-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
mihailescu2m pushed a commit that referenced this pull request Nov 14, 2019
When the extent tree is modified, it should be protected by inode
cluster lock and ip_alloc_sem.

The extent tree is accessed and modified in the
ocfs2_prepare_inode_for_write, but isn't protected by ip_alloc_sem.

The following is a case.  The function ocfs2_fiemap is accessing the
extent tree, which is modified at the same time.

  kernel BUG at fs/ocfs2/extent_map.c:475!
  invalid opcode: 0000 [#1] SMP
  Modules linked in: tun ocfs2 ocfs2_nodemanager configfs ocfs2_stackglue [...]
  CPU: 16 PID: 14047 Comm: o2info Not tainted 4.1.12-124.23.1.el6uek.x86_64 #2
  Hardware name: Oracle Corporation ORACLE SERVER X7-2L/ASM, MB MECH, X7-2L, BIOS 42040600 10/19/2018
  task: ffff88019487e200 ti: ffff88003daa4000 task.ti: ffff88003daa4000
  RIP: ocfs2_get_clusters_nocache.isra.11+0x390/0x550 [ocfs2]
  Call Trace:
    ocfs2_fiemap+0x1e3/0x430 [ocfs2]
    do_vfs_ioctl+0x155/0x510
    SyS_ioctl+0x81/0xa0
    system_call_fastpath+0x18/0xd8
  Code: 18 48 c7 c6 60 7f 65 a0 31 c0 bb e2 ff ff ff 48 8b 4a 40 48 8b 7a 28 48 c7 c2 78 2d 66 a0 e8 38 4f 05 00 e9 28 fe ff ff 0f 1f 00 <0f> 0b 66 0f 1f 44 00 00 bb 86 ff ff ff e9 13 fe ff ff 66 0f 1f
  RIP  ocfs2_get_clusters_nocache.isra.11+0x390/0x550 [ocfs2]
  ---[ end trace c8aa0c8180e869dc ]---
  Kernel panic - not syncing: Fatal exception
  Kernel Offset: disabled

This issue can be reproduced every week in a production environment.

This issue is related to the usage mode.  If others use ocfs2 in this
mode, the kernel will panic frequently.

[akpm@linux-foundation.org: coding style fixes]
[Fix new warning due to unused function by removing said function - Linus ]
Link: http://lkml.kernel.org/r/1568772175-2906-2-git-send-email-sunny.s.zhang@oracle.com
Signed-off-by: Shuning Zhang <sunny.s.zhang@oracle.com>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Gang He <ghe@suse.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mihailescu2m pushed a commit that referenced this pull request Nov 27, 2019
Call napi_disable() twice would cause dead lock. There are three situations
may result in the issue.

1. rtl8152_pre_reset() and set_carrier() are run at the same time.
2. Call rtl8152_set_tunable() after rtl8152_close().
3. Call rtl8152_set_ringparam() after rtl8152_close().

For #1, use the same solution as commit 8481141 ("r8152: Re-order
napi_disable in rtl8152_close"). For #2 and #3, add checking the flag
of IFF_UP and using napi_disable/napi_enable during mutex.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants