Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: failed to get os threadid #490

Closed
baohongmin opened this issue May 6, 2022 · 9 comments
Closed

Error: failed to get os threadid #490

baohongmin opened this issue May 6, 2022 · 9 comments

Comments

@baohongmin
Copy link

py-spy top --native --pid 229875
Error: failed to get os threadid
py-spy 0.3.11

@Jongy
Copy link
Contributor

Jongy commented May 7, 2022

This means py-spy failed to get the native thread ID. This can happen due to numerous reasons depending on the OS you are using. On which system are you running py-spy?

In any case, the direct trigger for this error is --native - if you remove this flag, this error shouldn't trigger; so you can try without it if you can go without native traces.

@baohongmin
Copy link
Author

baohongmin commented May 8, 2022

Hi, Jongy
Thanks for your response. My OS information is as bellow
Linux icx08 4.18.0-305.12.1.el8_4.x86_64 #1 SMP Wed Aug 11 01:59:55 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux.
gcc version 8.5.0 20210514 (Red Hat 8.5.0-10) (GCC)
Python-3.6.5
Linux distribution: CentOS Linux | 8 |
libc version: glibc-2.28

I profile a running process running inside the docker container.
If I remove the flag --native, it can go well, but I want to trace the native stack(C/C++ extension).

@Jongy
Copy link
Contributor

Jongy commented May 8, 2022

Ah, py-spy doesn't support getting the OS thread ID for dockerized processes. See _get_os_thread_id impl for linux:

    #[cfg(all(target_os="linux", unwind))]
    fn _get_os_thread_id<I: InterpreterState>(&mut self, python_thread_id: u64, interp: &I) -> Result<Option<Tid>, Error> {
....
        // likewise this doesn't yet work for profiling processes running inside docker containers from the host os
        if self.dockerized {
            return Ok(None);
        }

I think that's the issue.

This is actually something we've been tackling but I don't have a solution ready yet.

Meanwhile - I can suggest that you run py-spy inside the container - that is, in the same PID NS.

For example, if the host PID is 229875 and the PID inside the container is 40, and the container is named my_app, then you can instead copy py-spy into the container (use the static musl build): docker cp ./py-spy my_app:/py-spy then run it (note - privileged is required): docker exec -it --privileged /py-spy top --native --pid 40. I think that'll work (at least, it will avoid the OS thread ID issue).

@baohongmin
Copy link
Author

Ah, py-spy doesn't support getting the OS thread ID for dockerized processes. See _get_os_thread_id impl for linux:

    #[cfg(all(target_os="linux", unwind))]
    fn _get_os_thread_id<I: InterpreterState>(&mut self, python_thread_id: u64, interp: &I) -> Result<Option<Tid>, Error> {
....
        // likewise this doesn't yet work for profiling processes running inside docker containers from the host os
        if self.dockerized {
            return Ok(None);
        }

I think that's the issue.

This is actually something we've been tackling but I don't have a solution ready yet.

Meanwhile - I can suggest that you run py-spy inside the container - that is, in the same PID NS.

For example, if the host PID is 229875 and the PID inside the container is 40, and the container is named my_app, then you can instead copy py-spy into the container (use the static musl build): docker cp ./py-spy my_app:/py-spy then run it (note - privileged is required): docker exec -it --privileged /py-spy top --native --pid 40. I think that'll work (at least, it will avoid the OS thread ID issue).

Thanks Jongy,
Yes, It can run well, when I run py-spy inside the container.

@Jongy
Copy link
Contributor

Jongy commented May 9, 2022

Glad it helped :)

@benfred
Copy link
Owner

benfred commented Oct 3, 2022

Fwiw, with python 3.11 we can get the OS thread id directly from python, and will be able to grab it from a dockerized process from the host container. We still won't be able to do native profiling from the host into the container though -

@rkooo567
Copy link

I also found the same error. ray-project/ray#30566

But for our case, we run py-spy within a docker container, so I am not sure how we can debug this issue... any pointer to take a look?

@rkooo567
Copy link

rkooo567 commented Nov 22, 2022

I found when I don't specify this is returned

Thread 0x7FB1278F5740 (active): "MainThread"
    main_loop (ray/_private/worker.py:763)
    <module> (ray/_private/workers/default_worker.py:233)
Thread 860 (idle): "ray_import_thread"
    wait (threading.py:300)
    _wait_once (grpc/_common.py:106)
    wait (grpc/_common.py:148)
    result (grpc/_channel.py:735)
    _poll_locked (ray/_private/gcs_pubsub.py:255)
    poll (ray/_private/gcs_pubsub.py:391)
    _run (ray/_private/import_thread.py:69)
    run (threading.py:870)
    _bootstrap_inner (threading.py:926)
    _bootstrap (threading.py:890)
Thread 864 (idle): "AsyncIO Thread: default"
    run (threading.py:870)
    _bootstrap_inner (threading.py:926)
    _bootstrap (threading.py:890)
Thread 866 (idle): "Thread-2"
    run (threading.py:870)
    _bootstrap_inner (threading.py:926)
    _bootstrap (threading.py:890)
Thread 0x7F9F815EB700 (active)
Thread 39212 (idle): "Thread-19"
    channel_spin (grpc/_channel.py:1258)
    run (threading.py:870)
    _bootstrap_inner (threading.py:926)
    _bootstrap (threading.py:890)

Is this related to that we have a thread 0x7F9F815EB700 that doesn't seem to be a Python thread?

@benfred
Copy link
Owner

benfred commented Nov 24, 2022

@rkooo567 that looks pretty odd to me - I'm unsure why py-spy managed to figure out the native threadid in some cases, but not others. Is there a way I can run this myself to investigate ? (docker container with python script to run etc).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants