Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError: 'ParallelEnv' object has no attribute '_device_id' #46377

Closed
LucasMartinuzzo opened this issue Sep 21, 2022 · 13 comments
Closed
Assignees
Labels
status/close 已关闭 type/build 编译/安装问题

Comments

@LucasMartinuzzo
Copy link

bug描述 Describe the Bug

I'm trying to train the recognition model using custom data, but i'm getting some error. This is how i call it:
CUDA_VISIBLE_DEVICES=0 python3 -m paddle.distributed.launch --log_dir=./debug/ --selected_gpus='0' tools/train.py -c configs/rec/prices/rec_price_1.yml -o Global.pretrained_model=en_PP-OCRv3_rec_train/best_accuracy

(I also tried with more than one GPU)

I'm getting the error:
Traceback (most recent call last):
File "/mnt/batch/tasks/shared/LS_root/mounts/clusters/computerlucasgpu-nc24/code/Users/lucas.martinuzzobatista-ext/OCR/PaddleOCR/tools/train.py", line 199, in
config, device, logger, vdl_writer = program.preprocess(is_train=True)
File "/mnt/batch/tasks/shared/LS_root/mounts/clusters/computerlucasgpu-nc24/code/Users/lucas.martinuzzobatista-ext/OCR/PaddleOCR/tools/program.py", line 652, in preprocess
device = 'gpu:{}'.format(dist.ParallelEnv()
File "/anaconda/envs/ocrTests/lib/python3.9/site-packages/paddle/fluid/dygraph/parallel.py", line 200, in device_id
return self._device_id
AttributeError: 'ParallelEnv' object has no attribute '_device_id'

I also tried the snipped give on documentation:

.. code-block:: python
            # execute this command in terminal: export FLAGS_selected_gpus=1
            import paddle.distributed as dist
            env = dist.ParallelEnv()
            print("The device id are %d" % env.device_id)
            # The device id are 1

And it gives the same error:
Traceback (most recent call last):
File "", line 1, in
File "/anaconda/envs/ocrTests/lib/python3.9/site-packages/paddle/fluid/dygraph/parallel.py", line 200, in device_id
return self._device_id
AttributeError: 'ParallelEnv' object has no attribute '_device_id'

其他补充信息 Additional Supplementary Information

No response

@paddle-bot
Copy link

paddle-bot bot commented Sep 21, 2022

您好,我们已经收到了您的问题,会安排技术人员尽快解答您的问题,请耐心等待。请您再次检查是否提供了清晰的问题描述、复现代码、环境&版本、报错信息等。同时,您也可以通过查看官网API文档常见问题历史IssueAI社区来寻求解答。祝您生活愉快~

Hi! We've received your issue and please be patient to get responded. We will arrange technicians to answer your questions as soon as possible. Please make sure that you have posted enough message to demo your request. You may also check out the APIFAQGithub Issue and AI community to get the answer.Have a nice day!

@haohongxiang
Copy link
Contributor

@LucasMartinuzzo Hi, I've received your issue. And I require more information about the version of PaddlePaddle and PaddleOCR.

@haohongxiang
Copy link
Contributor

haohongxiang commented Sep 22, 2022

@LucasMartinuzzo When you tried the given demo, did you make sure the installed PaddlePaddle is supported with GPU? Please check it by the following codes.

import paddle
paddle.utils.run_check()

If it gives the results like:

PaddlePaddle works well on 8 GPUs.
PaddlePaddle is installed successfully! Let's start deep learning with PaddlePaddle now.

meaning you has installed the right version of paddlepaddle-gpu.

@LucasMartinuzzo
Copy link
Author

Hi, i ran the command and the output is:

Running verify PaddlePaddle program ... 
PaddlePaddle works well on 1 CPU.
W0922 12:29:10.119011 69143 fuse_all_reduce_op_pass.cc:76] Find all_reduce operators: 2. To make the speed faster, some all_reduce ops are fused during training, after fusion, the number of all_reduce ops is 2.
PaddlePaddle works well on 2 CPUs.
PaddlePaddle is installed successfully! Let's start deep learning with PaddlePaddle now.

I didn't know about this paddlepaddle-gpu, so I installed now and the output is:

Running verify PaddlePaddle program ... 
W0922 12:52:31.481676 70122 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 3.7, Driver API Version: 11.4, Runtime API Version: 10.2
W0922 12:52:32.592422 70122 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2.
PaddlePaddle works well on 1 GPU.
W0922 12:52:49.524565 70122 parallel_executor.cc:642] Cannot enable P2P access from 0 to 1
W0922 12:52:49.524595 70122 parallel_executor.cc:642] Cannot enable P2P access from 0 to 2
W0922 12:52:49.524602 70122 parallel_executor.cc:642] Cannot enable P2P access from 0 to 3
W0922 12:52:49.524611 70122 parallel_executor.cc:642] Cannot enable P2P access from 1 to 0
W0922 12:52:49.524619 70122 parallel_executor.cc:642] Cannot enable P2P access from 1 to 2
W0922 12:52:49.524623 70122 parallel_executor.cc:642] Cannot enable P2P access from 1 to 3
W0922 12:52:49.524628 70122 parallel_executor.cc:642] Cannot enable P2P access from 2 to 0
W0922 12:52:49.524636 70122 parallel_executor.cc:642] Cannot enable P2P access from 2 to 1
W0922 12:52:49.524642 70122 parallel_executor.cc:642] Cannot enable P2P access from 2 to 3
W0922 12:52:49.524649 70122 parallel_executor.cc:642] Cannot enable P2P access from 3 to 0
W0922 12:52:49.524657 70122 parallel_executor.cc:642] Cannot enable P2P access from 3 to 1
W0922 12:52:49.524660 70122 parallel_executor.cc:642] Cannot enable P2P access from 3 to 2
W0922 12:52:53.712558 70122 fuse_all_reduce_op_pass.cc:76] Find all_reduce operators: 2. To make the speed faster, some all_reduce ops are fused during training, after fusion, the number of all_reduce ops is 2.
PaddlePaddle works well on 4 GPUs.
PaddlePaddle is installed successfully! Let's start deep learning with PaddlePaddle now.

I executed the program again and it stopped giving this error. Thank you.

@paddle-bot paddle-bot bot added status/close 已关闭 type/build 编译/安装问题 and removed status/new-issue 新建 type/bug-report 报bug labels Sep 22, 2022
@moonriseny
Copy link

Hi Lucas,

I have the same issue, did you resolve it?

Sheng

@LucasMartinuzzo
Copy link
Author

Yes, installing paddlepaddle-gpu worked for me.

@moonriseny
Copy link

Hi Lucas,
Thanks for the info. I did install paddlepaddle-gpu, but still got the same error. After digging into a little bit more, I found I need to specify the version and CUDA version for my system, here is my command
conda install paddlepaddle-gpu==2.3.2 cudatoolkit=11.6 -c https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/Paddle/ -c conda-forge
Now everything works.
Thanks.
Sheng

@wns-purushottamsah
Copy link

I uninstalled paddlepaddle-gpu==2.4 and installed paddlepaddle-gpu==2.3.2.

@np-n
Copy link

np-n commented Jan 23, 2023

How to fix it with paddlepaddle-cpu?I don't have GPU on my machine.

@np-n
Copy link

np-n commented Jan 24, 2023

To use the CPU version of paddlepaddle, you need to set Global.use_gpu=False in your config file.

@shrutimary15
Copy link

Installing paddlepaddle-gpu==2.3.2 worked for me

@tschijvenaars
Copy link

Hi, I have the same issue. I have ran the command: conda install paddlepaddle-gpu==2.3.2 cudatoolkit=11.6 -c https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/Paddle/ -c conda-forge but that did not help.
When running paddle.utils.run_check() I get the following:

Running verify PaddlePaddle program ...
PaddlePaddle works well on 1 CPU.
PaddlePaddle is installed successfully! Let's start deep learning with PaddlePaddle now.

I've installed CUDA Toolkit 11.6 as well through NVidia...

@MathewsJosh
Copy link

For me, I had to try installing the appropriate "paddlepaddle-gpu" package from the link: PaddlePaddle

So I tried running the command below (but got a network error):
python -m pip install paddlepaddle-gpu==2.6.0.post120 -f https://www.paddlepaddle.org.cn/whl/windows/mkl/avx/stable.html

So I had to download the wheels I needed and ran the command: pip install wheel_name.whl.

This error seems to be related to No module named 'paddle.io'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/close 已关闭 type/build 编译/安装问题
Projects
None yet
Development

No branches or pull requests

9 participants