Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

accelerate "could not configure accelerate" #1127

Closed
dicksensei69 opened this issue Jul 6, 2023 · 6 comments
Closed

accelerate "could not configure accelerate" #1127

dicksensei69 opened this issue Jul 6, 2023 · 6 comments

Comments

@dicksensei69
Copy link

dicksensei69 commented Jul 6, 2023

I'm on Linux Mint 21.1 with everything required installed. I get this error when I run the setup.sh

WARNING  Could not automatically configure accelerate. Please   
                         manually configure accelerate with the option in the   
                         menu or with: accelerate config.       

Screenshot_2023-07-06_13-18-54

I have looked into the script it looks like there is an issue with it not meeting any of the requirements of the if statements and failing out into the cannot auto configure. I have looked into the venv for where I need to place this yaml from from config_files. But I didn't see anywhere that looked right. Can someone point me to where I need to look?

I'm sure that this has been seen a few times. I just need a little more direction on what to do to fix it.

Edit: I went into the environment source /venv/bin/activate and then ran the accelerate config myself. I gave it the values of the yaml file and then it wrote it's file to my /home/dick/.cache/huggingface/accelerate/ folder. If anyone else runs into this this should fix it. I'll have it running a little later today I think. Maybe there is an issue because of the linux mint thing instead of being on ubuntu or whatever. I know earlier when I tired to install (month ago) it was having an issue because it was finding the release name of Vera which I think is the Linux mint name for 22.04. I don't know if that is what is happening right now but just another thing.

@RKelln
Copy link

RKelln commented Jul 12, 2023

Had the same issue on Ubuntu 22.04 today and @dicksensei69 manual accelerate config workaround worked for me too.

@IMarooz
Copy link

IMarooz commented Aug 15, 2023

cd ~/kohya_ss/venv/bin

then write

./accelerate config

and it will ask you a few questions to setup your config

@mrmeseeks23
Copy link

Do you mean to say we should activate the venv first? or just be in the bin subdirectory when we run ./accelerate config? I am running ubuntu 22.04 on runpod. I created a username also, but i am not entirely sure this is necessary vs. using root?usually I just type 'accelerate config' without the './'.

@harumeow88
Copy link

./accelerate config

I tried but it didn't work.

ruanliyu@ruanliyu ~ % cd ~/kohya_ss/venv/bin
ruanliyu@ruanliyu bin % ./accelerate config
Traceback (most recent call last):
File "/Users/ruanliyu/kohya_ss/venv/lib/python3.10/site-packages/tensorboard/compat/init.py", line 42, in tf
from tensorboard.compat import notf # noqa: F401
ImportError: cannot import name 'notf' from 'tensorboard.compat' (/Users/ruanliyu/kohya_ss/venv/lib/python3.10/site-packages/tensorboard/compat/init.py)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/Users/ruanliyu/kohya_ss/venv/bin/./accelerate", line 5, in
from accelerate.commands.accelerate_cli import main
File "/Users/ruanliyu/kohya_ss/venv/lib/python3.10/site-packages/accelerate/init.py", line 3, in
from .accelerator import Accelerator
File "/Users/ruanliyu/kohya_ss/venv/lib/python3.10/site-packages/accelerate/accelerator.py", line 39, in
from .tracking import LOGGER_TYPE_TO_CLASS, GeneralTracker, filter_trackers
File "/Users/ruanliyu/kohya_ss/venv/lib/python3.10/site-packages/accelerate/tracking.py", line 42, in
from torch.utils import tensorboard
File "/Users/ruanliyu/kohya_ss/venv/lib/python3.10/site-packages/torch/utils/tensorboard/init.py", line 12, in
from .writer import FileWriter, SummaryWriter # noqa: F401
File "/Users/ruanliyu/kohya_ss/venv/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py", line 16, in
from ._embedding import (
File "/Users/ruanliyu/kohya_ss/venv/lib/python3.10/site-packages/torch/utils/tensorboard/_embedding.py", line 9, in
_HAS_GFILE_JOIN = hasattr(tf.io.gfile, "join")
File "/Users/ruanliyu/kohya_ss/venv/lib/python3.10/site-packages/tensorboard/lazy.py", line 65, in getattr
return getattr(load_once(self), attr_name)
File "/Users/ruanliyu/kohya_ss/venv/lib/python3.10/site-packages/tensorboard/lazy.py", line 97, in wrapper
cache[arg] = f(arg)
File "/Users/ruanliyu/kohya_ss/venv/lib/python3.10/site-packages/tensorboard/lazy.py", line 50, in load_once
module = load_fn()
File "/Users/ruanliyu/kohya_ss/venv/lib/python3.10/site-packages/tensorboard/compat/init.py", line 45, in tf
import tensorflow
File "/Users/ruanliyu/kohya_ss/venv/lib/python3.10/site-packages/tensorflow/init.py", line 441, in
_ll.load_library(_plugin_dir)
File "/Users/ruanliyu/kohya_ss/venv/lib/python3.10/site-packages/tensorflow/python/framework/load_library.py", line 151, in load_library
py_tf.TF_LoadLibrary(lib)
tensorflow.python.framework.errors_impl.NotFoundError: dlopen(/Users/ruanliyu/kohya_ss/venv/lib/python3.10/site-packages/tensorflow-plugins/libmetal_plugin.dylib, 0x0006): Symbol not found: OBJC_CLASS$_MPSGraphRandomOpDescriptor
Referenced from: /Users/ruanliyu/kohya_ss/venv/lib/python3.10/site-packages/tensorflow-plugins/libmetal_plugin.dylib
Expected in: /System/Library/Frameworks/MetalPerformanceShadersGraph.framework/Versions/A/MetalPerformanceShadersGraph

@kyeno
Copy link

kyeno commented Dec 10, 2023

I had exactly the same fail on Debian 10 with Python-3.10.13

After this fail what you should do is:

source venv/bin/activate
cd venv
./bin/accelerate config

And then answer accordingly:

-----------------------------------------------------------------------------------------------------------------------------------------------------In which compute environment are you running?
This machine                                                                                                                                         
-----------------------------------------------------------------------------------------------------------------------------------------------------Which type of machine are you using?                                                                                                                 
No distributed training                                                                                                                              
Do you want to run your training on CPU only (even if a GPU / Apple Silicon / Ascend NPU device is available)? [yes/NO]:NO                           
Do you wish to optimize your script with torch dynamo?[yes/NO]:                                                                                      
Do you want to use DeepSpeed? [yes/NO]:                                                                                                              
What GPU(s) (by id) should be used for training on this machine as a comma-seperated list? [all]:all                                                 
-----------------------------------------------------------------------------------------------------------------------------------------------------Do you wish to use FP16 or BF16 (mixed precision)?                                                                                                   
bf16                                                                                                                                                 

NOTE: bf16 is for RTX 30XX series and higher. For anything lower you should use FP16.

Hope this helps!

@Eugeniusz-Gienek
Copy link

If this issue happens with Python 3.11, here is my solution: huggingface/pytorch-image-models#1530 (comment)

Copy of the message So, summary of how to apply this change (step by step) - on the example of Kohya. Specifically for those who are unfamiliar with Python :)
  1. Navigate to Kohya folder (for example, /opt/kohya):
cd /opt/kohya
  1. Navigate to models in virtuan env:
cd venv/lib/python3.11/site-packages/timm/models
  1. Edit maxxvit.py file (I prefer nano editor)
nano maxxvit.py
  1. press Ctrl+W and type "dataclasses" (without quotes)
  2. Append ", field" to the end of the found string. It has tol look like that after this:
from dataclasses import dataclass, replace, field
  1. again press Ctrl+W and type "class MaxxVitCfg" (without quotes)
  2. comment out strings starting with "conf_cfg" and "transformer_cfg" by appending hash symbol at the beginning
  3. append the following two strings afterwards:
    conv_cfg: MaxxVitConvCfg = field(default_factory=MaxxVitConvCfg) # <--- we need field here
    transformer_cfg: MaxxVitTransformerCfg = field(default_factory=MaxxVitTransformerCfg) # <--- and here
  1. Please pay attention that there are 4 spaces at the beginning of the string. It is important!
  2. Press Ctrl+O and then press Ctrl+X in order to save changes and close. Done!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants