This page is for frequent asked questions and answers.
nnictl will use tmp folder as a temporary folder to copy files under codeDir when executing experimentation creation. When met errors like below, try to clean up tmp folder first.
OSError: [Errno 28] No space left on device
In OpenPAI training mode, we start a rest server which listens on 51189 port in NNI Manager to receive metrcis reported from trials running in OpenPAI cluster. If you didn't see any metrics from WebUI in OpenPAI mode, check your machine where NNI manager runs on to make sure 51189 port is turned on in the firewall rule.
make: *** [install-XXX] Segmentation fault (core dumped)
Please try the following solutions in turn:
- Update or reinstall you current python's pip like
python3 -m pip install -U pip
- Install NNI with
--no-cache-dir
flag likepython3 -m pip install nni --no-cache-dir
Your machine don't have eth0 device, please set nniManagerIp in your config file manually.
When the duration of experiment reaches the maximum duration, nniManager will not create new trials, but the existing trials will continue unless user manually stop the experiment.
If you upgrade your NNI or you delete some config files of NNI when there is an experiment running, this kind of issue may happen because the loss of config file. You could use ps -ef | grep node
to find the pid of your experiment, and use kill -9 {pid}
to kill it manually.
Config the network mode to bridge mode or other mode that could make virtual machine's host accessible from external machine, and make sure the port of virtual machine is not forbidden by firewall.
Please inquiry the problem in https://github.com/Microsoft/nni/issues to see whether there are other people already reported the problem, create a new one if there are no existing issues been created.