Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

[bug] The error message is empty when failed to create kubeflow config. #1036

Closed
gaocegege opened this issue Apr 30, 2019 · 4 comments
Closed
Assignees

Comments

@gaocegege
Copy link

gaocegege commented Apr 30, 2019

Short summary about the issue/question:

The error message is empty when failed to create kubeflow config.

Brief what process you are following:

Using the kubeconfig to set up an experiment with kubeflow training platform. The output is:

ERROR: Failed! Error is: {}

The root cause of the issue is that the response here is empty.

How to reproduce it:

Create a local k8s cluster, and use it to create kubeflow experiments.

nnictl create --config ./examples/trials/mnist/config_kubeflow.yml -p 8081

nni Environment:

  • nni version: master
  • nni mode(local|pai|remote): kubeflow
  • OS: Ubuntu
  • python version: 3.6
  • is conda or virtualenv used?: No
  • is running in docker?: No

need to update document(yes/no):

Anything else we need to know:

The response code of http://localhost:8081/api/v1/nni/experiment/cluster-metadata is 500, and the response body is empty, I think.

The request body is

{'kubeflow_config': {'operator': 'tf-operator', 'apiVersion': 'v1alpha2', 'storage': 'nfs', 'nfs': {'server': '10.10.10.10', 'path': '/var/nfs/general'}}}
@gaocegege
Copy link
Author

Found the reason in manager log:

[4/30/2019, 9:58:32 AM] FATAL [ 'Mount NFS 10.10.10.10:/var/nfs/general to /home/gaocegege/nni/experiments/a2D9yNon/trials-nfs-tmp failed, error is ChildProcessError: Command failed: sudo mount 10.10.10.10:/var/nfs/general /home/gaocegege/nni/experiments/a2D9yNon/trials-nfs-tmp\nmount: /home/gaocegege/nni/experiments/a2D9yNon/trials-nfs-tmp: bad option; for several filesystems (e.g. nfs, cifs) you might need a /sbin/mount.<type> helper program.\n `sudo mount 10.10.10.10:/var/nfs/general /home/gaocegege/nni/experiments/a2D9yNon/trials-nfs-tmp` (exited with error code 32)' ]

While I cannot get anything in the nnictl output.

@SparkSnail
Copy link
Contributor

Hi, this error dues to your nfs path in config file is not valid, you should change your nfs path in config file to your own nfs server path, not 10.10.10.10 in example.

@gaocegege
Copy link
Author

Yeah I know. But I think it will be better to show the error in CLI output. ERROR: Failed! Error is: {} is not helpful for us.

@SparkSnail
Copy link
Contributor

I see, reasonable advice. I will check the code and fix this issue, thanks.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants