-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Conversation
fix remote bug (microsoft#523)
merge master
merge master
merge master
merge master
merge master
merge master
merge master
merge master
merge master
merge master
merge master
merge master
private designatedGpuIndices!: Set<number>; | ||
private log: Logger; | ||
private localTrailConfig?: TrialConfig; | ||
private localConfig?: LocalConfig; | ||
private isMultiPhase: boolean = false; | ||
private jobStreamMap: Map<string, ts.Stream>; | ||
private maxTrialNumPerGPU: number = 1; | ||
private useActiveGPU: boolean = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about init them in constructor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed, moved to constructor.
if(trialJobDetail === undefined) { | ||
throw new Error(`could not get trialJobDetail by id ${trialJobId}`); | ||
} | ||
if (trialJobDetail.rmMeta !== undefined && trialJobDetail.rmMeta.occupiedGpuIndexMap !== undefined && trialJobDetail.gpuIndices !== undefined && trialJobDetail.gpuIndices.length > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better to split this line to four lines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed, split to four lines.
if(rmMeta.occupiedGpuIndexMap !== undefined) { | ||
let num = rmMeta.occupiedGpuIndexMap.get(gpuInfo.index); | ||
let maxTrialNumPerGPU: number = rmMeta.maxTrialNumPerGPU? rmMeta.maxTrialNumPerGPU: 1; | ||
if((num === undefined && (!rmMeta.useActiveGPU && gpuInfo.activeProcessNum === 0 || !rmMeta.useActiveGPU)) || (num !== undefined && num < maxTrialNumPerGPU)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the same, better to use multiple lines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggest 2 lines for this case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed, split to 2 lines.
} | ||
rmMeta.occupiedGpuIndexMap.set(gpuInfo.index, num + 1); | ||
}else { | ||
rmMeta.occupiedGpuIndexMap = new Map<number, number>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also reconsider this logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed, initialize in constructor.
tools/nni_cmd/launcher.py
Outdated
if request_data['local_config']: | ||
if request_data['local_config'].get('gpuIndices') and isinstance(request_data['local_config'].get('gpuIndices'), int): | ||
request_data['local_config']['gpuIndices'] = str(request_data['local_config'].get('gpuIndices')) | ||
if request_data['local_config'].get('maxTrialNumOnEachGPU'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GPU?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
tools/nni_cmd/launcher.py
Outdated
if request_data['local_config'].get('maxTrialNumOnEachGPU'): | ||
request_data['local_config']['maxTrialNumOnEachGPU'] = request_data['local_config'].get('maxTrialNumOnEachGPU') | ||
if request_data['local_config'].get('useActiveGPU'): | ||
request_data['local_config']['useActiveGPU'] = request_data['local_config'].get('useActiveGPU') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GPU? -> Gpu
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
docs/en_US/ExperimentConfig.md
Outdated
|
||
* __useActiveGpu__ | ||
|
||
__useActiveGpu__ is used to specify whether to use a GPU if there is another process. By default, NNI will use the GPU only if there is no another active process in the GPU, if __useActiveGpu__ is set to true, NNI will use the GPU regardless of another processes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to state that this option is not applicable for NNI on Windows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
* __useActiveGpu__ | ||
|
||
__useActiveGpu__ is used to specify whether to use a GPU if there is another process. By default, NNI will use the GPU only if there is no another active process in the GPU, if __useActiveGpu__ is set to true, NNI will use the GPU regardless of another processes. This field is not applicable for NNI on Windows. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice Job!
I test it on my local machine and it works successfully.
It seems that it also works in remote mode when I catch a sight of the code.
I suggest put one example code here.
localConfig:
maxTrialNumPerGpu: 5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good suggestion, I've added an configuration in cifar10-pytorch-example
No description provided.