This repository has been archived by the owner on Sep 18, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Contributor
SparkSnail
commented
Dec 3, 2018
•
edited
Loading
edited
- move nnictl folder
- delete kubernetsServer in nnictl
- refactor aks document
- add warning information to expand relative path
- update experiment status when the experiment crashed.
SparkSnail
changed the title
Move nnictl config folder
Move nnictl config folder && remove kubernetsServer
Dec 4, 2018
SparkSnail
changed the title
Move nnictl config folder && remove kubernetsServer
Update document
Dec 4, 2018
yds05
reviewed
Dec 5, 2018
chicm-ms
reviewed
Dec 5, 2018
yds05
approved these changes
Dec 5, 2018
chicm-ms
approved these changes
Dec 5, 2018
yds05
pushed a commit
that referenced
this pull request
Dec 5, 2018
move nnictl folder delete kubernetsServer in nnictl refactor aks document add warning information to expand relative path update experiment status when the experiment crashed.
QuanluZhang
pushed a commit
that referenced
this pull request
Jan 30, 2019
* Lijiao (#1) * Set up CI with Azure Pipelines * Add idompotent support for get_parameters() in nni sdk (#216) * Updated based on comments * Fix bug, make get_parameters() idompotent * Add idompotent support for get_parameters() in LocalTrainingService * Add ip address cached to resolve network issue (#220) * Add ip address cached to resolve network issue * Fix bug of trial hypermeters (#222) * Format trial duration, rename button name (#209) * Refactor nnictl to support list multiple experiment (#207) 1.fix some bugs 2.support nnictl stop id, and add some regulars * Fix bug when trail duration is 0 (#223) * Fix broken issue comes from OpenPAI API upgrade (#227) Fix paiTrainingService broken issue comes from OpenPAI API upgrade * add document for nnictl (#230) * fix nnictl bug * fix install.sh * add desc for Dockerfile.build.base * update document for Dockerfile * update * refactor port detect * update * refactor NNICTLDOC.md * add document for pai and nnictl * add default value for port * Fix bug: some trial jobs hyper params not stored and error handling updates (#225) * Pull latest code (#2) * webui logpath and document (#135) * Add webui document and logpath as a href * fix tslint * fix comments by Chengmin * Pai training service bug fix and enhancement (#136) * Add NNI installation scripts * Update pai script, update NNI_out_dir * Update NNI dir in nni sdk local.py * Create .nni folder in nni sdk local.py * Add check before creating .nni folder * Fix typo for PAI_INSTALL_NNI_SHELL_FORMAT * Improve annotation (#138) * Improve annotation * Minor bugfix * Selectively install through pip (#139) Selectively install through pip * update setup.py * fix paiTrainingService bugs (#137) * fix nnictl bug * add hdfs host validation * fix bugs * fix dockerfile * fix install.sh * update install.sh * fix dockerfile * Set timeout for HDFSUtility exists function * remove unused TODO * fix sdk * add optional for outputDir and dataDir * refactor dockerfile.base * Remove unused import in hdfsclientUtility * Add documentation for NNI PAI mode experiment (#141) * Add documentation for NNI PAI mode * Fix typo based on PR comments * Exit with subprocess return code of trial keeper * Remove additional exit code * Fix typo based on PR comments * update doc for smac tuner (#140) * Revert "Selectively install through pip (#139)" due to potential pip install issue (#142) * Revert "Selectively install through pip (#139)" This reverts commit 1d174836d3146a0363e9c9c88094bf9cff865faa. * Add exit code of subprocess for trial_keeper * Update README, add link to PAImode doc * fix bug (#147) * Refactor nnictl and add config_pai.yml (#144) * fix nnictl bug * add hdfs host validation * fix bugs * fix dockerfile * fix install.sh * update install.sh * fix dockerfile * Set timeout for HDFSUtility exists function * remove unused TODO * fix sdk * add optional for outputDir and dataDir * refactor dockerfile.base * Remove unused import in hdfsclientUtility * add config_pai.yml * refactor nnictl create logic and add colorful print * fix nnictl stop logic * add annotation for config_pai.yml * add document for start experiment * fix config.yml * fix document * Fix trial keeper wrongly exit issue (#152) * Fix trial keeper bug, use actual exitcode to exit rather than 1 * Fix bug of table sort (#145) * Update doc for PAIMode and v0.2 release notes (#153) * Update v0.2 documentation regards to release note and PAI training service * Update document to describe NNI docker image * Bug fix for SQuAD example tuner. (#134) * Update Makefile (#151) * test * update setup.py * update Makefile and install.sh * rever setup.py * change color * update doc * update doc * fix auto-completion's extra space * update Makefile * update webui * Update doc image (#163) * update doc * trivial * trivial * trivial * trivial * trivial * trivial * update image * update image size * Update ga squad (#104) * update readme in ga_squad * update readme * fix typo * Update README.md * Update README.md * Update README.md * update readme * sklearn examples (#169) * fix nnictl bug * fix install.sh * add sklearn-regression example * add sklearn classification * update sklearn * update example * remove additional code * Update batch tuner (#158) * update readme in ga_squad * update readme * fix typo * Update README.md * Update README.md * Update README.md * update readme * update batch tuner * Quickly fix cascading search space bug in tuner (#156) * update readme in ga_squad * update readme * fix typo * Update README.md * Update README.md * Update README.md * update readme * quickly fix cascading searchspace bug in tuner * Add iterative search space example (#119) * update readme in ga_squad * update readme * fix typo * Update README.md * Update README.md * Update README.md * update readme * add iterative search space example * update * update readme * change name * Fix bug: some trial jobs hyper params not stored when fast finished * updates * updates * updates * updates * Add exception handler in trial_keeper (#235) * add exception handling in trial_keeper.py * Merge 0.2 into master (#237) * fix antd (#159) * quick fix config_pai.yml in examples (#171) * fix nnictl bug * add hdfs host validation * fix bugs * fix dockerfile * fix install.sh * update install.sh * fix dockerfile * Set timeout for HDFSUtility exists function * remove unused TODO * fix sdk * add optional for outputDir and dataDir * refactor dockerfile.base * Remove unused import in hdfsclientUtility * add config_pai.yml * refactor nnictl create logic and add colorful print * fix nnictl stop logic * add annotation for config_pai.yml * add document for start experiment * fix config.yml * fix document * fix dataDir and outputDir in config_pai.yml * fix config_pai.yml * Update slidebar icon (#173) * quick fix bug: assessor validation in nnictl (#200) * fix nnictl bug * add hdfs host validation * fix bugs * fix dockerfile * fix install.sh * update install.sh * fix dockerfile * Set timeout for HDFSUtility exists function * remove unused TODO * fix sdk * add optional for outputDir and dataDir * refactor dockerfile.base * Remove unused import in hdfsclientUtility * add config_pai.yml * refactor nnictl create logic and add colorful print * fix nnictl stop logic * add annotation for config_pai.yml * add document for start experiment * fix config.yml * fix document * fix dataDir and outputDir in config_pai.yml * fix config_pai.yml * fix assessor launcher * Disable the tensorboard button about pai experiment (#192) * Remove the gap in search space value array (#226) * Remove the gap in search space value array * Update duration * Fix comments of Chengmin * Quick fix bug: nnictl port value error (#245) * fix port bug * Dev exp stop more (#221) * Exp stop refactor (#161) * Update RemoteMachineMode.md (#63) * Remove unused classes for SQuAD QA example. * Remove more unused functions for SQuAD QA example. * Fix default dataset config. * Add Makefile README (#64) * update document (#92) * Edit readme.md * updated a word * Update GetStarted.md * Update GetStarted.md * refact readme, getstarted and write your trial md. * Update README.md * Update WriteYourTrial.md * Update WriteYourTrial.md * Update WriteYourTrial.md * Update WriteYourTrial.md * Fix nnictl bugs and add new feature (#75) * fix nnictl bug * fix nnictl create bug * add experiment status logic * add more information for nnictl * fix Evolution Tuner bug * refactor code * fix code in updater.py * fix nnictl --help * fix classArgs bug * update check response.status_code logic * remove Buffer warning (#100) * update readme in ga_squad * update readme * fix typo * Update README.md * Update README.md * Update README.md * Add support for debugging mode * fix setup.py (#115) * Add DAG model configuration format for SQuAD example. * Explain config format for SQuAD QA model. * Add more detailed introduction about the evolution algorithm. * Fix install.sh add add trial log path (#109) * fix nnictl bug * fix nnictl create bug * add experiment status logic * add more information for nnictl * fix Evolution Tuner bug * refactor code * fix code in updater.py * fix nnictl --help * fix classArgs bug * update check response.status_code logic * show trial log path * update document * fix install.sh * set default vallue for maxTrialNum and maxExecDuration * fix nnictl * Dev smac (#116) * support package install (#91) * fix nnictl bug * support package install * update * update package install logic * Fix package install issue (#95) * fix nnictl bug * fix pakcage install * support SMAC as a tuner on nni (#81) * update doc * update doc * update doc * update hyperopt installation * update doc * update doc * update description in setup.py * update setup.py * modify encoding * encoding * add encoding * remove pymc3 * update doc * update builtin tuner spec * support smac in sdk, fix logging issue * support smac tuner * add optimize_mode * update config in nnictl * add __init__.py * update smac * update import path * update setup.py: remove entry_point * update rest server validation * fix bug in nnictl launcher * support classArgs: optimize_mode * quick fix bug * test travis * add dependency * add dependency * add dependency * add dependency * create smac python package * fix trivial points * optimize import of tuners, modify nnictl accordingly * fix bug: incorrect algorithm_name * trivial refactor * for debug * support virtual * update doc of SMAC * update smac requirements * update requirements * change debug mode * update doc * update doc * refactor based on comments * fix comments * modify example config path to relative path and increase maxTrialNum (#94) * modify example config path to relative path and increase maxTrialNum * add document * support conda (#90) (#110) * support install from venv and travis CI * support install from venv and travis CI * support install from venv and travis CI * support conda * support conda * modify example config path to relative path and increase maxTrialNum * undo messy commit * undo messy commit * Support pip install as root (#77) * Typo on #58 (#122) * PAI Training Service implementation (#128) * PAI Training service implementation **1. Implement PAITrainingService **2. Add trial-keeper python module, and modify setup.py to install the module **3. Add PAItrainingService rest server to collect metrics from PAI container. * fix datastore for multiple final result (#129) * Update NNI v0.2 release notes (#132) Update NNI v0.2 release notes * Update setup.py Makefile and documents (#130) * update makefile and setup.py * update makefile and setup.py * update document * update document * Update Makefile no travis * update doc * update doc * fix convert from ss to pcs (#133) * Fix bugs about webui (#131) * Fix webui bugs * Fix tslint * webui logpath and document (#135) * Add webui document and logpath as a href * fix tslint * fix comments by Chengmin * Pai training service bug fix and enhancement (#136) * Add NNI installation scripts * Update pai script, update NNI_out_dir * Update NNI dir in nni sdk local.py * Create .nni folder in nni sdk local.py * Add check before creating .nni folder * Fix typo for PAI_INSTALL_NNI_SHELL_FORMAT * Improve annotation (#138) * Improve annotation * Minor bugfix * Selectively install through pip (#139) Selectively install through pip * update setup.py * fix paiTrainingService bugs (#137) * fix nnictl bug * add hdfs host validation * fix bugs * fix dockerfile * fix install.sh * update install.sh * fix dockerfile * Set timeout for HDFSUtility exists function * remove unused TODO * fix sdk * add optional for outputDir and dataDir * refactor dockerfile.base * Remove unused import in hdfsclientUtility * Add documentation for NNI PAI mode experiment (#141) * Add documentation for NNI PAI mode * Fix typo based on PR comments * Exit with subprocess return code of trial keeper * Remove additional exit code * Fix typo based on PR comments * update doc for smac tuner (#140) * Revert "Selectively install through pip (#139)" due to potential pip install issue (#142) * Revert "Selectively install through pip (#139)" This reverts commit 1d174836d3146a0363e9c9c88094bf9cff865faa. * Add exit code of subprocess for trial_keeper * Update README, add link to PAImode doc * Merge branch V0.2 to Master (#143) * webui logpath and document (#135) * Add webui document and logpath as a href * fix tslint * fix comments by Chengmin * Pai training service bug fix and enhancement (#136) * Add NNI installation scripts * Update pai script, update NNI_out_dir * Update NNI dir in nni sdk local.py * Create .nni folder in nni sdk local.py * Add check before creating .nni folder * Fix typo for PAI_INSTALL_NNI_SHELL_FORMAT * Improve annotation (#138) * Improve annotation * Minor bugfix * Selectively install through pip (#139) Selectively install through pip * update setup.py * fix paiTrainingService bugs (#137) * fix nnictl bug * add hdfs host validation * fix bugs * fix dockerfile * fix install.sh * update install.sh * fix dockerfile * Set timeout for HDFSUtility exists function * remove unused TODO * fix sdk * add optional for outputDir and dataDir * refactor dockerfile.base * Remove unused import in hdfsclientUtility * Add documentation for NNI PAI mode experiment (#141) * Add documentation for NNI PAI mode * Fix typo based on PR comments * Exit with subprocess return code of trial keeper * Remove additional exit code * Fix typo based on PR comments * update doc for smac tuner (#140) * Revert "Selectively install through pip (#139)" due to potential pip install issue (#142) * Revert "Selectively install through pip (#139)" This reverts commit 1d174836d3146a0363e9c9c88094bf9cff865faa. * Add exit code of subprocess for trial_keeper * Update README, add link to PAImode doc * fix bug (#147) * Refactor nnictl and add config_pai.yml (#144) * fix nnictl bug * add hdfs host validation * fix bugs * fix dockerfile * fix install.sh * update install.sh * fix dockerfile * Set timeout for HDFSUtility exists function * remove unused TODO * fix sdk * add optional for outputDir and dataDir * refactor dockerfile.base * Remove unused import in hdfsclientUtility * add config_pai.yml * refactor nnictl create logic and add colorful print * fix nnictl stop logic * add annotation for config_pai.yml * add document for start experiment * fix config.yml * fix document * Fix trial keeper wrongly exit issue (#152) * Fix trial keeper bug, use actual exitcode to exit rather than 1 * Fix bug of table sort (#145) * Update doc for PAIMode and v0.2 release notes (#153) * Update v0.2 documentation regards to release note and PAI training service * Update document to describe NNI docker image * fix antd (#159) * refactor experiment stopping logic * support change concurrency * remove trialJobs.ts * trivial changes * fix bugs * fix bug * support updating maxTrialNum * Modify IT scripts for supporting multiple experiments * Update ci (#175) * Update RemoteMachineMode.md (#63) * Remove unused classes for SQuAD QA example. * Remove more unused functions for SQuAD QA example. * Fix default dataset config. * Add Makefile README (#64) * update document (#92) * Edit readme.md * updated a word * Update GetStarted.md * Update GetStarted.md * refact readme, getstarted and write your trial md. * Update README.md * Update WriteYourTrial.md * Update WriteYourTrial.md * Update WriteYourTrial.md * Update WriteYourTrial.md * Fix nnictl bugs and add new feature (#75) * fix nnictl bug * fix nnictl create bug * add experiment status logic * add more information for nnictl * fix Evolution Tuner bug * refactor code * fix code in updater.py * fix nnictl --help * fix classArgs bug * update check response.status_code logic * remove Buffer warning (#100) * update readme in ga_squad * update readme * fix typo * Update README.md * Update README.md * Update README.md * Add support for debugging mode * modify CI cuz of refracting exp stop * update CI for expstop * update CI for expstop * update CI for expstop * update CI for expstop * update CI for expstop * update CI for expstop * update CI for expstop * update CI for expstop * update CI for expstop * file saving * fix issues from code merge * remove $(INSTALL_PREFIX)/nni/nni_manager before install * fix indent * fix merge issue * socket close * update port * fix merge error * modify ci logic in nnimanager * fix ci * fix bug * change suspended to done * update ci (#229) * update ci * update ci * update ci (#232) * update ci * update ci * update azure-pipelines * update azure-pipelines * update ci (#233) * update ci * update ci * update azure-pipelines * update azure-pipelines * update azure-pipelines * run.py (#238) * Nnupdate ci (#239) * run.py * test ci * Nnupdate ci (#240) * run.py * test ci * test ci * Udci (#241) * run.py * test ci * test ci * test ci * update ci (#242) * run.py * test ci * test ci * test ci * update ci * revert install.sh (#244) * run.py * test ci * test ci * test ci * update ci * revert install.sh * add comments * remove assert * trivial change * trivial change * update Makefile (#246) * update Makefile * update Makefile * quick fix for ci (#248) * add update trialNum and fix bugs (#261) * Add builtin tuner to CI (#247) * update Makefile * update Makefile * add builtin-tuner test * add builtin-tuner test * refractor ci * update azure.yml * add built-in tuner test * fix bugs * Doc refactor (#258) * doc refactor * image name refactor * Refactor nnictl to support listing stopped experiments. (#256) Refactor nnictl to support listing stopped experiments. * Show experiment parameters more beautifully (#262) * fix error on example of RemoteMachineMode (#269) * add pycharm project files to .gitignore list * update pylintrc to conform vscode settings * fix RemoteMachineMode for wrong trainingServicePlatform * Update docker file to use latest nni release (#263) * fix bug about execDuration and endTime (#270) * fix bug about execDuration and endTime * modify time interval to 30 seconds * refactor based on Gems's suggestion * for triggering ci * Refactor dockerfile (#264) * refactor Dockerfile * Support nnictl tensorboard (#268) support tensorboard * Sdk update (#272) * Rename get_parameters to get_next_parameter * annotations add get_next_parameter * updates * updates * updates * updates * updates * add experiment log path to experiment profile (#276) * Add sequenceId to TrialJobInfo (#283) * Show error information and fix paramiko installation (#282) * fix paramiko install * Refactor pip installation logic for supporting uninstall * Update documents due to new pip installation approach * Refactor Makefile for consistent with pip installation approach * Add README for building and uploading NNI package * Fix issues for pip installation * Minor fix on #41 (#280) * Typo on #12 (#281) * plus minor proposals * Quick fix resume logic (#285) * Tgs salt example (#286) * TGS salt example * updates * updates * Hide install via pip prompt, since 0.3 has not been published (#287) * move reward extraction logic to tuner (#274) * add pycharm project files to .gitignore list * update pylintrc to conform vscode settings * fix RemoteMachineMode for wrong trainingServicePlatform * add python cache files to gitignore list * move extract scalar reward logic from dispatcher to tuner * update tuner code corresponding to last commit * update doc for receive_trial_result api change * add numpy to package whitelist of pylint * distinguish param value from return reward for tuner.extract_scalar_reward * update pylintrc * add comments to dispatcher.handle_report_metric_data * refactor extract reward from dict by tuner * Update README.md (#288) added License badge * fix doc mistakes and broken links. (#271) * refactor doc * update with Mao's suggestions * Set theme jekyll-theme-dinky * updated the "Contribute" part (merged Gems' wiki in, updated ReadMe) * fix link * Update README.md * Fix misspelling in examples/trials/ga_squad/README.md * Update WebUI (#295) * Update README.md (#296) * Fix nnictl in master (#300) 1.Fix old version of config file 2.fix sklearn requirements 3.Fix resume log logic * revert master's installation doc to install v0.2 before v0.3 official release. (#298) * refactor doc * update with Mao's suggestions * Set theme jekyll-theme-dinky * update doc * fix links * fix links * fix links * merge * fix links and doc errors * merge * merge * merge * merge * Quick fix nnictl config logic (#289) * fix nnictl bug * fix install.sh * add desc for Dockerfile.build.base * update document for Dockerfile * update * refactor port detect * update * refactor NNICTLDOC.md * add document for pai and nnictl * add default value for port * add exception handling in trial_keeper.py * fix port bug * fix resume * fix nnictl resume and fix nnictl stop * fix document * update * refactor nnictl * update * update doc * update * update nnictl * fix comment * revert dockerfile * update * update * update * fix nnictl error hit * fix comments * fix bash-completion * fix paramiko install * quick fix resume logic * update * quick fix nnictl * merge * updated the "Contribute" part (merged Gems' wiki in, updated ReadMe) * fix link * revise the installation cmd to v0.2 * revise to install v0.2 * Update nnictl_utils.py * Update nnictl_utils.py * Update nnictl_utils.py * Rename the pypi package for nni * Merge 0.3 into master (#313) * Quick fix nnictl config logic (#289) * fix nnictl bug * fix install.sh * add desc for Dockerfile.build.base * update document for Dockerfile * update * refactor port detect * update * refactor NNICTLDOC.md * add document for pai and nnictl * add default value for port * add exception handling in trial_keeper.py * fix port bug * fix resume * fix nnictl resume and fix nnictl stop * fix document * update * refactor nnictl * update * update doc * update * update nnictl * fix comment * revert dockerfile * update * update * update * fix nnictl error hit * fix comments * fix bash-completion * fix paramiko install * quick fix resume logic * update * quick fix nnictl * PR merge to 0.3 (#297) * refactor doc * update with Mao's suggestions * Set theme jekyll-theme-dinky * update doc * fix links * fix links * fix links * merge * fix links and doc errors * merge * merge * merge * merge * Update README.md (#288) added License badge * merge * updated the "Contribute" part (merged Gems' wiki in, updated ReadMe) * fix link * fix doc mistakes and broken links. (#271) * refactor doc * update with Mao's suggestions * Set theme jekyll-theme-dinky * updated the "Contribute" part (merged Gems' wiki in, updated ReadMe) * fix link * Update README.md * Fix misspelling in examples/trials/ga_squad/README.md * revise the installation cmd to v0.2 * revise to install v0.2 * remove enas readme (#292) * Fix datastore performance issue (#301) * Fix nnictl in v0.3 (#299) Fix old version of config file fix sklearn requirements Fix resume log logic * remove paramiko in V0.3 (#306) remove paramiko in V0.3 * Release note 0.3 (#303) * v0.3 release notes * updates * updates * updates * updates * updates * updates * Inform users to set experiment id when id is empty (#310) * fix nnictl bug * fix install.sh * add desc for Dockerfile.build.base * update document for Dockerfile * update * refactor port detect * update * refactor NNICTLDOC.md * add document for pai and nnictl * add default value for port * add exception handling in trial_keeper.py * fix port bug * fix resume * fix nnictl resume and fix nnictl stop * fix document * update * refactor nnictl * update * update doc * update * update nnictl * fix comment * revert dockerfile * update * update * update * fix nnictl error hit * fix comments * fix bash-completion * fix paramiko install * quick fix resume logic * update * quick fix nnictl * fix nnictl crash bug * add requirement.txt for sklearn example * fix nnictl configuration bug * update * update * update * update * remove paramiko * refactor nnictl lfor log stdout * update * updaate * fix endtime when resume (#307) * fix endtime when resume * update * update * update * updates * Fix sequence id issue on resuming experiment (#316) * Fix bugs for v0.3 (#315) * Fix bugs * update * Refactor document of nnictl for v0.3 (#314) * fix nnictl document * Fix bug of default metric for v0.3 (#304) * Fix bug of Default Metric and modifiy trials detail style * update * Document updates for v0.3 (#318) * refactor doc * update with Mao's suggestions * Set theme jekyll-theme-dinky * update doc * fix links * fix links * fix links * merge * fix links and doc errors * merge * merge * merge * merge * Quick fix nnictl config logic (#289) * fix nnictl bug * fix install.sh * add desc for Dockerfile.build.base * update document for Dockerfile * update * refactor port detect * update * refactor NNICTLDOC.md * add document for pai and nnictl * add default value for port * add exception handling in trial_keeper.py * fix port bug * fix resume * fix nnictl resume and fix nnictl stop * fix document * update * refactor nnictl * update * update doc * update * update nnictl * fix comment * revert dockerfile * update * update * update * fix nnictl error hit * fix comments * fix bash-completion * fix paramiko install * quick fix resume logic * update * quick fix nnictl * merge * updated the "Contribute" part (merged Gems' wiki in, updated ReadMe) * fix link * revise the installation cmd to v0.2 * revise to install v0.2 * Update nnictl_utils.py * Update nnictl_utils.py * Update nnictl_utils.py * Update documentation for v0.3 * update (#320) * update * fix ga_squad config * Refactor close experiment implementation * Uniform the names of python modules * Update WebUI docs (#325) * add webui screenshot in README.md (#323) * update doc * update * update * update * add logo * logo * logo * update * update installation doc * Update v0.3.0 release note (#324) update v0.3.0 release note * update doc for v0.3.3 installation tag (#329) as the title * Merge v0.3 into master (#337) * Fix pypi package missing python module * Fix pypi package missing python module * fix bug in smartparam example (#322) * Fix nnictl update trialnum and document (#326) 1.Fix restful server of update 2.Update nnictl document of update 3.Add tensorboard in docement * Update the version numbers from 0.3.2 to 0.3.3 * Fix contributing doc problems (#335) broken link wrong step refine wording * Add download button (#332) * Merge v0.3 to master (#339) * Fix pypi package missing python module * Fix pypi package missing python module * fix bug in smartparam example (#322) * Fix nnictl update trialnum and document (#326) 1.Fix restful server of update 2.Update nnictl document of update 3.Add tensorboard in docement * Update the version numbers from 0.3.2 to 0.3.3 * Update examples (#331) * update mnist-annotation * fix mnist-annotation typo * update mnist example * update mnist-smartparam * update mnist-annotation * update mnist-smartparam * change learning rate * update mnist assessor maxTrialNum * update examples * update examples * update maxTrialNum * fix breaking path in config_assessor.yml * Add error message when experiment's status is error (#338) * Add error message when experiment error * delete unuseful code * Fix bug of bgcolor when experiment status is running * Change base image from devel to runtime, to reduce docker image size (#343) * Update version number since v0.3.4 has been released (#342) * Fixed the issue that pip install --user doesn't work in docker as root user * Fix localTrainingService cancel logic and nnictl logic (#334) Fix nnictl stop logic Fix localTrainingService cancelJob logic Show port information in "nnictl experiment list" cmd. Show more information when config file validate failed. Add nnictl detect adjacent port logic if the platform is pai * update doc about nni docker image (#345) * Merge v0.3 2 (#352) * Fix pypi package missing python module * Fix pypi package missing python module * fix bug in smartparam example (#322) * Fix nnictl update trialnum and document (#326) 1.Fix restful server of update 2.Update nnictl document of update 3.Add tensorboard in docement * Update the version numbers from 0.3.2 to 0.3.3 * Update examples (#331) * update mnist-annotation * fix mnist-annotation typo * update mnist example * update mnist-smartparam * update mnist-annotation * update mnist-smartparam * change learning rate * update mnist assessor maxTrialNum * update examples * update examples * update maxTrialNum * fix breaking path in config_assessor.yml * fix bug in nnimanager (#341) * Update nnictl.py (#347) * Update nnictl.py * modify help message for nnictl stop * update doc for docker image (#353) * update doc for docker image * update * [PAI training service] Support running multiple PAI experiment (#348) * Change base image from devel to runtime, to reduce docker image size * Support running multiple experiment for PAI * Fix a bug regarding to recuisively reference between paiRestServer and paiTrainingService * update makefile (#350) * update makefile * update launcher.py to fix the problem of finding main.js * remove duplicated lib * update local demo doc and configuration (#344) * update local demo doc and configuration * change folder name * Update tutorial_1_CR_exp_local_api.md no need to have a new training file * Delete mnist_gpu.py no need to have a new training file * Update config_gpu.yml no need to have a new training file * add PyTorch to Dockerfile (#362) * update local demo doc and configuration * change folder name * Update tutorial_1_CR_exp_local_api.md no need to have a new training file * Delete mnist_gpu.py no need to have a new training file * Update config_gpu.yml no need to have a new training file * add PyTorch to Dockerfile * Add Pytorch and set sklearn version in Dockerfile (#346) 1.Set scikit-learn==0.20.0 in Dockerfile 2.Update readme.md of dockerile 3.Add PyTorch 0.4.1 4.Add description for 'nnictl stop all' * Quick fix Docker (#363) Remove "RUN python3 -m pip --no-cache-dir install torch torchvision" * Updated document for "write a trial" related fixes. (#351) - Updated document for "write a trial" related fixes per Quanlu's feedback; - Fix wrong links in Get started per Meng's feedback. * Fix the issue#211: WebUI does not support search for a specific Trial (#355) * Fix the issue#211: WebUI does not support search for a specific Trial * delete unuseful code * Update * default 20 * add more details for remote mode docs (#366) * add more details for remote mode docs (#365) * update tutorial for remote machine as well (#367) * Support hyper-band (#358) * add gridsearch tuner (#364) * add gridsearch tuner * add gridsearchtuner * add gridsearchtuner * add gridsearchtuner * update gridsearch tuner * update gridsearch tuner * update gridsearch tuner * update gridsearch tuner * update gridsearch tuner * update gridsearch tuner * update gridsearch tuner * update gridsearch and pylint * Fix nni stop (#368) Fix "nnictl stop" * Add more tooltips in default metric graph (#370) * Add more tooltip in default metric graph and fix bug * update * Update README.md (#371) * [Kubeflow Training Service] V1, merge from kubeflow branch to master branch (#382) * Kubeflow TrainingService support, v1 (#373) 1. Create new Training Service: kubeflow trainning service, use 'kubectl' and kubeflow tfjobs CRD to submit and manage jobs 2. Update nni python SDK to support new kubeflow platform 3. Update nni python SDK's get_sequende_id() implementation, read NNI_TRIAL_SEQ_ID env variable, instead of reading .nni/sequence_id file 4. This version only supports Tensorflow operator. Will add more operators' support in future versions * Add Gitter badge (#376) * Update ci with new built-in tuner and assessor (#359) * fix sdk's unittest and add medianstop, batchtuner to ci * fix sdk's unittest and add medianstop, batchtuner to ci * remove debug info * update azure-pipelines * remove useless code * add some checks * fix pylint * update ci test * update ci * Show intermediate result (#384) * Asynchronous dispatcher (#372) * Asynchronous dispatcher * updates * updates * updates * updates * [Kubeflow training service] Update kubeflow exp job config schema to support distributed training (#387) * Support distributed training on tf-operator, for worker and ps * Update validation rule for kubeflow config * small code refactor adjustment for private methods * Use different output folder for ps and worker * add gpuNum check for local TS (#378) * add gpuNum check for local TS * set CUDA_VISIBLE_DEVICES to empty string when gpuNum is 0 * remove redundency code * [Kubeflow Training Service] Explicitly set cuda_visible_devices env var (#388) * Use different output folder for ps and worker * Add cuda_visible_devices env var if gpuNum is 0 * NNICTL set classArgs as optional (#374) In nnictl, classArgs is not required, now set it as optional for some kind of tuner and assessor may not require classArgs. * Move the call of experimentDoneCleanUp into stopExperiment() method (#390) * Adjust sleep position for sdk_test.py * Exit dispather process if receive Terminate command * Add comment for sleep change in sdk_test.py * Add nniManagerIp in nnictl and trainingService (#393) Add nniManager Ip in nnictl, pai TrainingService and kubeflow TrainingService. If users set nniManagerIp, pai and kubeflow will use this ip instead of using getIPV4() function. Web UI will also use this nniManagerIp. * Fix trialjobstate (#385) * add one more trial job status, EARLY_STOPPED * fix datastore/nnimanager/mockeddatastore. test/webui/metrics_reader not done. USER_TO_CANCEL * fix bug * modifications based on Deshui's comments * fix bug * fix bug in remote mode * add NO_MORE_TRIAL state in experiment (#389) * Multi final metrics (#377) * Rest retrieve multiple final results for multiphase job * updates * mac support with local, remote & pai mode (#386) * update Makefile for mac support, wait for aka.ms support * refix Makefile for colorful echo * update Makefile with shorturl * fix false fail on mac webui * fix cross os remote tmpdir issue * add readonly to RemoteMachineTrainingService.remoteOS * fix var name for PR 386 * Merge v0.2 branch back to master for PR #273 (#400) * fix bugs due to ts.tailstream (#273) * Fix bugs and update webui doc (#397) * Support Azure k8s (#383) Support aks of kuberflow training service Support nnictl set nniManagerIp * [PAI training service] Support virtualCluster configuration (#401) * [PAI training service] Support virtual cluster config * fix a small bug to convert virtualCluster to string * Correct typo (#402) * Fix bug of webui's table (#407) * Fix bug * fix lint * Fix trial start time (#408) * Fix trial job start time * updates * updates * Add codeDir file count validation for setClusterConfig (#409) * Add codeDir file count validation for setClusterConfig * fix a small bug if find command is not installed * Remove codeDir validation for local training service * Remove useless import * Remove intermediate result (#410) * Trial keeper refactor (#411) * [Trial keeper refactor] refactor trial keeper stdout output * Refactor nnictl error information (#412) 1.Refactor nnictl information when validateion error. 2.Set kubernetesServer as optional. * Kubeflow training service documentation, v1 (#419) * Kubeflow training service documentation, v1 * Fix typos based on comments * Fix for issue #414 (#415) * update doc for "write trial" * fix link * issue 414 * Support to show 2 logPath, add hdfsLogPath (#420) * Support to show 2 logPath * fix lint * Update trial status color * fix sdk for NoMoreTrial status (#394) * fix bug * add docs * [Kubeflow training service] fix bug that wrongly split kube delete cmd into 2 lines (#425) * [Kubeflow training service] fix bug that wrongly split kube delete cmd into 2 lines * Adjust white space * Add AKS document (#422) 1.Add kubeflow in experiment config document 2.Add AKS in kubeflow document * Add macOS environment to CI pipeline * Dev hyperband (#405) * support hyperband * add example for hyperband * register Hyperband in tuner * after debug * update doc * trivial change * update spec validation of yaml config * modify nnictl launcher * modify nnimanager and util to support advisor * Quick fix nnictl config logic (#289) * fix nnictl bug * fix install.sh * add desc for Dockerfile.build.base * update document for Dockerfile * update * refactor port detect * update * refactor NNICTLDOC.md * add document for pai and nnictl * add default value for port * add exception handling in trial_keeper.py * fix port bug * fix resume * fix nnictl resume and fix nnictl stop * fix document * update * refactor nnictl * update * update doc * update * update nnictl * fix comment * revert dockerfile * update * update * update * fix nnictl error hit * fix comments * fix bash-completion * fix paramiko install * quick fix resume logic * update * quick fix nnictl * refactor sdk main * update unit test accordingly * update example's config file * update restserver validation * PR merge to 0.3 (#297) * refactor doc * update with Mao's suggestions * Set theme jekyll-theme-dinky * update doc * fix links * fix links * fix links * merge * fix links and doc errors * merge * merge * merge * merge * Update README.md (#288) added License badge * merge * updated the "Contribute" part (merged Gems' wiki in, updated ReadMe) * fix link * fix doc mistakes and broken links. (#271) * refactor doc * update with Mao's suggestions * Set theme jekyll-theme-dinky * updated the "Contribute" part (merged Gems' wiki in, updated ReadMe) * fix link * Update README.md * Fix misspelling in examples/trials/ga_squad/README.md * revise the installation cmd to v0.2 * revise to install v0.2 * remove files * update * remove enas readme (#292) * support checkpoint directory * Fix datastore performance issue (#301) * fix pylint * Fix nnictl in v0.3 (#299) Fix old version of config file fix sklearn requirements Fix resume log logic * modify log * trivial changes * update example * update makefile * update launcher.py to fix the problem of finding main.js * debug * add hyperparameter info into trial_end api * fix bug and update example * fix error induced by merge * support initialize * add doc for hyperband * fix bugs and add config_pai * fix bugs and add config_pai * fix bugs and add config_pai * fix bugs and add config_pai * update doc * add doc for advisor * fit * modification based on hui's comments * update doc * modify loguniform and lognormal (#395) * modify loguniform and lognormal * fix bug * fix bug * update doc * update doc * fix * update tpe for loguniform * update tpe for loguniform * update for loguniform * update for loguniform * update loguniform and qloguniform * update doc * update * revert * revert * revert * revert * fix ci (#433) * fix ci * expand time range * expand time range * Update doc6 (#428) * test * tuners * refactor doc of tuners * update * update assessor doc * update * update * Update KubeflowMode.md (#436) * Delete unnecessary letter 'a' (#439) Fix a spelling bug that may cause confuse * Update doc (#438) * update readme in ga_squad * fix typo * Update README.md * Update README.md * Update README.md * fix path * update README reference * fix bug in config file about batch tuner * update loguniform for smac (#430) * modify loguniform and lognormal * fix bug * fix bug * update doc * update doc * fix * update tpe for loguniform * update tpe for loguniform * update for loguniform * update for loguniform * update loguniform and qloguniform * update doc * update * revert * revert * revert * revert * update loguniform for smac * update loguniform for smac * update loguniform for smac * update loguniform for smac * Add distributed mnist training example, to show how to perform distri… (#435) * Add distributed mnist training example, to show how to perform distributed training on kubeflow for NNI * rename folder name to mnist_distributed * Remove duplicated is_chief check * [Kubeflow training service] Add document for installing NFS client (#442) * Add document for installing NFS client * [V0.4 Release] Kubeflow training service: Remove unued kubernetesServer config entry (#444) * Remove unused kubernetesServer config entry in config file and schema validation * Fix multiphase error message in nnimanager.log (#445) * NNI V0.4 Release: Update version from v0.3.4 to v0.4 (#446) * Change version number from v0.3.4 to v0.4, for NNI v0.4 release * update makefile & doc for pypi & installation (#440) * update pypi/makefile for multiple platform support * update linux os spec * udpate doc for installation & pypi * update readme * Update webui document (#443) * Update webui document * Fix comments of Chengmin * Update document v0.4 (#437) move nnictl folder delete kubernetsServer in nnictl refactor aks document add warning information to expand relative path update experiment status when the experiment crashed. * Update document for PAI mode to turn on 8081 port (#448) * Update document for PAI mode port * Update document for PAI mode to turn on 8081 port * [V0.4 Release] Docoment update for kubeflow and release notes (#450) * Docoment update for kubeflow and release notes * Refactor examples' default image (#449) * update * remove kubernetsServer in nnictl * update document * update * add warning to expand relative path * update doc * update * update * fix doc * update * update * update * refactor image * update * update * update * update * backward compatibility for mac: job end timestamp (#451) * add pycharm project files to .gitignore list * update pylintrc to conform vscode settings * fix RemoteMachineMode for wrong trainingServicePlatform * add python cache files to gitignore list * move extract scalar reward logic from dispatcher to tuner * update tuner code corresponding to last commit * update doc for receive_trial_result api change * add numpy to package whitelist of pylint * distinguish param value from return reward for tuner.extract_scalar_reward * update pylintrc * add comments to dispatcher.handle_report_metric_data * update install for mac support * fix root mode bug on Makefile * Quick fix bug: nnictl port value error (#245) * fix port bug * Dev exp stop more (#221) * Exp stop refactor (#161) * Update RemoteMachineMode.md (#63) * Remove unused classes for SQuAD QA example. * Remove more unused functions for SQuAD QA example. * Fix default dataset config. * Add Makefile README (#64) * update document (#92) * Edit readme.md * updated a word * Update GetStarted.md * Update GetStarted.md * refact readme, getstarted and write your trial md. * Update README.md * Update WriteYourTrial.md * Update WriteYourTrial.md * Update WriteYourTrial.md * Update WriteYourTrial.md * Fix nnictl bugs and add new feature (#75) * fix nnictl bug * fix nnictl create bug * add experiment status logic * add more information for nnictl * fix Evolution Tuner bug * refactor code * fix code in updater.py * fix nnictl --help * fix classArgs bug * update check response.status_code logic * remove Buffer warning (#100) * update readme in ga_squad * update readme * fix typo * Update README.md * Update README.md * Update README.md * Add support for debugging mode * fix setup.py (#115) * Add DAG model configuration format for SQuAD example. * Explain config format for SQuAD QA model. * Add more detailed introduction about the evolution algorithm. * Fix install.sh add add trial log path (#109) * fix nnictl bug * fix nnictl create bug * add experiment status logic * add more information for nnictl * fix Evolution Tuner bug * refactor code * fix code in updater.py * fix nnictl --help * fix classArgs bug * update check response.status_code logic * show trial log path * update document * fix install.sh * set default vallue for maxTrialNum and maxExecDuration * fix nnictl * Dev smac (#116) * support package install (#91) * fix nnictl bug * support package install * update * update package install logic * Fix package install issue (#95) * fix nnictl bug * fix pakcage install * support SMAC as a tuner on nni (#81) * update doc * update doc * update doc * update hyperopt installation * update doc * update doc * update description in setup.py * update setup.py * modify encoding * encoding * add encoding * remove pymc3 * update doc * update builtin tuner spec * support smac in sdk, fix logging issue * support smac tuner * add optimize_mode * update config in nnictl * add __init__.py * update smac * update import path * update setup.py: remove entry_point * update rest server validation * fix bug in nnictl launcher * support classArgs: optimize_mode * quick fix bug * test travis * add dependency * add dependency * add dependency * add dependency * create smac python package * fix trivial points * optimize import of tuners, modify nnictl accordingly * fix bug: incorrect algorithm_name * trivial refactor * for debug * support virtual * update doc of SMAC * update smac requirements * update requirements * change debug mode * update doc * update doc * refactor based on comments * fix comments * modify example config path to relative path and increase maxTrialNum (#94) * modify example config path to relative path and increase maxTrialNum * add document * support conda (#90) (#110) * support install from venv and travis CI * support install from venv and travis CI * support install from venv and travis CI * support conda * support conda * modify example config path to relative path and increase maxTrialNum * undo messy commit * undo messy commit * Support pip install as root (#77) * Typo on #58 (#122) * PAI Training Service implementation (#128) * PAI Training service implementation **1. Implement PAITrainingService **2. Add trial-keeper python module, and modify setup.py to install the module **3. Add PAItrainingService rest server to collect metrics from PAI container. * fix datastore for multiple final result (#129) * Update NNI v0.2 release notes (#132) Update NNI v0.2 release notes * Update setup.py Makefile and documents (#130) * update makefile and setup.py * update makefile and setup.py * update document * update document * Update Makefile no travis * update doc * update doc * fix convert from ss to pcs (#133) * Fix bugs about webui (#131) * Fix webui bugs * Fix tslint * webui logpath and document (#135) * Add webui document and logpath as a href * fix tslint * fix comments by Chengmin * Pai training service bug fix and enhancement (#136) * Add NNI installation scripts * Update pai script, update NNI_out_dir * Update NNI dir in nni sdk local.py * Create .nni folder in nni sdk local.py * Add check before creating .nni folder * Fix typo for PAI_INSTALL_NNI_SHELL_FORMAT * Improve annotation (#138) * Improve annotation * Minor bugfix * Selectively install through pip (#139) Selectively install through pip * update setup.py * fix paiTrainingService bugs (#137) * fix nnictl bug * add hdfs host validation * fix bugs * fix dockerfile * fix install.sh * update install.sh * fix dockerfile * Set timeout for HDFSUtility exists function * remove unused TODO * fix sdk * add optional for outputDir and dataDir * refactor dockerfile.base * Remove unused import in hdfsclientUtility * Add documentation for NNI PAI mode experiment (#141) * Add documentation for NNI PAI mode * Fix typo based on PR comments * Exit with subprocess return code of trial keeper * Remove additional exit code * Fix typo based on PR comments * update doc for smac tuner (#140) * Revert "Selectively install through pip (#139)" due to potential pip install issue (#142) * Revert "Selectively install through pip (#139)" This reverts commit 1d174836d3146a0363e9c9c88094bf9cff865faa. * Add exit code of subprocess for trial_keeper * Update README, add link to PAImode doc * Merge branch V0.2 to Master (#143) * webui logpath and document (#135) * Add webui document and logpath as a href * fix tslint * fix comments by Chengmin * Pai training service bug fix and enhancement (#136) * Add NNI installation scripts * Update pai script, update NNI_out_dir * Update NNI dir in nni sdk local.py * Create .nni folder in nni sdk local.py * Add check before creating .nni folder * Fix typo for PAI_INSTALL_NNI_SHELL_FORMAT * Improve annotation (#138) * Improve annotation * Minor bugfix * Selectively install through pip (#139) Selectively install through pip * update setup.py * fix paiTrainingService bugs (#137) * fix nnictl bug * add hdfs host validation * fix bugs * fix dockerfile * fix install.sh * update install.sh * fix dockerfile * Set timeout for HDFSUtility exists function * remove unused TODO * fix sdk * add optional for outputDir and dataDir * refactor dockerfile.base * Remove unused import in hdfsclientUtility * Add documentation for NNI PAI mode experiment (#141) * Add documentation for NNI PAI mode * Fix typo based on PR comments * Exit with subprocess return code of trial keeper * Remove additional exit code * Fix typo based on PR comments * update doc for smac tuner (#140) * Revert "Selectively install through pip (#139)" due to potential pip install issue (#142) * Revert "Selectively install through pip (#139)" This reverts commit 1d174836d3146a0363e9c9c88094bf9cff865faa. * Add exit code of subprocess for trial_keeper * Update README, add link to PAImode doc * fix bug (#147) * Refactor nnictl and add config_pai.yml (#144) * fix nnictl bug * add hdfs host validation * fix bugs * fix dockerfile * fix install.sh * update install.sh * fix dockerfile * Set timeout for HDFSUtility exists function * remove unused TODO * fix sdk * add optional for outputDir and dataDir * refactor dockerfile.base * Remove unused import in hdfsclientUtility * add config_pai.yml * refactor nnictl create logic and add colorful print * fix nnictl stop logic * add annotation for config_pai.yml * add document for start experiment * fix config.yml * fix document * Fix trial keeper wrongly exit issue (#152) * Fix trial keeper bug, use actual exitcode to exit rather than 1 * Fix bug of table sort (#145) * Update doc for PAIMode and v0.2 release notes (#153) * Update v0.2 documentation regards to release note and PAI training service * Update document to describe NNI docker image * fix antd (#159) * refactor experiment stopping logic * support change concurrency * remove trialJobs.ts * trivial changes * fix bugs * fix bug * support updating maxTrialNum * Modify IT scripts for supporting multiple experiments * Update ci (#175) * Update RemoteMachineMode.md (#63) * Remove unused classes for SQuAD QA example. * Remove more unused functions for SQuAD QA example. * Fix default dataset config. * Add Makefile README (#64) * update document (#92) * Edit readme.md * updated a word * Update GetStarted.md * Update GetStarted.md * refact readme, getstarted and write your trial md. * Update README.md * Update WriteYourTrial.md * Update WriteYourTrial.md * Update WriteYourTrial.md * Update WriteYourTrial.md * Fix nnictl bugs and add new feature (#75) * fix nnictl bug * fix nnictl create bug * add experiment status logic * add more information for nnictl * fix Evolution Tuner bug * refactor code * fix code in updater.py * fix nnictl --help * fix classArgs bug * update check response.status_code logic * remove Buffer warning (#100) * update readme in ga_squad * update readme * fix typo * Update README.md * Update README.md * Update README.md * Add support for debugging mode * modify CI cuz of refracting exp stop * update CI for expstop * update CI for expstop * update CI for expstop * update CI for expstop * update CI for expstop * update CI for expstop * update CI for expstop * update CI for expstop * update CI for expstop * file saving * fix issues from code merge * remove $(INSTALL_PREFIX)/nni/nni_manager before install * fix indent * fix merge issue * socket close * update port * fix merge error * modify ci logic in nnimanager * fix ci * fix bug * change suspended to done * update ci (#229) * update ci * update ci * update ci (#232) * update ci * update ci * update azure-pipelines * update azure-pipelines * update ci (#233) * update ci * update ci * update azure-pipelines * update azure-pipelines * update azure-pipelines * run.py (#238) * Nnupdate ci (#239) * run.py * test ci * Nnupdate ci (#240) * run.py * test ci * test ci * Udci (#241) * run.py * test ci * test ci * test ci * update ci (#242) * run.py * test ci * test ci * test ci * update ci * revert install.sh (#244) * run.py * test ci * test ci * test ci * update ci * revert install.sh * add comments * remove assert * trivial change * trivial change * update Makefile (#246) * update Makefile * update Makefile * quick fix for ci (#248) * add update trialNum and fix bugs (#261) * Add builtin tuner to CI (#247) * update Makefile * update Makefile * add builtin-tuner test * add builtin-tuner test * refractor ci * update azure.yml * add built-in tuner test * fix bugs * Doc refactor (#258) * doc refactor * image name refactor * Refactor nnictl to support listing stopped experiments. (#256) Refactor nnictl to support listing stopped experiments. * Show experiment parameters more beautifully (#262) * fix error on example of RemoteMachineMode (#269) * add pycharm project files to .gitignore list * update pylintrc to conform vscode settings * fix RemoteMachineMode for wrong trainingServicePlatform * Update docker file to use latest nni release (#263) * fix bug about execDuration and endTime (#270) * fix bug about execDuration and endTime * modify time interval to 30 seconds * refactor based on Gems's suggestion * for triggering ci * Refactor dockerfile (#264) * refactor Dockerfile * Support nnictl tensorboard (#268) support tensorboard * Sdk update (#272) * Rename get_parameters to get_next_parameter * annotations add get_next_parameter * updates * updates * updates * updates * updates * add experiment log path to experiment profile (#276) * refactor extract reward from dict by tuner * update Makefile for mac support, wait for aka.ms support * refix Makefile for colorful echo * update Makefile with shorturl * fix false fail on mac webui * fix cross os remote tmpdir issue * add readonly to RemoteMachineTrainingService.remoteOS * fix var name for PR 386 * cross platform package * update pypi/makefile for multiple platform support * update linux os spec * udpate doc for installation & pypi * update readme * job timestamp compatibility for mac * Update nni arch overview diagram (#447) * refactor doc * update with Mao's suggestions * Set theme jekyll-theme-dinky * update doc * fix links * fix links * fix links * merge * fix links and doc errors * merge * merge * merge * merge * Quick fix nnictl config logic (#289) * fix nnictl bug * fix install.sh * add desc for Dockerfile.build.base * update document for Dockerfile * update * refactor port detect * update * refactor NNICTLDOC.md * add document for pai and nnictl * add default value for port * add exception handling in trial_keeper.py * fix port bug * fix resume * fix nnictl resume and fix nnictl stop * fix document * update * refactor nnictl * update * update doc * update * update nnictl * fix comment * revert dockerfile * update * update * update * fix nnictl error hit * fix comments * fix bash-completion * fix paramiko install * quick fix resume logic * update * quick fix nnictl * merge * updated the "Contribute" part (merged Gems' wiki in, updated ReadMe) * fix link * revise the installation cmd to v0.2 * revise to install v0.2 * Update nnictl_utils.py * Update nnictl_utils.py * Update nnictl_utils.py * Update documentation for v0.3 * update release note * update v0.3.0 release note +1 * update doc for installation tag v0.3.3 * fix contributing doc problems * update doc for "write trial" * fix link * issue 414 * update arch overview diagram in README * update image * fix broken link * Update README.md * Correct typo, macOS -> MacOS * update ga_squad example (#461) * update ga_squad experiment example on pai * Update config_pai.yml * Update README.md * Update config_pai.yml * Update README.md * Update README.md * Update README.md * Update pai token by time interval (#434) Update pai token every 2 hours. * Support kuberflow pytorch-operator (#406) 1.Support pytorch-operator 2.remove unsupported operator * added search trail by id function (#455) * correct assessor typo (#463) correct assessor typos in several files. * Quick fix paiTrainingService (#465) quick fix paiTrainingService, add deferred.resolve(); * Add system requirements for NNI Installation * Fix nnictl multiThread option (#467) * Dev networkmorphism (#413) * Quick fix nnictl config logic (#289) * fix nnictl bug * fix install.sh * add desc for Dockerfile.build.base * update document for Dockerfile * update * refactor port detect * update * refactor NNICTLDOC.md * add document for pai and nnictl * add default value for port * add exception handling in trial_keeper.py * fix port bug * fix resume * fix nnictl resume and fix nnictl stop * fix document * update * refactor nnictl * update * update doc * update * update nnictl * fix comment * revert dockerfile * update * update * update * fix nnictl error hit * fix comments * fix bash-completion * fix paramiko install * quick fix resume logic * update * quick fix nnictl * PR merge to 0.3 (#297) * refactor doc * update with Mao's suggestions * Set theme jekyll-theme-dinky * update doc * fix links * fix links * fix links * merge * fix links and doc errors * merge * merge * merge * merge * Update README.md (#288) added License badge * merge * updated the "Contribute" part (merged Gems' wiki in, updated ReadMe) * fix link * fix doc mistakes and broken links. (#271) * refactor doc * update with Mao's suggestions * Set theme jekyll-theme-dinky * updated the "Contribute" part (merged Gems' wiki in, updated ReadMe) * fix link * Update README.md * Fix misspelling in examples/trials/ga_squad/README.md * revise the installation cmd to v0.2 * revise to install v0.2 * remove enas readme (#292) * Fix datastore performance issue (#301) * Fix nnictl in v0.3 (#299) Fix old version of config file fix sklearn requirements Fix resume log logic * add basic tuner and trial for network morphism * Complete basic receive_trial_result() and generate_parameters(). Use onnx as the intermediate representation ( But it cannot convert to pytorch model ) * add tensorflow cifar10 for network morphism * add unit test for tuner and its function * use temporary torch_model * fix request bug and program can communicate nni * add basic pickle support for graph and train successful in pytorch * Update unittest for networkmorphism_tuner * Network Morphism add multi-gpu trial training support * Format code with black tool * change intermediate representation from pickle file to json we defined * successfully pass the unittest for test_graph_json_transform * add README for network morphism and it works fine in both Pytorch and Keras. * separate the original Readme.md in network-morphism into two parts (tuner and trial) * change the openpai image path * beautify the file structure of network_morphism and add a fashion_mnist keras example * pretty the source and add some docstring for funtion in order to pass the pylint. * remove unused module import and add some docstring * add some details for the application scenario Network Morphism Tuner * follow the advice and modify the doc file * add the config file for each task in the examples trial of network morphism * change default python interpreter from python to python3 * Support 'nnictl top' (#464) Add nnictl top command …
chicm-ms
added a commit
that referenced
this pull request
Feb 18, 2019
* Pull code (#22) * Support distributed job for frameworkcontroller (#612) support distributed job for frameworkcontroller * Multiphase doc (#519) * multiPhase doc * updates * updates * Add time parser for 'nnictl update duration' (#632) Current nnictl update duration only support seconds unit, add a parser for this command to support {s, m, h, d} * fix experiment state bug (#629) * update top README.md (#622) * Update README.md * update (#634) * Integration tests refactoring (#625) * Integration test refactoring (#21) (#616) * Integration test refactoring (#21) * Refactoring integration tests * test metrics * update azure pipeline * updates * updates * updates * updates * updates * updates * updates * updates * updates * updates * updates * updates * updates * updates * updates * updates * updates * updates * updates * updates * update trigger * Integration test refactoring (#618) * updates * updates * update pipeline (#619) * update pipeline * updates * updates * updates * updates * updates * test pipeline (#623) * test pipeline * updates * updates * updates * Update integration test (#624) * Update integration test * updates * updates * updates * updates * updates * updates * Revert "Pull code (#22)" This reverts commit 62fc165ad7b2ba724eead3b99f010aa34491e2c7. * Fix broken pipe v0.5.1 (#679) * Fix broken pipe error * updates * Update version message (#682) * Lijiao (#1) * Set up CI with Azure Pipelines * Add idompotent support for get_parameters() in nni sdk (#216) * Updated based on comments * Fix bug, make get_parameters() idompotent * Add idompotent support for get_parameters() in LocalTrainingService * Add ip address cached to resolve network issue (#220) * Add ip address cached to resolve network issue * Fix bug of trial hypermeters (#222) * Format trial duration, rename button name (#209) * Refactor nnictl to support list multiple experiment (#207) 1.fix some bugs 2.support nnictl stop id, and add some regulars * Fix bug when trail duration is 0 (#223) * Fix broken issue comes from OpenPAI API upgrade (#227) Fix paiTrainingService broken issue comes from OpenPAI API upgrade * add document for nnictl (#230) * fix nnictl bug * fix install.sh * add desc for Dockerfile.build.base * update document for Dockerfile * update * refactor port detect * update * refactor NNICTLDOC.md * add document for pai and nnictl * add default value for port * Fix bug: some trial jobs hyper params not stored and error handling updates (#225) * Pull latest code (#2) * webui logpath and document (#135) * Add webui document and logpath as a href * fix tslint * fix comments by Chengmin * Pai training service bug fix and enhancement (#136) * Add NNI installation scripts * Update pai script, update NNI_out_dir * Update NNI dir in nni sdk local.py * Create .nni folder in nni sdk local.py * Add check before creating .nni folder * Fix typo for PAI_INSTALL_NNI_SHELL_FORMAT * Improve annotation (#138) * Improve annotation * Minor bugfix * Selectively install through pip (#139) Selectively install through pip * update setup.py * fix paiTrainingService bugs (#137) * fix nnictl bug * add hdfs host validation * fix bugs * fix dockerfile * fix install.sh * update install.sh * fix dockerfile * Set timeout for HDFSUtility exists function * remove unused TODO * fix sdk * add optional for outputDir and dataDir * refactor dockerfile.base * Remove unused import in hdfsclientUtility * Add documentation for NNI PAI mode experiment (#141) * Add documentation for NNI PAI mode * Fix typo based on PR comments * Exit with subprocess return code of trial keeper * Remove additional exit code * Fix typo based on PR comments * update doc for smac tuner (#140) * Revert "Selectively install through pip (#139)" due to potential pip install issue (#142) * Revert "Selectively install through pip (#139)" This reverts commit 1d174836d3146a0363e9c9c88094bf9cff865faa. * Add exit code of subprocess for trial_keeper * Update README, add link to PAImode doc * fix bug (#147) * Refactor nnictl and add config_pai.yml (#144) * fix nnictl bug * add hdfs host validation * fix bugs * fix dockerfile * fix install.sh * update install.sh * fix dockerfile * Set timeout for HDFSUtility exists function * remove unused TODO * fix sdk * add optional for outputDir and dataDir * refactor dockerfile.base * Remove unused import in hdfsclientUtility * add config_pai.yml * refactor nnictl create logic and add colorful print * fix nnictl stop logic * add annotation for config_pai.yml * add document for start experiment * fix config.yml * fix document * Fix trial keeper wrongly exit issue (#152) * Fix trial keeper bug, use actual exitcode to exit rather than 1 * Fix bug of table sort (#145) * Update doc for PAIMode and v0.2 release notes (#153) * Update v0.2 documentation regards to release note and PAI training service * Update document to describe NNI docker image * Bug fix for SQuAD example tuner. (#134) * Update Makefile (#151) * test * update setup.py * update Makefile and install.sh * rever setup.py * change color * update doc * update doc * fix auto-completion's extra space * update Makefile * update webui * Update doc image (#163) * update doc * trivial * trivial * trivial * trivial * trivial * trivial * update image * update image size * Update ga squad (#104) * update readme in ga_squad * update readme * fix typo * Update README.md * Update README.md * Update README.md * update readme * sklearn examples (#169) * fix nnictl bug * fix install.sh * add sklearn-regression example * add sklearn classification * update sklearn * update example * remove additional code * Update batch tuner (#158) * update readme in ga_squad * update readme * fix typo * Update README.md * Update README.md * Update README.md * update readme * update batch tuner * Quickly fix cascading search space bug in tuner (#156) * update readme in ga_squad * update readme * fix typo * Update README.md * Update README.md * Update README.md * update readme * quickly fix cascading searchspace bug in tuner * Add iterative search space example (#119) * update readme in ga_squad * update readme * fix typo * Update README.md * Update README.md * Update README.md * update readme * add iterative search space example * update * update readme * change name * Fix bug: some trial jobs hyper params not stored when fast finished * updates * updates * updates * updates * Add exception handler in trial_keeper (#235) * add exception handling in trial_keeper.py * Merge 0.2 into master (#237) * fix antd (#159) * quick fix config_pai.yml in examples (#171) * fix nnictl bug * add hdfs host validation * fix bugs * fix dockerfile * fix install.sh * update install.sh * fix dockerfile * Set timeout for HDFSUtility exists function * remove unused TODO * fix sdk * add optional for outputDir and dataDir * refactor dockerfile.base * Remove unused import in hdfsclientUtility * add config_pai.yml * refactor nnictl create logic and add colorful print * fix nnictl stop logic * add annotation for config_pai.yml * add document for start experiment * fix config.yml * fix document * fix dataDir and outputDir in config_pai.yml * fix config_pai.yml * Update slidebar icon (#173) * quick fix bug: assessor validation in nnictl (#200) * fix nnictl bug * add hdfs host validation * fix bugs * fix dockerfile * fix install.sh * update install.sh * fix dockerfile * Set timeout for HDFSUtility exists function * remove unused TODO * fix sdk * add optional for outputDir and dataDir * refactor dockerfile.base * Remove unused import in hdfsclientUtility * add config_pai.yml * refactor nnictl create logic and add colorful print * fix nnictl stop logic * add annotation for config_pai.yml * add document for start experiment * fix config.yml * fix document * fix dataDir and outputDir in config_pai.yml * fix config_pai.yml * fix assessor launcher * Disable the tensorboard button about pai experiment (#192) * Remove the gap in search space value array (#226) * Remove the gap in search space value array * Update duration * Fix comments of Chengmin * Quick fix bug: nnictl port value error (#245) * fix port bug * Dev exp stop more (#221) * Exp stop refactor (#161) * Update RemoteMachineMode.md (#63) * Remove unused classes for SQuAD QA example. * Remove more unused functions for SQuAD QA example. * Fix default dataset config. * Add Makefile README (#64) * update document (#92) * Edit readme.md * updated a word * Update GetStarted.md * Update GetStarted.md * refact readme, getstarted and write your trial md. * Update README.md * Update WriteYourTrial.md * Update WriteYourTrial.md * Update WriteYourTrial.md * Update WriteYourTrial.md * Fix nnictl bugs and add new feature (#75) * fix nnictl bug * fix nnictl create bug * add experiment status logic * add more information for nnictl * fix Evolution Tuner bug * refactor code * fix code in updater.py * fix nnictl --help * fix classArgs bug * update check response.status_code logic * remove Buffer warning (#100) * update readme in ga_squad * update readme * fix typo * Update README.md * Update README.md * Update README.md * Add support for debugging mode * fix setup.py (#115) * Add DAG model configuration format for SQuAD example. * Explain config format for SQuAD QA model. * Add more detailed introduction about the evolution algorithm. * Fix install.sh add add trial log path (#109) * fix nnictl bug * fix nnictl create bug * add experiment status logic * add more information for nnictl * fix Evolution Tuner bug * refactor code * fix code in updater.py * fix nnictl --help * fix classArgs bug * update check response.status_code logic * show trial log path * update document * fix install.sh * set default vallue for maxTrialNum and maxExecDuration * fix nnictl * Dev smac (#116) * support package install (#91) * fix nnictl bug * support package install * update * update package install logic * Fix package install issue (#95) * fix nnictl bug * fix pakcage install * support SMAC as a tuner on nni (#81) * update doc * update doc * update doc * update hyperopt installation * update doc * update doc * update description in setup.py * update setup.py * modify encoding * encoding * add encoding * remove pymc3 * update doc * update builtin tuner spec * support smac in sdk, fix logging issue * support smac tuner * add optimize_mode * update config in nnictl * add __init__.py * update smac * update import path * update setup.py: remove entry_point * update rest server validation * fix bug in nnictl launcher * support classArgs: optimize_mode * quick fix bug * test travis * add dependency * add dependency * add dependency * add dependency * create smac python package * fix trivial points * optimize import of tuners, modify nnictl accordingly * fix bug: incorrect algorithm_name * trivial refactor * for debug * support virtual * update doc of SMAC * update smac requirements * update requirements * change debug mode * update doc * update doc * refactor based on comments * fix comments * modify example config path to relative path and increase maxTrialNum (#94) * modify example config path to relative path and increase maxTrialNum * add document * support conda (#90) (#110) * support install from venv and travis CI * support install from venv and travis CI * support install from venv and travis CI * support conda * support conda * modify example config path to relative path and increase maxTrialNum * undo messy commit * undo messy commit * Support pip install as root (#77) * Typo on #58 (#122) * PAI Training Service implementation (#128) * PAI Training service implementation **1. Implement PAITrainingService **2. Add trial-keeper python module, and modify setup.py to install the module **3. Add PAItrainingService rest server to collect metrics from PAI container. * fix datastore for multiple final result (#129) * Update NNI v0.2 release notes (#132) Update NNI v0.2 release notes * Update setup.py Makefile and documents (#130) * update makefile and setup.py * update makefile and setup.py * update document * update document * Update Makefile no travis * update doc * update doc * fix convert from ss to pcs (#133) * Fix bugs about webui (#131) * Fix webui bugs * Fix tslint * webui logpath and document (#135) * Add webui document and logpath as a href * fix tslint * fix comments by Chengmin * Pai training service bug fix and enhancement (#136) * Add NNI installation scripts * Update pai script, update NNI_out_dir * Update NNI dir in nni sdk local.py * Create .nni folder in nni sdk local.py * Add check before creating .nni folder * Fix typo for PAI_INSTALL_NNI_SHELL_FORMAT * Improve annotation (#138) * Improve annotation * Minor bugfix * Selectively install through pip (#139) Selectively install through pip * update setup.py * fix paiTrainingService bugs (#137) * fix nnictl bug * add hdfs host validation * fix bugs * fix dockerfile * fix install.sh * update install.sh * fix dockerfile * Set timeout for HDFSUtility exists function * remove unused TODO * fix sdk * add optional for outputDir and dataDir * refactor dockerfile.base * Remove unused import in hdfsclientUtility * Add documentation for NNI PAI mode experiment (#141) * Add documentation for NNI PAI mode * Fix typo based on PR comments * Exit with subprocess return code of trial keeper * Remove additional exit code * Fix typo based on PR comments * update doc for smac tuner (#140) * Revert "Selectively install through pip (#139)" due to potential pip install issue (#142) * Revert "Selectively install through pip (#139)" This reverts commit 1d174836d3146a0363e9c9c88094bf9cff865faa. * Add exit code of subprocess for trial_keeper * Update README, add link to PAImode doc * Merge branch V0.2 to Master (#143) * webui logpath and document (#135) * Add webui document and logpath as a href * fix tslint * fix comments by Chengmin * Pai training service bug fix and enhancement (#136) * Add NNI installation scripts * Update pai script, update NNI_out_dir * Update NNI dir in nni sdk local.py * Create .nni folder in nni sdk local.py * Add check before creating .nni folder * Fix typo for PAI_INSTALL_NNI_SHELL_FORMAT * Improve annotation (#138) * Improve annotation * Minor bugfix * Selectively install through pip (#139) Selectively install through pip * update setup.py * fix paiTrainingService bugs (#137) * fix nnictl bug * add hdfs host validation * fix bugs * fix dockerfile * fix install.sh * update install.sh * fix dockerfile * Set timeout for HDFSUtility exists function * remove unused TODO * fix sdk * add optional for outputDir and dataDir * refactor dockerfile.base * Remove unused import in hdfsclientUtility * Add documentation for NNI PAI mode experiment (#141) * Add documentation for NNI PAI mode * Fix typo based on PR comments * Exit with subprocess return code of trial keeper * Remove additional exit code * Fix typo based on PR comments * update doc for smac tuner (#140) * Revert "Selectively install through pip (#139)" due to potential pip install issue (#142) * Revert "Selectively install through pip (#139)" This reverts commit 1d174836d3146a0363e9c9c88094bf9cff865faa. * Add exit code of subprocess for trial_keeper * Update README, add link to PAImode doc * fix bug (#147) * Refactor nnictl and add config_pai.yml (#144) * fix nnictl bug * add hdfs host validation * fix bugs * fix dockerfile * fix install.sh * update install.sh * fix dockerfile * Set timeout for HDFSUtility exists function * remove unused TODO * fix sdk * add optional for outputDir and dataDir * refactor dockerfile.base * Remove unused import in hdfsclientUtility * add config_pai.yml * refactor nnictl create logic and add colorful print * fix nnictl stop logic * add annotation for config_pai.yml * add document for start experiment * fix config.yml * fix document * Fix trial keeper wrongly exit issue (#152) * Fix trial keeper bug, use actual exitcode to exit rather than 1 * Fix bug of table sort (#145) * Update doc for PAIMode and v0.2 release notes (#153) * Update v0.2 documentation regards to release note and PAI training service * Update document to describe NNI docker image * fix antd (#159) * refactor experiment stopping logic * support change concurrency * remove trialJobs.ts * trivial changes * fix bugs * fix bug * support updating maxTrialNum * Modify IT scripts for supporting multiple experiments * Update ci (#175) * Update RemoteMachineMode.md (#63) * Remove unused classes for SQuAD QA example. * Remove more unused functions for SQuAD QA example. * Fix default dataset config. * Add Makefile README (#64) * update document (#92) * Edit readme.md * updated a word * Update GetStarted.md * Update GetStarted.md * refact readme, getstarted and write your trial md. * Update README.md * Update WriteYourTrial.md * Update WriteYourTrial.md * Update WriteYourTrial.md * Update WriteYourTrial.md * Fix nnictl bugs and add new feature (#75) * fix nnictl bug * fix nnictl create bug * add experiment status logic * add more information for nnictl * fix Evolution Tuner bug * refactor code * fix code in updater.py * fix nnictl --help * fix classArgs bug * update check response.status_code logic * remove Buffer warning (#100) * update readme in ga_squad * update readme * fix typo * Update README.md * Update README.md * Update README.md * Add support for debugging mode * modify CI cuz of refracting exp stop * update CI for expstop * update CI for expstop * update CI for expstop * update CI for expstop * update CI for expstop * update CI for expstop * update CI for expstop * update CI for expstop * update CI for expstop * file saving * fix issues from code merge * remove $(INSTALL_PREFIX)/nni/nni_manager before install * fix indent * fix merge issue * socket close * update port * fix merge error * modify ci logic in nnimanager * fix ci * fix bug * change suspended to done * update ci (#229) * update ci * update ci * update ci (#232) * update ci * update ci * update azure-pipelines * update azure-pipelines * update ci (#233) * update ci * update ci * update azure-pipelines * update azure-pipelines * update azure-pipelines * run.py (#238) * Nnupdate ci (#239) * run.py * test ci * Nnupdate ci (#240) * run.py * test ci * test ci * Udci (#241) * run.py * test ci * test ci * test ci * update ci (#242) * run.py * test ci * test ci * test ci * update ci * revert install.sh (#244) * run.py * test ci * test ci * test ci * update ci * revert install.sh * add comments * remove assert * trivial change * trivial change * update Makefile (#246) * update Makefile * update Makefile * quick fix for ci (#248) * add update trialNum and fix bugs (#261) * Add builtin tuner to CI (#247) * update Makefile * update Makefile * add builtin-tuner test * add builtin-tuner test * refractor ci * update azure.yml * add built-in tuner test * fix bugs * Doc refactor (#258) * doc refactor * image name refactor * Refactor nnictl to support listing stopped experiments. (#256) Refactor nnictl to support listing stopped experiments. * Show experiment parameters more beautifully (#262) * fix error on example of RemoteMachineMode (#269) * add pycharm project files to .gitignore list * update pylintrc to conform vscode settings * fix RemoteMachineMode for wrong trainingServicePlatform * Update docker file to use latest nni release (#263) * fix bug about execDuration and endTime (#270) * fix bug about execDuration and endTime * modify time interval to 30 seconds * refactor based on Gems's suggestion * for triggering ci * Refactor dockerfile (#264) * refactor Dockerfile * Support nnictl tensorboard (#268) support tensorboard * Sdk update (#272) * Rename get_parameters to get_next_parameter * annotations add get_next_parameter * updates * updates * updates * updates * updates * add experiment log path to experiment profile (#276) * Add sequenceId to TrialJobInfo (#283) * Show error information and fix paramiko installation (#282) * fix paramiko install * Refactor pip installation logic for supporting uninstall * Update documents due to new pip installation approach * Refactor Makefile for consistent with pip installation approach * Add README for building and uploading NNI package * Fix issues for pip installation * Minor fix on #41 (#280) * Typo on #12 (#281) * plus minor proposals * Quick fix resume logic (#285) * Tgs salt example (#286) * TGS salt example * updates * updates * Hide install via pip prompt, since 0.3 has not been published (#287) * move reward extraction logic to tuner (#274) * add pycharm project files to .gitignore list * update pylintrc to conform vscode settings * fix RemoteMachineMode for wrong trainingServicePlatform * add python cache files to gitignore list * move extract scalar reward logic from dispatcher to tuner * update tuner code corresponding to last commit * update doc for receive_trial_result api change * add numpy to package whitelist of pylint * distinguish param value from return reward for tuner.extract_scalar_reward * update pylintrc * add comments to dispatcher.handle_report_metric_data * refactor extract reward from dict by tuner * Update README.md (#288) added License badge * fix doc mistakes and broken links. (#271) * refactor doc * update with Mao's suggestions * Set theme jekyll-theme-dinky * updated the "Contribute" part (merged Gems' wiki in, updated ReadMe) * fix link * Update README.md * Fix misspelling in examples/trials/ga_squad/README.md * Update WebUI (#295) * Update README.md (#296) * Fix nnictl in master (#300) 1.Fix old version of config file 2.fix sklearn requirements 3.Fix resume log logic * revert master's installation doc to install v0.2 before v0.3 official release. (#298) * refactor doc * update with Mao's suggestions * Set theme jekyll-theme-dinky * update doc * fix links * fix links * fix links * merge * fix links and doc errors * merge * merge * merge * merge * Quick fix nnictl config logic (#289) * fix nnictl bug * fix install.sh * add desc for Dockerfile.build.base * update document for Dockerfile * update * refactor port detect * update * refactor NNICTLDOC.md * add document for pai and nnictl * add default value for port * add exception handling in trial_keeper.py * fix port bug * fix resume * fix nnictl resume and fix nnictl stop * fix document * update * refactor nnictl * update * update doc * update * update nnictl * fix comment * revert dockerfile * update * update * update * fix nnictl error hit * fix comments * fix bash-completion * fix paramiko install * quick fix resume logic * update * quick fix nnictl * merge * updated the "Contribute" part (merged Gems' wiki in, updated ReadMe) * fix link * revise the installation cmd to v0.2 * revise to install v0.2 * Update nnictl_utils.py * Update nnictl_utils.py * Update nnictl_utils.py * Rename the pypi package for nni * Merge 0.3 into master (#313) * Quick fix nnictl config logic (#289) * fix nnictl bug * fix install.sh * add desc for Dockerfile.build.base * update document for Dockerfile * update * refactor port detect * update * refactor NNICTLDOC.md * add document for pai and nnictl * add default value for port * add exception handling in trial_keeper.py * fix port bug * fix resume * fix nnictl resume and fix nnictl stop * fix document * update * refactor nnictl * update * update doc * update * update nnictl * fix comment * revert dockerfile * update * update * update * fix nnictl error hit * fix comments * fix bash-completion * fix paramiko install * quick fix resume logic * update * quick fix nnictl * PR merge to 0.3 (#297) * refactor doc * update with Mao's suggestions * Set theme jekyll-theme-dinky * update doc * fix links * fix links * fix links * merge * fix links and doc errors * merge * merge * merge * merge * Update README.md (#288) added License badge * merge * updated the "Contribute" part (merged Gems' wiki in, updated ReadMe) * fix link * fix doc mistakes and broken links. (#271) * refactor doc * update with Mao's suggestions * Set theme jekyll-theme-dinky * updated the "Contribute" part (merged Gems' wiki in, updated ReadMe) * fix link * Update README.md * Fix misspelling in examples/trials/ga_squad/README.md * revise the installation cmd to v0.2 * revise to install v0.2 * remove enas readme (#292) * Fix datastore performance issue (#301) * Fix nnictl in v0.3 (#299) Fix old version of config file fix sklearn requirements Fix resume log logic * remove paramiko in V0.3 (#306) remove paramiko in V0.3 * Release note 0.3 (#303) * v0.3 release notes * updates * updates * updates * updates * updates * updates * Inform users to set experiment id when id is empty (#310) * fix nnictl bug * fix install.sh * add desc for Dockerfile.build.base * update document for Dockerfile * update * refactor port detect * update * refactor NNICTLDOC.md * add document for pai and nnictl * add default value for port * add exception handling in trial_keeper.py * fix port bug * fix resume * fix nnictl resume and fix nnictl stop * fix document * update * refactor nnictl * update * update doc * update * update nnictl * fix comment * revert dockerfile * update * update * update * fix nnictl error hit * fix comments * fix bash-completion * fix paramiko install * quick fix resume logic * update * quick fix nnictl * fix nnictl crash bug * add requirement.txt for sklearn example * fix nnictl configuration bug * update * update * update * update * remove paramiko * refactor nnictl lfor log stdout * update * updaate * fix endtime when resume (#307) * fix endtime when resume * update * update * update * updates * Fix sequence id issue on resuming experiment (#316) * Fix bugs for v0.3 (#315) * Fix bugs * update * Refactor document of nnictl for v0.3 (#314) * fix nnictl document * Fix bug of default metric for v0.3 (#304) * Fix bug of Default Metric and modifiy trials detail style * update * Document updates for v0.3 (#318) * refactor doc * update with Mao's suggestions * Set theme jekyll-theme-dinky * update doc * fix links * fix links * fix links * merge * fix links and doc errors * merge * merge * merge * merge * Quick fix nnictl config logic (#289) * fix nnictl bug * fix install.sh * add desc for Dockerfile.build.base * update document for Dockerfile * update * refactor port detect * update * refactor NNICTLDOC.md * add document for pai and nnictl * add default value for port * add exception handling in trial_keeper.py * fix port bug * fix resume * fix nnictl resume and fix nnictl stop * fix document * update * refactor nnictl * update * update doc * update * update nnictl * fix comment * revert dockerfile * update * update * update * fix nnictl error hit * fix comments * fix bash-completion * fix paramiko install * quick fix resume logic * update * quick fix nnictl * merge * updated the "Contribute" part (merged Gems' wiki in, updated ReadMe) * fix link * revise the installation cmd to v0.2 * revise to install v0.2 * Update nnictl_utils.py * Update nnictl_utils.py * Update nnictl_utils.py * Update documentation for v0.3 * update (#320) * update * fix ga_squad config * Refactor close experiment implementation * Uniform the names of python modules * Update WebUI docs (#325) * add webui screenshot in README.md (#323) * update doc * update * update * update * add logo * logo * logo * update * update installation doc * Update v0.3.0 release note (#324) update v0.3.0 release note * update doc for v0.3.3 installation tag (#329) as the title * Merge v0.3 into master (#337) * Fix pypi package missing python module * Fix pypi package missing python module * fix bug in smartparam example (#322) * Fix nnictl update trialnum and document (#326) 1.Fix restful server of update 2.Update nnictl document of update 3.Add tensorboard in docement * Update the version numbers from 0.3.2 to 0.3.3 * Fix contributing doc problems (#335) broken link wrong step refine wording * Add download button (#332) * Merge v0.3 to master (#339) * Fix pypi package missing python module * Fix pypi package missing python module * fix bug in smartparam example (#322) * Fix nnictl update trialnum and document (#326) 1.Fix restful server of update 2.Update nnictl document of update 3.Add tensorboard in docement * Update the version numbers from 0.3.2 to 0.3.3 * Update examples (#331) * update mnist-annotation * fix mnist-annotation typo * update mnist example * update mnist-smartparam * update mnist-annotation * update mnist-smartparam * change learning rate * update mnist assessor maxTrialNum * update examples * update examples * update maxTrialNum * fix breaking path in config_assessor.yml * Add error message when experiment's status is error (#338) * Add error message when experiment error * delete unuseful code * Fix bug of bgcolor when experiment status is running * Change base image from devel to runtime, to reduce docker image size (#343) * Update version number since v0.3.4 has been released (#342) * Fixed the issue that pip install --user doesn't work in docker as root user * Fix localTrainingService cancel logic and nnictl logic (#334) Fix nnictl stop logic Fix localTrainingService cancelJob logic Show port information in "nnictl experiment list" cmd. Show more information when config file validate failed. Add nnictl detect adjacent port logic if the platform is pai * update doc about nni docker image (#345) * Merge v0.3 2 (#352) * Fix pypi package missing python module * Fix pypi package missing python module * fix bug in smartparam example (#322) * Fix nnictl update trialnum and document (#326) 1.Fix restful server of update 2.Update nnictl document of update 3.Add tensorboard in docement * Update the version numbers from 0.3.2 to 0.3.3 * Update examples (#331) * update mnist-annotation * fix mnist-annotation typo * update mnist example * update mnist-smartparam * update mnist-annotation * update mnist-smartparam * change learning rate * update mnist assessor maxTrialNum * update examples * update examples * update maxTrialNum * fix breaking path in config_assessor.yml * fix bug in nnimanager (#341) * Update nnictl.py (#347) * Update nnictl.py * modify help message for nnictl stop * update doc for docker image (#353) * update doc for docker image * update * [PAI training service] Support running multiple PAI experiment (#348) * Change base image from devel to runtime, to reduce docker image size * Support running multiple experiment for PAI * Fix a bug regarding to recuisively reference between paiRestServer and paiTrainingService * update makefile (#350) * update makefile * update launcher.py to fix the problem of finding main.js * remove duplicated lib * update local demo doc and configuration (#344) * update local demo doc and configuration * change folder name * Update tutorial_1_CR_exp_local_api.md no need to have a new training file * Delete mnist_gpu.py no need to have a new training file * Update config_gpu.yml no need to have a new training file * add PyTorch to Dockerfile (#362) * update local demo doc and configuration * change folder name * Update tutorial_1_CR_exp_local_api.md no need to have a new training file * Delete mnist_gpu.py no need to have a new training file * Update config_gpu.yml no need to have a new training file * add PyTorch to Dockerfile * Add Pytorch and set sklearn version in Dockerfile (#346) 1.Set scikit-learn==0.20.0 in Dockerfile 2.Update readme.md of dockerile 3.Add PyTorch 0.4.1 4.Add description for 'nnictl stop all' * Quick fix Docker (#363) Remove "RUN python3 -m pip --no-cache-dir install torch torchvision" * Updated document for "write a trial" related fixes. (#351) - Updated document for "write a trial" related fixes per Quanlu's feedback; - Fix wrong links in Get started per Meng's feedback. * Fix the issue#211: WebUI does not support search for a specific Trial (#355) * Fix the issue#211: WebUI does not support search for a specific Trial * delete unuseful code * Update * default 20 * add more details for remote mode docs (#366) * add more details for remote mode docs (#365) * update tutorial for remote machine as well (#367) * Support hyper-band (#358) * add gridsearch tuner (#364) * add gridsearch tuner * add gridsearchtuner * add gridsearchtuner * add gridsearchtuner * update gridsearch tuner * update gridsearch tuner * update gridsearch tuner * update gridsearch tuner * update gridsearch tuner * update gridsearch tuner * update gridsearch tuner * update gridsearch and pylint * Fix nni stop (#368) Fix "nnictl stop" * Add more tooltips in default metric graph (#370) * Add more tooltip in default metric graph and fix bug * update * Update README.md (#371) * [Kubeflow Training Service] V1, merge from kubeflow branch to master branch (#382) * Kubeflow TrainingService support, v1 (#373) 1. Create new Training Service: kubeflow trainning service, use 'kubectl' and kubeflow tfjobs CRD to submit and manage jobs 2. Update nni python SDK to support new kubeflow platform 3. Update nni python SDK's get_sequende_id() implementation, read NNI_TRIAL_SEQ_ID env variable, instead of reading .nni/sequence_id file 4. This version only supports Tensorflow operator. Will add more operators' support in future versions * Add Gitter badge (#376) * Update ci with new built-in tuner and assessor (#359) * fix sdk's unittest and add medianstop, batchtuner to ci * fix sdk's unittest and add medianstop, batchtuner to ci * remove debug info * update azure-pipelines * remove useless code * add some checks * fix pylint * update ci test * update ci * Show intermediate result (#384) * Asynchronous dispatcher (#372) * Asynchronous dispatcher * updates * updates * updates * updates * [Kubeflow training service] Update kubeflow exp job config schema to support distributed training (#387) * Support distributed training on tf-operator, for worker and ps * Update validation rule for kubeflow config * small code refactor adjustment for private methods * Use different output folder for ps and worker * add gpuNum check for local TS (#378) * add gpuNum check for local TS * set CUDA_VISIBLE_DEVICES to empty string when gpuNum is 0 * remove redundency code * [Kubeflow Training Service] Explicitly set cuda_visible_devices env var (#388) * Use different output folder for ps and worker * Add cuda_visible_devices env var if gpuNum is 0 * NNICTL set classArgs as optional (#374) In nnictl, classArgs is not required, now set it as optional for some kind of tuner and assessor may not require classArgs. * Move the call of experimentDoneCleanUp into stopExperiment() method (#390) * Adjust sleep position for sdk_test.py * Exit dispather process if receive Terminate command * Add comment for sleep change in sdk_test.py * Add nniManagerIp in nnictl and trainingService (#393) Add nniManager Ip in nnictl, pai TrainingService and kubeflow TrainingService. If users set nniManagerIp, pai and kubeflow will use this ip instead of using getIPV4() function. Web UI will also use this nniManagerIp. * Fix trialjobstate (#385) * add one more trial job status, EARLY_STOPPED * fix datastore/nnimanager/mockeddatastore. test/webui/metrics_reader not done. USER_TO_CANCEL * fix bug * modifications based on Deshui's comments * fix bug * fix bug in remote mode * add NO_MORE_TRIAL state in experiment (#389) * Multi final metrics (#377) * Rest retrieve multiple final results for multiphase job * updates * mac support with local, remote & pai mode (#386) * update Makefile for mac support, wait for aka.ms support * refix Makefile for colorful echo * update Makefile with shorturl * fix false fail on mac webui * fix cross os remote tmpdir issue * add readonly to RemoteMachineTrainingService.remoteOS * fix var name for PR 386 * Merge v0.2 branch back to master for PR #273 (#400) * fix bugs due to ts.tailstream (#273) * Fix bugs and update webui doc (#397) * Support Azure k8s (#383) Support aks of kuberflow training service Support nnictl set nniManagerIp * [PAI training service] Support virtualCluster configuration (#401) * [PAI training service] Support virtual cluster config * fix a small bug to convert virtualCluster to string * Correct typo (#402) * Fix bug of webui's table (#407) * Fix bug * fix lint * Fix trial start time (#408) * Fix trial job start time * updates * updates * Add codeDir file count validation for setClusterConfig (#409) * Add codeDir file count validation for setClusterConfig * fix a small bug if find command is not installed * Remove codeDir validation for local training service * Remove useless import * Remove intermediate result (#410) * Trial keeper refactor (#411) * [Trial keeper refactor] refactor trial keeper stdout output * Refactor nnictl error information (#412) 1.Refactor nnictl information when validateion error. 2.Set kubernetesServer as optional. * Kubeflow training service documentation, v1 (#419) * Kubeflow training service documentation, v1 * Fix typos based on comments * Fix for issue #414 (#415) * update doc for "write trial" * fix link * issue 414 * Support to show 2 logPath, add hdfsLogPath (#420) * Support to show 2 logPath * fix lint * Update trial status color * fix sdk for NoMoreTrial status (#394) * fix bug * add docs * [Kubeflow training service] fix bug that wrongly split kube delete cmd into 2 lines (#425) * [Kubeflow training service] fix bug that wrongly split kube delete cmd into 2 lines * Adjust white space * Add AKS document (#422) 1.Add kubeflow in experiment config document 2.Add AKS in kubeflow document * Add macOS environment to CI pipeline * Dev hyperband (#405) * support hyperband * add example for hyperband * register Hyperband in tuner * after debug * update doc * trivial change * update spec validation of yaml config * modify nnictl launcher * modify nnimanager and util to support advisor * Quick fix nnictl config logic (#289) * fix nnictl bug * fix install.sh * add desc for Dockerfile.build.base * update document for Dockerfile * update * refactor port detect * update * refactor NNICTLDOC.md * add document for pai and nnictl * add default value for port * add exception handling in trial_keeper.py * fix port bug * fix resume * fix nnictl resume and fix nnictl stop * fix document * update * refactor nnictl * update * update doc * update * update nnictl * fix comment * revert dockerfile * update * update * update * fix nnictl error hit * fix comments * fix bash-completion * fix paramiko install * quick fix resume logic * update * quick fix nnictl * refactor sdk main * update unit test accordingly * update example's config file * update restserver validation * PR merge to 0.3 (#297) * refactor doc * update with Mao's suggestions * Set theme jekyll-theme-dinky * update doc * fix links * fix links * fix links * merge * fix links and doc errors * merge * merge * merge * merge * Update README.md (#288) added License badge * merge * updated the "Contribute" part (merged Gems' wiki in, updated ReadMe) * fix link * fix doc mistakes and broken links. (#271) * refactor doc * update with Mao's suggestions * Set theme jekyll-theme-dinky * updated the "Contribute" part (merged Gems' wiki in, updated ReadMe) * fix link * Update README.md * Fix misspelling in examples/trials/ga_squad/README.md * revise the installation cmd to v0.2 * revise to install v0.2 * remove files * update * remove enas readme (#292) * support checkpoint directory * Fix datastore performance issue (#301) * fix pylint * Fix nnictl in v0.3 (#299) Fix old version of config file fix sklearn requirements Fix resume log logic * modify log * trivial changes * update example * update makefile * update launcher.py to fix the problem of finding main.js * debug * add hyperparameter info into trial_end api * fix bug and update example * fix error induced by merge * support initialize * add doc for hyperband * fix bugs and add config_pai * fix bugs and add config_pai * fix bugs and add config_pai * fix bugs and add config_pai * update doc * add doc for advisor * fit * modification based on hui's comments * update doc * modify loguniform and lognormal (#395) * modify loguniform and lognormal * fix bug * fix bug * update doc * update doc * fix * update tpe for loguniform * update tpe for loguniform * update for loguniform * update for loguniform * update loguniform and qloguniform * update doc * update * revert * revert * revert * revert * fix ci (#433) * fix ci * expand time range * expand time range * Update doc6 (#428) * test * tuners * refactor doc of tuners * update * update assessor doc * update * update * Update KubeflowMode.md (#436) * Delete unnecessary letter 'a' (#439) Fix a spelling bug that may cause confuse * Update doc (#438) * update readme in ga_squad * fix typo * Update README.md * Update README.md * Update README.md * fix path * update README reference * fix bug in config file about batch tuner * update loguniform for smac (#430) * modify loguniform and lognormal * fix bug * fix bug * update doc * update doc * fix * update tpe for loguniform * update tpe for loguniform * update for loguniform * update for loguniform * update loguniform and qloguniform * update doc * update * revert * revert * revert * revert * update loguniform for smac * update loguniform for smac * update loguniform for smac * update loguniform for smac * Add distributed mnist training example, to show how to perform distri… (#435) * Add distributed mnist training example, to show how to perform distributed training on kubeflow for NNI * rename folder name to mnist_distributed * Remove duplicated is_chief check * [Kubeflow training service] Add document for installing NFS client (#442) * Add document for installing NFS client * [V0.4 Release] Kubeflow training service: Remove unued kubernetesServer config entry (#444) * Remove unused kubernetesServer config entry in config file and schema validation * Fix multiphase error message in nnimanager.log (#445) * NNI V0.4 Release: Update version from v0.3.4 to v0.4 (#446) * Change version number from v0.3.4 to v0.4, for NNI v0.4 release * update makefile & doc for pypi & installation (#440) * update pypi/makefile for multiple platform support * update linux os spec * udpate doc for installation & pypi * update readme * Update webui document (#443) * Update webui document * Fix comments of Chengmin * Update document v0.4 (#437) move nnictl folder delete kubernetsServer in nnictl refactor aks document add warning information to expand relative path update experiment status when the experiment crashed. * Update document for PAI mode to turn on 8081 port (#448) * Update document for PAI mode port * Update document for PAI mode to turn on 8081 port * [V0.4 Release] Docoment update for kubeflow and release notes (#450) * Docoment update for kubeflow and release notes * Refactor examples' default image (#449) * update * remove kubernetsServer in nnictl * update document * update * add warning to expand relative path * update doc * update * update * fix doc * update * update * update * refactor image * update * update * update * update * backward compatibility for mac: job end timestamp (#451) * add pycharm project files to .gitignore list * update pylintrc to conform vscode settings * fix RemoteMachineMode for wrong trainingServicePlatform * add python cache files to gitignore list * move extract scalar reward logic from dispatcher to tuner * update tuner code corresponding to last commit * update doc for receive_trial_result api change * add numpy to package whitelist of pylint * distinguish param value from return reward for tuner.extract_scalar_reward * update pylintrc * add comments to dispatcher.handle_report_metric_data * update install for mac support * fix root mode bug on Makefile * Quick fix bug: nnictl port value error (#245) * fix port bug * Dev exp stop more (#221) * Exp stop refactor (#161) * Update RemoteMachineMode.md (#63) * Remove unused classes for SQuAD QA example. * Remove more unused functions for SQuAD QA example. * Fix default dataset config. * Add Makefile README (#64) * update document (#92) * Edit readme.md * updated a word * Update GetStarted.md * Update GetStarted.md * refact readme, getstarted and write your trial md. * Update README.md * Update WriteYourTrial.md * Update WriteYourTrial.md * Update WriteYourTrial.md * Update WriteYourTrial.md * Fix nnictl bugs and add new feature (#75) * fix nnictl bug * fix nnictl create bug * add experiment status logic * add more information for nnictl * fix Evolution Tuner bug * refactor code * fix code in updater.py * fix nnictl --help * fix classArgs bug * update check response.status_code logic * remove Buffer warning (#100) * update readme in ga_squad * update readme * fix typo * Update README.md * Update README.md * Update README.md * Add support for debugging mode * fix setup.py (#115) * Add DAG model configuration format for SQuAD example. * Explain config format for SQuAD QA model. * Add more detailed introduction about the evolution algorithm. * Fix install.sh add add trial log path (#109) * fix nnictl bug * fix nnictl create bug * add experiment status logic * add more information for nnictl * fix Evolution Tuner bug * refactor code * fix code in updater.py * fix nnictl --help * fix classArgs bug * update check response.status_code logic * show trial log path * update document * fix install.sh * set default vallue for maxTrialNum and maxExecDuration * fix nnictl * Dev smac (#116) * support package install (#91) * fix nnictl bug * support package install * update * update package install logic * Fix package install issue (#95) * fix nnictl bug * fix pakcage install * support SMAC as a tuner on nni (#81) * update doc * update doc * update doc * update hyperopt installation * update doc * update doc * update description in setup.py * update setup.py * modify encoding * encoding * add encoding * remove pymc3 * update doc * update builtin tuner spec * support smac in sdk, fix logging issue * support smac tuner * add optimize_mode * update config in nnictl * add __init__.py * update smac * update import path * update setup.py: remove entry_point * update rest server validation * fix bug in nnictl launcher * support classArgs: optimize_mode * quick fix bug * test travis * add dependency * add dependency * add dependency * add dependency * create smac python package * fix trivial points * optimize import of tuners, modify nnictl accordingly * fix bug: incorrect algorithm_name * trivial refactor * for debug * support virtual * update doc of SMAC * update smac requirements * update requirements * change debug mode * update doc * update doc * refactor based on comments * fix comments * modify example config path to relative path and increase maxTrialNum (#94) * modify example config path to relative path and increase maxTrialNum * add document * support conda (#90) (#110) * support install from venv and travis CI * support install from venv and travis CI * support install from venv and travis CI * support conda * support conda * modify example config path to relative path and increase maxTrialNum * undo messy commit * undo messy commit * Support pip install as root (#77) * Typo on #58 (#122) * PAI Training Service implementation (#128) * PAI Training service implementation **1. Implement PAITrainingService **2. Add trial-keeper python module, and modify setup.py to install the module **3. Add PAItrainingService rest server to collect metrics from PAI container. * fix datastore for multiple final result (#129) * Update NNI v0.2 release notes (#132) Update NNI v0.2 release notes * Update setup.py Makefile and documents (#130) * update makefile and setup.py * update makefile and setup.py * update document * update document * Update Makefile no travis * update doc * update doc * fix convert from ss to pcs (#133) * Fix bugs about webui (#131) * Fix webui bugs * Fix tslint * webui logpath and document (#135) * Add webui document and logpath as a href * fix tslint * fix comments by Chengmin * Pai training service bug fix and enhancement (#136) * Add NNI installation scripts * Update pai script, update NNI_out_dir * Update NNI dir in nni sdk local.py * Create .nni folder in nni sdk local.py * Add check before creating .nni folder * Fix typo for PAI_INSTALL_NNI_SHELL_FORMAT * Improve annotation (#138) * Improve annotation * Minor bugfix * Selectively install through pip (#139) Selectively install through pip * update setup.py * fix paiTrainingService bugs (#137) * fix nnictl bug * add hdfs host validation * fix bugs * fix dockerfile * fix install.sh * update install.sh * fix dockerfile * Set timeout for HDFSUtility exists function * remove unused TODO * fix sdk * add optional for outputDir and dataDir * refactor dockerfile.base * Remove unused import in hdfsclientUtility * Add documentation for NNI PAI mode experiment (#141) * Add documentation for NNI PAI mode * Fix typo based on PR comments * Exit with subprocess return code of trial keeper * Remove additional exit code * Fix typo based on PR comments * update doc for smac tuner (#140) * Revert "Selectively install through pip (#139)" due to potential pip install issue (#142) * Revert "Selectively install through pip (#139)" This reverts commit 1d174836d3146a0363e9c9c88094bf9cff865faa. * Add exit code of subprocess for trial_keeper * Update README, add link to PAImode doc * Merge branch V0.2 to Master (#143) * webui logpath and document (#135) * Add webui document and logpath as a href * fix tslint * fix comments by Chengmin * Pai training service bug fix and enhancement (#136) * Add NNI installation scripts * Update pai script, update NNI_out_dir * Update NNI dir in nni sdk local.py * Create .nni folder in nni sdk local.py * Add check before creating .nni folder * Fix typo for PAI_INSTALL_NNI_SHELL_FORMAT * Improve annotation (#138) * Improve annotation * Minor bugfix * Selectively install through pip (#139) Selectively install through pip * update setup.py * fix paiTrainingService bugs (#137) * fix nnictl bug * add hdfs host validation * fix bugs * fix dockerfile * fix install.sh * update install.sh * fix dockerfile * Set timeout for HDFSUtility exists function * remove unused TODO * fix sdk * add optional for outputDir and dataDir * refactor dockerfile.base * Remove unused import in hdfsclientUtility * Add documentation for NNI PAI mode experiment (#141) * Add documentation for NNI PAI mode * Fix typo based on PR comments * Exit with subprocess return code of trial keeper * Remove additional exit code * Fix typo based on PR comments * update doc for smac tuner (#140) * Revert "Selectively install through pip (#139)" due to potential pip install issue (#142) * Revert "Selectively install through pip (#139)" This reverts commit 1d174836d3146a0363e9c9c88094bf9cff865faa. * Add exit code of subprocess for trial_keeper * Update README, add link to PAImode doc * fix bug (#147) * Refactor nnictl and add config_pai.yml (#144) * fix nnictl bug * add hdfs host validation * fix bugs * fix dockerfile * fix install.sh * update install.sh * fix dockerfile * Set timeout for HDFSUtility exists function * remove unused TODO * fix sdk * add optional for outputDir and dataDir * refactor dockerfile.base * Remove unused import in hdfsclientUtility * add config_pai.yml * refactor nnictl create logic and add colorful print * fix nnictl stop logic * add annotation for config_pai.yml * add document for start experiment * fix config.yml * fix document * Fix trial keeper wrongly exit issue (#152) * Fix trial keeper bug, use actual exitcode to exit rather than 1 * Fix bug of table sort (#145) * Update doc for PAIMode and v0.2 release notes (#153) * Update v0.2 documentation regards to release note and PAI training service * Update document to describe NNI docker image * fix antd (#159) * refactor experiment stopping logic * support change concurrency * remove trialJobs.ts * trivial changes * fix bugs * fix bug * support updating maxTrialNum * Modify IT scripts for supporting multiple experiments * Update ci (#175) * Update RemoteMachineMode.md (#63) * Remove unused classes for SQuAD QA example. * Remove more unused functions for SQuAD QA example. * Fix default dataset config. * Add Makefile README (#64) * update document (#92) * Edit readme.md * updated a word * Update GetStarted.md * Update GetStarted.md * refact readme, getstarted and write your trial md. * Update README.md * Update WriteYourTrial.md * Update WriteYourTrial.md * Update WriteYourTrial.md * Update WriteYourTrial.md * Fix nnictl bugs and add new feature (#75) * fix nnictl bug * fix nnictl create bug * add experiment status logic * add more information for nnictl * fix Evolution Tuner bug * refactor code * fix code in updater.py * fix nnictl --help * fix classArgs bug * update check response.status_code logic * remove Buffer warning (#100) * update readme in ga_squad * update readme * fix typo * Update README.md * Update README.md * Update README.md * Add support for debugging mode * modify CI cuz of refracting exp stop * update CI for expstop * update CI for expstop * update CI for expstop * update CI for expstop * update CI for expstop * update CI for expstop * update CI for expstop * update CI for expstop * update CI for expstop * file saving * fix issues from code merge * remove $(INSTALL_PREFIX)/nni/nni_manager before install * fix indent * fix merge issue * socket close * update port * fix merge error * modify ci logic in nnimanager * fix ci * fix bug * change suspended to done * update ci (#229) * update ci * update ci * update ci (#232) * update ci * update ci * update azure-pipelines * update azure-pipelines * update ci (#233) * update ci * update ci * update azure-pipelines * update azure-pipelines * update azure-pipelines * run.py (#238) * Nnupdate ci (#239) * run.py * test ci * Nnupdate ci (#240) * run.py * test ci * test ci * Udci (#241) * run.py * test ci * test ci * test ci * update ci (#242) * run.py * test ci * test ci * test ci * update ci * revert install.sh (#244) * run.py * test ci * test ci * test ci * update ci * revert install.sh * add comments * remove assert * trivial change * trivial change * update Makefile (#246) * update Makefile * update Makefile * quick fix for ci (#248) * add update trialNum and fix bugs (#261) * Add builtin tuner to CI (#247) * update Makefile * update Makefile * add builtin-tuner test * add builtin-tuner test * refractor ci * update azure.yml * add built-in tuner test * fix bugs * Doc refactor (#258) * doc refactor * image name refactor * Refactor nnictl to support listing stopped experiments. (#256) Refactor nnictl to support listing stopped experiments. * Show experiment parameters more beautifully (#262) * fix error on example of RemoteMachineMode (#269) * add pycharm project files to .gitignore list * update pylintrc to conform vscode settings * fix RemoteMachineMode for wrong trainingServicePlatform * Update docker file to use latest nni release (#263) * fix bug about execDuration and endTime (#270) * fix bug about execDuration and endTime * modify time interval to 30 seconds * refactor based on Gems's suggestion * for triggering ci * Refactor dockerfile (#264) * refactor Dockerfile * Support nnictl tensorboard (#268) support tensorboard * Sdk update (#272) * Rename get_parameters to get_next_parameter * annotations add get_next_parameter * updates * updates * updates * updates * updates * add experiment log path to experiment profile (#276) * refactor extract reward from dict by tuner * update Makefile for mac support, wait for aka.ms support * refix Makefile for colorful echo * update Makefile with shorturl * fix false fail on mac webui * fix cross os remote tmpdir issue * add readonly to RemoteMachineTrainingService.remoteOS * fix var name for PR 386 * cross platform package * update pypi/makefile for multiple platform support * update linux os spec * udpate doc for installation & pypi * update readme * job timestamp compatibility for mac * Update nni arch overview diagram (#447) * refactor doc * update with Mao's suggestions * Set theme jekyll-theme-dinky * update doc * fix links * fix links * fix links * merge * fix links and doc errors * merge * merge * merge * merge * Quick fix nnictl config logic (#289) * fix nnictl bug * fix install.sh * add desc for Dockerfile.build.base * update document for Dockerfile * update * refactor port detect * update * refactor NNICTLDOC.md * add document for pai and nnictl * add default value for port * add exception handling in trial_keeper.py * fix port bug * fix resume * fix nnictl resume and fix nnictl stop * fix document * update * refactor nnictl * update * update doc * update * update nnictl * fix comment * revert dockerfile * update * update * update * fix nnictl error hit * fix comments * fix bash-completion * fix paramiko install * quick fix resume logic * update * quick fix nnictl * merge * updated the "Contribute" part (merged Gems' wiki in, updated ReadMe) * fix link * revise the installation cmd to v0.2 * revise to install v0.2 * Update nnictl_utils.py * Update nnictl_utils.py * Update nnictl_utils.py * Update documentation for v0.3 * update release note * update v0.3.0 release note +1 * update doc for installation tag v0.3.3 * fix contributing doc problems * update doc for "write trial" * fix link * issue 414 * update arch overview diagram in README * update image * fix broken link * Update README.md * Correct typo, macOS -> MacOS * update ga_squad example (#461) * update ga_squad experiment example on pai * Update config_pai.yml * Update README.md * Update config_pai.yml * Update README.md * Update README.md * Update README.md * Update pai token by time interval (#434) Update pai token every 2 hours. * Support kuberflow pytorch-operator (#406) 1.Support pytorch-operator 2.remove unsupported operator * added search trail by id function (#455) * correct assessor typo (#463) correct assessor typos in several files. * Quick fix paiTrainingService (#465) quick fix paiTrainingService, add deferred.resolve(); * Add system requirements for NNI Installation * Fix nnictl multiThread option (#467) * Dev networkmorphism (#413) * Quick fix nnictl config logic (#289) * fix nnictl bug * fix install.sh * add desc for Dockerfile.build.base * update document for Dockerfile * update * refactor port detect * update * refactor NNICTLDOC.md * add document for pai and nnictl * add default value for port * add exception handling in trial_keeper.py * fix port bug * fix resume * fix nnictl resume and fix nnictl stop * fix document * update * refactor nnictl * update * update doc * update * update nnictl * fix comment * revert dockerfile * update * update * update * fix nnictl error hit * fix comments * fix bash-completion * fix paramiko install * quick fix resume logic * update * quick fix nnictl * PR merge to 0.3 (#297) * refactor doc * update with Mao's suggestions * Set theme jekyll-theme-dinky * update doc * fix links * fix links * fix links * merge * fix links and doc errors * merge * merge * merge * merge * Update README.md (#288) added License badge * merge * updated the "Contribute" part (merged Gems' wiki in, updated ReadMe) * fix link * fix doc mistakes and broken links. (#271) * refactor doc * update with Mao's suggestions * Set theme jekyll-theme-dinky * updated the "Contribute" part (merged Gems' wiki in, updated ReadMe) * fix link * Update README.md * Fix misspelling in examples/trials/ga_squad/README.md * revise the installation cmd to v0.2 * revise to install v0.2 * remove enas readme (#292) * Fix datastore performance issue (#301) * Fix nnictl in v0.3 (…
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.