-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Conversation
examples/nas/spos/readme.md
Outdated
|
||
## Step 2. Evolution Search | ||
|
||
To have a search space ready for NNI framework, first run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please give more explanation about why do evolution search.
examples/nas/spos/readme.md
Outdated
To have a search space ready for NNI framework, first run | ||
|
||
``` | ||
nnictl ss_gen -t "python tester.py" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please briefly explain tester.py
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explained above.
examples/nas/spos/readme.md
Outdated
Block search only. Channel search is not supported yet. | ||
|
||
TODO: Reproduction results. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better to briefly introduce the code/directory structure of the example code.
# Conflicts: # src/sdk/pynni/nni/nas/pytorch/callbacks.py # src/sdk/pynni/nni/nas/pytorch/classic_nas/mutator.py # tools/nni_cmd/nnictl_utils.py
@@ -0,0 +1,91 @@ | |||
import os | |||
|
|||
import nvidia.dali.ops as ops |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed, should offer a requirement.txt
examples/nas/spos/network.py
Outdated
super().__init__() | ||
|
||
assert input_size % 32 == 0 | ||
with open(os.path.join(os.path.dirname(__file__), "./data/op_flops_dict.pkl"), "rb") as fp: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do not hard code the path here? make it as a parameter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
def get_candidate_flops(self, candidate): | ||
conv1_flops = self._op_flops_dict["conv1"][(3, self._first_conv_channels, | ||
self._input_size, self._input_size, 2)] | ||
# Should use `last_conv_channels` here, but megvii insists that it's `n_classes`. Keeping it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
???
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it's their mistake. But without their confirm, I have to keep it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should not show in our repo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So your suggestion is to silently fix it?
examples/nas/spos/readme.md
Outdated
|
||
Only GPU version is provided here. | ||
|
||
TODO: Reproduction results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you report the result for now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The result is not aligned with the paper. So no.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to know the gap, or we cannot merge the branch in case there are bugs.
examples/nas/spos/readme.md
Outdated
## Preparation | ||
|
||
### Requirements | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
write a requirement.txt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above.
examples/nas/spos/readme.md
Outdated
To have a search space ready for NNI framework, first run | ||
|
||
``` | ||
nnictl ss_gen -t "python tester.py" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did the user need to run this command? can it automatically run?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. This is by design.
examples/nas/spos/readme.md
Outdated
│ └── op_flops_dict.pkl | ||
├── dataloader.py | ||
├── network.py | ||
├── nni_auto_gen_search_space.json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it already exist? no need to execute "nnictl ss_gen -t "python tester.py"???
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry. Will fix.
examples/nas/spos/readme.md
Outdated
├── data | ||
│ ├── imagenet | ||
│ │ ├── train | ||
│ │ └── val |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does imagenet has test data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. But it's not used.
examples/nas/spos/readme.md
Outdated
nnictl create --config config_search.yml | ||
``` | ||
|
||
The final architecture exported from every epoch of evolution can be found in `checkpoints` under the working directory of your tuner, which, by default, is `$HOME/nni/experiments/$EXP_ID/log`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the $EXP_ID? where can I find it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's an experiment id. Sorry to confuse.
examples/nas/spos/readme.md
Outdated
|
||
NOTE: The data loading used in the official repo is [slightly different from usual](https://github.com/megvii-model/SinglePathOneShot/issues/5), as they use BGR tensor and keep the values between 0 and 255 intentionally to align with their own DL framework. The option `--spos-preprocessing` will simulate the behavior used originally and enable you to use the checkpoints pretrained. | ||
|
||
## Step 2. Evolution Search |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can I collect the top 10 arches by evolution search?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's written in tuner as export_results
.
examples/nas/spos/scratch.py
Outdated
from network import ShuffleNetV2OneShot | ||
from utils import CrossEntropyLabelSmooth, accuracy | ||
|
||
logger = logging.getLogger("nni") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change the name?
examples/nas/spos/scratch.py
Outdated
apply_fixed_architecture(model, args.architecture) | ||
if torch.cuda.device_count() > 1: # exclude last gpu, saving for data preprocessing on gpu | ||
model = nn.DataParallel(model, device_ids=list(range(0, torch.cuda.device_count() - 1))) | ||
criterion = CrossEntropyLabelSmooth(1000, 0.1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make it as parameters?
examples/nas/spos/supernet.py
Outdated
model = ShuffleNetV2OneShot() | ||
if args.load_checkpoint: | ||
if not args.spos_preprocessing: | ||
print("You might want to use SPOS preprocessing if you are loading their checkpoints.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use log??
examples/nas/spos/supernet.py
Outdated
model = nn.DataParallel(model, device_ids=list(range(0, torch.cuda.device_count() - 1))) | ||
mutator = SPOSSupernetTrainingMutator(model, flops_func=model.module.get_candidate_flops, | ||
flops_lb=290E6, flops_ub=360E6) | ||
criterion = CrossEntropyLabelSmooth(1000, 0.1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here.
examples/nas/spos/tester.py
Outdated
from network import ShuffleNetV2OneShot, load_and_parse_state_dict | ||
from utils import CrossEntropyLabelSmooth, accuracy | ||
|
||
logger = logging.getLogger("nni") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change the name
retrain_bn(model, criterion, args.train_iters, args.log_frequency, loader_train) | ||
acc = test_acc(model, criterion, args.log_frequency, loader_test) | ||
assert isinstance(acc, float) | ||
nni.report_intermediate_result(acc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
report the intermediate result and final result at the same time? what's the intermediate result used for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For visualization. In current NNI version, final result will not be displayed in intermediate result page.
@@ -0,0 +1,91 @@ | |||
import os | |||
|
|||
import nvidia.dali.ops as ops |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can it execute by a sh script?
def _random_candidate(self): | ||
chosen_arch = dict() | ||
for key, val in self._search_space.items(): | ||
if val["_type"] == "layer_choice": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comments like other PR.
Conflicts resolved.