Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Antares IR problem in AvgPool operator #172

Closed
xysmlx opened this issue Dec 16, 2020 · 1 comment · Fixed by #188 or #220
Closed

[BUG] Antares IR problem in AvgPool operator #172

xysmlx opened this issue Dec 16, 2020 · 1 comment · Fixed by #188 or #220
Labels
bug Something isn't working

Comments

@xysmlx
Copy link
Contributor

xysmlx commented Dec 16, 2020

wrong translated IR in frozen_nasnet_cifar_nchw_infer_bs1.const_folded.pb

[ERROR] 2020-12-16T03:31:52z src/nnfusion/engine/pass/graph/kernel_tuning.cpp 59         - einstein_v2(" temp0[N, C, HO, WO] +=! input0[N, C, HO * 2 + KH , WO * 2 + KW] where HO in 16, WO in 16, KH in 2, KW in 2; output0[N, C, HO, WO] = temp0[N, C, HO, WO] * 2.500000000000000000000000e-01;", input_dict={ "input0" : { "dtype" : "float32", "shape" : [1, 64, 32, 32]} })
[ERROR] Traceback (most recent call last):
  File "./antares/antares_compiler.py", line 617, in get
    code = main_compute(code_only=True)
  File "./antares/antares_compiler.py", line 377, in main_compute
    task = autotvm.task.create("template_op", args=(), target=tvm_target)
  File "/opt/tvm/python/tvm/autotvm/task/task.py", line 457, in create
    sch, _ = ret.func(*args)
  File "/opt/tvm/python/tvm/autotvm/task/task.py", line 236, in __call__
    return self.fcustomized(*args, **kwargs)
  File "/antares/lang/generic.py", line 222, in get_template_op
    traverse_inline(sch, outputs[0].op, _callback)
  File "/antares/lang/generic.py", line 130, in traverse_inline
    callback(explicit_ops[-1], explicit_ops)
  File "/antares/lang/generic.py", line 220, in _callback
    do_native_scheduling(attrs)
  File "/antares/lang/generic.py", line 158, in do_native_scheduling
    return select_plan(plan)
  File "/antares/lang/generic.py", line 139, in select_plan
    schedule_lib.schedule(attrs)
  File "/antares/platforms/c-cuda/schedule/standard/default.py", line 37, in schedule
    ax_obj = output.op.axis[th_idx[i]]
  File "/opt/tvm/python/tvm/ir/container.py", line 36, in __getitem__
    return getitem_helper(self, _ffi_node_api.ArrayGetItem, len(self), idx)
  File "/opt/tvm/python/tvm/runtime/container.py", line 57, in getitem_helper
    raise IndexError("Index out of range. size: {}, got index {}".format(length, idx))
IndexError: Index out of range. size: 3, got index 3

[INFO] 2020-12-16T03:31:52z src/nnfusion/engine/pass/graph/kernel_tuning.cpp 240        AvgPool, ir:  - einstein_v2(" temp0[N, C, HO, WO] +=! input0[N, C, HO * 2 + KH , WO * 2 + KW] where HO in 16, WO in 16, KH in 2, KW in 2; output0[N, C, HO, WO] = temp0[N, C, HO, WO] * 2.500000000000000000000000e-01;", input_dict={ "input0" : { "dtype" : "float32", "shape" : [1, 64, 32, 32]} })
[ERROR] 2020-12-16T03:31:52z src/nnfusion/engine/pass/graph/kernel_tuning.cpp 59         - einstein_v2(" temp0[N, C, HO, WO] +=! input0[N, C, HO * 2 + KH , WO * 2 + KW] where HO in 8, WO in 8, KH in 2, KW in 2; output0[N, C, HO, WO] = temp0[N, C, HO, WO] * 2.500000000000000000000000e-01;", input_dict={ "input0" : { "dtype" : "float32", "shape" : [1, 128, 16, 16]} })
[ERROR] Traceback (most recent call last):
  File "./antares/antares_compiler.py", line 617, in get
    code = main_compute(code_only=True)
  File "./antares/antares_compiler.py", line 377, in main_compute
    task = autotvm.task.create("template_op", args=(), target=tvm_target)
  File "/opt/tvm/python/tvm/autotvm/task/task.py", line 457, in create
    sch, _ = ret.func(*args)
  File "/opt/tvm/python/tvm/autotvm/task/task.py", line 236, in __call__
    return self.fcustomized(*args, **kwargs)
  File "/antares/lang/generic.py", line 222, in get_template_op
    traverse_inline(sch, outputs[0].op, _callback)
  File "/antares/lang/generic.py", line 130, in traverse_inline
    callback(explicit_ops[-1], explicit_ops)
  File "/antares/lang/generic.py", line 220, in _callback
    do_native_scheduling(attrs)
  File "/antares/lang/generic.py", line 158, in do_native_scheduling
    return select_plan(plan)
  File "/antares/lang/generic.py", line 139, in select_plan
    schedule_lib.schedule(attrs)
  File "/antares/platforms/c-cuda/schedule/standard/default.py", line 37, in schedule
    ax_obj = output.op.axis[th_idx[i]]
  File "/opt/tvm/python/tvm/ir/container.py", line 36, in __getitem__
    return getitem_helper(self, _ffi_node_api.ArrayGetItem, len(self), idx)
  File "/opt/tvm/python/tvm/runtime/container.py", line 57, in getitem_helper
    raise IndexError("Index out of range. size: {}, got index {}".format(length, idx))
IndexError: Index out of range. size: 3, got index 3

[INFO] 2020-12-16T03:31:52z src/nnfusion/engine/pass/graph/kernel_tuning.cpp 240        AvgPool, ir:  - einstein_v2(" temp0[N, C, HO, WO] +=! input0[N, C, HO * 2 + KH , WO * 2 + KW] where HO in 8, WO in 8, KH in 2, KW in 2; output0[N, C, HO, WO] = temp0[N, C, HO, WO] * 2.500000000000000000000000e-01;", input_dict={ "input0" : { "dtype" : "float32", "shape" : [1, 128, 16, 16]} })

update: Antares hang with the following error

- einstein_v2(" temp0[N, C, HO, WO] +=! input0[N, C, HO * 1 + KH - 1 , WO * 1 + KW - 1].when([HO * 1 + KH - 1 >= 0, HO * 1 + KH - 1 < 32 , WO * 1 + KW - 1 >= 0, WO * 1 + KW - 1 < 32], 0.0) where HO in 32, WO in 32, KH in 3, KW in 3; output0[N, C, HO, WO] = temp0[N, C, HO, WO] * 1.111111119389533996582031e-01;", input_dict={ "input0" : { "dtype" : "float32", "shape" : [1, 96, 32, 32]} })
Traceback (most recent call last):
  File "./antares/antares_compiler.py", line 664, in <module>
    main_compute()
  File "./antares/antares_compiler.py", line 534, in main_compute
    tuner.tune(n_trial=num_trials, callbacks=callbacks, measure_option=None)
  File "/antares/tuner/Ansor/main.py", line 66, in tune
    auto_scheduler.auto_schedule(self.auto_task, tuning_options=auto_scheduler.TuningOptions(num_measure_trials=n_trial, runner=self.measure_ctx.runner, measure_callbacks=[]))
  File "/opt/tvm/python/tvm/auto_scheduler/auto_schedule.py", line 186, in auto_schedule
    sch, tensors = _ffi_api.AutoSchedule(search_policy, tuning_options)
  File "/opt/tvm/python/tvm/_ffi/_ctypes/packed_func.py", line 237, in __call__
    raise get_last_ffi_error()
KeyError: 'Traceback (most recent call last):\n  [bt] (5) /opt/tvm/build/libtvm.so(TVMFuncCall+0x65) [0x7fb65416f375]\n  [bt] (4) /opt/tvm/build/libtvm.so(+0x5ccafe) [0x7fb6535b8afe]\n  [bt] (3) /opt/tvm/build/libtvm.so(tvm::auto_scheduler::AutoSchedule(tvm::auto_scheduler::SearchPolicy, tvm::auto_scheduler::TuningOptions)+0x116) [0x7fb6535b7f06]\n  [bt] (2) /opt/tvm/build/libtvm.so(tvm::auto_scheduler::SketchPolicyNode::Search(int, int, int, tvm::auto_scheduler::ProgramMeasurer)+0x686) [0x7fb653641376]\n  [bt] (1) /opt/tvm/build/libtvm.so(tvm::auto_scheduler::PythonBasedModelNode::Update(tvm::runtime::Array<tvm::auto_scheduler::MeasureInput, void> const&, tvm::runtime::Array<tvm::auto_scheduler::MeasureResult, void> const&)+0x95) [0x7fb6535e3f15]\n  [bt] (0) /opt/tvm/build/libtvm.so(+0x117fe0b) [0x7fb65416be0b]\n  File "/opt/tvm/python/tvm/_ffi/_ctypes/packed_func.py", line 81, in cfun\n    rv = local_pyfunc(*pyargs)\n  File "/opt/tvm/python/tvm/auto_scheduler/cost_model/cost_model.py", line 93, in update_func\n    self.update(inputs, results)\n  File "/opt/tvm/python/tvm/auto_scheduler/cost_model/xgb_model.py", line 176, in update\n    verbose_eval=self.verbose_eval,\n  File "/usr/local/lib/python3.6/dist-packages/xgboost/training.py", line 212, in train\n    xgb_model=xgb_model, callbacks=callbacks)\n  File "/usr/local/lib/python3.6/dist-packages/xgboost/training.py", line 100, in _train_internal\n    evaluation_result_list=evaluation_result_list))\n  File "/opt/tvm/python/tvm/auto_scheduler/cost_model/xgb_model.py", line 618, in callback\n    best_msg = state["best_msg"]\nKeyError: \'best_msg\''

update 2021-02-23

[INFO] 2021-02-23T09:13:31z src/nnfusion/core/kernels/cuda_gpu/cuda_emitter.hpp 211     Translate for AvgPool
[INFO] 2021-02-23T09:13:31z src/nnfusion/core/kernels/antares_ke_imp.cpp 39     [Autogen]  - einstein_v2(" mediate0[N, C, HO, WO] +=! input0[N, C, HO * 1 + KH - 1 , WO * 1 + KW - 1] where HO in 16, WO in 16, KH in 3, KW in 3; output0[N, C, HO, WO] = mediate0[N, C, HO, WO] / 9;", input_dict={ "input0" : { "dtype" : "float32", "shape" : [1, 64, 16, 16]} })   (tuned = 1)
@xysmlx xysmlx added the bug Something isn't working label Dec 16, 2020
@nnfbot
Copy link

nnfbot commented Dec 16, 2020

Thanks for the report @xysmlx! I will look into it ASAP! (I'm a bot).

@xiayuqing0622 xiayuqing0622 linked a pull request Dec 21, 2020 that will close this issue
@xysmlx xysmlx reopened this Feb 23, 2021
@xysmlx xysmlx linked a pull request Feb 23, 2021 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants