-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hsigmoid op #11063
Hsigmoid op #11063
Conversation
Thanks!! |
void Make() override { | ||
AddInput("X", | ||
"(Tensor, required) The input Tensor, which the shape is" | ||
"[N * D], which N is the size of mini-batch," |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe [N, D]
is better than [N * D]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
"[num_classes - 1, D]"); | ||
AddInput("Ids", | ||
"(Tensor, required), The labels of training data. It's a" | ||
"1-D tensor, which the shape is [1, N]"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should Ids
be a 2-D tenser with shape [N, 1]
and might Label
be better than Ids
to be the name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
"D is the embded size"); | ||
AddInput("W", | ||
"(Tensor, required), The parameters of hierarchical " | ||
"sigmoid operator, each of them is s a 3-D tensor, the shape is" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should W
be a 2-D tensor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
"(Tensor, required), The labels of training data. It's a" | ||
"1-D tensor, which the shape is [1, N]"); | ||
AddInput("Bias", | ||
"(Tensor, optional), The bias is a 1-D tensor, " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can reformulate this by The bias is a tensor with shape [1, num_classes - 1]
if we bother about whether it is 1-D or 2-D.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
"1-D tensor, which the shape is [1, N]"); | ||
AddInput("Bias", | ||
"(Tensor, optional), The bias is a 1-D tensor, " | ||
"which is applied to the output, the shape is" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
applied to the output
is confusing, and actually the bias
is applied before the final output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Args: | ||
input (Variable): (Tensor) The input Tensor, which the shape is | ||
[N * D], which N is the size of mini-batch,D is the embded size | ||
label (Variable): (Tensor), The labels of training data. It's a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The shape of label should be [N, 1]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
program = Program() | ||
with program_guard(program): | ||
x = layers.data(name='x', shape=[2, 2], dtype='float32') | ||
y = layers.data(name='y', shape=[1, 2], dtype='int64') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please refer to the data shape in test_softmax_with_cross_entropy
. If defined as above, the actual shape of x will be [-1, 1, 2]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
sum += pre_output[i][j] | ||
out[i] = -1.0 * sum | ||
# soft relu | ||
np.clip(pre_output, -40.0, 40.0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np.clip
is not in-place, thus this makes no effect on pre_output
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
sum += w[idx][l] * x[j][l] | ||
pre_output[j][k] += sum | ||
# clip[-40.0, 40.0] | ||
np.clip(pre_output, -40.0, 40.0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np.clip
is not in-place, thus this makes no effect on pre_output
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
for k in range(length): | ||
idx = code_table.cal_index(k) | ||
sum = 0.0 | ||
for l in range(x.shape[1]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you can replace this with np.dot
or something else.
Make hsigmoid right based on @Yancey1989's work.