-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add regression demo config ,fixed #10 #34
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
request changes.
regression/regression.py
Outdated
with paddle.layer.mixed(size=word_emb_dim, bias_attr=False) as input2_emb: | ||
input2_emb += paddle.layer.table_projection( | ||
input=input2, | ||
param_attr=paddle.attr.Param(name='_emb_basic', is_static=True)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
line 16 to 19, please use paddle.layer.embedding
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
regression/regression.py
Outdated
input2_vec = paddle.layer.pooling( | ||
input=input2_emb, | ||
pooling_type=paddle.pooling.Sum(), | ||
#act=paddle.activation.Tanh(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove the comments if it is useless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
regression/regression.py
Outdated
act=paddle.activation.Tanh(), | ||
param_attr=paddle.attr.Param(name='_hidden_input2.w', is_static=True), | ||
bias_attr=paddle.attr.ParameterAttribute( | ||
name='_hidden_input2.bias', is_static=True)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please make line 5 ~ 34 a function and make is_static
a parameter, because it is repeated with line 37 ~ 59.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
regression/regression.py
Outdated
input1_dict_dim = input2_dict_dim = dict_size | ||
|
||
#train the network | ||
if not is_generating: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This example does not need to generate anything. is_generating
is not appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
regression/regression.py
Outdated
trainer = paddle.trainer.SGD( | ||
cost=cost, parameters=parameters, update_equation=optimizer) | ||
|
||
# define data reader |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If to test whether the configuration can run, it is no problem to use the WMT14 machine translation dataset, but it is inappropriate to be used in the examples to the users, because such inputs may not reasonable. I suggest giving several example data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
request changes.
regression/regression.py
Outdated
import sys | ||
|
||
|
||
def regression_net(input1_dict_dim, input2_dict_dim, is_generating=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is_generating
is not used in the configuration, so it can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
regression/regression.py
Outdated
|
||
def regression_net(input1_dict_dim, input2_dict_dim, is_generating=False): | ||
### Network Architecture | ||
word_vector_dim = 512 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
word_vector_dim
is not used in the configuration, so it can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
regression/regression.py
Outdated
with paddle.layer.mixed(size=word_emb_dim, bias_attr=False) as input1_emb: | ||
input1_emb += paddle.layer.table_projection( | ||
input=input1, | ||
param_attr=paddle.attr.Param(name='emb_input1', initial_std=0.02)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
line 36 ~ 44, please use paddle.layer.embedding
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
regression/regression.py
Outdated
input1_vec = paddle.layer.pooling( | ||
input=input1_emb, | ||
pooling_type=paddle.pooling.Sum(), | ||
#act=paddle.activation.Tanh(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove the useless comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
regression/regression.py
Outdated
bias_attr=paddle.attr.ParameterAttribute(name='_hidden_input1.bias')) | ||
|
||
cost = paddle.layer.mse_cost(input=hidden_input1, label=hidden_input2) | ||
#cost = paddle.layer.huber_cost(input=hidden_input1, label=hidden_input2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove line 62. Different costs should be chosen by user defined parameter, or create an other function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
为了保证版本库相对清晰,请合并下commit次数,update readme.md更新太多次了。 |
增加新的示例数据 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
配置目前有比较大的问题。
regression/README.md
Outdated
@@ -1 +1,262 @@ | |||
TBD | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
第一行多余的空格去掉。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
regression/README.md
Outdated
|
||
# 回归问题 | ||
|
||
回归问题是机器学习中的一个经典问题,主要目的是构建一个函数将输入数据与输出数据关联起来。本示例中拟合两条语意相近的语句,通过构建了一个相同结构的网络,对源数据与目标数据进行编码。迭代更新源数据网络的参数,来拟合目标数据的编码,完成了一个简单的回归问题。用户可以利用机器翻译中的[WMT-14](https://github.com/PaddlePaddle/book/tree/develop/08.machine_translation#数据介绍)数据集来测试该配置文件,经典的线性回归,请参考[fit_a_line](https://github.com/PaddlePaddle/book/tree/develop/01.fit_a_line). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
”回归问题是机器学习中的一个经典问题,主要目的是构建一个函数将输入数据与输出数据关联起来“ 这句话这样下定义不准确,请修改一下。
-
本示例中拟合两条语意相近的语句,通过构建了一个相同结构的网络,对源数据与目标数据进行编码。 --> 这个解释不对,和回归问题什么关系?请在这里介绍一下 static paramerter 的概念?
-
用户可以利用机器翻译中的WMT-14数据集来测试该配置文件, --> 请不要提机器翻译,原因之前解释过,这不是一个符合逻辑的例子。
-
请扩充背景介绍一节,至少包括以下内容:
- 回归相关的一些背景和概念介绍。
- 回归问题的应用场景。
- 重新解释一下本节的例子。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
regression/README.md
Outdated
|
||
## 数据准备 | ||
|
||
本示例模拟的是两条语句相近的两条语句的编码相似,具体示例如下图所示,左右两端语句的语意相近,中间以 `\t` 分割。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- 背景一节介绍之后,这里不需要再解释应用背景,直接介绍数据的格式即可~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
regression/regression.py
Outdated
def encode_net(input_dict_dim, word_emb_dim, hidden_dim, is_static=False): | ||
""" | ||
the input data is encoded to obtain the same size encoded vector | ||
params: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
注释的风格请和Paddle现有注释风格保持一致~ 可以参考这里。
https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/trainer_config_helpers/data_sources.py#L35
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -0,0 +1,75 @@ | |||
import tarfile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
配置目前有比较大的问题~ 我们先仔细沟通一下~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done,目前还差一个小的样例数据
No description provided.