add regression demo config ,fixed #10 #34

chrisxu2016 · 2017-05-09T04:42:41Z

No description provided.

lcy-seso

request changes.

lcy-seso · 2017-05-09T06:02:46Z

regression/regression.py

+    with paddle.layer.mixed(size=word_emb_dim, bias_attr=False) as input2_emb:
+        input2_emb += paddle.layer.table_projection(
+            input=input2,
+            param_attr=paddle.attr.Param(name='_emb_basic', is_static=True))


line 16 to 19, please use paddle.layer.embedding.

lcy-seso · 2017-05-09T06:04:03Z

regression/regression.py

+    input2_vec = paddle.layer.pooling(
+        input=input2_emb,
+        pooling_type=paddle.pooling.Sum(),
+        #act=paddle.activation.Tanh(),


Please remove the comments if it is useless.

lcy-seso · 2017-05-09T06:06:00Z

regression/regression.py

+        act=paddle.activation.Tanh(),
+        param_attr=paddle.attr.Param(name='_hidden_input2.w', is_static=True),
+        bias_attr=paddle.attr.ParameterAttribute(
+            name='_hidden_input2.bias', is_static=True))


please make line 5 ~ 34 a function and make is_static a parameter, because it is repeated with line 37 ~ 59.

lcy-seso · 2017-05-09T06:08:12Z

regression/regression.py

+    input1_dict_dim = input2_dict_dim = dict_size
+
+    #train the network
+    if not is_generating:


This example does not need to generate anything. is_generating is not appropriate.

lcy-seso · 2017-05-09T07:48:30Z

regression/regression.py

+        trainer = paddle.trainer.SGD(
+            cost=cost, parameters=parameters, update_equation=optimizer)
+
+        # define data reader


If to test whether the configuration can run, it is no problem to use the WMT14 machine translation dataset, but it is inappropriate to be used in the examples to the users, because such inputs may not reasonable. I suggest giving several example data.

lcy-seso

request changes.

lcy-seso · 2017-05-09T10:36:32Z

regression/regression.py

+import sys
+
+
+def regression_net(input1_dict_dim, input2_dict_dim, is_generating=False):


is_generating is not used in the configuration, so it can be removed.

lcy-seso · 2017-05-09T10:37:30Z

regression/regression.py

+
+def regression_net(input1_dict_dim, input2_dict_dim, is_generating=False):
+    ### Network Architecture
+    word_vector_dim = 512


word_vector_dim is not used in the configuration, so it can be removed.

lcy-seso · 2017-05-09T10:39:12Z

regression/regression.py

+    with paddle.layer.mixed(size=word_emb_dim, bias_attr=False) as input1_emb:
+        input1_emb += paddle.layer.table_projection(
+            input=input1,
+            param_attr=paddle.attr.Param(name='emb_input1', initial_std=0.02))


line 36 ~ 44, please use paddle.layer.embedding.

lcy-seso · 2017-05-09T10:39:36Z

regression/regression.py

+    input1_vec = paddle.layer.pooling(
+        input=input1_emb,
+        pooling_type=paddle.pooling.Sum(),
+        #act=paddle.activation.Tanh(),


please remove the useless comment.

lcy-seso · 2017-05-09T10:41:07Z

regression/regression.py

+        bias_attr=paddle.attr.ParameterAttribute(name='_hidden_input1.bias'))
+
+    cost = paddle.layer.mse_cost(input=hidden_input1, label=hidden_input2)
+    #cost = paddle.layer.huber_cost(input=hidden_input1, label=hidden_input2)


Remove line 62. Different costs should be chosen by user defined parameter, or create an other function.

luotao1 · 2017-05-11T02:56:51Z

为了保证版本库相对清晰，请合并下commit次数，update readme.md更新太多次了。

chrisxu2016 · 2017-05-15T08:16:53Z

增加新的示例数据

lcy-seso

配置目前有比较大的问题。

lcy-seso · 2017-05-24T06:35:49Z

regression/README.md

@@ -1 +1,262 @@
-TBD
+


第一行多余的空格去掉。

lcy-seso · 2017-05-24T06:47:52Z

regression/README.md

+
+# 回归问题
+
+回归问题是机器学习中的一个经典问题，主要目的是构建一个函数将输入数据与输出数据关联起来。本示例中拟合两条语意相近的语句，通过构建了一个相同结构的网络，对源数据与目标数据进行编码。迭代更新源数据网络的参数，来拟合目标数据的编码，完成了一个简单的回归问题。用户可以利用机器翻译中的[WMT-14](https://github.com/PaddlePaddle/book/tree/develop/08.machine_translation#数据介绍)数据集来测试该配置文件，经典的线性回归，请参考[fit_a_line](https://github.com/PaddlePaddle/book/tree/develop/01.fit_a_line).


”回归问题是机器学习中的一个经典问题，主要目的是构建一个函数将输入数据与输出数据关联起来“ 这句话这样下定义不准确，请修改一下。

本示例中拟合两条语意相近的语句，通过构建了一个相同结构的网络，对源数据与目标数据进行编码。 --> 这个解释不对，和回归问题什么关系？请在这里介绍一下 static paramerter 的概念？

用户可以利用机器翻译中的WMT-14数据集来测试该配置文件， --> 请不要提机器翻译，原因之前解释过，这不是一个符合逻辑的例子。

请扩充背景介绍一节，至少包括以下内容：

回归相关的一些背景和概念介绍。

回归问题的应用场景。

重新解释一下本节的例子。

lcy-seso · 2017-05-24T07:53:30Z

regression/README.md

+
+## 数据准备
+
+本示例模拟的是两条语句相近的两条语句的编码相似，具体示例如下图所示，左右两端语句的语意相近，中间以 `\t` 分割。


背景一节介绍之后，这里不需要再解释应用背景，直接介绍数据的格式即可~

lcy-seso · 2017-05-24T07:58:04Z

regression/regression.py

+def encode_net(input_dict_dim, word_emb_dim, hidden_dim, is_static=False):
+    """
+        the input data is encoded to obtain the same size encoded vector
+        params:


注释的风格请和Paddle现有注释风格保持一致~ 可以参考这里。
https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/trainer_config_helpers/data_sources.py#L35

lcy-seso · 2017-05-24T08:50:37Z

regression/data_process.py

@@ -0,0 +1,75 @@
+import tarfile


配置目前有比较大的问题~ 我们先仔细沟通一下~

done，目前还差一个小的样例数据

New model

chrisxu2016 added 2 commits May 9, 2017 12:38

add regression config

692ba92

add regression file

19ba12f

lcy-seso requested changes May 9, 2017

View reviewed changes

lcy-seso self-assigned this May 9, 2017

Update README.md

27eeeaa

chrisxu2016 added 6 commits May 11, 2017 15:13

modify readme

b54a56d

modify regression.py

6621738

modify readme

fec6c37

modify readme

3786025

add new dataset

77c250a

add new dataset

529029b

chrisxu2016 added 2 commits May 16, 2017 15:24

modify readme

69171f5

modify readme

ac5adfa

lcy-seso requested a review from xinghai-sun May 18, 2017 04:42

lcy-seso requested changes May 24, 2017

View reviewed changes

chrisxu2016 added 12 commits May 26, 2017 03:00

modify regression config script

43030fe

modify readme

03cd2de

add regression config

7bd8818

add regression file

59532f6

Update README.md

956302c

modify readme

2656669

modify regression.py

da1d7e4

modify readme

0e29a68

add new dataset

000bc3c

add new dataset

c503caa

add augmentation

f19eeb9

Merge pull request #2 from chrisxu2016/new_model

7a16729

New model

Update README.md

65a5c90

chrisxu2016 closed this Jun 12, 2017

chrisxu2016 mentioned this pull request Jun 12, 2017

add regression config file,fix #10 #84

Closed

		import sys


		def regression_net(input1_dict_dim, input2_dict_dim, is_generating=False):


		# 回归问题

		回归问题是机器学习中的一个经典问题，主要目的是构建一个函数将输入数据与输出数据关联起来。本示例中拟合两条语意相近的语句，通过构建了一个相同结构的网络，对源数据与目标数据进行编码。迭代更新源数据网络的参数，来拟合目标数据的编码，完成了一个简单的回归问题。用户可以利用机器翻译中的[WMT-14](https://github.com/PaddlePaddle/book/tree/develop/08.machine_translation#数据介绍)数据集来测试该配置文件，经典的线性回归，请参考[fit_a_line](https://github.com/PaddlePaddle/book/tree/develop/01.fit_a_line).


		## 数据准备

		本示例模拟的是两条语句相近的两条语句的编码相似，具体示例如下图所示，左右两端语句的语意相近，中间以 `\t` 分割。

		@@ -1 +1,262 @@
		TBD

add regression demo config ,fixed #10 #34

add regression demo config ,fixed #10 #34

Conversation

chrisxu2016 commented May 9, 2017

lcy-seso left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lcy-seso left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

luotao1 commented May 11, 2017

chrisxu2016 commented May 15, 2017

lcy-seso left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment