-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rewrite the text classification demo. #83
rewrite the text classification demo. #83
Conversation
eeccb28
to
f884a61
Compare
@@ -51,10 +37,10 @@ def main(): | |||
learning_rate_schedule="discexp", ) | |||
|
|||
train_reader = paddle.batch( | |||
paddle.reader.shuffle(reader.test_reader("train.list"), buf_size=1000), | |||
paddle.reader.shuffle(reader.train_reader("train.list"), buf_size=1000), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
train.list是文本数据集吗?没有在目录下找到
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这处修改用来修改图像分类例子的bug,目前每个例子读取数据的方式确实不统一。后续提PR修改图像分类的例子。
batch_size=BATCH_SIZE) | ||
test_reader = paddle.batch( | ||
reader.train_reader("test.list"), batch_size=BATCH_SIZE) | ||
reader.test_reader("test.list"), batch_size=BATCH_SIZE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test.list是文本数据集吗?没有在目录下找到
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这处修改用来修改图像分类例子的bug,目前每个例子读取数据的方式确实不统一。后续提PR修改图像分类的例子。
text_classification/index.html
Outdated
├── run.sh # 运行此脚本,可以以默认参数直接开始训练任务 | ||
├── train.py # 训练任务脚本 | ||
└── utils.py # 定义通用的函数,例如:打印日志、解析命令行参数、构建字典、加载字典等 | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
缺少一个快速开始。其中包含上面的目录说明,及一个训练说明(训练过程输出日志里都是什么意思)、和一个预测说明(可以是一段code,参考:https://www.oschina.net/p/jieba/?fromerr=btIKdxHH -功能 1)分词的代码及output、当然也可以是一个gif图)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
接下来应该是之前写在末尾的『修改参数说明』,这样可以在新手运行出一个结果的前提下,来对比修改不同参数得到的结果不同。于是顺势引出CNN&DNN该如何选择
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
其实余下部分都是在解释这些『参数』的含义,顺序也应如此。
text_classification/index.html
Outdated
cost = paddle.layer.classification_cost(input=output, label=lbl) | ||
|
||
return cost, output, lbl | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果是面向初学者,数据格式的自定义要比其他信息重要得多,因为这可能是唯一"需要考虑"的事,所以优先级需要提到快速开始后面。而详解的这个位置可以对应加一个锚点链接。
#!/bin/sh | ||
|
||
python train.py \ | ||
--nn_type="dnn" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里的train的方式怎么又改成shell传参了,按照约定都应该写到train.py里?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shell 里面为 train.py
指定的参数,直接运行shell 即可,否则需要在命令行敲长串的参数。
|
||
|
||
def train(topology, | ||
train_data_dir=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lm_rnn.py 里有个run_type=GRU #'or LSTM'
,在这里是否可以增加一个方法的选择参数比如DNN OR CNN
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已有此参数,nn_type
用来指定选择何种模型。
f884a61
to
b1231a5
Compare
b1231a5
to
501ce21
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
follow comments.
batch_size=BATCH_SIZE) | ||
test_reader = paddle.batch( | ||
reader.train_reader("test.list"), batch_size=BATCH_SIZE) | ||
reader.test_reader("test.list"), batch_size=BATCH_SIZE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这处修改用来修改图像分类例子的bug,目前每个例子读取数据的方式确实不统一。后续提PR修改图像分类的例子。
@@ -51,10 +37,10 @@ def main(): | |||
learning_rate_schedule="discexp", ) | |||
|
|||
train_reader = paddle.batch( | |||
paddle.reader.shuffle(reader.test_reader("train.list"), buf_size=1000), | |||
paddle.reader.shuffle(reader.train_reader("train.list"), buf_size=1000), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这处修改用来修改图像分类例子的bug,目前每个例子读取数据的方式确实不统一。后续提PR修改图像分类的例子。
6068a69
to
a0529eb
Compare
Readme的目录顺序需要调整么?把模型介绍、模型详解放到最开头的地方? |
74980fe
to
ba69ba5
Compare
ba69ba5
to
136a60d
Compare
rewrite the text classification example.