hierarchical rnn document, add new config example #106

luotao1 · 2016-09-23T06:25:59Z

No description provided.

reyoung · 2016-09-23T08:15:58Z

doc_cn/algorithm/rnn/hierarchical-rnn.md

+def hook2(settings, dict_file, **kwargs):
+    settings.word_dict = dict_file
+    settings.input_types = [integer_value_sub_sequence(len(settings.word_dict)),
+                            integer_value_sub_sequence(3)]


这个，label不一定非要是sequence了呀？

这里改成integer_value_sequence，表示一个子句一个label

reyoung · 2016-09-23T08:16:06Z

doc_cn/algorithm/rnn/hierarchical-rnn.md

+            label = int(''.join(label.split()))
+            words = comment.split()
+            word_slot = [settings.word_dict[w] for w in words if w in settings.word_dict]
+            yield word_slot, [label]


这个，label不一定非要是sequence了呀？

已经改成integer_value类型了

emailweixu · 2016-09-24T00:59:10Z

doc_cn/algorithm/rnn/hierarchical-rnn.md

+```
+- 双层seq，memory是单层seq：
+  - 由于外层每个时间步返回的是一个子句，这些子句的长度往往不等长。因此当外层有is_seq=True的memory时，内层是无法直接使用它的，即内层memory的boot_layer不能链接外层的这个memory。
+  - 如果内层memory想使用单层seq的外层memory，只能通过`pooling_layer`、`last_seq`或`first_seq`将它先变成一个向量。但这种情况下，外层memory必须有boot_layer，否则在第0个时间步时，外层memory全部为0且没有任何seq信息，会出段错误。


"会出段错误"这个需要fix，应该检查所有不支持的场景并报出相应错误，以便于用户排查

在Average_layer, SequenceLastInstanceLayer中都加了check，用于检查input中的seq信息。

lcy-seso · 2016-09-26T06:55:49Z

doc_cn/algorithm/rnn/hierarchical-rnn.md

+
+#### 读取双层seq的方法
+
+首先，我们看一下单测中使用的单双层seq的不同数据组织形式（您也可以采用别的的组织形式）：


“您也可以采用别的的组织形式”，这一句多了一个“的”。

lcy-seso · 2016-09-26T07:26:45Z

doc_cn/algorithm/rnn/hierarchical-rnn.md

@@ -0,0 +1,380 @@
+# 双层RNN教程
+


整体感觉是这个文档对普通用户的友好度不够。
（1）建议在文档的最开始增加一个导引，从直觉就可以理解的角度介绍，（1）为什么需要双层RNN；（2）什么场景可以使用双层RNN；
（2）“单进单出：输入和输出都是单层seq” 和 “双进双出：输入和输出都是双层seq” 这种描述其实都已经含有一定的技术细节了，每个示例下面都可以描述一个场景和例子，帮助用户理解。

luotao1 · 2016-09-26T08:02:07Z

paddle/gserver/gradientmachines/RecurrentGradientMachine.cpp

+        CHECK(outFrameLine.frames[i]->getOutput().sequenceStartPositions)
+          << "In hierachical RNN, all out links should be from sequences.";
+      }
+    }


用于检查sequence_nest_rnn.conf中输出是last，但不报错的情况。现在会报错了。

reyoung · 2016-09-26T10:14:52Z

DataProvider is fixed.

emailweixu · 2016-09-29T17:10:41Z

doc_cn/algorithm/rnn/hierarchical-rnn.md

+def outer_step(x):
+    # if memory in hierachical rnn is sequence (is_seq=True), it must be read-only 
+    # memory, since the sequence length of each timestep maybe different.
+    readonly_mem = memory(name="outer_rnn_readonly", size=hidden_dim, is_seq=True)


最好是一个sequence memory真正有用的例子，否则用户很难理解这有什么用。另外这样的不变的memory应该通过StaticInput来实现。

删去了这个单测示例，之后会提供一个sequence memory真正有用的demo

emailweixu · 2016-09-29T17:11:55Z

doc_cn/algorithm/rnn/rnn-tutorial.md

+## 相关概念
+
+### 基本原理
+`recurrent_group` 是PaddlePaddle支持的一种任意复杂的RNN单元。使用者只需要关注于设计RNN在一个时间步之内完成的计算，PaddlePaddle负责完成信息和误差在时间序列上的传播。


误差=〉梯度

emailweixu · 2016-09-29T17:32:31Z

doc_cn/algorithm/rnn/rnn-tutorial.md

+在序列生成任务中，decoder RNN总是引用上一时刻预测出的词的词向量，作为当前时刻输入。`GeneratedInput`自动完成这一过程。
+
+### 输出
+`step`函数必须返回某个Layer的输出，这个Layer的输出会作为整个`recurrent_group` 最终的输出结果。在输出的过程中，`recurrent_group` 会将每个时间步的输出拼接，这个过程对用户也是透明的。


step函数必须返回一个或多个Layer的输出

emailweixu · 2016-09-29T17:34:59Z

doc_cn/algorithm/rnn/hierarchical-layer.md

+
+## expand_layer
+
+expand_layer的使用示例如下，详细见配置API。


详细见配置API：给一个link

已经加了link

* refine sparse momentum api and unittest (PaddlePaddle#126) * refine sparse momentum api and unittest * fix unittests bug * Remove main function in some unittest. * Update Mac OS X port * follow comments to fix bugs * Revise some word in build doc * Add automatic check AVX in CMake (PaddlePaddle#145) * Add automatic check AVX in CMake * Revise table format and some words in build docs * Fix cmake/FindAVX.cmake * Update build docs (PaddlePaddle#148) * Add automatic check AVX in CMake * Add indent in FindAVX.cmake * Revise table format and some words in build docs * Update build docs * Fix bug when only support AVX 2 (PaddlePaddle#150) In some situation, for instance, in the virtual machine, it could happen. * add scripts to build ubuntu install package. (PaddlePaddle#132) * also refine install docs, too * some bug fix for sparse matrix (PaddlePaddle#133) * some bug fix for sparse matrix * a minor bug fix * Update build docs (PaddlePaddle#149) * Add automatic check AVX in CMake * Add indent in FindAVX.cmake * Revise table format and some words in build docs * Update build docs * Update build docs * [DOC CHANGE] Rerange Build docs & emphasize them in README.md (PaddlePaddle#151) * Rerange Build docs & emphasize them in README.md * Rerange Build docs & emphasize them in README.md * Update Readme (PaddlePaddle#153) * Update Readme * Update readme * Update readme * Fix CUDA_VERSION Comparsion (PaddlePaddle#165) * Update readme (PaddlePaddle#155) * Update readme * Apache 2.0 * add interface and test of RecurrentGradientMachine (PaddlePaddle#156) * add interface and unittest of RecurrentGradientMachine for the function of multiple Subsequence inlinks with unequal token length * bug fix for dataprovider for quick start inference (PaddlePaddle#168) * Support MAC OS Sierra (PaddlePaddle#169) * typo in image classification demo (PaddlePaddle#167) * support rectangle padding, stride, window and input for PoolProjection (PaddlePaddle#115) * support rectangle padding, stride, window and input for PoolProjection * Follow comments. 1. Remove start 2. refine img_pool_a/b.conf for test_NetworkCompare 3. Split unit test * Modify the test in img_layers.py * Use C++ 11 atomic_flag in MacOS as spin lock (PaddlePaddle#175) * Use C++ 11 atomic_flag in MacOS as spin lock * Add unittest for it. * Read git sha1 when building Paddle, and add it to PADDLE_VERSION macro * save the model file including git sha1 * add weight for cost layer interface (PaddlePaddle#177) * Should not compile the two files if -DWITH_AVX=OFF. (PaddlePaddle#163) * If cmake -DWITH_AVX=OFF during configuration, should not compile the file src/hl_math.cc and src/hl_avx_functions.cc. * Add travis for osx (PaddlePaddle#189) * set MKL search path with intel64 (PaddlePaddle#188) * Mnist demo (PaddlePaddle#162) * added mnist demo * modified .gitignore for .project files * normalize pixel in mnist_provider.py and set use_gpu=0 * add interface and unittest for nce layer (PaddlePaddle#180) * add interface and unittest for nce layer * follow comments * Merge internal changes (PaddlePaddle#198) * fix DataProvider create function args bug Change-Id: I9e3a1c535c805bf30204a14aea8d5143ff534784 * remove PserverForPython.h which is not used Change-Id: I2b27f1f3c11a42766a92fc689f0f5f1f73ee1d70 * add internal document script Change-Id: Ia0fec79456caea0b271f9903cc13e8a3d32e0774 * hierarchical rnn document, add new config example (PaddlePaddle#106) * hierarchical rnn document, add new config example * update inputs_type of label * add check for unsupported config * refine hierarchical document * refine doc title * update docs, fix paddle to PaddlePaddle * follow comments * remove some copyfrom in AgentLayer and ExpandLayer, fix warning in seq2seq config (PaddlePaddle#183) * remove redundant HPPL_TYPE_DOUBLE (PaddlePaddle#200) * add cost_type constraint to weighted_cost interface (PaddlePaddle#206) * remove unmerged internal documents (PaddlePaddle#205) * Add FAQ (PaddlePaddle#128) * Init commit for doing FAQ * Add speed up training * Add graphviz to ci * Add shared paramter * Tiny refine * Fix bug in yield dictionary in DataProvider. (PaddlePaddle#197) * Fix bug in yield dictionary in DataProvider. * Also make virtualenv work in Paddle. * Update docker_instll.rst docker image name (PaddlePaddle#210) * Fix sparse training for trainer_count=1 (PaddlePaddle#204) * Fix sparse training for trainer_count=1 For trainer_count=1, the gradient machine is NeuralNetwork, which does not create parameter buf for PARAMETER_GRADIENT for sparse update in Parameter::enableType. But gradient parameter buf is still used in SgdThreadUpdater. * Minor update to comment * Supplement doc for RNN (PaddlePaddle#214) * Speed up PyDP2, support numpy.float array (PaddlePaddle#207) * fix bug in some different python environment (PaddlePaddle#220) * Fix install_docker.rst and data_sources file open mode * Follow PaddlePaddle#223 * Fix PaddlePaddle#222 * add base class for seqlastin/max/average layer (PaddlePaddle#187) * Added Bidi-LSTM and DB-LSTM to quick_start demo (PaddlePaddle#226) * add missing layer_attr (PaddlePaddle#234) * fix build bug in gcc46 (PaddlePaddle#236) * error in doc of quick_start (PaddlePaddle#228) * fix error in doc of quick_start * There are some warning when execute preprocess.sh * add maxout layer, including interface and unittest (PaddlePaddle#229) * add maxout layer, including interface and unittest * follow maxout comments * auto setting channels * fix unittest bug in test_RecurrentGradientMachine * remove deprecated start input in img_pool_layer (PaddlePaddle#237) * Fix dataprovider converter for sparse data * FIx check type unmatch in MaxOutLayer (PaddlePaddle#242) Compiled failed on gcc 4.6 * Sequence tagging demo (PaddlePaddle#225) * Update contribute_to_paddle.md (PaddlePaddle#248) * add input sparse data check for sparse layer at runtime (PaddlePaddle#247) * add input sparse data check for sparse layer at runtime, to avoid invalid data access at pserver end while doing prefetch * remote sparse design support binary sparse and float saprse both * Python trainer api (PaddlePaddle#193) * Python trainer API and demo * Adding missing PaddleAPIPrivate.h * Adding api_train.sh * More comments * Bump up patch version to 0b3 * Change contribute to paddle to fit new branching model (PaddlePaddle#275) * Change contribute to paddle to fit new branching model * set test_period default value to 0 (PaddlePaddle#279) * Make Paddle --save_dir support a directory name (PaddlePaddle#277) * Also fix PaddlePaddle#243 * fix interface bug of block_expand_layer and add unittest (PaddlePaddle#265) * fix interface bug of block_expand_layer and add unittest * auto compute num_channels * default value of num_channels is None * adjust input order of block_expand * Support empty Param Block in ParameterSever (PaddlePaddle#244) * Because in cluster maybe use a lot machine to train a model, and some parameter size could be too small for ParameterServer. Then some of pservers could not have any ParamBlock. * Also, because ports_num or ports_num_for_sparse is too large, then give a warning in runtime. * Add bilinear interpolation layer * fix type unmatch on gcc * Adding an introduction doc for Paddle to implement simplest linear regression. * Add default cuda system path (PaddlePaddle#192) * DYLD_LIBRARY_PATH is disable after Mac OS X 10.11 * fix clang + gpu compile error on Mac OS * fix some words and errors in build docs * Add glog header path to include (PaddlePaddle#295) * add SpatialPyramidPoolLayer c++ support * Add job=time in trainer, refine cudnn_conv to reduce gpu memory and speed up training. (PaddlePaddle#218) * Add benchmark for PaddlePaddle, tensorflow and caffe * ConvProjection to reduce memory for goolenet * Add unit test for ConvProjection. 1. unit test in test_LayerGrad. 2. compare the ConvPorjection and CudnnConvLayer, also compare the concat_layer+img_conv_layer and concat_layer_conv_projection. * Reduce cudnn_conv memory and add benchmark document. 1. Use TmpMatrix as the workspace in cudnn_conv to reduce gpu memory. It reduce lots of memory. 2. Add benchmark document. 3. fix smallnet_mnist_cifar.py in paddle. * Add job=time and refine cudnn_conv to reduce gpu memroy and speed up * Refine cudnn_conv and shared biases operation in concat_layer and mixed_layer. * follow comments * follow comments * Use unique_ptr to prevent memory leaks in CudnnConvLayer. * Add some concepts documents to guide user for using paddle (PaddlePaddle#249) * reuse code of PoolProjection in PoolProjectionLayer * Add How to build docs (PaddlePaddle#312) * Bug fix in CudnnConvLayer, which will lead to destruction error. (PaddlePaddle#317) * Fix a bug in testOnePeriod. (PaddlePaddle#322) * Forget to finishTestPeriod in testOnePeriod. * Fix PaddlePaddle#318 * add user_arg to LayerConfig (PaddlePaddle#315) * install the right python package version (PaddlePaddle#326) For multiple installation of paddle, there might be multiple versions of python package at opt/paddle/share/wheels/. We should install the right version. Ideally, we should remove the wrong versions when install. But it's not easy to do this with cmake. Change-Id: Ida8a8d60643ad9e42cf1c85776de9122d5ba1392 * Add matrix inverse (PaddlePaddle#240) * Add matrix inverse * report error when use parallel_nn to train recurrent_nn model (PaddlePaddle#335) * install the right python package version (PaddlePaddle#340) For multiple installation of paddle, there might be multiple versions of python package at opt/paddle/share/wheels/. We should install the right version. Ideally, we should remove the wrong versions when install. But it's not easy to do this with cmake. Change-Id: Ida8a8d60643ad9e42cf1c85776de9122d5ba1392 * Fix minor errors in instructions of building Paddle on Mac OS X (PaddlePaddle#347) * Fix bug and redundant code in hl_dso_loader.cc (PaddlePaddle#306) * Fix glog check type unmatch in Util.cpp (PaddlePaddle#353) * Fix glog check type unmatch in Util.cpp PaddlePaddle#352 * Add code coverage and coveralls (PaddlePaddle#296) * Add Issue template to guide user submit good issue (PaddlePaddle#354) * Add issue template * Update ISSUE_TEMPLATE.md * Update ISSUE_TEMPLATE.md * Rename * Rename * Typo * Typo * Typo * Typo * Follow comments * Follow comments * Add elementwise math operations (PaddlePaddle#343) * Add elementwise math operations This allows use to use expressions like: y=log(1+exp(x)) Also added unittests for ActivationFunction * Enforce keyword arguments for non-positional arguments * Add LogActivation to doc * include mkl_lapacke.h (PaddlePaddle#359) * Update ISSUE_TEMPLATE.md (PaddlePaddle#357) * add rdma cmake support (PaddlePaddle#284) * add rdma cmake support * move rdma related code to rdma.cmake * using find_package for swig (PaddlePaddle#334) * Use diff to compare config unittest (PaddlePaddle#363) Fix PaddlePaddle#342 * Fix SRL hang when exit. (PaddlePaddle#291) * Fix SRL hang when exit. * Error occurred when enable Async Load in TestDataProvider. * It because DataProvider is calling getNextBatchInternal in one thread, and destructing DataProvider in other thread. * Add wait routine in DataProvider destructing. * Also fix another bug, when destructing TestDataProvider and do not read any test data. Fix PaddlePaddle#286 * Follow comments, Use mutex is cool! * Follow comments * Add img_size for unit test * Fix bilinear interp bug * revert flags.cmake * Replace outputH to batchSize * Follow comments * Revise one word in ISSUE_TEMPLATE.md (PaddlePaddle#371) * abstract outputSize function in CNN-related layers (PaddlePaddle#314) * Add define for double getrf, getri (PaddlePaddle#381) * Add SumCost This allows user to implement any type of cost by summing over the output of non-cost layers. Change-Id: Ic55aaabbf0c1299e70b8e48a0effcc91f8f5bd29 * Add sum_cost to document And rebase Change-Id: I7ea234b3aa8fc70675af15d91db08242c43fb5ff * Remove Mac OS X build docs (PaddlePaddle#386) Currently, Paddle on Mac OS X is not deliberate testing through the different versions of Mac OS X and Clang. When all these things that we've done, we will reopen Mac build docs. * add python wrap for sppLayer * Cancelling Travis build with docs updates only. (PaddlePaddle#372) * fix deadlink in Chinese quick start doc. (PaddlePaddle#389) * add python-related unittest problem in faq document (PaddlePaddle#377) * Fix macOS quick start preprocess script. (PaddlePaddle#390) * Use `gshuf` instead of `shuf` in macOS * Fix PaddlePaddle#388 * fix floating-point overflow problem of tanh (PaddlePaddle#355) * py_paddle link zlib(PaddlePaddle#393) * enable swig unittest in travis-ci (PaddlePaddle#394) * Init * Add numpy deps * Refine * fix some nvcc compile options (PaddlePaddle#392) * Follow comments * modify the format of diff information in protostr (PaddlePaddle#398) * Fix minior bug * add patch does not trigger travis ci * follow comments * Fix Travis Ci does not build when push patches (PaddlePaddle#399) * add getSize method for PoolProjection * Make matrix well-conditioned when unittest inverse * Implement setDiag() with BaseMatrix::assign() * Follow comments * follow comments * Update FindAVX.cmake (PaddlePaddle#404) * make AVX_FOUND is default value to WITH AVX * let AVX_FLAG always keep -mavx flag since compiler can build binary with -mavx even CPU does not support avx. * some tiny fixs (PaddlePaddle#406) * some tiny fixs * use VLOG(3) * [Work in Progress] Update cluster_train.md (PaddlePaddle#391) Update cluster_train.md for easier understanding * Fix memory leak in image classification demo, which is caused by dataprovider (PaddlePaddle#323) * the memory leak is inside one pass. * Update * Delelte old protostr * Follow comments * add some code comments for SppLayer * Update * Fix a bug * initial take on deconv layers * added convTrans test and python components * added more test on convTrans layer and comments * Refactor ExpandConvTransLayer to share codes with ExpandConvLayer * refactored ExpandConvLayer and ExpandConvTransLayer with ConvBaseLayerCpu * fixed a bug in refactoring ExpandConv/TransLayer * add another small test in test_LayerGrad for convTransLayer * Revised deconv implementations according to luotao1 * rebase deconv implementation with develop branch and resolve conflicts with pull#218 commit 45c81a4 * deconv layer implementation modification following luotao1 comments * fix a small bug in ConvTransLayerBase in config_parser.py * deconv implementation mionr changes in ConvBaseLayer.cpp and config_parser.py * minor changes on deconv per luotao1 comments * Refactored imageSize in ConvBaseLayer to MathUtil * minor change to convTransLayer test in test_LayerGrad * minor changes on deconv implementation and add protostr test for deconv layer * fixed a bug in parse_conv in config_parser.py * Generate bilinear protostr via Linux * set mixedlayer output size according to input operator (PaddlePaddle#414) * set mixedlayer output size according to input operator * change from num_channel to num_channels for conv_operator (the old one is really misleading because all the others are num_channels) * also changed the arg name in projections.py * change the act.name for LinearActivation() to "linear" so that it won't fail in hl_activetype; also fix the hasinputsset in submodel * Revise code * use yapf to format python code, add style config file * Add checkout name for Dockerfile * Because in dockerhub, we cannot set the `docker build `running directory, we could only use `git clone` command to get the latest code if we put `Dockerfile` in subdirectory * But the `git clone` will checkout the default branch only, so here we add a `ENV` in Dockerfile to checkout special branch or tag in git repo. We could change it to `V0.9.0` tag when it release. * '*' operator overload for LayerOutput Making '*' support the multiplication between a scalar and LayerOutput Also changing '+' to support adding between a vector and a scalar. Change-Id: I7daf35590dc2b2f855a29d9ef43ac57979442e0f * change hlactivetype instead of act.name * fix bug in sum_cost * fix test_layerHelpers unittest error * change python code style to pep8 * Fix bug in multple objects in define_py_sources * Add unittest for split datasource * Fix PaddlePaddle#436 * multi_binary_cross_entropy when ids vector is provided * copy the data when createSparseMatrix * format python code in demo, doc, doc_cn and paddle directories * format python code in python directory * modifications according to comments * Add pre-commit config file. * Add yapf hook to format python code. * Add Remove CRLF * Update pre-commit-config * Check all files by pre commit hooks * Bug fix in testing mode. * Refine clang-format for Paddle style * fix url of sub-pages * added resnet lstm architecture from GNMT * modify document directory structure in model config helpers * Revert "fix url of sub-pages" * Add ScalingProjection out = w * input where w is a parameter of size 1 Change-Id: Ife682d62323ceb1a20cbbf6269421b20a862d888 * Fix unittest Change-Id: Ic80845c892c96c37a0df0ddc433fe1aeaa5a9d1c * Fix forwardTest for ids in python swig. * unittest need to be added. But fix the bugs first. * Bumping up version number to v0.9.0a0 * Fix some problems in Debian build scripts. * Mount local Paddle instead of git clone from remote. * Use official chinese ubuntu source instead of 163 mirror. * Update dockerfile tags * Add version check for paddle * Refine ver2num function, add comments * Fix Debian package name in ubuntu install docs. * Fix PaddlePaddle#486 * Change demo datafile location by using CDN in baidu. * merge bugfix PaddlePaddle#593 and # 597 from develop branch * Bumping up version number * Add Release notes * Refine documentation in RELEASE.md * fix dead link for quick start * update * Fix Travis-CI build for release * Remove typo in documentation. * fix typo

* Make the travis fails if a script fails * Update jobs with names

* clean code * fix SaveModelProtoNoCheck * rm unneeded code * rm unneeded code 02

…ded directory when the vocab file path included in tokenizer_config_file.json does not exist. (PaddlePaddle#106)

Co-authored-by: yangjunchao <yangjunchao@baidu.com>

* Optimizing the zero key problem in the push phase * Optimize CUDA thread parallelism in MergeGrad phase * Optimize CUDA thread parallelism in MergeGrad phase * Performance optimization, segment gradient merging * Performance optimization, segment gradient merging * Optimize pullsparse and increase keys aggregation * sync gpugraph to gpugraph_v2 (#86) * change load node and edge from local to cpu (#83) * change load node and edge * remove useless code Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com> * extract pull sparse as single stage(#85) Co-authored-by: yangjunchao <yangjunchao@baidu.com> Co-authored-by: miaoli06 <106585574+miaoli06@users.noreply.github.com> Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com> Co-authored-by: chao9527 <33347532+chao9527@users.noreply.github.com> Co-authored-by: yangjunchao <yangjunchao@baidu.com> * [GPUGraph] graph sample v2 (#87) * change load node and edge from local to cpu (#83) * change load node and edge * remove useless code Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com> * extract pull sparse as single stage(#85) Co-authored-by: yangjunchao <yangjunchao@baidu.com> * support ssdsparsetable;test=develop (#81) * graph sample v2 * remove log Co-authored-by: miaoli06 <106585574+miaoli06@users.noreply.github.com> Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com> Co-authored-by: chao9527 <33347532+chao9527@users.noreply.github.com> Co-authored-by: yangjunchao <yangjunchao@baidu.com> Co-authored-by: danleifeng <52735331+danleifeng@users.noreply.github.com> * Release cpu graph * uniq nodeid (#89) * compatible whole HBM mode (#91) Co-authored-by: yangjunchao <yangjunchao@baidu.com> * Gpugraph v2 (#93) * compatible whole HBM mode * unify flag for graph emd storage mode and graph struct storage mode * format Co-authored-by: yangjunchao <yangjunchao@baidu.com> * split generate batch into multi stage (#92) * split generate batch into multi stage * fix conflict Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com> * [GpuGraph] Uniq feature (#95) * uniq feature * uniq feature * uniq feature * [GpuGraph] global startid (#98) * uniq feature * uniq feature * uniq feature * global startid * load node edge seperately and release graph (#99) * load node edge seperately and release graph * load node edge seperately and release graph Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com> * v2 infer (#102) * optimize begin pass and end pass (#106) Co-authored-by: yangjunchao <yangjunchao@baidu.com> * fix ins no (#104) * [GPUGraph] fix FillOneStep args (#107) * fix ins no * fix FillOnestep args * fix bug for whole hbm mode (#110) Co-authored-by: yangjunchao <yangjunchao@baidu.com> * [GPUGraph] fix infer && add infer_table_cap (#108) * fix ins no * fix FillOnestep args * fix infer && add infer table cap * fix infer * 【PSCORE】perform ssd sparse table (#111) * perform ssd sparsetable;test=develop Conflicts: paddle/fluid/framework/fleet/ps_gpu_wrapper.cc * perform ssd sparsetable;test=develop * remove debug code; * remove debug code; * add jemalloc cmake;test=develop * fix wrapper;test=develop * fix sample core (#114) * [GpuGraph] optimize shuffle batch (#115) * fix sample core * optimize shuffle batch * release gpu mem when sample end (#116) Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com> * fix class not found err (PaddlePaddle#118) Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com> * optimize sample (PaddlePaddle#117) * optimize sample * optimize sample Co-authored-by: yangjunchao <yangjunchao@baidu.com> * fix clear gpu mem (PaddlePaddle#119) Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com> * fix sample core (PaddlePaddle#121) Co-authored-by: yangjunchao <yangjunchao@baidu.com> * add ssd cache (PaddlePaddle#123) * add ssd cache;test=develop * add ssd cache;test=develop * add ssd cache;test=develop * add multi epoch train & fix train table change ins & save infer embeding (PaddlePaddle#129) * add multi epoch train & fix train table change ins & save infer embedding * change epoch finish judge * change epoch finish change Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com> * Add debug log (PaddlePaddle#131) * Add debug log * Add debug log Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0008.yq01.baidu.com> * optimize mem in uniq slot feature (PaddlePaddle#130) * [GpuGraph] cherry pick var slot feature && fix load multi path node (PaddlePaddle#136) * optimize mem in uniq slot feature * cherry-pick var slot_feature Co-authored-by: huwei02 <53012141+huwei02@users.noreply.github.com> * [GpuGraph] fix kernel overflow (PaddlePaddle#138) * optimize mem in uniq slot feature * cherry-pick var slot_feature * fix kernel overflow && add max feature num flag Co-authored-by: huwei02 <53012141+huwei02@users.noreply.github.com> * fix ssd cache;test=develop (PaddlePaddle#139) * slot feature secondary storage (PaddlePaddle#140) * slot feature secondary storage * slot feature secondary storage Co-authored-by: yangjunchao <yangjunchao@baidu.com> Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0008.yq01.baidu.com> Co-authored-by: xuewujiao <105861147+xuewujiao@users.noreply.github.com> Co-authored-by: miaoli06 <106585574+miaoli06@users.noreply.github.com> Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com> Co-authored-by: chao9527 <33347532+chao9527@users.noreply.github.com> Co-authored-by: yangjunchao <yangjunchao@baidu.com> Co-authored-by: Thunderbrook <52529258+Thunderbrook@users.noreply.github.com> Co-authored-by: danleifeng <52735331+danleifeng@users.noreply.github.com> Co-authored-by: huwei02 <53012141+huwei02@users.noreply.github.com>

* Optimizing the zero key problem in the push phase * Optimize CUDA thread parallelism in MergeGrad phase * Optimize CUDA thread parallelism in MergeGrad phase * Performance optimization, segment gradient merging * Performance optimization, segment gradient merging * Optimize pullsparse and increase keys aggregation * sync gpugraph to gpugraph_v2 (PaddlePaddle#86) * change load node and edge from local to cpu (PaddlePaddle#83) * change load node and edge * remove useless code Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com> * extract pull sparse as single stage(PaddlePaddle#85) Co-authored-by: yangjunchao <yangjunchao@baidu.com> Co-authored-by: miaoli06 <106585574+miaoli06@users.noreply.github.com> Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com> Co-authored-by: chao9527 <33347532+chao9527@users.noreply.github.com> Co-authored-by: yangjunchao <yangjunchao@baidu.com> * [GPUGraph] graph sample v2 (PaddlePaddle#87) * change load node and edge from local to cpu (PaddlePaddle#83) * change load node and edge * remove useless code Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com> * extract pull sparse as single stage(PaddlePaddle#85) Co-authored-by: yangjunchao <yangjunchao@baidu.com> * support ssdsparsetable;test=develop (PaddlePaddle#81) * graph sample v2 * remove log Co-authored-by: miaoli06 <106585574+miaoli06@users.noreply.github.com> Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com> Co-authored-by: chao9527 <33347532+chao9527@users.noreply.github.com> Co-authored-by: yangjunchao <yangjunchao@baidu.com> Co-authored-by: danleifeng <52735331+danleifeng@users.noreply.github.com> * Release cpu graph * uniq nodeid (PaddlePaddle#89) * compatible whole HBM mode (PaddlePaddle#91) Co-authored-by: yangjunchao <yangjunchao@baidu.com> * Gpugraph v2 (PaddlePaddle#93) * compatible whole HBM mode * unify flag for graph emd storage mode and graph struct storage mode * format Co-authored-by: yangjunchao <yangjunchao@baidu.com> * split generate batch into multi stage (PaddlePaddle#92) * split generate batch into multi stage * fix conflict Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com> * [GpuGraph] Uniq feature (PaddlePaddle#95) * uniq feature * uniq feature * uniq feature * [GpuGraph] global startid (PaddlePaddle#98) * uniq feature * uniq feature * uniq feature * global startid * load node edge seperately and release graph (PaddlePaddle#99) * load node edge seperately and release graph * load node edge seperately and release graph Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com> * v2 infer (PaddlePaddle#102) * optimize begin pass and end pass (PaddlePaddle#106) Co-authored-by: yangjunchao <yangjunchao@baidu.com> * fix ins no (PaddlePaddle#104) * [GPUGraph] fix FillOneStep args (PaddlePaddle#107) * fix ins no * fix FillOnestep args * fix bug for whole hbm mode (PaddlePaddle#110) Co-authored-by: yangjunchao <yangjunchao@baidu.com> * [GPUGraph] fix infer && add infer_table_cap (PaddlePaddle#108) * fix ins no * fix FillOnestep args * fix infer && add infer table cap * fix infer * 【PSCORE】perform ssd sparse table (PaddlePaddle#111) * perform ssd sparsetable;test=develop Conflicts: paddle/fluid/framework/fleet/ps_gpu_wrapper.cc * perform ssd sparsetable;test=develop * remove debug code; * remove debug code; * add jemalloc cmake;test=develop * fix wrapper;test=develop * fix sample core (PaddlePaddle#114) * [GpuGraph] optimize shuffle batch (PaddlePaddle#115) * fix sample core * optimize shuffle batch * release gpu mem when sample end (PaddlePaddle#116) Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com> * fix class not found err (PaddlePaddle#118) Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com> * optimize sample (PaddlePaddle#117) * optimize sample * optimize sample Co-authored-by: yangjunchao <yangjunchao@baidu.com> * fix clear gpu mem (PaddlePaddle#119) Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com> * fix sample core (PaddlePaddle#121) Co-authored-by: yangjunchao <yangjunchao@baidu.com> * add ssd cache (PaddlePaddle#123) * add ssd cache;test=develop * add ssd cache;test=develop * add ssd cache;test=develop * add multi epoch train & fix train table change ins & save infer embeding (PaddlePaddle#129) * add multi epoch train & fix train table change ins & save infer embedding * change epoch finish judge * change epoch finish change Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com> * Add debug log (PaddlePaddle#131) * Add debug log * Add debug log Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0008.yq01.baidu.com> * optimize mem in uniq slot feature (PaddlePaddle#130) * [GpuGraph] cherry pick var slot feature && fix load multi path node (PaddlePaddle#136) * optimize mem in uniq slot feature * cherry-pick var slot_feature Co-authored-by: huwei02 <53012141+huwei02@users.noreply.github.com> * [GpuGraph] fix kernel overflow (PaddlePaddle#138) * optimize mem in uniq slot feature * cherry-pick var slot_feature * fix kernel overflow && add max feature num flag Co-authored-by: huwei02 <53012141+huwei02@users.noreply.github.com> * fix ssd cache;test=develop (PaddlePaddle#139) * slot feature secondary storage (PaddlePaddle#140) * slot feature secondary storage * slot feature secondary storage Co-authored-by: yangjunchao <yangjunchao@baidu.com> Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0008.yq01.baidu.com> Co-authored-by: xuewujiao <105861147+xuewujiao@users.noreply.github.com> Co-authored-by: miaoli06 <106585574+miaoli06@users.noreply.github.com> Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com> Co-authored-by: chao9527 <33347532+chao9527@users.noreply.github.com> Co-authored-by: yangjunchao <yangjunchao@baidu.com> Co-authored-by: Thunderbrook <52529258+Thunderbrook@users.noreply.github.com> Co-authored-by: danleifeng <52735331+danleifeng@users.noreply.github.com> Co-authored-by: huwei02 <53012141+huwei02@users.noreply.github.com>

* fused_seqpool_cvm_with_conv support filter by threshold * add fill zero in fused_seqpool_cvm * add fused seq tensor && support transpose batch fc weight --------- Co-authored-by: mojingcj <ChengJing_dhu@163.com> Co-authored-by: jiaoxuewu <jiaoxuewu@163.com> Co-authored-by: yuandong1998 <1377526365@qq.com> Co-authored-by: shangzhongbin <shangzhongbin@baidu.com>

hierarchical rnn document, add new config example

7a69b55

luotao1 assigned reyoung, Zrachel, lcy-seso, qingqing01 and emailweixu Sep 23, 2016

reyoung requested changes Sep 23, 2016

View reviewed changes

luotao1 added 2 commits September 23, 2016 17:39

update inputs_type of label

dc02eb9

Merge branch 'master' into rnn

ffaa30b

emailweixu requested changes Sep 24, 2016

View reviewed changes

luotao1 added 2 commits September 26, 2016 10:57

Merge branch 'master' into rnn

e120dcc

Merge branch 'master' into rnn

42e6320

lcy-seso reviewed Sep 26, 2016

View reviewed changes

add check for unsupported config

0526254

luotao1 commented Sep 26, 2016

View reviewed changes

reyoung approved these changes Sep 26, 2016

View reviewed changes

luotao1 added 5 commits September 28, 2016 10:40

fix conflict with master

e07eb4c

Merge branch 'master' into rnn

60fb5fc

refine hierarchical document

22bc8e8

Merge branch 'master' into rnn

0bd0440

refine doc title

569e273

lcy-seso approved these changes Sep 29, 2016

View reviewed changes

luotao1 added 2 commits September 29, 2016 13:12

Merge branch 'master' into rnn

7d5cc7e

update docs, fix paddle to PaddlePaddle

ebf144c

emailweixu requested changes Sep 29, 2016

View reviewed changes

luotao1 added 3 commits October 8, 2016 09:54

Merge branch 'master' into rnn

aa53d3f

follow comments

bc3d269

Merge branch 'master' into rnn

009f3fa

emailweixu approved these changes Oct 14, 2016

View reviewed changes

emailweixu merged commit cebdb66 into PaddlePaddle:master Oct 14, 2016

luotao1 deleted the rnn branch October 14, 2016 02:23

tensor-tang mentioned this pull request Aug 4, 2017

remove global linker and exe from mkldnn iomp #3244

Merged

zhhsplendid pushed a commit to zhhsplendid/Paddle that referenced this pull request Sep 25, 2019

Make travis fail based on scripts (PaddlePaddle#106)

abf5ac2

* Make the travis fails if a script fails * Update jobs with names

DemoMoon mentioned this pull request Mar 24, 2021

oneDNN 如何能提升DeepSpeech的语音处理性能 #31838

Closed

thisjiang pushed a commit to thisjiang/Paddle that referenced this pull request Oct 28, 2021

Load store to relative indice (PaddlePaddle#106)

d01cfc8

gglin001 added a commit to graphcore/Paddle-fork that referenced this pull request Dec 8, 2021

clean code for Compiler & IpuBackend (PaddlePaddle#106)

7cb619a

* clean code * fix SaveModelProtoNoCheck * rm unneeded code * rm unneeded code 02

Thunderbrook pushed a commit to Thunderbrook/Paddle that referenced this pull request Sep 9, 2022

optimize begin pass and end pass (PaddlePaddle#106)

2bc6bf7

Co-authored-by: yangjunchao <yangjunchao@baidu.com>

lizexu123 pushed a commit to lizexu123/Paddle that referenced this pull request Feb 23, 2024

add early stop (PaddlePaddle#106)

7d1ec56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hierarchical rnn document, add new config example #106

hierarchical rnn document, add new config example #106

luotao1 commented Sep 23, 2016

reyoung Sep 23, 2016

luotao1 Sep 23, 2016

reyoung Sep 23, 2016

luotao1 Sep 23, 2016

emailweixu Sep 24, 2016

luotao1 Sep 26, 2016

lcy-seso Sep 26, 2016

lcy-seso Sep 26, 2016

luotao1 Sep 26, 2016

reyoung commented Sep 26, 2016

emailweixu Sep 29, 2016

luotao1 Oct 8, 2016

emailweixu Sep 29, 2016

luotao1 Oct 8, 2016

emailweixu Sep 29, 2016

luotao1 Oct 8, 2016

emailweixu Sep 29, 2016

luotao1 Oct 8, 2016


		#### 读取双层seq的方法

		首先，我们看一下单测中使用的单双层seq的不同数据组织形式（您也可以采用别的的组织形式）：


		## expand_layer

		expand_layer的使用示例如下，详细见配置API。

hierarchical rnn document, add new config example #106

hierarchical rnn document, add new config example #106

Conversation

luotao1 commented Sep 23, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

reyoung commented Sep 26, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment