Skip to content

Commit

Permalink
fix doc error (PaddlePaddle#675)
Browse files Browse the repository at this point in the history
  • Loading branch information
sneaxiy authored and shanyi15 committed Mar 9, 2019
1 parent 7f7c7da commit d9ae6d8
Showing 1 changed file with 8 additions and 11 deletions.
19 changes: 8 additions & 11 deletions doc/fluid/user_guides/howto/prepare_data/use_py_reader.rst
Original file line number Diff line number Diff line change
Expand Up @@ -89,18 +89,20 @@
设置PyReader对象的数据源
################################

PyReader对象通过 :code:`decorate_paddle_reader()` 或 :code:`decorate_tensor_provider()` 方法设置其数据源。 :code:`decorate_paddle_reader()` 和 :code:`decorate_tensor_provider()` 均接收Python生成器 :code:`generator` 作为参数, :code:`generator` 内部每次通过yield的方式生成一个batch的数据。

:code:`decorate_paddle_reader()` 和 :code:`decorate_tensor_provider()` 方法的区别在于:

- :code:`decorate_paddle_reader()` :code:`generator` 应返回Numpy Array类型的数据,:code:`decorate_tensor_provider()` :code:`generator` 应返回LoDTensor类型的数据
- :code:`decorate_paddle_reader()` 要求 :code:`generator` 返回的数据格式为[(img_1, label_1), (img_2, label_2), ..., (img_n, label_n)],其中img_i和label_i均为每个样本的Numpy Array类型数据,n为batch size。:code:`decorate_tensor_provider()` 要求 :code:`generator` 返回的数据的数据格式为[batched_imgs, batched_labels],其中batched_imgs和batched_labels为batch级的Numpy Array或LoDTensor类型数据

- :code:`decorate_tensor_provider()` 要求 :code:`generator` 返回的LoDTensor的数据类型、尺寸必须与配置py_reader时指定的dtypes、shapes参数相同,而 :code:`decorate_paddle_reader()` 不要求数据类型和尺寸的严格一致,其内部会完成数据类型和尺寸的转换。
- :code:`decorate_tensor_provider()` 要求 :code:`generator` 返回的数据类型、尺寸必须与配置py_reader时指定的dtypes、shapes参数相同,而 :code:`decorate_paddle_reader()` 不要求数据类型和尺寸的严格一致,其内部会完成数据类型和尺寸的转换。

具体方式为:

.. code-block:: python
import paddle.batch
import paddle.fluid as fluid
import numpy as np
Expand All @@ -109,8 +111,8 @@ PyReader对象通过 :code:`decorate_paddle_reader()` 或 :code:`decorate_tensor
# Case 1: Use decorate_paddle_reader() method to set the data source of py_reader
# The generator yields Numpy-typed batched data
def fake_random_numpy_reader():
image = np.random.random(size=(BATCH_SIZE, 784))
label = np.random.random_integers(size=(BATCH_SIZE, 1), low=0, high=9)
image = np.random.random(size=(784, ))
label = np.random.random_integers(size=(1, ), low=0, high=9)
yield image, label
py_reader1 = fluid.layers.py_reader(
Expand All @@ -120,19 +122,14 @@ PyReader对象通过 :code:`decorate_paddle_reader()` 或 :code:`decorate_tensor
name='py_reader1',
use_double_buffer=True)
py_reader1.decorate_paddle_reader(fake_random_reader)
py_reader1.decorate_paddle_reader(paddle.batch(fake_random_reader, batch_size=BATCH_SIZE))
# Case 2: Use decorate_tensor_provider() method to set the data source of py_reader
# The generator yields Tensor-typed batched data
def fake_random_tensor_provider():
image = np.random.random(size=(BATCH_SIZE, 784)).astype('float32')
label = np.random.random_integers(size=(BATCH_SIZE, 1), low=0, high=9).astype('int64')
image_tensor = fluid.LoDTensor()
image_tensor.set(image, fluid.CPUPlace())
label_tensor = fluid.LoDTensor()
label_tensor.set(label, fluid.CPUPlace())
yield image_tensor, label_tensor
py_reader2 = fluid.layers.py_reader(
Expand Down

0 comments on commit d9ae6d8

Please sign in to comment.