Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add copy from tensor #34406

Merged
merged 29 commits into from
Aug 26, 2021
Merged

Conversation

shangzhizhou
Copy link
Member

@shangzhizhou shangzhizhou commented Jul 27, 2021

PR types

Others

PR changes

APIs

Describe

add copy_from_tensor api for inference tensor

一、python增加同步copy_tensor的api,使用示例

from paddle.inference.contrib import utils

utils.copy_tensor(dst_tensor, src_tensor)

二、C++增加paddle_infer::Tensor::CopyToCpuAsync接口(当前只支持GPU到CPU的异步拷贝,此时不能使用Host申请的内存,需要使用cuda的pinned memory,可以使用提供的工具函数申请和释放 CudaMallocPinnedMemory()/CudaFreePinnedMemory()),示例如下

  1. 返回stream的调用方式
  //...
  //predictor运行代码

  const auto &output_names = predictor->GetOutputNames();
  auto output_tensor = predictor->GetOutputHandle(output_names[0]);
  std::vector<int> output_shape = output_tensor->shape();
  int out_num = std::accumulate(output_shape.begin(), output_shape.end(), 1,
                                std::multiplies<int>());

  float *out_data = static_cast<float *>(
      contrib::TensorUtils::CudaMallocPinnedMemory(sizeof(float) * out_num));

  cudaStream_t stream;
  output_tensor->CopyToCpuAsync(out_data, static_cast<void *>(&stream));

  // sync
  cudaStreamSynchronize(stream);

  contrib::TensorUtils::CudaFreePinnedMemory(static_cast<void *>(out_data));
  1. 使用回调的调用方式
  //...
  //predictor运行代码

  const auto &output_names = predictor->GetOutputNames();
  auto output_tensor = predictor->GetOutputHandle(output_names[0]);
  std::vector<int> output_shape = output_tensor->shape();
  int out_num = std::accumulate(output_shape.begin(), output_shape.end(), 1,
                                std::multiplies<int>());

  float *out_data = static_cast<float *>(
      contrib::TensorUtils::CudaMallocPinnedMemory(sizeof(float) * out_num));

  output_tensor->CopyToCpuAsync(
      out_data,
      [](void *cb_params) {
        float *data = static_cast<float *>(cb_params);
        for (int i = 0; i < 10; i++) {
          std::cout << data[i] << std::endl;
        }
      },
      static_cast<void *>(out_data));

  cudaDeviceSynchronize();
  contrib::TensorUtils::CudaFreePinnedMemory(static_cast<void *>(out_data));

三、增加C++ tensor拷贝函数

  static void CopyTensor(Tensor* p_dst, const Tensor& src);
  static void CopyTensorAsync(Tensor* p_dst, const Tensor& src,
                              void* exec_stream);
  static void CopyTensorAsync(Tensor* p_dst, const Tensor& src, CallbackFunc cb,
                              void* cb_params);

异步使用方式参考Tensor.CopyToCpuAsync()
测试代码参考 paddle/fluid/inference/tests/api/paddle_infer_api_copy_tensor_tester.cc

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

const std::string& name, PlaceType place, void* p_scope);

private:
static void CopyTensorImp(Tensor& dst, const Tensor& src, void* exec_stream,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imp -> Impl

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

std::vector<float> input(in_num, 1.0);

auto input_names = predictor->GetInputNames();
auto input_t = predictor->GetInputHandle(input_names[0]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

input_tensor

_t 后缀一般表示 type,比如 value_t

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -185,7 +187,8 @@ void Tensor::CopyFromCpu(const T *data) {
}

template <typename T>
void Tensor::CopyToCpu(T *data) {
void Tensor::CopyToCpuImp(T *data, void *exec_stream, CallbackFunc cb,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

全局, Imp -> Impl

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


using paddle::PaddleDType;

std::unique_ptr<Tensor> TensorUtils::CreateInferTensorForTest(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

用于单测的,最好单独拆出去,用 WITH_TESTING 宏隔开

这个会进到最终生产环境的库里面吧?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done,thanks

@@ -0,0 +1,325 @@
/* Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

文件名 copy_tensor,两个单词

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这是个单测?

文件名也需要加 _tester

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@Superjomn
Copy link
Contributor

最好把用例,功能在 PR 描述里面也加下

@shangzhizhou
Copy link
Member Author

最好把用例,功能在 PR 描述里面也加下

done

Copy link
Contributor

@XieYunshen XieYunshen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for 'set_tests_properties(paddle_infer_api_copy_tensor_tester PROPERTIES TIMEOUT 30) '

@shangzhizhou shangzhizhou merged commit ac33c0c into PaddlePaddle:develop Aug 26, 2021
lelelelelez added a commit that referenced this pull request Aug 26, 2021
wanghuancoder pushed a commit that referenced this pull request Aug 27, 2021
shangzhizhou added a commit to shangzhizhou/Paddle that referenced this pull request Aug 30, 2021
shangzhizhou added a commit that referenced this pull request Aug 31, 2021
* Revert "Revert "Add copy from tensor (#34406)" (#35173)"

This reverts commit 32c1ec4.

* add template instantiation
@shangzhizhou shangzhizhou deleted the add_copy_from_tensor branch September 10, 2021 05:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants