[DO NOT MERGE] MKLDNN layouts #10291

pzelazko-intel · 2018-04-28T11:04:05Z

DO NOT MERGE IT

Please take a look at MKLDNN layout Proof of Concept and assess if we can stay with this design. This code is not finished - there are debug VLOG prints, some parts are missing and the code needs cleanup.

We need MKLDNN layouts for MKLDNN kernels to be performed efficiently - especially for convolution and fully-connected OPs.

In #8305 it was recommended to create MKLDNNLoDTensor class deriving from LoDTensor. But that approach was causing too many problems - in many places Tensor class is used explicitly and I would need to differentiate between LoDTensor and MKLDNNLoDTensor.

I came up with a different approach - I've added Tensor::ExtendedData interface and a pointer to this interface in Tensor class. It is a placeholder for additional data. I use it for MKLDNN layout case - mkldnn::memory::format and mkldnn::engine are held in MKLDNNTensorData, which derives from Tensor::ExtendedData. If Tensor layout is set to kMKLDNN, then I can treat this tensor as holding data in MKLDNN format layout. To make use of such a tensor, I've created decorator classes MKLDNNTensor and MutableMKLDNNTensor. The data is kept still in Placeholder within Tensor.

Reorders to MKLDNN format happen within convolution and fully-connected, because:

For these two OPs MKLDNN library can decide which layout is best based on input parameters like the size of the input, stride, padding etc.
Other MKLDNN OPs don't offer significant performance boost on MKLDNN layouts

When the next OP expected kernel is not a MKLDNN one, then we transform tensor to NCHW layout.

TODO: fc needs reorder implemented for backpropagation

luotao1 · 2018-05-07T03:04:51Z

@pzelazko-intel Sorry for the late reply.
It is better to create MKLDNNLoDTensor class deriving from LoDTensor. Some reasons are in #8305 (comment)

For examples, fluid_op_A->mkldnn_op_B->fluid_op_C, if mkldnn_op_B doesn't use LoDTensor, how should fluid_op_C get the LoD information?

pzelazko-intel · 2018-05-07T08:55:19Z

@luotao1 LoDTensor is still used, because it inherits from Tensor, which I modified to be capable of keeping data in MKLDNN format.
Plain Tensor is used in many places, like https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/framework/data_transform.cc#L29.
In this case I cannot perform data transformation from MKLDNN format to let say NCHW, because MKLDNN specific data would be kept in MKLDNNLoDTensor.

Please let me know what you think.

jacquesqiao · 2018-05-07T10:53:01Z

I an not sure what is the problem of create MKLDNNLoDTensor class deriving from LoDTensor? In fluid, most of the Tensors are actually LoDTensors, if operator do not need LoD information, it can treat LoDTensor as Tensor, I think we can also make MKLDNNLoDTensor deriving from LoDTensor, so that we can keep LoD information.

Plain Tensor is used in many places, like https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/framework/data_transform.cc#L29.

Data transform use plain tensor here because it only need to use the data inside the tensor, most of the inputs are actually LoDTensor.

pzelazko-intel · 2018-05-07T11:42:02Z

Data transform use plain tensor here because it only need to use the data inside the tensor, most of the inputs are actually LoDTensor.

The problem is that DataTransform does not have information about MKLDNN format (which would be in MKLDNNLoDTensor).

jacquesqiao · 2018-05-08T01:54:24Z

The problem is that DataTransform does not have information about MKLDNN format (which would be in MKLDNNLoDTensor).

Cay we use tensor->layout to judge if a tensor is MKLDNNLoDTensor and then cast it to MKLDNNLoDTensor?

pzelazko-intel · 2018-05-08T05:20:40Z

Cay we use tensor->layout to judge if a tensor is MKLDNNLoDTensor and then cast it to MKLDNNLoDTensor?

We could, but in mentioned DataTransform and in OperatorWithKernel::RunImpl we instantiate Tensor class, not LoDTensor: https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/framework/operator.cc#L575.

Having to cast Tensor to MKLDNNLoDTensor, we would have to do it in many places, making our code full of if-else constructions.

luotao1 · 2018-05-09T02:07:24Z

Having to cast Tensor to MKLDNNLoDTensor, we would have to do it in many places, making our code full of if-else constructions.

How about new files named mkldnn_lod_tensor.h and mkldnn_lod_tensor.cc, and implement the cast in these files, just like MKLDNNMatrix.h ?

pzelazko-intel · 2018-05-09T04:39:10Z

Having to cast Tensor to MKLDNNLoDTensor, we would have to do it in many places, making our code full of if-else constructions.

Where exactly is there a cast? Anyway cast won't solve my problem, because DataTransform operates on Tensor instance: https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/framework/operator.cc#L575.

jacquesqiao · 2018-05-09T09:24:13Z

The relationship is MKLDNNLoDTensor -> LoDTensor -> Tensor, so DataTransform can still operates on Tensor(MKLDNNLoDTensor or LoDTensor) instance and do cast if needed.

pzelazko-intel · 2018-05-09T11:50:12Z

Yes, but the output of DataTransform is an instance of plain Tensor: https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/framework/operator.cc#L575.

Moreover please take a look at: https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/framework/data_transform.cc#L33 - there are again plain Tensor instances used.

How would you overcome this obstacle?

luotao1 · 2018-05-09T14:01:32Z

@pzelazko-intel I have another question, could we use mkldnn op do inference if MKLDNN layout not finished?

pzelazko-intel · 2018-05-09T14:06:18Z

@luotao1 yes, but it will have worse performance than version with MKLDNN layouts.

luotao1 · 2018-05-09T14:24:39Z

Oh, could some PRs (run inference model with current mkldnn ops in c++ end) created first to validate the whole process?

pzelazko-intel · 2018-05-09T19:07:01Z

@luotao1 I'm not sure if I understand - you mean PR with some inference model test with MKLDNN OPs enabled?

jacquesqiao · 2018-05-10T02:59:53Z

For MKLDNNTensor, what is needed in DataTransform? Because for the current LoDTensor, in DataTransform, we only need the field in Tensor.

pzelazko-intel · 2018-05-10T05:34:02Z

MKLDNNTensor has additionally information about MKLDNN format used and MKLDNN engine (MKLDNNTensorData class), but the latter one can be extracted from MKLDNNDeviceContext. So we have to know MKLDNN format in which the data is organized. There are many such formats and MKLDNN can choose which one to use, so I would not recommend to simply add every one of them to the the DataLayout. Instead, we could add field like mkldnn_format to the Tensor in place of extended_data if you don't like concept of extended data.

jacquesqiao · 2018-05-10T05:59:19Z

I notice that in the current implementation extended_data is one filed of Tensor, does it means that we can get all the information we need to do data transform from Tensor and then construct a MKLDNNTensor with the data inside the transformed tensor, like: https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/framework/operator.cc#L578

pzelazko-intel · 2018-05-10T08:30:34Z

It may work with some workarounds, because in DataTransform we reorder only from MKLDNN format to common ones.

So eventually you don't like extended_data concept and want me to reimplement it to MKLDNNLoDTensor, right? Could you please check if other changes are acceptable? That can save me from future reimplementations.

Thank you

jacquesqiao · 2018-05-10T11:24:18Z

Sorry, maybe my expression is not so clear, I think to add extended_data in Tensor is Ok, what I want to do is to make clear what is needed for data_transform. If all we need is inside tensor itself(such as layout, data, and extended_data) then deriving from Tensor or deriving from LoDTensor seems to have no different, but we need the LoD information to pass along with operators running.

pzelazko-intel · 2018-05-10T12:00:13Z

If I modified Tensor and Tensor is a base class for LoDTensor, then LoD information still passes along operators? Moreover I assumed that doing reorders I don't need to modify LoD information. Am I right?

luotao1 · 2018-05-10T12:42:07Z

If I modified Tensor and Tensor is a base class for LoDTensor, then LoD information still passes along operators?

Yes, you are right.

Moreover I assumed that doing reorders I don't need to modify LoD information. Am I right?

Yes, you are right. The LoD information is independent of Tensor.

jacquesqiao · 2018-05-11T07:19:57Z

@pzelazko-intel After data transform, we will copy the transformed data inside tensor back to it's original var type, the lod infromation will also be copied.

Paddle/paddle/fluid/framework/operator.cc

Lines 574 to 578 in 18f527b

    
           auto* trans_var = new_scope.Var(var_name); 
        
           std::shared_ptr<Tensor> out(new Tensor); 
        
           DataTransform(expected_kernel_key, kernel_type_for_var, *tensor_in, 
        
                         out.get()); 
        
           CopyVariableWithTensor(*var, *(out.get()), trans_var);

Paddle/paddle/fluid/framework/data_transform.cc

Lines 65 to 81 in f09aed0

    
           void CopyVariableWithTensor(const Variable& in_var, const Tensor& tensor, 
        
                                       Variable* out_var) { 
        
             if (in_var.IsType<LoDTensor>()) { 
        
               auto& in_lod_tensor = in_var.Get<LoDTensor>(); 
        
               auto* tran_lod_tensor = out_var->GetMutable<LoDTensor>(); 
        
               tran_lod_tensor->set_lod(in_lod_tensor.lod()); 
        
               tran_lod_tensor->set_layout(in_lod_tensor.layout()); 
        
               tran_lod_tensor->ShareDataWith(tensor); 
        
             } else if (in_var.IsType<SelectedRows>()) { 
        
               auto& in_selected_rows = in_var.Get<SelectedRows>(); 
        
               auto* trans_selected_rows = out_var->GetMutable<SelectedRows>(); 
        
               trans_selected_rows->set_height(in_selected_rows.height()); 
        
               trans_selected_rows->set_rows(in_selected_rows.rows()); 
        
               trans_selected_rows->mutable_value()->ShareDataWith(tensor); 
        
             } else { 
        
               PADDLE_THROW("unknown var type"); 
        
             }

If we add MKLDNNLoDTensor, we also need to do this, and the LoD information should maintain in the MKLDNNLoDTensor. That is why we think MKLDNNLoDTensor should derive from LoDTensor, because it need to store the LoD information inside it.

pzelazko-intel · 2018-05-11T08:05:16Z

@jacquesqiao I agree, but please notice that in proposed solution I don't create any class deriving from Tensor. MKLDNNTensor is only a decorator for Tensor class.

If we are on the same page and you have no other questions, then please let me know if I should stay with the current design or reimplement it to MKLDNNLoDTensor deriving from LoDTensor.

Thank you

jacquesqiao · 2018-05-14T06:50:35Z

@pzelazko-intel After discussion, we think use decorator is ok, we should move on now, thanks!

luotao1 · 2018-05-14T06:53:40Z

@pzelazko-intel Discussed with @jacquesqiao and @Superjomn , if you can ensure the lod information transfer among different ops, we think that you can stay with your current design.

pzelazko-intel added 5 commits April 27, 2018 18:07

MKLDNNTensor introduced

1442bca

backward convolution works with block layouts

d684376

mnist training converges

246fa41

reorders logged

c7189f5

all mkldnn kernels using MKLDNNTensor

0a6571b

TODO: fc needs reorder implemented for backpropagation

pzelazko-intel requested review from jacquesqiao and luotao1 April 28, 2018 11:04

pzelazko-intel added the Intel label Apr 28, 2018

pzelazko-intel closed this May 24, 2018

pzelazko-intel mentioned this pull request May 24, 2018

support for MKLDNN layouts #10915

Closed

mozga-intel mentioned this pull request May 30, 2018

Mkldnn layout #11040

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DO NOT MERGE] MKLDNN layouts #10291

[DO NOT MERGE] MKLDNN layouts #10291

pzelazko-intel commented Apr 28, 2018

luotao1 commented May 7, 2018

pzelazko-intel commented May 7, 2018

jacquesqiao commented May 7, 2018 •

edited

Loading

pzelazko-intel commented May 7, 2018

jacquesqiao commented May 8, 2018

pzelazko-intel commented May 8, 2018

luotao1 commented May 9, 2018

pzelazko-intel commented May 9, 2018

jacquesqiao commented May 9, 2018

pzelazko-intel commented May 9, 2018

luotao1 commented May 9, 2018

pzelazko-intel commented May 9, 2018

luotao1 commented May 9, 2018

pzelazko-intel commented May 9, 2018

jacquesqiao commented May 10, 2018

pzelazko-intel commented May 10, 2018

jacquesqiao commented May 10, 2018

pzelazko-intel commented May 10, 2018

jacquesqiao commented May 10, 2018

pzelazko-intel commented May 10, 2018

luotao1 commented May 10, 2018

jacquesqiao commented May 11, 2018

pzelazko-intel commented May 11, 2018

jacquesqiao commented May 14, 2018 •

edited

Loading

luotao1 commented May 14, 2018

[DO NOT MERGE] MKLDNN layouts #10291

[DO NOT MERGE] MKLDNN layouts #10291

Conversation

pzelazko-intel commented Apr 28, 2018

luotao1 commented May 7, 2018

pzelazko-intel commented May 7, 2018

jacquesqiao commented May 7, 2018 • edited Loading

pzelazko-intel commented May 7, 2018

jacquesqiao commented May 8, 2018

pzelazko-intel commented May 8, 2018

luotao1 commented May 9, 2018

pzelazko-intel commented May 9, 2018

jacquesqiao commented May 9, 2018

pzelazko-intel commented May 9, 2018

luotao1 commented May 9, 2018

pzelazko-intel commented May 9, 2018

luotao1 commented May 9, 2018

pzelazko-intel commented May 9, 2018

jacquesqiao commented May 10, 2018

pzelazko-intel commented May 10, 2018

jacquesqiao commented May 10, 2018

pzelazko-intel commented May 10, 2018

jacquesqiao commented May 10, 2018

pzelazko-intel commented May 10, 2018

luotao1 commented May 10, 2018

jacquesqiao commented May 11, 2018

pzelazko-intel commented May 11, 2018

jacquesqiao commented May 14, 2018 • edited Loading

luotao1 commented May 14, 2018

jacquesqiao commented May 7, 2018 •

edited

Loading

jacquesqiao commented May 14, 2018 •

edited

Loading