optimize call to make dygraph faster #37713

JiabinYang · 2021-11-30T09:19:49Z

PR types

Performance optimization

PR changes

Others

Describe

This PR optimize __call__ in paddle.Layers to optimize dygraph performance without to_static.
We test multi-layer linear performance with code below:

import os
import paddle.fluid.core as core
from paddle import _C_ops
import paddle
import numpy as np
from time import time

num_runs = 100000

os.environ["CUDA_VISIBLE_DEVICES"] = ""
paddle.set_device("cpu")

class FluidMatmulx2(paddle.nn.Layer):
    def __init__(self):
        super(FluidMatmulx2, self).__init__()

        arrW1 = np.ones([4, 128]).astype('float32')
        self.W1 = paddle.to_tensor(arrW1, 'float32', core.CPUPlace())
        self.W1.stop_gradient = False
        
        arrW2 = np.ones([128, 2]).astype('float32')
        self.W2 = paddle.to_tensor(arrW2, 'float32', core.CPUPlace())
        self.W2.stop_gradient = False

    def forward(self, obs):
        Out1 = _C_ops.matmul_v2(obs, self.W1, 'trans_x', False, 'trans_y', False)
        Out = _C_ops.matmul_v2(Out1, self.W2, 'trans_x', False, 'trans_y', False)
        return Out

if __name__ == "__main__":
    input_data = np.ones([32, 4]).astype('float32')
    
    ###########
    # Warm Up #
    ###########
    data_paddle = paddle.to_tensor(input_data.astype(np.float32))
    fluid_matmul = FluidMatmulx2()
    for _ in range(num_runs):
        fluid_matmul.forward(data_paddle)
    
    ###############
    # Performance #
    ###############
    # Fluid Matmul Forward
    data_paddle = paddle.to_tensor(input_data.astype(np.float32))
    ts = time()
    for _ in range(num_runs):
        out = fluid_matmul.forward(data_paddle)
    te = time()
    print("Fluid Matmul Forward: ", 1e6*(te-ts))
    
    # Fluid Matmul Call
    data_paddle = paddle.to_tensor(input_data.astype(np.float32))
    ts = time()
    for _ in range(num_runs):
        out = fluid_matmul(data_paddle)
    te = time()
    print("Fluid Matmul Call: ", 1e6*(te-ts))

And we have result looks like below:

This indicates that we have over 20% time cost on python __call__, and with some profile and test, we found that python decorator context used here is pretty slow.

So, we just use if statement to optimize it. And we can get performance improve as below:

paddle-bot-old · 2021-11-30T09:19:58Z

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

zhiqiu

LGTM

* optimizer __call__ to make dygraph faster * fix return type

woshilinsixu · 2021-12-14T10:36:54Z

why I cannot import name '_C_ops' from 'paddle'

* optimizer __call__ to make dygraph faster * fix return type

… in dy2stat (#38418) Fix error when calling sublayer's non-forward func in dy2stat cherrypick: #37713、#37759、#37296、#38540、#37888

optimizer __call__ to make dygraph faster

0d1e9f3

fix return type

a4d807b

zhiqiu approved these changes Dec 1, 2021

View reviewed changes

JiabinYang merged commit 370864d into PaddlePaddle:develop Dec 1, 2021

JiabinYang changed the title ~~optimizer __call__ to make dygraph faster~~ optimize __call__ to make dygraph faster Dec 2, 2021

Zjq9409 pushed a commit to Zjq9409/Paddle that referenced this pull request Dec 10, 2021

optimizer __call__ to make dygraph faster (PaddlePaddle#37713)

96ee2ee

* optimizer __call__ to make dygraph faster * fix return type

0x45f pushed a commit to 0x45f/Paddle that referenced this pull request Dec 24, 2021

optimizer __call__ to make dygraph faster (PaddlePaddle#37713)

26a5907

* optimizer __call__ to make dygraph faster * fix return type

0x45f mentioned this pull request Dec 24, 2021

[CherryPick][Dy2St]Fix error when calling sublayer's non-forward func in dy2stat #38418

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optimize call to make dygraph faster #37713

optimize call to make dygraph faster #37713

JiabinYang commented Nov 30, 2021 •

edited

Loading

paddle-bot-old bot commented Nov 30, 2021

zhiqiu left a comment

woshilinsixu commented Dec 14, 2021

optimize __call__ to make dygraph faster #37713

optimize __call__ to make dygraph faster #37713

Conversation

JiabinYang commented Nov 30, 2021 • edited Loading

PR types

PR changes

Describe

paddle-bot-old bot commented Nov 30, 2021

zhiqiu left a comment

Choose a reason for hiding this comment

woshilinsixu commented Dec 14, 2021

optimize call to make dygraph faster #37713

optimize call to make dygraph faster #37713

JiabinYang commented Nov 30, 2021 •

edited

Loading