Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【Hackathon No. 91】Update rfcs for register_hook #491

Merged
merged 3 commits into from
Apr 4, 2023

Conversation

yangguohao
Copy link
Contributor

您好 @Aurelius84 ,上一次的 rfc 中我并没有整体完成,这里更新了一下相关的内容。我目前已实现了功能,但还存在一些问题,我在这里想请教一下

我在静态图下的 Variable 类的 register_hook 中实现了以下的功能

def register_hook(self, hook):
    """
    Constructing the py_func op for TensorHook in static mode
    """
    import paddle

    def backward_hook_wrapper(dy):
        """call the backward hook in ."""
        import numpy as np
        return hook(np.array(dy))

    def forward_hook_wrapper(x):
        """do nothing but return the variable."""
        return x

    paddle.static.py_func(
        func=forward_hook_wrapper,
        x=self,
        out=self,
        backward_func=backward_hook_wrapper,
        skip_vars_in_backward_input=[self],
    )

可以成功运行下面的例子,与动态图下的行为一致。

import paddle

@to_static
def ffff(x):
    def h(g):
        print('123Hook: ', g)
        return 2 * g
    x.register_hook(h)
    t = 3 * x
    return t

x = paddle.to_tensor([2.0])
x.stop_gradient = False
y = ffff(x)
y.backward()

print('Grad: ', x.grad)  # <---  4.0
############
123Hook: 
[3.]
Grad:  Tensor(shape=[1], dtype=float32, place=Place(cpu), stop_gradient=False,
       [6.])

但是碰到了这样的情况,如果我更改了 register_hook 这行代码的位置,就无法实现功能。例如:

@to_static
def ffff(x):
    def h(g):
        print('123Hook: ', g)
        return 2 * g

    t = 3 * x
    x.register_hook(h)  #<----- 更改了位置
    return t

我猜可能是我在 py_func 的实现出错了,我不太理解 append_op 的操作,这个方面可以讨论一下。

或者我想目前的方案就是规定用户在需要 register_hook 的变量进行其他任何操作前加入 register_hook。如上面的例子,在对 x 进行其他算子操作前,先加入 register_hook。这样是否可行呢?或者有其他任何的方向,我可以去了解一下。

@paddle-bot
Copy link

paddle-bot bot commented Mar 30, 2023

你的PR提交成功,感谢你对开源项目的贡献!
请检查PR提交格式和内容是否完备,具体请参考示例模版
Your PR has been submitted. Thanks for your contribution!
Please check its format and content. For this, you can refer to Template and Demo.

Copy link
Collaborator

@Aurelius84 Aurelius84 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Aurelius84 Aurelius84 merged commit c400f2b into PaddlePaddle:master Apr 4, 2023
@Aurelius84
Copy link
Collaborator

但是碰到了这样的情况,如果我更改了 register_hook 这行代码的位置,就无法实现功能。例如:

@yangguohao 这个报错能贴一下么?

@yangguohao
Copy link
Contributor Author

不会出现任何报错,但是没有成功运行 hook 的函数。

@yangguohao
Copy link
Contributor Author

以下是正确运行时 program 当中的内容:

{ // block 0
    var generated_tensor_0 : LOD_TENSOR.shape(1,).dtype(float32).stop_gradient(False)
    var tmp_0 : LOD_TENSOR.shape(1,).dtype(float32).stop_gradient(False)
    var generated_tensor_0@GRAD : LOD_TENSOR.shape(1,).dtype(float32).stop_gradient(False)
    var tmp_0@GRAD : LOD_TENSOR.shape(1,).dtype(float32).stop_gradient(False)

    {Out=['generated_tensor_0']} = py_func(inputs={X=['generated_tensor_0']}, backward_callable_id = 1, backward_skip_vars = ['generated_tensor_0'], forward_callable_id = 0, op_device = , op_namescope = /, op_role = 0, op_role_var = [], with_quant_attr = False)
    {Out=['tmp_0']} = scale(inputs={ScaleTensor=[], X=['generated_tensor_0']}, bias = 0.0, bias_after_scale = True, op_device = , op_namescope = /, op_role = 0, op_role_var = [], scale = 3.0, with_quant_attr = False)
    {Out=['tmp_0@GRAD']} = fill_any_like(inputs={X=['tmp_0']}, dtype = 5, op_device = , op_namescope = , op_role = 1, op_role_var = [], value = 1.0, with_quant_attr = False)
    {Out=['generated_tensor_0@GRAD']} = scale(inputs={ScaleTensor=[], X=['tmp_0@GRAD']}, bias = 0.0, bias_after_scale = True, op_device = , op_namescope = , op_role = 1, op_role_var = [], scale = 3.0, with_quant_attr = False)
    {Out=['generated_tensor_0@GRAD']} = py_func(inputs={X=['generated_tensor_0@GRAD']}, backward_callable_id = -1, backward_skip_vars = [], forward_callable_id = 1, op_device = , op_namescope = , op_role = 1, op_role_var = [], with_quant_attr = False)
}

更换 hook 位置后

{ // block 0
    var generated_tensor_0 : LOD_TENSOR.shape(1,).dtype(float32).stop_gradient(False)
    var tmp_0 : LOD_TENSOR.shape(1,).dtype(float32).stop_gradient(False)
    var generated_tensor_0@GRAD : LOD_TENSOR.shape(1,).dtype(float32).stop_gradient(False)
    var tmp_0@GRAD : LOD_TENSOR.shape(1,).dtype(float32).stop_gradient(False)

    {Out=['tmp_0']} = scale(inputs={ScaleTensor=[], X=['generated_tensor_0']}, bias = 0.0, bias_after_scale = True, op_device = , op_namescope = /, op_role = 0, op_role_var = [], scale = 3.0, with_quant_attr = False)
    {Out=['generated_tensor_0']} = py_func(inputs={X=['generated_tensor_0']}, backward_callable_id = 1, backward_skip_vars = ['generated_tensor_0'], forward_callable_id = 0, op_device = , op_namescope = /, op_role = 0, op_role_var = [], with_quant_attr = False)
    {Out=['tmp_0@GRAD']} = fill_any_like(inputs={X=['tmp_0']}, dtype = 5, op_device = , op_namescope = , op_role = 1, op_role_var = [], value = 1.0, with_quant_attr = False)
    {Out=['generated_tensor_0@GRAD']} = scale(inputs={ScaleTensor=[], X=['tmp_0@GRAD']}, bias = 0.0, bias_after_scale = True, op_device = , op_namescope = , op_role = 1, op_role_var = [], scale = 3.0, with_quant_attr = False)
}

我的想法是有没有其他办法调换算子的执行顺序。我有简单尝试过,但是都不太成功。如果可以的话,每次的 register 生成的 py_func 算子都移动到该变量生成的算子之后的第一位,也可以解决这个问题。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants