Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【Hackathon 5th No.42】 为Paddle代码转换工具新增API转换规则 (第1组 编号1-20) #318

Merged
merged 25 commits into from
Nov 8, 2023

Conversation

Li-fAngyU
Copy link
Contributor

@Li-fAngyU Li-fAngyU commented Oct 23, 2023

PR Docs

PR APIs

1.torch.nanquantile
5.torch.Tensor.quantile
10.torch.Tensor.is_pinned
11.torch.Tensor.tile
12.torch.Tensor.to_sparse

其中,torch.Tensor.to_sparse 直接使用了 PR 中的案例。

@paddle-bot
Copy link

paddle-bot bot commented Oct 23, 2023

Thanks for your contribution!

@paddle-bot paddle-bot bot added the contributor External developers label Oct 23, 2023
@Li-fAngyU
Copy link
Contributor Author

Li-fAngyU commented Oct 23, 2023

针对 torch.stft 因其精度无法对齐,因此没有在该PR中,请问该如何解决其精度无法对齐的问题呢?

示例代码:

import paddle
import numpy as np
import torch
np.random.seed(42)
input = np.random.randn(4410)

x = paddle.to_tensor(data=input, dtype='float32')
n_fft = 400
result1 = paddle.signal.stft(n_fft=n_fft, x=x)
result1 = paddle.as_real(result1)

x1 = torch.tensor(data=input, dtype=torch.float)
n_fft = 400
result = x1.stft(n_fft=n_fft)
print(np.array_equal(result1.numpy(), result.numpy()) )

"""
import torch
x = torch.tensor([0., 1., 2., 3.],dtype=torch.float64)
result = x.quantile(0.6)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

第一个参数,指定下关键字吧,这四种情况的用例必须实现:

全部指定关键字、全部不指定关键字、改变关键字顺序、默认参数均不指定

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这些comment不修改吗

obj = APIBase("torch.Tensor.is_pinned")


def test_case_1():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

加个GPU的单测,使用if torch.cuda.is_available(): 判断下

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个comment不修改吗

@zhwesky2010
Copy link
Collaborator

针对 torch.stft 因其精度无法对齐,因此没有在该PR中,请问该如何解决其精度无法对齐的问题呢?

示例代码:

import paddle
import numpy as np
import torch
np.random.seed(42)
input = np.random.randn(4410)

x = paddle.to_tensor(data=input, dtype='float32')
n_fft = 400
result1 = paddle.signal.stft(n_fft=n_fft, x=x)
result1 = paddle.as_real(result1)

x1 = torch.tensor(data=input, dtype=torch.float)
n_fft = 400
result = x1.stft(n_fft=n_fft)
print(np.array_equal(result1.numpy(), result.numpy()) )

可以在 obj.run 时通过设置rtol、atol来降低一下精度阈值要求

@CLAassistant
Copy link

CLAassistant commented Nov 3, 2023

CLA assistant check
All committers have signed the CLA.

@Li-fAngyU
Copy link
Contributor Author

针对 torch.stft 因其精度无法对齐,因此没有在该PR中,请问该如何解决其精度无法对齐的问题呢?
示例代码:

import paddle
import numpy as np
import torch
np.random.seed(42)
input = np.random.randn(4410)

x = paddle.to_tensor(data=input, dtype='float32')
n_fft = 400
result1 = paddle.signal.stft(n_fft=n_fft, x=x)
result1 = paddle.as_real(result1)

x1 = torch.tensor(data=input, dtype=torch.float)
n_fft = 400
result = x1.stft(n_fft=n_fft)
print(np.array_equal(result1.numpy(), result.numpy()) )

可以在 obj.run 时通过设置rtol、atol来降低一下精度阈值要求

这个stfu的精度误差有点大,无法降低精度阈值要求来通过。

import paddle
import numpy as np
import torch
np.random.seed(42)
input = np.random.randn(4410)

x = paddle.to_tensor(data=input, dtype='float32')
n_fft = 400
result1 = paddle.signal.stft(n_fft=n_fft, x=x)
result1 = paddle.as_real(result1)

x1 = torch.tensor(data=input, dtype=torch.float)
n_fft = 400
result = x1.stft(n_fft=n_fft)
print(np.allclose(result1.numpy(), result.numpy(),rtol=1e-01, atol=1e-01) )
# False
max_diff = np.max(result1.numpy() - result.numpy())
print(max_diff)
# 85.4296

@Li-fAngyU Li-fAngyU requested a review from zhwesky2010 November 3, 2023 07:47
@zhwesky2010
Copy link
Collaborator

zhwesky2010 commented Nov 3, 2023

针对 torch.stft 因其精度无法对齐,因此没有在该PR中,请问该如何解决其精度无法对齐的问题呢?
示例代码:

import paddle
import numpy as np
import torch
np.random.seed(42)
input = np.random.randn(4410)

x = paddle.to_tensor(data=input, dtype='float32')
n_fft = 400
result1 = paddle.signal.stft(n_fft=n_fft, x=x)
result1 = paddle.as_real(result1)

x1 = torch.tensor(data=input, dtype=torch.float)
n_fft = 400
result = x1.stft(n_fft=n_fft)
print(np.array_equal(result1.numpy(), result.numpy()) )

可以在 obj.run 时通过设置rtol、atol来降低一下精度阈值要求

这个stfu的精度误差有点大,无法降低精度阈值要求来通过。

import paddle
import numpy as np
import torch
np.random.seed(42)
input = np.random.randn(4410)

x = paddle.to_tensor(data=input, dtype='float32')
n_fft = 400
result1 = paddle.signal.stft(n_fft=n_fft, x=x)
result1 = paddle.as_real(result1)

x1 = torch.tensor(data=input, dtype=torch.float)
n_fft = 400
result = x1.stft(n_fft=n_fft)
print(np.allclose(result1.numpy(), result.numpy(),rtol=1e-01, atol=1e-01) )
# False
max_diff = np.max(result1.numpy() - result.numpy())
print(max_diff)
# 85.4296

那这个是API映射本身就有问题吧,或者说是不是有什么默认参数没有对齐?这个映射关系是否成立呢?抽出来单独实现这个

Copy link
Collaborator

@zhwesky2010 zhwesky2010 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stft的映射关系,是不是根本就是错的。另外其他的comment都没有修复

"torch.Tensor.quantile": {},
"torch.Tensor.quantile": {
"Matcher": "GenericMatcher",
"paddle_api": "paddle.Tensor.quantile",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"min_input_args" 配置了吗

obj = APIBase("torch.Tensor.is_pinned")


def test_case_1():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个comment不修改吗

"""
import torch
x = torch.tensor([0., 1., 2., 3.],dtype=torch.float64)
result = x.quantile(0.6)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这些comment不修改吗

@Li-fAngyU
Copy link
Contributor Author

在上一次 commit 中, 有修改相应的comment的。

@Li-fAngyU
Copy link
Contributor Author

Li-fAngyU commented Nov 6, 2023

对于 stft 查阅了一下资料后,发现该API目前仅支持numpy端的精度对齐,还无法对齐 pytorch 的精度。

相关资料:
Paddle ISSUE: PaddlePaddle/Paddle#47750
Paddle PR: PaddlePaddle/Paddle#42270

@@ -1473,7 +1473,9 @@
},
"torch.Tensor.is_inference": {},
"torch.Tensor.is_meta": {},
"torch.Tensor.is_pinned": {},
"torch.Tensor.is_pinned": {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

torch.Tensor的API,都需要配置min_input_args

"""
import torch
if torch.cuda.is_available():
x = torch.randn(4,4).cuda()
Copy link
Collaborator

@zhwesky2010 zhwesky2010 Nov 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

使用.pin_memory(),测试更好点

@Li-fAngyU
Copy link
Contributor Author

已修改

@zhwesky2010 zhwesky2010 merged commit d173ee1 into PaddlePaddle:master Nov 8, 2023
@luotao1
Copy link
Collaborator

luotao1 commented Nov 8, 2023

@Li-fAngyU 42题全部完成了么

@Li-fAngyU
Copy link
Contributor Author

还有个 torch.stft 不知道该如何处理

@zrr1999
Copy link
Member

zrr1999 commented Nov 14, 2023

还有个 torch.stft 不知道该如何处理

https://github.com/PaddlePaddle/PaConvert/pull/301/files#diff-6bb398e612dfb5646bd931c9370571fbd52a8c9869da6c75b0d62da21213e6da
你可以直接试试这里面的能不能用,我最近没时间搞了

@Li-fAngyU
Copy link
Contributor Author

好的感谢,我有参考过该PR,stft的问题应该是计算结果不一致,不知道可否像 torch.nn.CTCLoss 一样设置为功能缺失。

@zhwesky2010
Copy link
Collaborator

@Li-fAngyU 这个就不处理了,可能是Paddle自身的API问题,完成比率超过70%即视作题目结束。

@Li-fAngyU
Copy link
Contributor Author

好的!

@luotao1 luotao1 changed the title 【Hackathon 5th No.42】 为Paddle代码转换工具新增API转换规则 (第1组 编号1-20) -part2 【Hackathon 5th No.42】 为Paddle代码转换工具新增API转换规则 (第1组 编号1-20) Nov 14, 2023
@luotao1
Copy link
Collaborator

luotao1 commented Nov 14, 2023

完成比率超过70%即视作题目结束。

@Li-fAngyU 补充下:完成率超过70%达到黑客松发放奖金的条件,具体奖金数额与完成率相关。

@Li-fAngyU
Copy link
Contributor Author

ok

@GreatV
Copy link
Contributor

GreatV commented Dec 5, 2023

stft 我在 paddle develop 和 paddle 2.5.2 上都试了一下,没啥问题

stft 在 pytorch 现在必须设置 return_complex=True

import paddle
import numpy as np
import torch
np.random.seed(42)
input = np.random.randn(4410)

x = paddle.to_tensor(data=input, dtype='float32')
n_fft = 400
result1 = paddle.signal.stft(n_fft=n_fft, x=x)
# result1 = paddle.as_real(result1)

x1 = torch.tensor(data=input, dtype=torch.float)
n_fft = 400
# result = torch.stft(input=x1, n_fft=n_fft, return_complex=True)
result = x1.stft(n_fft=n_fft, return_complex=True)
print(np.testing.assert_allclose(result1.numpy(), result.numpy(), rtol=1e-5, atol=1e-5))

stft requires the return_complex parameter be given for real inputs, and will further require that return_complex=True in a future PyTorch release.

@Li-fAngyU
Copy link
Contributor Author

stft 我在 paddle develop 和 paddle 2.5.2 上都试了一下,没啥问题

stft 在 pytorch 现在必须设置 return_complex=True

import paddle
import numpy as np
import torch
np.random.seed(42)
input = np.random.randn(4410)

x = paddle.to_tensor(data=input, dtype='float32')
n_fft = 400
result1 = paddle.signal.stft(n_fft=n_fft, x=x)
# result1 = paddle.as_real(result1)

x1 = torch.tensor(data=input, dtype=torch.float)
n_fft = 400
# result = torch.stft(input=x1, n_fft=n_fft, return_complex=True)
result = x1.stft(n_fft=n_fft, return_complex=True)
print(np.testing.assert_allclose(result1.numpy(), result.numpy(), rtol=1e-5, atol=1e-5))

stft requires the return_complex parameter be given for real inputs, and will further require that return_complex=True in a future PyTorch release.

感谢提醒,经过再次验证当return_complex=True时,即返回复数形式时torch.stftpaddle.signal.stft两者是可以对齐的。可当return_complex=False时,就需要通过paddle.as_real去将复数张量转换成实数张量,经过测试问题应该是出现在paddle.as_real无法与torch.view_as_real对齐。

示例代码:

import paddle
import numpy as np
import torch
np.random.seed(42)
input = np.random.randn(4410)

x = paddle.to_tensor(data=input, dtype='float32')
n_fft = 400
paddle_complex_result = paddle.signal.stft(n_fft=n_fft, x=x)
paddle_real_result = paddle.as_real(paddle_complex_result)

x1 = torch.tensor(data=input, dtype=torch.float)
n_fft = 400
torch_complex_result = x1.stft(n_fft=n_fft, return_complex=True)
torch_real_result = x1.stft(n_fft=n_fft, return_complex=False)
torch_transform_real_result = torch.view_as_real(torch_complex_result)

compare_complex = np.allclose(paddle_complex_result.numpy(), torch_complex_result.numpy(),rtol=1e-01, atol=1e-01)
compare_real = np.allclose(paddle_real_result.numpy(), torch_real_result.numpy(),rtol=1e-01, atol=1e-01)
compare_torch_real = np.allclose(torch_transform_real_result.numpy(), torch_real_result.numpy(),rtol=1e-01, atol=1e-01)
print(f'paddle vs. torch complex result:{compare_complex}')
print(f'paddle vs. torch real result:{compare_real}')
print(f'torch vs. torch real result:{compare_torch_real}')
# paddle vs. torch complex result:True
# paddle vs. torch real result:False
# torch vs. torch real result:True

@GreatV
Copy link
Contributor

GreatV commented Dec 6, 2023

单独使用 paddle.as_realtorch.view_as_real 也是没问题的

import torch
import numpy as np
import paddle

np.random.seed(42)

x = np.random.rand(4000, ) + np.random.rand(4000, ) * 1j
x = x.astype(np.complex64)

x_t = torch.tensor(data=x)
y_t = torch.view_as_real(x_t)

x_p = paddle.to_tensor(data=x)
y_p = paddle.as_real(x_p)


print(np.testing.assert_allclose(y_t.numpy(), y_p.numpy(), rtol=1e-5, atol=1e-5))
None

stft计算后paddle的结果会出现问题,但是给它转成numpy后重新转回来,又是对的。

import paddle
import numpy as np
np.random.seed(42)

input = np.random.randn(4410)

x = paddle.to_tensor(data=input, dtype='float32')
n_fft = 400
res0 = paddle.signal.stft(n_fft=n_fft, x=x)
res_a = paddle.as_real(res0)

res_np = res0.numpy()
res_p = paddle.to_tensor(data=res_np)
res_b = paddle.as_real(res_p)

print(np.testing.assert_allclose(res_a.numpy(), res_b.numpy(), rtol=1e-5, atol=1e-5))
AssertionError: 
Not equal to tolerance rtol=1e-05, atol=1e-05

Mismatched elements: 18030 / 18090 (99.7%)
Max absolute difference: 85.429596
Max relative difference: 4.6649037e+08
 x: array([[[-1.644731e+01,  0.000000e+00],
        [ 2.198440e+01,  6.146729e-08],
        [-7.343600e+00, -1.937151e-06],...
 y: array([[[-1.644731e+01,  0.000000e+00],
        [-1.396130e+01,  0.000000e+00],
        [ 9.019444e+00,  0.000000e+00],...

@Li-fAngyU
Copy link
Contributor Author

是的,这有点奇怪。
res0res_p的numpy结果是一致的,但是对他们两个分别计算as_real后的res_ares_b结果就不一致了。

import paddle
import numpy as np
np.random.seed(42)

input = np.random.randn(4410)

x = paddle.to_tensor(data=input, dtype='float32')
n_fft = 400
res0 = paddle.signal.stft(n_fft=n_fft, x=x)
res_a = paddle.as_real(res0)

res_np = res0.numpy()
res_p = paddle.to_tensor(data=res_np)
res_b = paddle.as_real(res_p)

print(np.testing.assert_allclose(res0.numpy(), res_p.numpy(), rtol=1e-5, atol=1e-5))
print(np.testing.assert_allclose(res_a.numpy(), res_b.numpy(), rtol=1e-5, atol=1e-5))
None
AssertionError: 
...
        [ 2.198440e+01,  6.146729e-08],
        [-7.343600e+00, -1.937151e-06],...
 y: array([[[-1.644731e+01,  0.000000e+00],
        [-1.396130e+01,  0.000000e+00],
        [ 9.019444e+00,  0.000000e+00],...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contributor External developers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants