Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

询问data_process.py中ConcatDataset的使用 #3

Open
jt-dcw opened this issue Sep 24, 2024 · 1 comment
Open

询问data_process.py中ConcatDataset的使用 #3

jt-dcw opened this issue Sep 24, 2024 · 1 comment

Comments

@jt-dcw
Copy link

jt-dcw commented Sep 24, 2024

首先非常团队的出色工作,但是我在复现时有一个问题不是很理解,问题如下:
假设进行预训练时所使用的数据是['CAD4-1', 'NYC_TAXI'],这两个数据的特征数量分别是621和263,在特征数量不同的情况下是如何使用ConcatDataset是如何进行数据拼接呢

@LZH-YS1998
Copy link
Collaborator

感谢您的关注!可以使用ConcatDataset进行拼接是因为:ConcatDataset 通过保持一个内部的索引,将传入的多个数据集按照顺序连接。与每个数据集的样本内容或样本的尺寸无关。

您可以参考data_process.py部分的代码实现,以及pytorch提供的ConcatDataset官方文档。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants