Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to Replace backbone #70

Open
DragonBoyL opened this issue Jan 20, 2024 · 38 comments
Open

How to Replace backbone #70

DragonBoyL opened this issue Jan 20, 2024 · 38 comments

Comments

@DragonBoyL
Copy link

I want to change backbone to ResNet, how should I modify the code

@KRK11
Copy link

KRK11 commented Jan 27, 2024

I want to change backbone to ResNet, how should I modify the code

Modify backbone.py to return four feature layers. Of course, only the last three feature layers are used.

@jone222
Copy link

jone222 commented Feb 2, 2024

File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 2152, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for VGG:
Missing key(s) in state_dict: "features.1.weight", "features.1.bias", "features.1.running_mean", "features.1.running_var", "features.3.weight", "features.3.bias", "features.4.weight", "features.4.bias", "features.4.running_mean", "features.4.running_var", "features.8.weight", "features.8.bias", "features.8.running_mean", "features.8.running_var", "features.11.weight", "features.11.bias", "features.11.running_mean", "features.11.running_var", "features.15.weight", "features.15.bias", "features.15.running_mean", "features.15.running_var", "features.18.weight", "features.18.bias", "features.18.running_mean", "features.18.running_var", "features.20.weight", "features.20.bias", "features.21.running_mean", "features.21.running_var", "features.25.weight", "features.25.bias", "features.25.running_mean", "features.25.running_var", "features.27.weight", "features.27.bias", "features.28.running_mean", "features.28.running_var", "features.30.weight", "features.30.bias", "features.31.weight", "features.31.bias", "features.31.running_mean", "features.31.running_var", "features.34.weight", "features.34.bias", "features.35.weight", "features.35.bias", "features.35.running_mean", "features.35.running_var", "features.37.weight", "features.37.bias", "features.38.weight", "features.38.bias", "features.38.running_mean", "features.38.running_var", "features.40.weight", "features.40.bias", "features.41.weight", "features.41.bias", "features.41.running_mean", "features.41.running_var".
Unexpected key(s) in state_dict: "features.2.weight", "features.2.bias", "features.5.weight", "features.5.bias", "features.12.weight", "features.12.bias", "features.19.weight", "features.19.bias", "features.26.weight", "features.26.bias".
size mismatch for features.7.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 64, 3, 3]).
size mismatch for features.10.weight: copying a param with shape torch.Size([256, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]).
size mismatch for features.10.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for features.14.weight: copying a param with shape torch.Size([256, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 128, 3, 3]).
size mismatch for features.17.weight: copying a param with shape torch.Size([512, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 256, 3, 3]).
size mismatch for features.17.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for features.21.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for features.21.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for features.24.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 256, 3, 3]).
size mismatch for features.28.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([512]).

这个问题怎么解决呀

@KRK11
Copy link

KRK11 commented Feb 2, 2024

你是换了backbone之后直接load之前原模型训练出来的checkpoint吗

@jone222
Copy link

jone222 commented Feb 2, 2024

没有佬,我刚开始接触,我想问一下,这个模型用于麦穗计数需要改的代码多么

@KRK11
Copy link

KRK11 commented Feb 2, 2024

你甚至可以不用改动,把数据集处理好就可以。

@jone222
Copy link

jone222 commented Feb 2, 2024

好的佬,我估计数据集没处理好,我再看看。谢谢佬

@jone222
Copy link

jone222 commented Feb 2, 2024

佬,可以方便加个v吗,后续有些问题的话也想问你,万分感谢,我的v,yjh1203-

@KRK11
Copy link

KRK11 commented Feb 2, 2024

在这问吧,这样别人也可以看到你的问题。

@jone222
Copy link

jone222 commented Feb 3, 2024

好的,真的是万分感谢

@jone222
Copy link

jone222 commented Feb 3, 2024

Traceback (most recent call last):
File "/content/drive/MyDrive/CrowdCounting-P2PNet-main/train.py", line 222, in
main(args)
File "/content/drive/MyDrive/CrowdCounting-P2PNet-main/train.py", line 159, in main
stat = train_one_epoch(
File "/content/drive/MyDrive/CrowdCounting-P2PNet-main/engine.py", line 85, in train_one_epoch
for samples, targets in data_loader:
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 630, in next
data = self._next_data()
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 1345, in _next_data
return self._process_data(data)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 1371, in _process_data
data.reraise()
File "/usr/local/lib/python3.10/dist-packages/torch/_utils.py", line 694, in reraise
raise exception
FileNotFoundError: Caught FileNotFoundError in DataLoader worker process 0.

请问这个错误如何解决,已经导入了数据集,并且更换了train和shha上的数据集路径

@KRK11
Copy link

KRK11 commented Feb 3, 2024

你是不是train.list和test.list没写对,或者是SHHA.py下的__init__里的self.train_list,self.test_list没对应好

@jone222
Copy link

jone222 commented Feb 3, 2024

我直接用的小麦的数据集,这张照片是我的数据集格式
屏幕截图 2024-02-03 164932
在shha.py里面他的原码是self.val=一个测试集的list。需要把这个val改为test吗

@KRK11
Copy link

KRK11 commented Feb 3, 2024

self.train_lists = "train.list" self.eval_list = "test.list"
这样就行,你的数据集train.list里怎么写的,然后gt文件里每行一对点坐标。

@jone222
Copy link

jone222 commented Feb 3, 2024

train.list是右边部分
image
我的gt文件是图片对应的txt文件
但是在test文件夹中,我没有划分image和txt直接把他们放在一起了

@KRK11
Copy link

KRK11 commented Feb 3, 2024

gt文件错了,是test/gt/XXXX.txt,不是"text"/gt/XXXX.txt。而且关于gt路径要给对,你test文件夹里得有这个gt文件路径,只要能指定正确路径就行。

@jone222
Copy link

jone222 commented Feb 3, 2024

好的佬,我再改一下,感谢感谢

@jone222
Copy link

jone222 commented Feb 3, 2024

我将text改成test之后依旧是这个错误,我想问一下下面图片中的default应该填什么呀
屏幕截图 2024-02-03 182223

@KRK11
Copy link

KRK11 commented Feb 3, 2024

默认'SHHA'就行,因为源码里的dataload部分实现的时候只考虑了SHHA,只有默认这个才会导入数据。或者你将crowd_datasets/init.py中的if args.dataset_file == 'SHHA':直接删掉,或者改成任意你想要的。

@jone222
Copy link

jone222 commented Feb 3, 2024

好的,我用SHHA,但是还之前的问题
Traceback (most recent call last):
File "/content/drive/MyDrive/CrowdCounting-P2PNet-main/train.py", line 222, in
main(args)
File "/content/drive/MyDrive/CrowdCounting-P2PNet-main/train.py", line 159, in main
stat = train_one_epoch(
File "/content/drive/MyDrive/CrowdCounting-P2PNet-main/engine.py", line 85, in train_one_epoch
for samples, targets in data_loader:
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 630, in next
data = self._next_data()
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 1345, in _next_data
return self._process_data(data)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 1371, in _process_data
data.reraise()
File "/usr/local/lib/python3.10/dist-packages/torch/_utils.py", line 694, in reraise
raise exception
FileNotFoundError: Caught FileNotFoundError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
data = fetcher.fetch(index)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 51, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/content/drive/MyDrive/CrowdCounting-P2PNet-main/crowd_datasets/SHHA/SHHA.py", line 53, in getitem
img, point = load_data((img_path, gt_path), self.train)
File "/content/drive/MyDrive/CrowdCounting-P2PNet-main/crowd_datasets/SHHA/SHHA.py", line 102, in load_data
with open(gt_path) as f_label:
FileNotFoundError: [Errno 2] No such file or directory: '/content/drive/MyDrive/CrowdCounting-P2PNet-main/DATAROOT/test/gt/0078.txt'

@jone222
Copy link

jone222 commented Feb 3, 2024

image
我是不是要根据这个把我数据集中文件名“gt”改为img_gt_path

@KRK11
Copy link

KRK11 commented Feb 3, 2024

不是,是将文件中gt文件对应的真实地址写对,也就是相对地址要对。

@KRK11
Copy link

KRK11 commented Feb 3, 2024

'/content/drive/MyDrive/CrowdCounting-P2PNet-main/DATAROOT/test/gt/0078.txt‘你确定有这个地址吗

@jone222
Copy link

jone222 commented Feb 3, 2024

这个地址在train中有,是/content/drive/MyDrive/CrowdCounting-P2PNet-main/DATAROOT/train/gt/0078.txt,有这个地址。
但是不明白他为啥在说在test找不到,test没有这个地址,我的train是1-700的图片,test是5134-5833(数字是图片的文件名),在test中没有78

@KRK11
Copy link

KRK11 commented Feb 3, 2024

没有的话你的数据集test.list这个文件写错了,这个文件要的是你测试集里面文件的地址。

@jone222
Copy link

jone222 commented Feb 3, 2024

image
我的test.list里面也没有78这个,里面的东西和图片是对应着的呀

@KRK11
Copy link

KRK11 commented Feb 3, 2024

你的train.list里可能出现了这个地址,你是训练还没开始就报错吗,还是在验证时报错

@KRK11
Copy link

KRK11 commented Feb 3, 2024

你的train.list把test后面的gt去掉,因为你直接平铺的,没有gt文件夹

@jone222
Copy link

jone222 commented Feb 3, 2024

是的,我发现问题是在train.list中,gt的路径那个应该是train下面,不是test下面,谢谢佬,耽误你时间了。现在好像跑起来了
Averaged stats: lr: 0.000100 loss: 0.0112 (0.0842) loss_ce: 0.0112 (0.0842) loss_ce_unscaled: 0.0112 (0.0842) loss_point_unscaled: 3.7103 (4.0512)
[ep 0][lr 0.0001000][88.10s]
Averaged stats: lr: 0.000100 loss: 0.0084 (0.0093) loss_ce: 0.0084 (0.0093) loss_ce_unscaled: 0.0084 (0.0093) loss_point_unscaled: 3.5111 (3.5377)
[ep 1][lr 0.0001000][63.98s
是这样吗,那mse,在哪看呀

@KRK11
Copy link

KRK11 commented Feb 3, 2024

每5轮输出结果。

@jone222
Copy link

jone222 commented Feb 18, 2024

佬,还在吗,请问一下跑上海或者nwpu这些数据集时候,数据中的mat文件怎么转成txt

@KRK11
Copy link

KRK11 commented Feb 18, 2024

from scipy.io import loadmat
mat = loadmat(mat_path)
然后根据你的需要将mat里的信息保存就行了。

@jone222
Copy link

jone222 commented Feb 20, 2024

是直接在train中加入这两行代码吗

@mpmmpmmmp
Copy link

@jone222
Copy link

jone222 commented Feb 20, 2024

我看一下,感谢

@willow-god
Copy link

I want to change backbone to ResNet, how should I modify the code我想将骨干更改为ResNet,我应该如何修改代码

Modify backbone.py to return four feature layers. Of course, only the last three feature layers are used.

修改backbone.py返回四个特征层,当然只使用最后三个特征层。

你好大佬,我也尝试修改backbone,现在代码运行了,但是结果不是很理想,我使用了resnet的四层特征,并在decoder中添加了一层进行处理,不过效果反而下降了,我不清楚是什么原因,请问佬有什么想法吗?

@KRK11
Copy link

KRK11 commented Mar 2, 2024

I want to change backbone to ResNet, how should I modify the code我想将骨干更改为ResNet,我应该如何修改代码

Modify backbone.py to return four feature layers. Of course, only the last three feature layers are used.
修改backbone.py返回四个特征层,当然只使用最后三个特征层。

你好大佬,我也尝试修改backbone,现在代码运行了,但是结果不是很理想,我使用了resnet的四层特征,并在decoder中添加了一层进行处理,不过效果反而下降了,我不清楚是什么原因,请问佬有什么想法吗?

效果差大不大?

@willow-god
Copy link

I want to change backbone to ResNet, how should I modify the code我想将骨干更改为ResNet,我应该如何修改代码

Modify backbone.py to return four feature layers. Of course, only the last three feature layers are used.
修改backbone.py返回四个特征层,当然只使用最后三个特征层。

你好大佬,我也尝试修改backbone,现在代码运行了,但是结果不是很理想,我使用了resnet的四层特征,并在decoder中添加了一层进行处理,不过效果反而下降了,我不清楚是什么原因,请问佬有什么想法吗?

效果差大不大?

差的不是很多,但是我感觉不应该下降VGG换ResNet感觉怎么着也得提升吧😂😂😂,模型大致修改了这些位置:
image
上面这个是基线网络的特征提取
image
上面这个是decoder加了一层

@nice98k
Copy link

nice98k commented Nov 16, 2024

P2PNet qq交流群:790745540,互相帮助,解决bug问题

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants