Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

使用vgg16、vgg19跑5分类花的数据loss不收敛、精度有问题,且怎么指定预训练模型。 #811

Open
LegendSun0 opened this issue Sep 29, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@LegendSun0
Copy link

LegendSun0 commented Sep 29, 2024

If this is your first time, please read our contributor guidelines:
https://github.com/mindspore-lab/mindcv/blob/main/CONTRIBUTING.md

Describe the bug/ 问题描述 (Mandatory / 必填)
使用vgg16、vgg19在GPU和NPU跑5分类花的数据loss不收敛、精度有问题。

  • Hardware Environment(Ascend/GPU/CPU) / 硬件环境:

Please delete the backend not involved / 请删除不涉及的后端:
/device ascend/GPU

  • Software Environment / 软件环境 (Mandatory / 必填):
    -- MindSpore version (e.g., 2.2.11) :
    -- Python version (e.g., Python 3.9.18) :
    -- OS platform and distribution (e.g., Linux Ubuntu 22.04):
    -- GCC/Compiler version (if compiled from source):

  • Excute Mode / 执行模式 (Mandatory / 必填)(PyNative/Graph):

Please delete the mode not involved / 请删除不涉及的模式:
/mode pynative PYNATIVE_MODE(1)
/mode graph

To Reproduce / 重现步骤 (Mandatory / 必填)
Steps to reproduce the behavior:
使用yaml文件训练
命令:python train.py --config ./configs/vgg/vgg16_ascend.yaml

Expected behavior / 预期结果 (Mandatory / 必填)
A clear and concise description of what you expected to happen.

Screenshots/ 日志 / 截图 (Mandatory / 必填)
If applicable, add screenshots to help explain your problem.
yaml文件内容

system

mode: 1
distribute: False
num_parallel_workers: 8
val_while_train: True

dataset

dataset: 'imagenet'
data_dir: './imageNet'
shuffle: True
dataset_download: False
batch_size: 32
drop_remainder: True

augmentation

image_resize: 224
scale: [0.08, 1.0]
ratio: [0.75, 1.333]
hflip: 0.5
interpolation: 'bilinear'
crop_pct: 0.875

model

model: 'vgg16'
num_classes: 5
pretrained: True
ckpt_path: ''
keep_checkpoint_max: 1
ckpt_save_dir: './ckpt3'
epoch_size: 20
dataset_sink_mode: True
amp_level: 'O0'

loss

loss: 'CE'
label_smoothing: 0.1

lr scheduler

scheduler: 'warmup_cosine_decay'
lr: 0.01
min_lr: 0.0001
decay_epochs: 198
warmup_epochs: 2

optimizer

opt: 'momentum'
momentum: 0.9
weight_decay: 0.00004
loss_scale: 1024
use_nesterov: False

训练结果:
Epoch TrainLoss Top_1_Accuracy Top_5_Accuracy TrainTime EvalTime TotalTime
1 1.659075 25.2044% 100.0000% 22.04 0.99 27.67
2 1.790772 19.0736% 100.0000% 6.21 0.84 10.10
3 1.747301 19.0736% 100.0000% 6.46 0.84 10.10
4 1.628069 19.0736% 100.0000% 6.18 0.78 9.68
5 1.661704 19.0736% 100.0000% 6.33 0.85 10.33
6 1.725484 19.0736% 100.0000% 6.19 0.85 10.06
7 1.674596 18.9373% 100.0000% 6.40 0.89 10.36
8 1.607921 19.0736% 100.0000% 6.25 0.75 10.25
9 1.670359 19.0736% 100.0000% 6.17 0.80 10.14
10 1.685464 19.0736% 100.0000% 6.22 0.87 10.75
11 1.688051 19.0736% 100.0000% 6.41 0.83 10.23
12 1.720397 19.0736% 100.0000% 6.22 0.78 10.54
13 1.750791 19.0736% 100.0000% 6.29 0.79 10.29
14 1.598438 19.0736% 100.0000% 6.18 0.83 9.85
15 1.609399 19.0736% 100.0000% 6.14 0.84 9.81
16 1.617299 19.0736% 100.0000% 6.17 0.95 10.13
17 1.744891 19.0736% 100.0000% 6.23 0.86 10.30
18 1.776682 19.0736% 100.0000% 6.18 0.83 9.81
19 1.670697 19.0736% 100.0000% 6.12 0.93 10.03
20 1.782085 19.0736% 100.0000% 6.36 0.83 10.14

Additional context / 备注 (Optional / 选填)
Add any other context about the problem here.
loss不收敛,精度也不对。麻烦看一下是什么问题;还有就是我把预训练模型下载下来了怎么进行指定?目前使用pretrained: True会自动下载且在固定位置,想问下怎么进行指定;

@LegendSun0 LegendSun0 added the bug Something isn't working label Sep 29, 2024
@LegendSun0 LegendSun0 changed the title 使用vgg16、vgg19跑5分类花的数据loss不收敛、精度有问题。 使用vgg16、vgg19跑5分类花的数据loss不收敛、精度有问题,且怎么指定预训练模型。 Sep 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant