Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Colab: Training steps / epochs not appearing on log, only D_0.pth and G_0.pth being created #321

Closed
outhipped opened this issue Apr 13, 2023 · 16 comments · Fixed by #335
Closed
Labels
bug Something isn't working

Comments

@outhipped
Copy link

Describe the bug
Ttarining step does not work. The log shows:
2023-04-13 22:53:18.009147: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-04-13 22:53:19.311998: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
[22:53:20] INFO [22:53:20] NumExpr defaulting to 2 threads.

To Reproduce
From step "Automatic preprocessing" onwards the log finishes at "NumExpr defaulting to 2 threads." message.

Additional context
Stoped working 2 days ago.

Screenshot 2023-04-14 at 01 01 48

@outhipped outhipped added the bug Something isn't working label Apr 13, 2023
@34j
Copy link
Collaborator

34j commented Apr 14, 2023

#317 made this bug

@roseyyy2022ai
Copy link

roseyyy2022ai commented Apr 14, 2023

So how exactly do I fix it? It stopped working for me as well today.

  • It just stops running after about a minute, still nothing has been trained or done.
    image

@roseyyy2022ai
Copy link

#317 made this bug

How do I revert to like... a previous version or something (in colab)?

@34j
Copy link
Collaborator

34j commented Apr 14, 2023

Probably it is working but just not logging, probably

@outhipped
Copy link
Author

outhipped commented Apr 14, 2023 via email

@roseyyy2022ai
Copy link

Probably it is working but just not logging, probably

Nope. It's not working.
It's crasing after around one minute of running with a green checkmark too. No logs, no checkpoints. Nothing.

@estaesta
Copy link

How do I revert to like... a previous version or something (in colab)?

u can install 33180e9 branch to revert to v3.5.0

@roseyyy2022ai
Copy link

How do I revert to like... a previous version or something (in colab)?

u can install 33180e9 branch to revert to v3.5.0

Okay thanks that seems to kind of work but it still doesn't save checkpoints when I need to. Usually stopping the cell works just fine, 3.5 gives an eror and starts over every time. The end step seems to be set to 9999. Is there a way I can set it to something like 2500 instead?

@Lordmau5
Copy link
Collaborator

#317 made this bug

The logic is inverted.
callbacks=[pl.callbacks.RichProgressBar()] if is_notebook() else None,

It uses the RichProgressBar if it is a notebook, not if it isn't

@axelblaze88
Copy link

#317 made this bug

The logic is inverted. callbacks=[pl.callbacks.RichProgressBar()] if is_notebook() else None,

It uses the RichProgressBar if it is a notebook, not if it isn't

What do you mean? Can you explain me how to solve it pls?

@Lordmau5
Copy link
Collaborator

What do you mean? Can you explain me how to solve it pls?

Okay so, that code snippet checks if it's running in Colab. If it is, it will use the fancier progress bar.

However, this progress bar does not seem to be available inside Colab so 34j made this method to check.

The logic for it is the wrong way around though - instead of using the old / default progress bar in Colab when it detects it's running in Colab, it is using the fancy one that's not available.

The fix would be to do if not is_notebook instead.


I noticed this issue locally when the fancy progress bar wasn't available after an update.

If 34j (or someone else) won't be getting to a pull request I can do one once I'm back at my computer in around an hour

@axelblaze88
Copy link

What do you mean? Can you explain me how to solve it pls?

Okay so, that code snippet checks if it's running in Colab. If it is, it will use the fancier progress bar.

However, this progress bar does not seem to be available inside Colab so 34j made this method to check.

The logic for it is the wrong way around though - instead of using the old / default progress bar in Colab when it detects it's running in Colab, it is using the fancy one that's not available.

The fix would be to do if not is_notebook instead.

I noticed this issue locally when the fancy progress bar wasn't available after an update.

If 34j (or someone else) won't be getting to a pull request I can do one once I'm back at my computer in around an hour

Thank you!! ^^

@831Digital
Copy link

I'm getting the same error @outhipped had right now when I try to train on colab. Switching to the 33180e9 branch works and allows me to train.

@Lordmau5
Copy link
Collaborator

Lordmau5 commented Apr 18, 2023

I'm getting the same error @outhipped had right now when I try to train on colab. Switching to the 33180e9 branch works and allows me to train.

Make sure you are on the newest version by running the pip install -U so-vits-svc-fork command again in the Colab. (That's what fixed it for axel for example)

Also, could you please open a new issue with the logs / errors for better visibility and handling of the repo? Cheers 🙏

@34j
Copy link
Collaborator

34j commented Apr 20, 2023

@allcontributors add outhipped bug

@allcontributors
Copy link
Contributor

@34j

I've put up a pull request to add @outhipped! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants