Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AssertionError: Output shape does not match input shape. Data loss has occured. #140

Closed
R-N opened this issue Nov 25, 2023 · 4 comments
Closed

Comments

@R-N
Copy link

R-N commented Nov 25, 2023

Code:

dataset = importer.ImportYoloV5("labels", path_to_images="../images")
dataset.splitter.StratifiedGroupShuffleSplit(train_pct=.8, val_pct=.0, test_pct=.2, batch_size=1)

Environment:

  • Windows 11 64 bit
  • Python 3.10
  • Jupyter notebook
Error:

AssertionError Traceback (most recent call last)
Cell In[13], line 1
----> 1 dataset.splitter.StratifiedGroupShuffleSplit(train_pct=.8, val_pct=.0, test_pct=.2, batch_size=1)

File ~\AppData\Roaming\Python\Python310\site-packages\pylabel\splitter.py:223, in Split.StratifiedGroupShuffleSplit(self, train_pct, test_pct, val_pct, weight, group_col, cat_col, batch_size)
218 df_val["split"] = "val"
220 df = pd.concat([df_train, pd.concat([df_test, df_val])])
222 assert (
--> 223 df.shape == df_main.shape
224 ), "Output shape does not match input shape. Data loss has occured."
226 self.dataset.df = df
227 self.dataset.df = self.dataset.df.reset_index(drop=True)

AssertionError: Output shape does not match input shape. Data loss has occured.

I have no idea what that means and what I should (or shouldn't) do.

alexheat added a commit that referenced this issue Nov 26, 2023
Add tqdm progress bar and resolve #139, resolve #140
alexheat added a commit that referenced this issue Nov 26, 2023
@alexheat
Copy link
Contributor

@R-N I have resolved the issue in the latest version.

@R-N
Copy link
Author

R-N commented Nov 27, 2023

@alexheat Thank you. The error went away, but ShowClassSplits is empty.

image

@R-N
Copy link
Author

R-N commented Nov 27, 2023

I see the problem. dataset.df["cat_name"] is empty

@R-N
Copy link
Author

R-N commented Nov 27, 2023

Okay it seems to be empty from the start, right after loading, so it's a different issue.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants