Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MNIST dataset from torchvision versus keras.datasets #1

Open
dgcovell opened this issue Oct 8, 2024 · 0 comments
Open

MNIST dataset from torchvision versus keras.datasets #1

dgcovell opened this issue Oct 8, 2024 · 0 comments

Comments

@dgcovell
Copy link

dgcovell commented Oct 8, 2024

I am trying to implement the code at

https://github.com/tschechlovdev/AutoEncoder_KMeans/blob/main/AutoEncoder_KMeans_MNIST.ipynb

but I would like to replace the mnist dataset with my own data. From the above site the steps for loading MNIST are

from torchvision.datasets import MNIST
from torch.utils.data import ConcatDataset
from torchvision import transforms
transform = transforms.Compose([transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,)),
])
trainset = MNIST('./', download=True,
train=True,
transform=transform)
testset = MNIST('./', download=True,
train=False,
transform=transform)

Alternatively the mnist dataset can be loaded with
from keras.datasets import mnist

#loading the dataset
(train_X, train_y), (test_X, test_y) = mnist.load_data()

My question is how to get from the mnist dataset to the format used for the MNIST dataset. I see that the latter has transformation and normalization within the transform step. Transform apparently converts the mnist (28x28) to a 784 vector.
Ideally export/importing the
MNIST train/testsets to a csv file would be helpful to me. However the MNIST test/trainsets are not compatible for creating a dataframe (df_testset = pd.DataFrame(testset) yields the error ValueError: DataFrame constructor not properly called!).

If anyone could provide the steps connecting the MNIST and mnist datasets that would be appreciated. My overall goal is to use the autoencoder utilities in this blog to process my own data.

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant