Collection of functions and modules to help development in PyTorch.
pip install torchoutil
The only requirement is PyTorch.
To check if the package is installed and show the package version, you can use the following command:
torchoutil-info
import torch
from torchoutil import probs_to_name
probs = torch.as_tensor([[0.9, 0.1], [0.4, 0.6]])
names = probs_to_name(probs, idx_to_name={0: "Cat", 1: "Dog"})
# ["Cat", "Dog"]
import torch
from torchoutil import multihot_to_indices
multihot = torch.as_tensor([[1, 0, 0], [0, 1, 1], [0, 0, 0]])
indices = multihot_to_indices(multihot)
# [[0], [1, 2], []]
import torch
from torchoutil import lengths_to_non_pad_mask
x = torch.as_tensor([3, 1, 2])
mask = lengths_to_non_pad_mask(x, max_len=4)
# Each row i contains x[i] True values for non-padding mask
# tensor([[True, True, True, False],
# [True, False, False, False],
# [True, True, False, False]])
import torch
from torchoutil import masked_mean
x = torch.as_tensor([1, 2, 3, 4])
mask = torch.as_tensor([True, True, False, False])
result = masked_mean(x, mask)
# result contains the mean of the values marked as True: 1.5
Here is an example of pre-computing spectrograms of torchaudio SPEECHCOMMANDS
dataset, using pack_to_custom
function:
from torch import nn
from torchaudio.datasets import SPEECHCOMMANDS
from torchaudio.transforms import Spectrogram
from torchoutil.utils.pack import pack_to_custom
speech_commands_root = "path/to/speech_commands"
packed_root = "path/to/packed_dataset"
dataset = SPEECHCOMMANDS(speech_commands_root, download=True, subset="validation")
# dataset[0] is a tuple, contains waveform and other metadata
class MyTransform(nn.Module):
def __init__(self) -> None:
super().__init__()
self.spectrogram_extractor = Spectrogram()
def forward(self, item):
waveform = item[0]
spectrogram = self.spectrogram_extractor(waveform)
return (spectrogram,) + item[1:]
pack_to_custom(dataset, packed_root, MyTransform())
Then you can load the pre-computed dataset using PackedDataset
:
from torchoutil.utils.pack import PackedDataset
packed_root = "path/to/packed_dataset"
packed_dataset = PackedDataset(packed_root)
packed_dataset[0] # == first transformed item, i.e. transform(dataset[0])
import torch
from torchoutil import insert_at_indices
x = torch.as_tensor([1, 2, 3, 4])
result = insert_at_indices(x, indices=[0, 2], values=5)
# result contains tensor with inserted values: tensor([5, 1, 2, 5, 3, 4])
import torch
from torchoutil import get_inverse_perm
perm = torch.randperm(10)
inv_perm = get_inverse_perm(perm)
x1 = torch.rand(10)
x2 = x1[perm]
x3 = x2[inv_perm]
# inv_perm are indices that allow us to get x3 from x2, i.e. x1 == x3 here
torchoutil
also provides additional modules when some specific package are already installed in your environment.
All extras can be installed with pip install torchoutil[extras]
- If
tensorboard
is installed, the functionload_event_file
can be used. It is useful to load manually all data contained in an tensorboard event file. - If
numpy
is installed, the classesNumpyToTensor
andToNumpy
can be used and their related function. It is meant to be used to compose dynamic transforms intoSequential
module. - If
h5py
is installed, the functionpack_to_hdf
and classHDFDataset
can be used. Can be used to pack/read dataset to HDF files, and supports variable-length sequences of data.
Maintainer:
- Étienne Labbé "Labbeti": labbeti.pub@gmail.com