You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I have a few suggestions on using LF-MMI:
I noticed you are looping over the batch, creating the supervision and calculating the criterion.
you can use MergeSupervision function I added to pykaldi in-order to create the supervision to the whole batch and run the criterion only once.
here is my collate function for the data-loader:
def supervision_collate(batch):
"""
a collate function, for using supervision with dataloader
"""
elem = batch[0]
elem_type = type(elem)
if isinstance(elem, container_abcs.Sequence):
transposed = zip(*batch)
return [supervision_collate(samples) for samples in transposed]
elif isinstance(elem,kaldi.chain._chain_supervision.Supervision):
if len(batch) == 1:
return batch[0]
return kaldi.chain.merge_supervison(batch)
elif elem is None:
return batch
return torch.utils.data.dataloader.default_collate(batch)
also, using phone_ali gives a small supervision, you should consider using lattices and phone_lattice_to_proto_supervision instead of alignment_to_proto_supervision.
and finally, you can save your supervision as bytes and read them again.
here are the functions I used for doing this:
Hi @yotam319, thanks a lot for your advice and code sample. Yes, what you said definitely makes sense. Previously, I only did a vanilla version of LF-MMI in the toolbox, and planed to revisit later to improve the efficiency. I noticed the code change in the pykaldi lib, but have not been able to squeeze my time to work on it. Our internal tools are not built on Kaldi, so I have very limited time to work on this toolkit. Will try to integrate your dataloader into the code base soon. Thanks again!
Hi,
I have a few suggestions on using LF-MMI:
I noticed you are looping over the batch, creating the supervision and calculating the criterion.
you can use MergeSupervision function I added to pykaldi in-order to create the supervision to the whole batch and run the criterion only once.
here is my collate function for the data-loader:
To add the MergeSupervision I made a pull request to pykaldi (pykaldi/pykaldi#182), but you can use my fork that already have the change (https://github.com/yotam319/pykaldi)
also, using phone_ali gives a small supervision, you should consider using lattices and phone_lattice_to_proto_supervision instead of alignment_to_proto_supervision.
and finally, you can save your supervision as bytes and read them again.
here are the functions I used for doing this:
hope this helps :)
The text was updated successfully, but these errors were encountered: