Skip to content

Commit

Permalink
Mixin refactor (#360)
Browse files Browse the repository at this point in the history
  • Loading branch information
mmcauliffe authored Dec 3, 2021
1 parent 686a1af commit 2d6f7e7
Show file tree
Hide file tree
Showing 263 changed files with 19,956 additions and 19,226 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ jobs:
- uses: "actions/checkout@v2"
- uses: "actions/setup-python@v2"
with:
python-version: "3.8"
python-version: "3.9"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
Expand Down
4 changes: 2 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,7 @@ report.txt
.DS_Store


tests/data/generated
docs/source/generated
generated/

pretrained_models/

Expand Down Expand Up @@ -83,5 +82,6 @@ docs/source/api/

montreal_forced_aligner/_version.py
/docs/source/reference/generated/
<<<<<<< main

docs/source/reference/multiprocessing/generated/
6 changes: 3 additions & 3 deletions docs/source/_static/interrogate_badge.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
34 changes: 0 additions & 34 deletions docs/source/_templates/class_b.rst

This file was deleted.

10 changes: 0 additions & 10 deletions docs/source/_templates/function_b.rst

This file was deleted.

16 changes: 14 additions & 2 deletions docs/source/changelog/changelog_2.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,18 @@
Beta releases
=============

2.0.0b8
-------

- Refactored internal organization to rely on mixins more than monolithic classes, and moved internal functions to be organized by what they're used for instead of the general type.

- For instance, there used to be a ``montreal_forced_aligner.multiprocessing`` module with ``alignment.py``, ``transcription.py``, etc that all did multiprocessing for various workers. Now that functionality is located closer to where it's used, i.e. ``montreal_forced_aligner.transcription.multiprocessing``.
- Mixins should allow for more easy extension to new use cases and allow for better configuration

- Updated documentation to reflect the refactoring and did a pass over the User Guide
- Added the ability to change the location of root MFA directory based on the ``MFA_ROOT_DIR`` environment variable
- Fixed an issue where the version was incorrectly reported as "2.0.0"

2.0.0b5
-------

Expand All @@ -23,8 +35,8 @@ Beta releases
- Massive refactor to a proper class-based API for interacting with MFA corpora

- Sorry, I really do hope this is the last big refactor of 2.0
- :class:`~montreal_forced_aligner.corpus.Speaker`, :class:`~montreal_forced_aligner.corpus.File`, and :class:`~montreal_forced_aligner.corpus.Utterance` have dedicated classes rather than having their information split across dictionaries mimicking Kaldi files, so they should be more useful for interacting with outside of MFA
- Added :class:`~montreal_forced_aligner.multiprocessing.Job` class as well to make it easier to generate and keep track of information about different processes
- :class:`~montreal_forced_aligner.corpus.classes.Speaker`, :class:`~montreal_forced_aligner.corpus.classes.File`, and :class:`~montreal_forced_aligner.corpus.classes.Utterance` have dedicated classes rather than having their information split across dictionaries mimicking Kaldi files, so they should be more useful for interacting with outside of MFA
- Added :class:`~montreal_forced_aligner.corpus.multiprocessing.Job` class as well to make it easier to generate and keep track of information about different processes
- Updated installation style to be more dependent on conda-forge packages

- Kaldi and MFA are now on conda-forge! |:tada:|
Expand Down
21 changes: 15 additions & 6 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,10 @@

xref_links = {
"mfa_mailing_list": ("MFA mailing list", "https://groups.google.com/g/mfa-users"),
"mfa_github": ("MFA GitHub Repo", "https://groups.google.com/g/mfa-users"),
"mfa_github": (
"MFA GitHub Repo",
"https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner",
),
"mfa_github_issues": (
"MFA GitHub Issues",
"https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner/issues",
Expand All @@ -79,6 +82,9 @@
"kaldi_github": ("Kaldi GitHub", "https://github.com/kaldi-asr/kaldi"),
"htk": ("HTK", "http://htk.eng.cam.ac.uk/"),
"phonetisaurus": ("Phonetisaurus", "https://github.com/AdolfVonKleist/Phonetisaurus"),
"opengrm_ngram": ("OpenGrm-NGram", "https://www.openfst.org/twiki/bin/view/GRM/NGramLibrary"),
"openfst": ("OpenFst", "https://www.openfst.org/twiki/bin/view/FST"),
"baumwelch": ("Baum-Welch", "https://www.opengrm.org/twiki/bin/view/GRM/BaumWelch"),
"pynini": ("Pynini", "https://www.openfst.org/twiki/bin/view/GRM/Pynini"),
"prosodylab_aligner": ("Prosodylab-aligner", "http://prosodylab.org/tools/aligner/"),
"p2fa": (
Expand Down Expand Up @@ -126,19 +132,22 @@
"Trainer": "montreal_forced_aligner.abc.Trainer",
"Aligner": "montreal_forced_aligner.abc.Aligner",
"DictionaryData": "montreal_forced_aligner.dictionary.DictionaryData",
"Utterance": "montreal_forced_aligner.corpus.Utterance",
"File": "montreal_forced_aligner.corpus.File",
"Utterance": "montreal_forced_aligner.corpus.classes.Utterance",
"File": "montreal_forced_aligner.corpus.classes.File",
"FeatureConfig": "montreal_forced_aligner.config.FeatureConfig",
"multiprocessing.context.Process": "multiprocessing.Process",
"mp.Process": "multiprocessing.Process",
"Speaker": "montreal_forced_aligner.corpus.Speaker",
"Speaker": "montreal_forced_aligner.corpus.classes.Speaker",
"Namespace": "argparse.Namespace",
"MetaDict": "dict[str, Any]",
}

napoleon_preprocess_types = False
napoleon_attr_annotations = False
napoleon_use_param = True
napoleon_use_ivar = True
napoleon_type_aliases = {
"Labels": "List[str]",
"Labels": "list[str]",
}
typehints_fully_qualified = False
# numpydoc_xref_param_type = True
Expand Down Expand Up @@ -222,13 +231,13 @@
nitpick_ignore = [
("py:class", "optional"),
("py:class", "callable"),
("py:class", "CtmType"),
("py:class", "ReversedMappingType"),
("py:class", "WordsType"),
("py:class", "MappingType"),
("py:class", "TextIO"),
("py:class", "SegmentationType"),
("py:class", "CtmErrorDict"),
("py:class", "kwargs"),
("py:class", "Labels"),
("py:class", "ScpType"),
("py:class", "multiprocessing.Value"),
Expand Down
52 changes: 26 additions & 26 deletions docs/source/external_links.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
:license: BSD, see LICENSE for details.
"""

from typing import Any, Dict, List, Tuple
from typing import Any

import sphinx
from docutils import nodes, utils
Expand All @@ -41,9 +41,9 @@ def model_role(
text: str,
lineno: int,
inliner: Inliner,
options: Dict = None,
content: List[str] = None,
) -> Tuple[List[Node], List[system_message]]:
options: dict = None,
content: list[str] = None,
) -> tuple[list[Node], list[system_message]]:
text = utils.unescape(text)
model_type, model_name = text.split("/")
full_url = f"https://github.com/MontrealCorpusTools/mfa-models/raw/main/{model_type}/{model_name.lower()}.zip"
Expand All @@ -58,9 +58,9 @@ def kaldi_steps_role(
text: str,
lineno: int,
inliner: Inliner,
options: Dict = None,
content: List[str] = None,
) -> Tuple[List[Node], List[system_message]]:
options: dict = None,
content: list[str] = None,
) -> tuple[list[Node], list[system_message]]:
text = utils.unescape(text)
full_url = f"https://github.com/kaldi-asr/kaldi/tree/master/egs/wsj/s5/steps/{text}.sh"
title = f"{text}.sh"
Expand All @@ -74,9 +74,9 @@ def kaldi_utils_role(
text: str,
lineno: int,
inliner: Inliner,
options: Dict = None,
content: List[str] = None,
) -> Tuple[List[Node], List[system_message]]:
options: dict = None,
content: list[str] = None,
) -> tuple[list[Node], list[system_message]]:
filename = utils.unescape(text)
full_url = f"https://github.com/kaldi-asr/kaldi/tree/master/egs/wsj/s5/utils/{filename}"
title = f"{text}"
Expand All @@ -90,9 +90,9 @@ def kaldi_steps_sid_role(
text: str,
lineno: int,
inliner: Inliner,
options: Dict = None,
content: List[str] = None,
) -> Tuple[List[Node], List[system_message]]:
options: dict = None,
content: list[str] = None,
) -> tuple[list[Node], list[system_message]]:
text = utils.unescape(text)
full_url = f"https://github.com/kaldi-asr/kaldi/tree/cbed4ff688a172a7f765493d24771c1bd57dcd20/egs/sre08/v1/sid/{text}.sh"
title = f"sid/{text}.sh"
Expand All @@ -106,9 +106,9 @@ def kaldi_docs_role(
text: str,
lineno: int,
inliner: Inliner,
options: Dict = None,
content: List[str] = None,
) -> Tuple[List[Node], List[system_message]]:
options: dict = None,
content: list[str] = None,
) -> tuple[list[Node], list[system_message]]:
text = utils.unescape(text)
t = text.split("#")
text = t[0]
Expand All @@ -129,9 +129,9 @@ def openfst_src_role(
text: str,
lineno: int,
inliner: Inliner,
options: Dict = None,
content: List[str] = None,
) -> Tuple[List[Node], List[system_message]]:
options: dict = None,
content: list[str] = None,
) -> tuple[list[Node], list[system_message]]:
text = utils.unescape(text)
full_url = f"https://www.openfst.org/doxygen/fst/html/{text}-main_8cc_source.html"
title = f"OpenFst {text} source"
Expand All @@ -145,9 +145,9 @@ def kaldi_src_role(
text: str,
lineno: int,
inliner: Inliner,
options: Dict = None,
content: List[str] = None,
) -> Tuple[List[Node], List[system_message]]:
options: dict = None,
content: list[str] = None,
) -> tuple[list[Node], list[system_message]]:
text = utils.unescape(text)
mapping = {
"bin": set(
Expand Down Expand Up @@ -378,9 +378,9 @@ def xref(
text: str,
lineno: int,
inliner: Inliner,
options: Dict = None,
content: List[str] = None,
) -> Tuple[List[Node], List[system_message]]:
options: dict = None,
content: list[str] = None,
) -> tuple[list[Node], list[system_message]]:

title = target = text
# look if explicit title and target are given with `foo <bar>` syntax
Expand Down Expand Up @@ -409,7 +409,7 @@ def get_refs(app):
xref.links = app.config.xref_links


def setup(app: Sphinx) -> Dict[str, Any]:
def setup(app: Sphinx) -> dict[str, Any]:
app.add_config_value("xref_links", {}, "env")
app.add_role("mfa_model", model_role)
app.add_role("kaldi_steps", kaldi_steps_role)
Expand Down
2 changes: 1 addition & 1 deletion docs/source/first_steps/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ There are several broad use cases that you might want to use MFA for. Take a lo

#. **Use case 1:** You have a speech corpus, the language involved is in the list of :ref:`pretrained_acoustic_models` and the list of :ref:`pretrained_dictionaries`.

#. Follow :ref:`first_steps_align_pretrained` to generate aligned TextGrids
#. Follow :ref:`first_steps_align_pretrained` to generate aligned TextGrids

#. **Use case 2:** You have a speech corpus, the language involved is in the list of :ref:`pretrained_acoustic_models` and the list of :ref:`pretrained_g2p`, but not on the list of :ref:`pretrained_dictionaries`.

Expand Down
20 changes: 20 additions & 0 deletions docs/source/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,26 @@ In general, it's recommend to create a new environment. If you want to update,

Windows native install is not fully supported in 2.0. G2P functionality will be unavailable due to Pynini supporting only Linux and MacOS. To use G2P functionality on Windows, please set up the :xref:`wsl` and use the Bash console to continue the instructions.

Installing from source
======================

If the Conda installation above does not work or the binaries don't work on your system, you can try building Kaldi and OpenFst from source, along with MFA.

1. Download/clone the :xref:`kaldi_github` and follow the installation instructions
2. If you're on Mac or Linux and want G2P functionality, install :xref:`openfst`, :xref:`opengrm_ngram`, :xref:`baumwelch`, and :xref:`pynini`
3. Make sure all Kaldi and other third party executables are on the system path
4. Download/clone the :xref:`mfa_github` and install MFA via :code:`python setup install` or :code:`pip install -e .`
5. Double check everything's working on the console with :code:`mfa -h`

.. note::

You can also clone the conda forge feedstocks for `OpenFst <https://github.com/conda-forge/openfst-feedstock>`_, `SoX <https://github.com/conda-forge/sox-feedstock>`_, `Kaldi <https://github.com/conda-forge/kaldi-feedstock>`_, and `MFA <https://github.com/conda-forge/montreal-forced-aligner-feedstock>`_ and run them with `conda build <https://docs.conda.io/projects/conda-build/en/latest/>`_ to build for your specific system.

MFA temporary files
===================

MFA uses a temporary directory for commands that can be specified in running commands with ``--temp_directory`` (or see :ref:`configuration`), and it also uses a directory to store global configuration settings and saved models. By default this root directory is ``~/Documents/MFA``, but if you would like to put this somewhere else, you can set the environment variable ``MFA_ROOT_DIR`` to use that. MFA will raise an error on load if it's unable to write the specified root directory.

Supported functionality
=======================

Expand Down
13 changes: 0 additions & 13 deletions docs/source/reference/abc.rst

This file was deleted.

Loading

0 comments on commit 2d6f7e7

Please sign in to comment.