Skip to content

πŸˆβ€β¬› Comprehensive NLP pipeline with insightful abstractions, seamlessly streamlining experiment setup and execution while providing a robust solution for a diverse array of natural language processing tasks.

Notifications You must be signed in to change notification settings

tamohannes/tmynNLP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

11 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

tmynNLP πŸˆβ€β¬›

A unified framework for NLP tasks, using Machine Learning models, inspired by AllenNLP

Setup

It is recommended to use a conda environment for this project.

Follow the steps below to get started.

Create a conda environment

conda create -n tmynnlp_env python=3.8

Install the package with the requirements

pip install -e .

To make the CLI executable from any location: set this alias in your .zshrc or .bash_profile

alias tmynnlp='[ABSOLUTE_PATH_TO_THE_DIR]/tmynnlp/__main__.py'

CLI usage

Example: Running the Experiments with specific configurations:

The trainner

tmynnlp train runs/runs.json --include_package document_classification

Sample runs.json file

[
    {
        "type": "experiment1",
        "num_epochs": 10,
        "batch_size": 32,
        "dataset_reader": {
            "type": "ted_multi",
            "train_data_path": "train",
            "valid_data_path": "validation",
            "mock_samples_num": 500,
            "preprocessor": {
                "type": "drop_nan"
            }
        },
        "tokenizer": {
            "type": "huggingface_tokenizer",
            "pretrained_model": "bert-base-cased"
        },
        "model": {
            "type": "huggingface_sequence_classifier",
            "arch": "xlm-roberta-base",
            "num_labels": 60
        },
        "tracker": {
            "type": "aim",
            "repo_path": ".tmp_aim"
        },
        "metrics": [
            {
                "type": "accuracy"
            },
            {
                "type": "f1",
                "average": "weighted"
            }
        ],
        "criterion": {
            "type": "CrossEntropyLoss"
        },
        "optimizer": {
            "type": "SGD",
            "lr": 0.0005
        },
        "lr_scheduler": {
            "type": "StepLR",
            "step_size": 1.0,
            "gamma": 0.1
        }
    }
]

About

πŸˆβ€β¬› Comprehensive NLP pipeline with insightful abstractions, seamlessly streamlining experiment setup and execution while providing a robust solution for a diverse array of natural language processing tasks.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages