DIET-pytorch

Dual Intent Entity Transformer Pytorch version

It is implemented pytorch-lightning based module

How to train

Training

from DIET import trainer

trainer.train(
    file_path,

    #training args
    train_ratio=0.8,
    batch_size=32,
    optimizer="Adam",
    intent_optimizer_lr=1e-5,
    entity_optimizer_lr=2e-5,
    checkpoint_path=os.getcwd(),
    max_epochs=10,

    #model args
    num_encoder_layers=3,
    **kwargs
)

file_path indicate markdown format NLU dataset which follow below RASA NLU training data format

All parameters in trainer including kwargs saved as a modeel hparams

User can check these paramters via checkpoint tensorboard logs

Inference

from DIET import Inferencer

inferencer = Inferencer(checkpoint_path)
inferencer.inference(text: str, intent_topk=5)

As this repository model is implemented based on pytorch-lightning, it generate checkpoint file automatically(user can set checkpoint path in training step)

After setting checkpoint path, query text to inferencer. Result contain intent_rank, user can set n-th rank confidences of intents.

Inference result will be like below

{
    "text": "오늘 서울 날씨 어때?",
    "intent": {
        "confidence": 0.6323,
        "name": "ask_weather"
    },
    "intent_ranking": [
        {
            "confidence": 0.6323,
            "name": "ask_weather"
        },
        ...
    ],
    "entities": [
        {
            "start": 3,
            "end": 4,
            "value": "서울",
            "entity": "location"
        },
        ...
    ]
}

How it works

The model in this repository refered from Rasa DIET classifier.

this blog explain how it works in Rasa framework.

But more simple implementation & fast training, inference, There are several changes int here.

There is no CRF layer ahead TransformerEncoder layer

In real training situation, CRF training pipeline takes a lot of training time. But it can not sure CRF model really learn token relation well or it really need(I guess transformer self-attention do similar things)
It takes character tokenzier for enhancing korean language parsing.

Differ from English or other languages. Korean's character can be joined or splitted in character themselves. Considering this feature, I applied character based tokenizer
There is no mask loss.

Relating upper difference, it doesn't use any pre-trained embedding and tokenizer. So masking techinique is hard to apply.

Name		Name	Last commit message	Last commit date
Latest commit History 83 Commits
DIET		DIET
img		img
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DIET-pytorch

How to train

How it works

About

Releases

Packages

Languages

License

ski-net/DIET-pytorch

Folders and files

Latest commit

History

Repository files navigation

DIET-pytorch

How to train

How it works

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages