Skip to content

[Implementation example] Attend and Diagnose: Clinical Time Series Analysis Using Attention Models

License

Notifications You must be signed in to change notification settings

behavioral-data/SAnD

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SAnD

AAAI 2018 Attend and Diagnose: Clinical Time Series Analysis Using Attention Models

Codacy Badge contributions welcome License: MIT

Warning This code is UNOFFICIAL.

Paper: Attend and Diagnose: Clinical Time Series Analysis Using Attention Models

If you want to run this code, you need download some dataset and write experimenting code.

from comet_ml import Experiment
from SAnD.core.model import SAnD
from SAnD.utils.trainer import NeuralNetworkClassifier

model = SAnD( ... )
clf = NeuralNetworkClassifier( ... )
clf.fit( ... )

Installation

git clone https://github.com/khirotaka/SAnD.git

Requirements

  • Python 3.6
  • Comet.ml
  • PyTorch v1.1.0 or later

Simple Usage

Here's a brief overview of how you can use this project to help you solve the classification task.

Download this project

First, create an empty directory.
In this example, I'll call it "playground".
Run the git init & git submodule add command to register SAnD project as a submodule.

$ mkdir playground/
$ cd playground/
$ git init
$ git submodule add https://github.com/khirotaka/SAnD.git

Now you're ready to use SAnD in your project.

Preparing the Dataset

Prepare the data set of your choice.
Remember that the input dimension to the SAnD model is basically three dimensions of [N, seq_len, features].

This example shows how to use torch.randn() as a pseudo dataset.

from comet_ml import Experiment

import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import TensorDataset, DataLoader

from SAnD.core.model import SAnD
from SAnD.utils.trainer import NeuralNetworkClassifier


x_train = torch.randn(1024, 256, 23)    # [N, seq_len, features]
x_val = torch.randn(128, 256, 23)       # [N, seq_len, features]
x_test =  torch.randn(512, 256, 23)     # [N, seq_len, features]

y_train = torch.randint(0, 9, (1024, ))
y_val = torch.randint(0, 9, (128, ))
y_test = torch.randint(0, 9, (512, ))


train_ds = TensorDataset(x_train, y_train)
val_ds = TensorDataset(x_val, y_val)
test_ds = TensorDataset(x_test, y_test)

train_loader = DataLoader(train_ds, batch_size=128)
val_loader = DataLoader(val_ds, batch_size=128)
test_loader = DataLoader(test_ds, batch_size=128)

Note:
In my experience, I have a feeling that SAnD is better at problems with a large number of features.

Training SAnD model using Trainer

Finally, train the SAnD model using the included NeuralNetworkClassifier.
Of course, you can also have them use a well-known training tool such as PyTorch Lightning.
The included NeuralNetworkClassifier depends on the comet.ml's logging service.

in_feature = 23
seq_len = 256
n_heads = 32
factor = 32
num_class = 10
num_layers = 6

clf = NeuralNetworkClassifier(
    SAnD(in_feature, seq_len, n_heads, factor, num_class, num_layers),
    nn.CrossEntropyLoss(),
    optim.Adam, optimizer_config={"lr": 1e-5, "betas": (0.9, 0.98), "eps": 4e-09, "weight_decay": 5e-4},
    experiment=Experiment()
)

# training network
clf.fit(
    {"train": train_loader,
     "val": val_loader},
    epochs=200
)

# evaluating
clf.evaluate(test_loader)

# save
clf.save_to_file("save_params/")

For the actual task, choose the appropriate hyperparameters for your model and optimizer.

Regression Task

There are two ways to use SAnD in a regression task.

  1. Specify the number of output dimensions in num_class.
  2. Inherit class SAnD and overwrite ClassificationModule with RegressionModule.

I would like to introduce a second point.

from SAnD.core.model import SAnD
from SAnD.core.modules import RegressionModule


class RegSAnD(SAnD):
    def __init__(self, *args, **kwargs):
        super(RegSAnD, self).__init__(*args, **kwargs)
        d_model = kwargs.get("d_model")
        factor = kwargs.get("factor")
        output_size = kwargs.get("n_class")    # output_size

        self.clf = RegressionModule(d_model, factor, output_size)


model = RegSAnD(
    input_features=..., seq_len=..., n_heads=..., factor=...,
    n_class=..., n_layers=...
)

The contents of both ClassificationModule and RegressionModule are almost the same, so the 1st is recommended.

Please let me know when my code has been used to bring products or research results to the world.
It's very encouraging :)

Author

Hirotaka Kawashima (川島 寛隆)

License

Copyright (c) 2019 Hirotaka Kawashima
Released under the MIT license

About

[Implementation example] Attend and Diagnose: Clinical Time Series Analysis Using Attention Models

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 98.0%
  • Dockerfile 2.0%