Skip to content

cilab-ufersa/euthyroid_sick_syndrome

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Euthyroid sick syndrome classification with machine learning approaches 🔬

In this project, we are going to classify patients with euthyroid sick syndrome (ESS) using machine learning approaches. The dataset is from the UCI Machine Learning Repository. The dataset contains 3772 instances and 25 attributes. The dataset is imbalanced with 95% of the instances are labeled as "negative" and 5% are labeled as "positive". The goal of this project is to build a model that can classify patients with ESS with high accuracy.

Prerequisites

What things you need to have to be able to run:

  • Python 3.6 +
  • Pip 3+
  • VirtualEnvWrapper is recommended but not mandatory

Requirements

$ pip install -r requirements.txt

About

Euthyroid is a term used to describe a normal thyroid function. The thyroid is a gland located in the neck that produces hormones that regulate the body's metabolism. These hormones, called triiodothyronine (T3) and thyroxine (T4), help to control the body's energy levels and metabolism, as well as heart rate and body temperature.

A euthyroid state means that the thyroid is functioning normally and producing the appropriate amount of hormones. The levels of T3 and T4 are within the normal range and the thyroid-stimulating hormone (TSH) produced by the pituitary gland is also within the normal range. This is the typical state for most people, and having euthyroid status is important for maintaining overall health and well-being.

However, if the thyroid gland is underactive (hypothyroidism) or overactive (hyperthyroidism) it will affect the levels of T3, T4, and TSH and can cause symptoms such as fatigue, weight gain or loss, changes in heart rate and many others. In those cases, the treatment is usually hormone replacement therapy.

Some of the attributes in the dataset

  • Levothyroxine (T4 /T4U)
  • Triiodothyronine (T3)
  • Total T4 (TT4)
  • Free T4 Index (FTI)
  • Thyroid Stimulating Hormone (TSH)

We used the above attributes to build a model that can classify patients with ESS with high accuracy. These are chosen because they are the most important attributes in the dataset. Moreover, theses attributes can be measured in a blood test.

Publications related to this project

Cavalcante, Caio, Vinicius Almeida, Marcos Barros, Nathalee Lima, and Rosana Rego. "Thyroid Syndrome Detection using Machine Learning Algorithms: A Comparative Analysis." In: Congresso Brasileiro de Inteligência Computacional, 2024. Anais do XVI Congresso Brasileiro de Inteligência Computacional. p. 1.

Rego, R. C. B., Vinicius A. Almeida, Caio M. V. Cavalcante, and Nathalee C. A. Lima. "Diagnostic Support System for Euthyroid Sick Syndrome based on Machine Learning Algorithms Approaches." In: International Conference on Intelligent Systems and New Applications, 2023, Liverpool. ICISNA 23 Proceedings Book. Liverpool, 2023. v. 1. pp. 259-264.

Almeida, Vinicius A., Caio M. V. Cavalcante, Nathalee C. A. Lima, and Rosana C. B. Rego. "Classificação da Síndrome do Doente Eutireoideo com Algoritmos de Machine Learning: Uma Aplicação de Suporte ao Diagnóstico." In: IV Congresso Brasileiro Interdisciplinar em Ciência e Tecnologia, 2023. Anais do Congresso Brasileiro Interdisciplinar em Ciência e Tecnologia, 2023.

Almeida, Vinicius A., and Rosana C. B. Rego. "Síndrome do Doente Eutireoidiano: Análise de Indicadores Importantes com Machine Learning." In: VI Encontro De Computação Do Oeste Potiguar (ECOP), 2023, Pau dos Ferros. Anais Do Encontro De Computação Do Oeste Potiguar, 2023. v. 1.

Part 1: Results : IVCobiCET

We used 4 different machine learning approaches to build a model that can classify patients with ESS with high accuracy. The approaches are:

  • Naive Bayes
  • Logistic Regression
  • Decision Tree
  • Random Forest

The results are shown in the table below:

Approach Accuracy Precision Recall F1-Score
Naive Bayes 0.8493 0.7963 0.9285 0.8573
Logistic Regression 0.9198 0.9063 0.9321 0.9190
Decision Tree 0.9817 0.9719 0.9911 0.9814
Random Forest 0.9834 0.9839 0.9821 0.9830

Access the detailed results

Part 2: Results : ICISNA 2023

We used 4 different machine learning approaches to build a model that can classify patients with ESS with high accuracy. The approaches are:

  • Logistic Regression
  • Random Forest
  • LightGBM
  • XGBoost
  • Stack Ensemble based on Random Forest and XGBoost

The results are shown in the table below:

Approach Accuracy Recall Precision F1-Score
Logistic Regression 91.98% 93.21% 90.62% 91.90%
Random Forest 98.34% 98.21% 98.38% 98.30%
LightGBM 97.64% 97.32% 97.64% 97.58%
XGBoost 98.60% 98.77% 98.57% 98.57%
Stack Ensemble 98.78% 98.75% 98.75% 98.75%

Scientific Developers

👤 Vinicius Almeida: vinicius45anacleto@gmail.com

👤 Caio Moisés: caio.cavalcante@alunos.ufersa.edu.br

Technical and scientific support

👤 Rosana Rego

Support by