Skip to content

danypark91/hd_log_reg

Repository files navigation

Logistic Regression using Heart Disease Dataset

This project is to apply logistic regression to the dataset of heart disease patients and create the regression model to predict the potential patients using Rstudio.

Tech/framework used

  • RStudio
  • Rmarkdown
  • Excel

RStudio Library Used

  • library(MASS)
  • library(caret)
  • library(Amelia)
  • library(caTools)
  • library(pROC)
  • library(ROCR)
  • library(plyr)
  • library(GGally)
  • library(ggsci)
  • library(cowplot)
  • library(ggpubr)

Installation of R packages

rpack <- c("MASS", "caret", "Amelia", "caTools", "pROC", "ROCR", "plyr", "GGally", "ggsci", "cowplot", "ggpubr")

install.packages(rpack)

Dataset

The original dataset from UCI contained 76 attributes which represent a patient's condition. The dataset for this article is from Kaggle - Heart Disease UCI. The subset of 14 attributes with every incident represents a patient.

Project Description

This project is to apply logistic regression to the dataset. It begins with the importation of the dataset from the local device and checks if it requires data cleansing. The cleansed data divides into train and test sets with a ratio of 3 to 1. The best-fit logistic regression model gets derived by using train_df. The model undergoes statistical tests to determine scientific accuracy. Finally, the model is applied to the test_df to check the predictability of the logistic model.

Reference

About

Logistic Regression using Heart Disease dataset

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages