This project involves the analysis and prediction of heart attacks using Artificial Neural Networks (ANN). The goal is to develop a predictive model that can accurately identify the likelihood of a heart attack based on various health metrics and patient data.
The dataset used for this project is sourced from the Heart Attack Analysis & Prediction Dataset. It includes features such as age, sex, chest pain type, resting blood pressure, serum cholesterol, fasting blood sugar, resting electrocardiographic results, maximum heart rate achieved, exercise-induced angina, ST depression induced by exercise, the slope of the peak exercise ST segment, number of major vessels, and thalassemia.
To run this project locally, follow these steps:
- Clone the repository:
git clone https://github.com/Islam-hady9/Heart-Attack-Analysis-Prediction-using-ANN.git
- Navigate to the project directory:
cd Heart-Attack-Analysis-Prediction-using-ANN
- Install the required dependencies:
pip install -r requirements.txt
To run the entire Jupyter Notebook for analysis and prediction, follow these steps:
- Ensure the dataset is in the correct format and available in the project directory.
- Start Jupyter Notebook:
jupyter notebook
- Open the
Heart Attack Analysis & Prediction using ANN.ipynb
file from the Jupyter Notebook dashboard. - Run all cells in the notebook to perform the data analysis, model training, and prediction steps.
The ANN model is constructed using the following layers:
- Input Layer: Corresponding to the number of features in the dataset.
- Hidden Layers: Multiple hidden layers with ReLU activation functions.
- Output Layer: A single neuron with a sigmoid activation function to output the probability of a heart attack.
The model is trained using the Adam optimizer and binary cross-entropy loss function.
Here is a simplified code snippet of the model architecture:
# Set the random seed for reproducibility
tf.random.set_seed(42)
# Define the number of folds for KFold cross-validation
n_splits = 5
kfold = KFold(n_splits=n_splits, shuffle=True, random_state=42)
# Prepare to collect scores and histories
accuracies = []
all_histories = []
# KFold Cross Validation
for train_index, val_index in kfold.split(X_train):
# Split data
X_train_kfold, X_val_kfold = X_train[train_index], X_train[val_index]
y_train_kfold, y_val_kfold = y_train[train_index], y_train[val_index]
# Create a new instance of the model (to reinitialize weights)
ANN_model = tf.keras.Sequential([
tf.keras.layers.Dense(32, activation="relu", input_shape=(X_train.shape[1],)),
tf.keras.layers.Dense(16, activation="relu"),
tf.keras.layers.Dense(1, activation="sigmoid")
])
# Compile the model
ANN_model.compile(loss="binary_crossentropy",
optimizer=tf.keras.optimizers.Adam(),
metrics=["accuracy"])
# Early stopping callback
early_stopping = tf.keras.callbacks.EarlyStopping(
monitor='val_loss',
patience=10,
restore_best_weights=True
)
# Fit the model
history = ANN_model.fit(X_train_kfold, y_train_kfold,
epochs=100,
validation_data=(X_val_kfold, y_val_kfold),
callbacks=[early_stopping],
verbose=0) # Set verbose to 0 to reduce output
# Collect the history from each fold
all_histories.append(history)
# Evaluate the model on the validation set
scores = ANN_model.evaluate(X_val_kfold, y_val_kfold, verbose=0)
accuracies.append(scores[1]) # Assume that the accuracy is the second metric
# Print the accuracy for each fold
print("Accuracy for each fold:", accuracies)
# Print the average accuracy
print("Average accuracy:", np.mean(accuracies))
The model achieves an accuracy of 88.5% on the test set, with a precision of 87% and a recall of 87%.
- Islam Abd_Elhady Hassanein (Project Lead)
- Enas Ragab Abdel_Latif
- Mariam Tarek Saad
This project is licensed under the MIT License. See the LICENSE file for more details.