-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #90 from UBC-MDS/final-fixes
Final Fixes
- Loading branch information
Showing
15 changed files
with
187 additions
and
62 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,111 @@ | ||
# Changelog | ||
*Modified from our original combined [issue](https://github.com/UBC-MDS/DSCI522-2425-25-heart_disease_predictor/issues/64)* | ||
## Feedback Sources | ||
|
||
### [Peer Review Feedback](https://github.com/UBC-MDS/data-analysis-review-2024/issues/12#issuecomment-2537520503) | ||
|
||
### [Milestone 1 Feedback](https://github.com/UBC-MDS/DSCI522-2425-25-heart_disease_predictor/issues/35) | ||
|
||
### Milestone 2 Feedback: | ||
|
||
- Feedback from Milestone 2 was to change the docker-compose file to not specify latest. This was addressed in more details in this [Issue](https://github.com/UBC-MDS/DSCI522-2425-25-heart_disease_predictor/issues/88) | ||
|
||
--- | ||
|
||
In general, all feedback from Milestones and most from peer review were addressed in fixes. A minority few from peer reviews were addressed by explaining our rationale for we didn't follow their suggestions. These specific explanations each have their own GitHub Issues and are linked to within their respective bullet point below. | ||
|
||
*Note: Some fixes may have been addressed together in one big PR* | ||
|
||
## Feedback Fixes, Commits, and Pull Requests | ||
|
||
### **Report** | ||
|
||
- Ensure EDA is conducted exclusively on the training dataset to prevent data leakage. | ||
- Addressed by: | ||
- [This commit](https://github.com/UBC-MDS/DSCI522-2425-25-heart_disease_predictor/pull/79/commits/53de6cb46fa01639ee9fe745a43c6669ea679050) | ||
- [This commit](https://github.com/UBC-MDS/DSCI522-2425-25-heart_disease_predictor/pull/79/commits/15cf4c5586995dd7154f4ee238a08d10b5813d93) | ||
|
||
- [Expand the "Methods & Results" section with more explanatory notes and keep the "Discussion" focused on providing additional details. Discuss the motivation behind your scoring metrics in the context of "Heart Disease Prediction".](https://github.com/UBC-MDS/DSCI522-2425-25-heart_disease_predictor/pull/89) | ||
|
||
- [Provide more detail in the "Methods" section on how the dataset was split for training and testing (e.g., 70-30 split or cross-validation)](https://github.com/UBC-MDS/DSCI522-2425-25-heart_disease_predictor/pull/89) | ||
|
||
- [The rendered reports in the repository are different from those created by make all. They are not up-to-date.](https://github.com/UBC-MDS/DSCI522-2425-25-heart_disease_predictor/pull/90) | ||
|
||
- [References: Please ensure all softwares used in generating the analysis and report are properly attributed.](https://github.com/UBC-MDS/DSCI522-2425-25-heart_disease_predictor/pull/89) | ||
|
||
--- | ||
|
||
### **Documentation** | ||
|
||
- [Pin `requests` version to `==2.24.0` instead of `>=2.24.0`.](https://github.com/UBC-MDS/DSCI522-2425-25-heart_disease_predictor/commit/92e9e74bc2b7b78f7fcb1cf1958d242d9187b836) | ||
|
||
- [In the dockerfile, the port is mapped to 34651. However, the message in terminal still shows 8888. This may confuse people who do not carefully read instructions. Is there any way to resolve it?](https://github.com/UBC-MDS/DSCI522-2425-25-heart_disease_predictor/issues/75) | ||
|
||
- [Split the quarto render commands into separate lines for clarity:](https://github.com/UBC-MDS/DSCI522-2425-25-heart_disease_predictor/issues/74) | ||
|
||
```bash | ||
quarto render heart_disease_predictor_report.qmd --to html | ||
quarto render heart_disease_predictor_report.qmd --to pdf | ||
``` | ||
|
||
--- | ||
|
||
### **Code** | ||
|
||
- [Ensure consistent script naming (e.g., some scripts use `_heart_disease_predictor`, others do not).](https://github.com/UBC-MDS/DSCI522-2425-25-heart_disease_predictor/commit/c1df4330520f8f65728530e5e9b12ed9b8465860) | ||
|
||
- [Implement more specific exceptions (e.g., `InvalidColumnTypeError`, `MissingValueError`) instead of generic ones like `ValueError`.](https://github.com/UBC-MDS/DSCI522-2425-25-heart_disease_predictor/commit/d1e3849dfffb0aa2e32b599f7ac7f58c42afb12d) | ||
|
||
- Move helper functions into separate modules for better organization. | ||
- Addressed by: | ||
- [This commit](https://github.com/UBC-MDS/DSCI522-2425-25-heart_disease_predictor/commit/fc240272756a1cde597eacd3924520ff1713c34f) | ||
- [This commit](https://github.com/UBC-MDS/DSCI522-2425-25-heart_disease_predictor/commit/be9ba5c8b969124a002e793d4f22840f44945650) | ||
|
||
- Modularize functions to improve code clarity and reusability & Abstract code from the `main()` function into separate functions to improve structure and readability. | ||
- Addressed by: | ||
- [This PR](https://github.com/UBC-MDS/DSCI522-2425-25-heart_disease_predictor/pull/66) | ||
- [This PR](https://github.com/UBC-MDS/DSCI522-2425-25-heart_disease_predictor/pull/79) | ||
- [This PR](https://github.com/UBC-MDS/DSCI522-2425-25-heart_disease_predictor/pull/84) | ||
- [This PR](https://github.com/UBC-MDS/DSCI522-2425-25-heart_disease_predictor/pull/83) | ||
|
||
- Unit Testing: Adding unit tests for each validation function would ensure the correctness of the logic when applied to different datasets. It can help catch edge cases and confirm that each function works as expected. | ||
- Addressed by: | ||
- [This PR](https://github.com/UBC-MDS/DSCI522-2425-25-heart_disease_predictor/pull/66) | ||
- [This PR](https://github.com/UBC-MDS/DSCI522-2425-25-heart_disease_predictor/pull/79) | ||
- [This PR](https://github.com/UBC-MDS/DSCI522-2425-25-heart_disease_predictor/pull/84) | ||
- [This PR](https://github.com/UBC-MDS/DSCI522-2425-25-heart_disease_predictor/pull/83) | ||
|
||
--- | ||
|
||
### Milestone 1 Specific Feedback: | ||
|
||
#### The below 6 are all addressed in [this PR](https://github.com/UBC-MDS/DSCI522-2425-25-heart_disease_predictor/pull/51) | ||
|
||
- Add proper email to code of conduct | ||
- Remove .ds_store file | ||
- Discuss importance/limitation of findings in Abstract/summary | ||
- Clearly identify question in introduction | ||
- Add all references to the dataset in Introduction | ||
- More clearly define target/response variable in Introduction | ||
|
||
#### The below 3 are all addressed in [this PR](https://github.com/UBC-MDS/DSCI522-2425-25-heart_disease_predictor/pull/89) | ||
|
||
- Important methodology descriptions missing (e.g., did not explain in narrative what metric was being used for model parameter optimization) | ||
- Some important results not displayed | ||
- Findings from project need to be linked back to application domain and questions | ||
|
||
--- | ||
|
||
### **Organization** | ||
|
||
- [Move all environment files into a separate folder to tidy up the repository. #72](https://github.com/UBC-MDS/DSCI522-2425-25-heart_disease_predictor/issues/72) | ||
- [Remove redundant lock files](https://github.com/UBC-MDS/DSCI522-2425-25-heart_disease_predictor/issues/71) | ||
- [Organize files in the `report` folder into subfolders for better structure.](https://github.com/UBC-MDS/DSCI522-2425-25-heart_disease_predictor/commit/704fb0f46221944bcf70efff16d43fae6b21e9d5) | ||
|
||
--- | ||
|
||
### **Warnings** | ||
|
||
- [Address and resolve the warning messages that appear when running `make all`. Follow the instructions provided in the messages to eliminate them.](https://github.com/UBC-MDS/DSCI522-2425-25-heart_disease_predictor/commit/ecf10b313713e36ddaed93f853acd98397680faa) | ||
|
||
--- |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
4 changes: 2 additions & 2 deletions
4
results/tables/logistic_regression/logistic_regression_cv_results.csv
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
50 changes: 25 additions & 25 deletions
50
results/tables/logistic_regression/logreg_coefficients.csv
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,26 +1,26 @@ | ||
Feature,Coefficient | ||
onehotencoder__chest_pain_type_asymptomatic,1.1838672562033263 | ||
onehotencoder__thalassemia_reversable defect,0.9657968582786678 | ||
onehotencoder__num_of_vessels_2.0,0.9142733069914617 | ||
onehotencoder__exercise_induced_angina_yes,0.6288076002703239 | ||
standardscaler__sex,0.6045607179889861 | ||
onehotencoder__num_of_vessels_1.0,0.5644600036470457 | ||
onehotencoder__slope_flat,0.3503515446664692 | ||
standardscaler__resting_blood_pressure,0.3404955465111282 | ||
standardscaler__cholesterol,0.32447332163941606 | ||
standardscaler__st_depression,0.3043895729937989 | ||
onehotencoder__slope_downsloping,0.24053223862408943 | ||
onehotencoder__rest_ecg_left ventricular hypertrophy,0.1960784505781805 | ||
onehotencoder__chest_pain_type_atypical angina,0.19048123124637503 | ||
onehotencoder__rest_ecg_ST-T wave abnormality,0.17303560432690512 | ||
standardscaler__age,0.05633345043781977 | ||
standardscaler__fasting_blood_sugar,-0.16843768707539558 | ||
standardscaler__max_heart_rate,-0.1864102268478972 | ||
onehotencoder__num_of_vessels_3.0,-0.22288813387665835 | ||
onehotencoder__rest_ecg_normal,-0.3690988378536457 | ||
onehotencoder__thalassemia_fixed defect,-0.38841108925053175 | ||
onehotencoder__chest_pain_type_typical angina,-0.4424102779839873 | ||
onehotencoder__thalassemia_normal,-0.5773705519766958 | ||
onehotencoder__slope_upsloping,-0.590868566239119 | ||
onehotencoder__chest_pain_type_non-anginal pain,-0.9319229924142742 | ||
onehotencoder__num_of_vessels_0.0,-1.2558299597104094 | ||
onehotencoder__chest_pain_type_asymptomatic,1.1838672562061838 | ||
onehotencoder__thalassemia_reversable defect,0.9657968582775411 | ||
onehotencoder__num_of_vessels_2.0,0.914273306990384 | ||
onehotencoder__exercise_induced_angina_yes,0.6288076002699492 | ||
standardscaler__sex,0.6045607179898266 | ||
onehotencoder__num_of_vessels_1.0,0.5644600036491417 | ||
onehotencoder__slope_flat,0.35035154466635776 | ||
standardscaler__resting_blood_pressure,0.340495546509579 | ||
standardscaler__cholesterol,0.3244733216396428 | ||
standardscaler__st_depression,0.304389572995754 | ||
onehotencoder__slope_downsloping,0.2405322386242731 | ||
onehotencoder__rest_ecg_left ventricular hypertrophy,0.19607845057825268 | ||
onehotencoder__chest_pain_type_atypical angina,0.1904812312471063 | ||
onehotencoder__rest_ecg_ST-T wave abnormality,0.17303560432627466 | ||
standardscaler__age,0.05633345043789224 | ||
standardscaler__fasting_blood_sugar,-0.16843768707612017 | ||
standardscaler__max_heart_rate,-0.18641022684835776 | ||
onehotencoder__num_of_vessels_3.0,-0.22288813387853396 | ||
onehotencoder__rest_ecg_normal,-0.3690988378526316 | ||
onehotencoder__thalassemia_fixed defect,-0.388411089249939 | ||
onehotencoder__chest_pain_type_typical angina,-0.442410277986553 | ||
onehotencoder__thalassemia_normal,-0.5773705519757063 | ||
onehotencoder__slope_upsloping,-0.590868566238734 | ||
onehotencoder__chest_pain_type_non-anginal pain,-0.9319229924148412 | ||
onehotencoder__num_of_vessels_0.0,-1.255829959709096 |
Binary file modified
BIN
-56 Bytes
(100%)
results/tables/logistic_regression/logreg_coefficients.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
File renamed without changes.
File renamed without changes.