This file documents improvements made to the project in response to feedback from the DSCI 522 teaching team and peer reviews. Each change is linked to specific feedback and provides evidence of how the feedback was addressed.
It was suggested to include example usage screenshots to improve user experience.
Embedded the screenshot of example usage in the README.md file.
commit: https://github.com/UBC-MDS/dsci-522-group-23/commit/10da7023edaf5fb57dc2e71f304a14059522f937
It was suggested to define the test statistics.
Add definitions of MSE, RMSE and MAE to the report.
commit: https://github.com/UBC-MDS/dsci-522-group-23/commit/0d44281e06e953d1f0ca6bd2a40a111a1cbf7242
It was suggested to correct several spelling and grammatical mistakes throughout the report.
Corrected all identified spelling mistakes (e.g., "behaviour" to "behavior," "distirbution" to "distribution"). Revised sentences with grammatical issues for clarity and readability.
git issue: #63 (comment)
4. M1 feedback from teaching team: Versions are missing from environment files(s) for all Python packages
Versions are missing from environment files(s) for all Python packages
Specify versions for all Python packages in the environment files.
Check the environment.yml: https://github.com/UBC-MDS/dsci-522-group-23/blob/main/environment.yml
- Target/response variable needs to be more clearly defined.
- Did not clearly identify and describe the dataset that was used to answer the question.
The target variable has been clearly defined, and the dataset used to answer the question has been identified and described in detail within the Introduction section.
Commit: https://github.com/UBC-MDS/dsci-522-group-23/commit/d06bd75d9236f7fa5e59c271ccf59bde1c43bca6
6. M2 feedback from teaching team: The platform key and value is missing from the docker-compose.yml file,
The platform key and value is missing from the docker-compose.yml file, causing issues when running on different chip architectures.
Replaced jupyter-notebook:
with student-performance-predictor-env:
to ensure a more specific and appropriate service name.
Commit: https://github.com/UBC-MDS/dsci-522-group-23/commit/d06bd75d9236f7fa5e59c271ccf59bde1c43bca6
Could not reproducibly run the analysis because the computational environment cannot be recreated from the provided instructions and/or environment specification files.
We replaced the process of building the image locally with pulling the pre-built image directly from our DockerHub repository.
Commit: https://github.com/UBC-MDS/dsci-522-group-23/commit/d06bd75d9236f7fa5e59c271ccf59bde1c43bca6
The section lacked depth in explaining coefficients, model limitations, and the role of multicollinearity. Additionally, the justification for using Ridge Regression and discussion of results needed more clarity.
Expanded discussion on the model coefficients, explained the linearity assumption and provided examples, improved clarity on Ridge’s handling of multicollinearity through coefficient shrinkage and clarified why Ridge Regression was chosen.
resolved git issue: #60
The correlations between G1, G2, G3 are very high. It would be best if you had studied the effects of features on G1/G2/G3 without considering the others.
We removed G1 and G2 from our final model to address multicollinearity and focus on the predictive power of G3.
Check our final report: https://ubc-mds.github.io/dsci-522-group-23/