Contributor: Luka Bulić
Supervisor: Assoc. prof. Mirjana Domazet-Lošo, PhD
Tumor Identification Based on Machine Learning Analysis of Microarray Data was an undergraduate machine learning project created at the Faculty of Electrical Engineering and Computing, University of Zagreb (courses "Bioinformatics 1" and "Undergraduate thesis"). The project aimed to solve the issue of tumor identification based on large sets of DNA microarray data, using the tools of machine learning. The database used for model training and testing was made publicly available as part of the following research:
Feltes, B.C.; Chandelier, E.B.; Grisci, B.I.; Dorn, M. CuMiDa: An Extensively Curated Microarray Database for Benchmarking and Testing of Machine Learning Approaches in Cancer Research. Journal of Computational Biology, 2019.
The problem was approached through Python programming, using the typical machine learning libraries, such as Pandas, XGBoost, and SciKit-Learn. The methods and results can be found in the uploaded documentation.