Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project proposal feedback #1

Open
swapnens opened this issue Apr 5, 2020 · 0 comments
Open

Project proposal feedback #1

swapnens opened this issue Apr 5, 2020 · 0 comments

Comments

@swapnens
Copy link
Owner

swapnens commented Apr 5, 2020

The idea is nice but needs to be elaborate better to become really a challenging and interesting ML problem.

First, the questions might be more articulated . E.g., to which planet the DNA most likely belongs to? Which other planets seem to host similar beings? (e.g., they could be allied because of a similar species).

The problem becomes difficult depending on the size of the DNA as well as the number of planets / species.
Score for this problem: 3.5

Missing details on how you will generate the data - what will be the priors on which you will generate the data? how would you introduce anomalies/missing data/noise?

The way you are describing sounds too easy: you could indeed use the make_blobs() from sklearn.datasets that does what you are expecting. However, you need to make things more "complicated". Clusters in the feature space shouldn't be spherical, but have complex shapes, with the cluster of one species "inside" the cluster of other species.
Score for this problem: 1.5

The coronavirus wasn't enough? Now we have to deal also with nasty aliens? :-))

Total: 8/10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant