Hello contributors, meet Vishesh, an Under-Graduate student at IIITA. During technical round of the intern session, Vishesh is presented with several tasks that require Python and Machine Learning expertise. Vishesh who did his education in online mode is facing difficulties in solving these issues. Help him in clearing this round and moving a step further in selection process.
Preprocessing data is on of the most important task of a ML engineer and it takes about 50-60% of the time while developing any project. Data is collected by scraping the web or by any other means and it is always possible that the dataset is not in the form you will like it to be. The dataset provided to you faces the same problem. Preprocessing data using state of art machine learning frameworks reduces the codes to few lines. But to understand the concepts better you need to build things from scratch so here in this repo you are only allowed to use numpy and pandas to complete the tasks.
You can look into the issue section to see the tasks available. You can reply "claim" or "I want to solve this issue" on the issue that you want to solve and then one of the mentor can assign you the issue. Once you are assigned the issue make a copy of the notebook and edit it DO NOT EDIT THE ORIGINAL NOTEBOOK
You will be provided with three files in most of the tasks
A .csv file | A csv file will be provided for the dataset you will be working on. |
---|---|
A .ipynb notebook | To help you complete the task. All the steps and rules will be mentioned in the notebook. |
A .py file | This will contain all the functions that will be used as checkpoints in the task |
A utils.py file | For some tasks you will be provided with a python file containg functions that will be used to complete the task |
- The issue is first come first serve that means the person who comments first will get the issue first and if he is not able to solve the issue in two days then the issue will be assigned to some other person.
- Only the mentors :
- pSN0W
- parth1007
- himanshu370
- utkarsh1210-tech
- The points will be provided if the issue is solved completely. No points for partial completion
- You are given with helper functions as check point of your work while passing these check points doesn't mean that your solution is correct. The solution will be considered correct after verification by one of the mentors
- You are only allowed to use numpy and pandas
- No changes should be done to cells marked as UNIQUE in the notebook
Once done with solving the issue. Open your notebook in google colab and generate a link for the notebook with access to the person mentioned below the notebook or specified below
Pratyaksh Singh | iib2020015@iiita.ac.in |
---|---|
Himanshu Bhawnani | iib2020035@iiita.ac.in |
Parth Soni | iec2020132@iiita.ac.in |
Utkarsh | iec2020060@iiita.ac.in |