Kaggle-Competitions-Analysis

First off, this is a work in progress repository. Each entry has 5 bullet points:

Type
Competition overview
Winner blog
Winner notebook/code/kernel
Other notebook/code/kernel
Solution thread
Take home message (Taken directly from the authors with some modifications)

The list of competitions was taken from this reference#1 and reference#2. Generally at the end of every Kaggle competition, the winners share their solutions. The goal of this repository is to offer a quick refence guide to what matters the most: their kernel and the take home message. Competitions where neither the code nor the approach was described were ommited.

A very short intro on Kaggle competition via bullet points

Competitions on Kaggle are classified into different types according to their reward: Knowledge, Jobs, money. Knowledge competitions are meant for beginners who are looking to get started. These are a good fit for a beginner, because you can find a lot of articles and sample solutions explaining how to get a good score.
After getting familiar with the platform and how to solve a competition, you can join a real live competition and participate.
Kaggle has a rewarding system which categorise sers into Novice for recently joined users, Contributor, Expert, Master and GrandMaster for each of the four paradigms, Kaggle GrandMasters. The last being the highest rank achiavable,
The Kaggle leaderboard has a public and private component to prevent participants from “overfitting” to the leaderboard. If your model is “overfit” to a dataset then it is not generalizable outside of the dataset you trained it on. This means that your model would have low accuracy on another sample of data taken from a similar dataset.
Kaggle Kernels is a cloud computational environment that enables reproducible and collaborative analysis. Kernels supports scripts in R and Python, Jupyter Notebooks, and RMarkdown reports

Introductory Article on Kaggle competition

Other resources on how to approach a Kaggle competition

Kaggle Grandmasters

Installing Kaggle API

With pip: pip install kaggle
How to resolve kaggle.json not found. Go to the Kaggle's homepage www.kaggle.com -> Your Account -> Create New API token. This will download a ready-to-go JSON file to place in you [user-home]/.kaggle folder. If there is no .kaggle folder yet, please create it first, however it is highly likely that the folder is already there, especially if you tried early this: kaggle competitions download -c competition_name.

Kaggle API

You have two options to send over your submission: 1) directly from a Kaggle kernel or by their API. The last one is the one I prefer. I'd like to do all the wrangling and modelling on my set up and then send my submission file directly.

How to submit from kaggle kernel
Kaggle API wiki
Step-by-step manual submission to Kaggle:
- How to downlaod the sets? kaggle competitions download -c house-prices-advanced-regression-techniques
- Where do I get the exact name of the competition? Check the URL like this: https://www.kaggle.com/c/house-prices-advanced-regression-techniques/submissions
- See your submissions history: kaggle competitions submissions house-prices-advanced-regression-techniques
- How to submit your file via Kaggle API: kaggle competitions submit house-prices-advanced-regression-techniques -f submission.csv -m "Submission_No_1"

Notebale Techniques

Pesudo Labelling #1, #2

Useful resource

heamy A set of useful tools for competitive data science. Automatic caching (data preprocessing, predictions from models) Ensemble learning (stacking, blending, weighted average, etc.).

1 Forecast Eurovision Voting

This competition requires contestants to forecast the voting for this years Eurovision Song Contest in Norway on May 25th, 27th and 29th.

Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message: CV was paramount to avoid overfitting and provide an indication of which model would perform well on new data. It is stated that studying the voting patterns for all countries provided valuable insights.

2 Predict HIV Progression

This contest requires competitors to predict the likelihood that an HIV patient's infection will become less severe, given a small dataset and limited clinical information.

Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message: Make sure that all areas of the dataset are randomly partitioned. In order to do machine learning correctly, it is important to have your training data closely match the test dataset. Furtherm the recursive feature elimination was mentioned as one of the factor that helped win the competition.

3 Tourism Forecasting Part One

Part one requires competitors to predict 518 tourism-related time series. The winner of this competition will be invited to contribute a discussion paper to the International Journal of Forecasting.

Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message: A weighted combination of three predictors turned out to be the best appraoch for forecasting.

4 Tourism Forecasting Part Two

Part two requires competitors to predict 793 tourism-related time series. The winner of this competition will be invited to contribute a discussion paper to the International Journal of Forecasting.

Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message: The mindset was not to concentrate on the the overall accuracy, but how to prevent the worst case events. This was achieved an ensemble of algorithms.

5 INFORMS Data Mining Contest 2010

The goal of this contest is to predict short term movements in stock prices. The winners of this contest will be honoured of the INFORMS Annual Meeting in Austin-Texas (November 7-10).

Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message: NA

6 Chess ratings - Elo versus the Rest of the World

This competition aims to discover whether other approaches can predict the outcome of chess games more accurately than the workhorse Elo rating system.

Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message: The winning algorithm, called Elo++, is characterised by an l2 regularization technique that avoids overfitting. This was paramount given the extremey small dataset. Overfitting is a big problem for rating systems. The regularization takes into account the number of games per player, the recency of these games and the ratings of the opponents of each player. The intuition is that any rating system should “trust” more the ratings of players who have played a lot of recent games versus the ratings of players who have played a few old games. The extent of regularization is controlled using a single parameter, that was optimised via CV.

7 IJCNN Social Network Challenge

This competition requires participants to predict edges in an online social network. The winner will receive free registration and the opportunity to present their solution at IJCNN 2011.

Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message: Apart from the techniality of the winning approach, the most interesting finding was that large real-world online social network graph can be effectively de-anonymised. Releasing useful social network graph data that is resilient to de-anonymisation remains an open question.

8 R Package Recommendation Engine

The aim of this competition is to develop a recommendation engine for R libraries (or packages). (R is opensource statistics software.

Competition overview
Winner (2nd) blog/article
Winner (2nd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message: ensamble of 4 different models.

9 RTA Freeway Travel Time Prediction

This competition requires participants to predict travel time on Sydney's M4 freeway from past travel time observations.

Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message: NA

10 Predict Grant Applications

This task requires participants to predict the outcome of grant applications for the University of Melbourne.

Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message: Pre-processing of the Excel spreadsheet looking for groups which had high or low application success rates. The winning algorithm was a slight modification of the random forest algorithm.

11 Stay Alert! The Ford Challenge

Driving while not alert can be deadly. The objective is to design a classifier that will detect whether the driver is alert or not alert, employing data that are acquired while driving.

Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message: Trials are not homogeneous, meaning the driver is either mainly alert or not alert. High AUC on the training set (or a held-back portion of the training set) and achieves a poor AUC on the test set. This observation suggests that we are working in the world of extrapolation: i.e. the training and test set differ in some manner. If we’re extrapolating then a simple model is usually required. The winning algorithm was based on a simple logistic regression.

12 ICDAR 2011 - Arabic Writer Identification

This competition require participants to develop an algorithm to identify who wrote which documents. The winner will be honored at a special session of the ICDAR 2011 conference.

Competition overview
Winner (3rd) blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message: The approach was based on SVMs with a diffusion kernel. A notable comment was that there was an apparent lack of correlation between CV results on the training data and the accuracy on the validation set.

13 Deloitte/FIDE Chess Rating Challenge

This contest, sponsored by professional services firm Deloitte, will find the most accurate system to predict chess outcomes, and FIDE will also bring a top finisher to Athens to present their system.

Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message: Yannis Sismanis, the winner of the first competition, used a logistic curve for this purpose and estimated the rating numbers by minimizing a regularized version of the model fit. Jeremy Howard, the runner-up, instead used the TrueSkill model, which uses a Gaussian cumulative density function and estimates the ratings using approximate Bayesian inference. These were the starting points but the winning algorithm was based on some solid post-processing of the data. These inludes:
the predictions of the base model
the ratings of the players
the number of matches played by each player
the ratings of the opponents encountered by each player
the variation in the quality of the opponents encountered
the average predicted win percentage over all matches in the same month for each player
the predictions of a random forest using these variables

14 Don't Overfit!

With nearly as many variables as training cases, what are the best techniques to avoid disaster?

15 Mapping Dark Matter

Measure the small distortion in galaxy images caused by dark matter

Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message: Without the neural network, the winner best entry would have ranked 8th.

16 Wikipedia's Participation Challenge

This competition challenges data-mining experts to build a predictive model that predicts the number of edits an editor will make five months from the end date of the training dataset

Competition overview
Winner (3rd) blog/article
Other notebook/code/kernel - NA
Take home message: NA

17 dunnhumby's Shopper Challenge

Going grocery shopping, we all have to do it, some even enjoy it, but can you predict it? dunnhumby is looking to build a model to better predict when supermarket shoppers will next visit the store and how much they will spend.

Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message: At first I tried to use simple heuristics to understand the ‘logic of the problem’. My main idea was to split the problem into two parts: the date prediction and the dollar spend prediction. For that task, I used a kernel density (Parzen) estimator. But it was necessary to take account of the fact that ‘fresh’ data is more useful than ‘old’ data, so I used a weighted Parzen scheme to give greater weight to more recent data points.

18 Photo Quality Prediction

Given anonymized information on thousands of photo albums, predict whether a human evaluator would mark them as 'good'.

Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message: My best result was a mix of random forest, GBM, and two forms of logistic regression. I put the raw data into a database and built many derived variables. It’s probably not worth it to spend too much time on external data, as chances are any especially useful data are already included. Time can be better spent on algorithms and included variables.

19 Give Me Some Credit

Improve on the state of the art in credit scoring by predicting the probability that somebody will experience financial distress in the next two years.

Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message: We tried many different supervised learning methods, but we decided to keep our ensemble to only those things that we knew would improve our score through cross-validation evaluations. In the end we only used five supervised learning methods: a random forest of classification trees, a random forest of regression trees, a classification tree boosting algorithm, a regression tree boosting algorithm, and a neural network. This competition had a fairly simple data set and relatively few features which meant that the barrier to entry was low, competition would be very intense and everyone would eventually arrive at similar results and methods. Thus, I would have to work extra hard and be really innovative in my approach to solving this problem. I was surprised at how well neural networks performed. They certainly gave a good improvement over and above more modern approaches based on bagging and boosting. I have tried neural networks in other competitions where they did not perform as well.

20 Don't Get Kicked!

Predict if a car purchased at auction is a lemon.

Competition overview
Winner (2nd) blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message: On the pre-processing: it was necessary to transfer textual values to the numerical format. I used Perl to do that task. Also, I created secondary synthetic variables by comparing different Prices/Costs. On the supervised learning methods: Neural Nets (CLOP, Matlab) and GBM in R. No other classifiers were user in order to produce my best result. Note that the NNs were used only for the calculation of the weighting coefficient in the blending model. Blending itself was conducted not around the different classifiers, but around the different training datasets with the same classifier. I derived this idea during last few days of the Contest, and it produced very good improvement (in both public and private).

21 Algorithmic Trading Challenge

Develop new models to accurately predict the market response to large trades.

Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Solution thread
Take home message: I tried many techniques: (SVM, LR, GBM, RF). Finally, I chose to use a random forest. The training set was a nice example of how stock market conditions are extremely volatile. Different samples of the training set could fit very different models.

22 The Hewlett Foundation: Automated Essay Scoring

Develop an automated scoring algorithm for student-written essays.

Competition overview
Winner blog - NA
Winner notebook/code/kernel - NA
Other notebook/code/kernel #1
Other notebook/code/kernel #2
Take home message - NA

23 KDD Cup 2012, Track 2

Predict the click-through rate of ads given the query and user information.

Competition overview
Winner blog - NA
Winner notebook/code/kernel - NA
Other notebook/code/kernel #1
Take home message

24 Predicting a Biological Response

Predict a biological response of molecules from their chemical properties.

Competition overview
Winner blog - NA
Winner notebook/code/kernel - NA
Other notebook/code/kernel #1
Take home message

25 Facebook Recruiting Competition

Show them your talent, not just your resume.

Competition overview
Winner blog - NA
Winner notebook/code/kernel - NA
Other notebook/code/kernel #1
Take home message

26 EMI Music Data Science Hackathon - July 21st - 24 hours

Can you predict if a listener will love a new song?

Competition overview
Winner blog/article
[Winner notebook/code/kernel] (https://github.com/fancyspeed/codes_of_innovations_for_emi)
Other notebook/code/kernel - NA
Take home message - NA

27 Detecting Insults in Social Commentary

Predict whether a comment posted during a public discussion is considered insulting to one of the participants.

Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel
Take home message - NA

28 Predict Closed Questions on Stack Overflow

Predict which new questions asked on Stack Overflow will be closed

Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel
Take home message - NA

29 Observing Dark Worlds

Can you find the Dark Matter that dominates our Universe? Winton Capital offers you the chance to unlock the secrets of dark worlds.

Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - Although the description makes this sound like a physics problem, it is really one of statistics: given the noisy data (the elliptical galaxies) recover the model and parameters (position and mass of the dark matter) that generated them. Bayesian analysis provided the winning recipe for solving this problem. The 1.05 public score of my winning submission was only about average on the public leaderboard. All of this means I was very lucky indeed to win this competition.

30 Traveling Santa Problem

Solve ye olde traveling salesman problem to help Santa Claus deliver his presents

Competition overview
Winner blog - NA
Winner notebook/code/kernel - NA
Other notebook/code/kernel
Take home message -NA

31 Event Recommendation Engine Challenge

Predict what events our users will be interested in based on user actions, event metadata, and demographic information.

Competition overview
Winner blog - NA
Winner notebook/code/kernel - NA
Other notebook/code/kernel
Take home message -NA

32 Job Salary Prediction

Predict the salary of any UK job ad based on its contents.

Competition overview
Winner blog - NA
Winner notebook/code/kernel - NA
Other notebook/code/kernel
Take home message - NA

33 Influencers in Social Networks

Predict which people are influential in a social network.

Competition overview
Winner blog - NA
Winner notebook/code/kernel - NA
Other notebook/code/kernel
Take home message - NA

34 Blue Book for Bulldozers

Predict the auction sale price for a piece of heavy equipment to create a "blue book" for bulldozers.

Competition overview
Winner (20th place) blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - Kaggle provided us with a machine appendix with the “real” value of each feature and for each machine, but it turned out that putting in the true value was not a good idea. Indeed, we think that each seller could declare the characteristics (or not) on the auction website and this had an impact on the price. As for the second point, we focused on the volatility of some models. We spent a lot of time trying to understand how a machine could be sold the same year, and, even with only a few days between two sales, at two completely different prices. It turned out not to be easily predictable. In financial theory, the model used to describe this kind of randomness is call random walk. We tried a lot of things: we decomposed each option in new binary features, we added the age from the sale date and the year of manufacture, we added the day of the week, the number of the week in the year, we also tried to add the number of auctions of the current month to try to capture the market tendency, we tried to learn our models on different periods, for example by removing the year 2009 and 2010 which were impacted by the economic crisis. In the end we built one model per category.

35 Challenges in Representation Learning: Multi-modal Learning

The multi-modal learning challenge.

Competition overview
Winner (20th place) blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

36 Challenges in Representation Learning: The Black Box Learning Challenge

Competitors train a classifier on a dataset that is not human readable, without knowledge of what the data consists of.

Competition overview
Winner blog/article
Winner notebook/code/kernel/code
Other notebook/code/kernel - NA
Take home message - Although almost all of the winner submissions were single classifiers, the actual winning entry was a small ensemble of three previous submissions.

37 Challenges in Representation Learning: Facial Expression Recognition Challenge

Learn facial expressions from an image

Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

38 KDD Cup 2013 - Author Disambiguation Challenge (Track 2)

Identify which authors correspond to the same person

Competition overview
Winner blog/article
Winner (2nd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

39 The ICML 2013 Whale Challenge - Right Whale Redux

Develop recognition solutions to detect and classify right whales for BIG data mining and exploration studies

Competition overview
Winner blog/article
Winner (2nd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

40 KDD Cup 2013 - Author-Paper Identification Challenge (Track 1)

Determine whether an author has written a given paper

Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

41 Amazon.com - Employee Access Challenge

Predict an employee's access needs, given his/her job role

Competition overview
Winner blog/article
Winner notebook/code/kernel
[Other notebook/code/kernel] #1(https://github.com/kaz-Anova/Competitive_Dai)
[Other notebook/code/kernel] #1(https://github.com/kaz-Anova/ensemble_amazon)
Take home message - The general strategy was to produce 2 feature sets: one categorical to be modeled with decision tree based approaches and the second a sparse matrix of binary features, created by binarizing all categorical values and 2nd and 3rd order combinations of categorical values. The latter features could be modeled with Logistic Regressoin, SVMs, etc. The starting point of this latter set of code was provided on the forums by Miroslaw Horbal. The most critical modeification I made to it was in merging the most rarely occuring binary features into a much smaller number of features holding these rare values.

42 MLSP 2013 Bird Classification Challenge

Predict the set of bird species present in an audio recording, collected in field conditions.

Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

43 RecSys2013: Yelp Business Rating Prediction

RecSys Challenge 2013: Yelp business rating prediction

Competition overview
Winner blog/article
Winner (7th) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

44 The Big Data Combine Engineered by BattleFin

Predict short term movements in stock prices using news and sentiment data provided by RavenPack

Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - A high-level description of my approach is: 1. Group securities into groups according to price movement correlation. 2. For each security group, use I146 to build a “decision stump” (a 1-split decision tree with 2 leaf nodes). 3. For each leaf node, build a model of the form Prediction = m * Last Observed Value. For each leaf node, find m that minimizes MAE. Rows that most-improved or most-hurt MAE with respect to m=1.0 were not included.

45 Belkin Energy Disaggregation Competition

Disaggregate household energy consumption into individual appliances

Competition overview
Winner (discussion) blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - The winner of the competition shown by the final standings was actually only ranked 6th on the public leaderboard. This suggests that many participants might have been overfitting their algorithms to the half of the test data for which the performance was disclosed, while the competition winner had not optimised their approach in such a way. An interesting forum thread seems to show that most successful participants used an approach based on only low-frequency data, despite the fact that high-frequency data was also provided. This seems to contradict most academic research, which generally shows that high-frequency based approaches will outperform low-frequency methods. A reason for this could be that, although high-frequency based approaches perform well in laboratory test environments, their features do not generalise well over time, and as a result algorithm training quickly becomes outdated. However, another reason could have been that the processing of the high-frequency features was simply too time consuming, and better performance could be achieved by concentrating on the low-frequency data given the deadline of the competition.

46 StumbleUpon Evergreen Classification Challenge

Build a classifier to categorize webpages as evergreen or non-evergreen

47 AMS 2013-2014 Solar Energy Prediction Contest

Forecast daily solar energy with an ensemble of weather models

Competition overview
Winner blog/article
Winner notebook/code/kernel
[Other notebook/code/kernel]
Take home message - NA

48 Accelerometer Biometric Competition

Recognize users of mobile devices from accelerometer data

Competition overview
Winner (1st, 2nd, 3d, 4th) blog/article
Winner (3rd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - Main idea is constructing chains of consecutive sequences using timestamp leak and determining real device using bayes rule and professed devices of se- quences in this chain. Some chains are “stuck” on their real devices

49 Multi-label Bird Species Classification - NIPS 2013

Identify which of 87 classes of birds and amphibians are present into 1000 continuous wild sound recordings

Competition overview
Winner blog/article
Winner (2nd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

50 See Click Predict Fix

Predict which 311 issues are most important to citizens

Competition overview
Winner (2nd) blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - Because this contest was temporal in nature, using time-series models to make future predictions, most competitors quickly realized that proper calibration of predictions was a major factor in reducing error. Even during the initial Hackathon portion of the contest, it became well known on the competition forum that one needed to apply scalars to predictions in order to optimize leaderboard scores. But while scaling was common knowledge, our most important insight came in applying our segmentation approach to the scalars. For example, rather than apply one optimized scalar to all predicted views for the entire test set, we applied optimized scalars for each distinct segment of the test set (the remote API sourced issues and the four cities). We then optimized the scalars using a combination of leaderboard feedback and cross-validation scores. What we found was that each segment responded differently to scaling, so trying to apply one scalar to all issues, as many of our competitors were doing, was not optimal.

51 Partly Sunny with a Chance of Hashtags

What can a #machine learn from tweets about the #weather?

Competition overview
Winner blog/article
Winner (2nd) notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - Regarding the ML model, one core observation, which I guess prevented many people from entering into the <.15 zone, is that the problem here is multi-output. While the Ridge does handle the multi-output case it actually treats each variable independently. You could easily verify this by training an individual model for each of the variables and compare the results. You would see the same performance. So, the core idea is how to go about taking into account the correlations between the output variables. The approach I took was simple stacking, where you feed the output of a first level model and use it as features to the 2nd level model (of course you do it in a CV fashion).

52 Facebook Recruiting III - Keyword Extraction

Identify keywords and tags from millions of text questions

Competition overview
Winner blog/article
Other relevant blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - As noted by everyone, the bottleneck of this competition is RAM.

53 Personalized Web Search Challenge

Re-rank web documents using personal preferences

Competition overview
Winner blog
Winner notebook/code/kernel - NA
[Other notebook/code/kernel] (https://github.com/ykdojo/personalized_search_challenge)
Take home message - NA

54 Packing Santa's Sleigh

He's making a list, checking it twice; to fill up his sleigh, he needs your advice

Competition overview
Winners blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

55 Dogs vs. Cats

Create an algorithm to distinguish dogs from cats

Competition overview
Winner (4th) blog/article
Winner notebook/code/kernel -NA
Other notebook/code/kernel - NA
Take home message - Just like with most other image recognition/classification problems, I have completely relied on Deep Convolutional Neural Networks (DCNN). I have built a simple convolutional neural network (CNN) in Keras from scratch, but for the most part I’ve relied on out-of-the-box models: VGG16, VGG19, Inception V3, Xception, and various flavors of ResNets. My simple CNN managed to get the score in the 0.2x range on the public leaderboard (PL). My best models that I build using features extracted by applying retrained DCNNs got me into the 0.06x range on PL. Stacking of those models got me in the 0.05x range on PL. My single best fine-tuned DCNN got me to 0.042 on PL, and my final ensemble gave me the 0.35 score on PL. My ensembling diagram can be seen below:

56 Conway's Reverse Game of Life

Reverse the arrow of time in the Game of Life

Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - Regarding better hardware, I had figured that some might have access to more computing power, but since probability convergence has exponentially decreasing returns over time, I felt I could do just fine provided I used the information as well as possible. I tuned the ~100 parameters using genetic algorithms, mostly because I was too lazy to come up with something more theoretically rigorous in terms of optimizing capabilitie.

57 Loan Default Prediction - Imperial College London

Constructing an optimal portfolio of loans

Competition overview
Winner blog/article
Winner (2nd) notebook/code/kernel
Other (12th) notebook/code/kernel
Take home message - The training data is sorted by the time, and the test data is randomly orded. So in the validation process, I first shuffle the training data randomly. Owing to lack of the feature description, It is hard to use the tradition method to predict LGD. In my implemention, the operator +,-.*,/ between two features, and the operator (a-b) * c among three features were used, these features were selected by computing the pearson corrlation with the loss.

58 Galaxy Zoo - The Galaxy Challenge

Classify the morphologies of distant galaxies in our Universe

Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

59 March Machine Learning Mania

Tip off college basketball by predicting the 2014 NCAA Tournament

Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - Sometimes it’s better to be lucky than good. The location data that I used had a coding error in it. South Carolina’s Sweet Sixteen and Elite Eight games were coded as being in Greenville, SC instead of New York City. The led me to give them higher odds than most others, which helped me since they won. It is hard to say what the optimizer would have selected (and how it affected others’ models), but there is a good chance I would have finished in 2nd place or worse if the correct locations had been used.

60 Large Scale Hierarchical Text Classification

Classify Wikipedia documents into one of 325,056 categories

Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - Our winning submission consists mostly of an ensemble of sparse generative models extending Multinomial Naive Bayes. The base-classifiers consist of hierarchically smoothed models combining document, label, and hierarchy level Multinomials, with feature pre-processing using variants of TF-IDF and BM25. Additional diversification is introduced by different types of folds and random search optimization for different measures. The ensemble algorithm optimizes macroFscore by predicting the documents for each label, instead of the usual prediction of labels per document. Scores for documents are predicted by weighted voting of base-classifier outputs with a variant of Feature-Weighted Linear Stacking. The number of documents per label is chosen using label priors and thresholding of vote scores.

61 Walmart Recruiting - Store Sales Forecasting

Use historical markdown data to predict store sales

Competition overview
Winner (2nd) blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - I used SAS (for data prep/ARIMA/UCM) and R (for the remainder models) together. I used weighted average and trimmed mean of following 6 methods. The goal from the beginning was to build a robust model that will be able to withstand uncertainty.

62 The Analytics Edge (15.071x)

Learn what predicts happiness by using informal polling questions.

Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

63 CONNECTOMICS

Reconstruct the wiring between neurons from fluorescence imaging of neural activity

Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

64 Allstate Purchase Prediction Challenge

Predict a purchased policy based on transaction history

Competition overview
Winner blog/article
Winner (2nd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

65 Greek Media Monitoring Multilabel Classification (WISE 2014)

Multi-label classification of printed media articles to topics

Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

66 KDD Cup 2014 - Predicting Excitement at DonorsChoose.org

Predict funding requests that deserve an A+

Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - We worked independently until we merged so we had two separate "models" (each "model" was itself an ensemble of a few models). A lot of our variables ended up being similar, though. Instead of listing out features one-by-one we will note some potentially interesting features that we used. The full feature list can be found in the code. Given private LB's sensitivity to discounting, and given public LB's (relative) lack of sensitivity to discounting (e.g. 1.0 to 0.5 linear decay gave ~0.003 improvements on the public LB), we were simply lucky.

67 MLSP 2014 Schizophrenia Classification Challenge

Diagnose schizophrenia using multimodal features from MRI scans

Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - The solution is based on Gaussian process classification. The model is actually very simple (really a 'Solution draft'), but it did show promising performance using LOOCV. The score on the public leaderboard was, however, only 0.70536, discouraging any further tuning of the model. In the end, it turned out to perform well on the private leaderboard.

68 DecMeg2014 - Decoding the Human Brain

Predict visual stimuli from MEG recordings of human brain activity

Competition overview
Winner blog/article
Winner (2nd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

69 UPenn and Mayo Clinic's Seizure Detection Challenge

Detect seizures in intracranial EEG recordings

Competition overview
Winner (27th) blog/article
Winner (27th) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

70 The Hunt for Prohibited Content

Predict which ads contain illicit content

Competition overview
Winner blog/article
Winner (4th) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

71 Liberty Mutual Group - Fire Peril Loss Cost

Predict expected fire losses for insurance policies

Competition overview
Winner (3rd) blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - There's one little trick I used, which I guess others have also done. Instead of predicting the losses directly, I took the logarithm, and predicted on that.

72 Higgs Boson Machine Learning Challenge

Use the ATLAS experiment to identify the Higgs boson

73 Display Advertising Challenge

Predict click-through rates on display ads

74 CIFAR-10 - Object Recognition in Images

Identify the subject of 60,000 labeled images

Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

75 Africa Soil Property Prediction Challenge

Predict physical and chemical properties of soil using spectral measurements

Competition overview
Winner (3rd) blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

76 Learning Social Circles in Networks

Model friend memberships to multiple circles

Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - It’s well-known that people tend to over-fit the data in the Public leaderboard. In this case, there were a total of 110 data instances, of which ‘solutions’ were provided for 60. One third of the remaining 50 instances were used for the Public scoring, and two-thirds were used for the Private scoring. I got the sense from my work with the Test data that the Public set was a little bit strange, and so I tried to restrain myself from putting too much work into doing well on the Public leaderboard, and instead on understanding and doing well with the Test data. This seems to have worked well for me in the end.

77 Tradeshift Text Classification

Classify text blocks in documents

Competition overview
Winner blog - NA
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

78 American Epilepsy Society Seizure Prediction Challenge

Predict seizures in intracranial EEG recordings

Competition overview
Winner blog/article
Winner (13th) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

79 Data Science London + Scikit-learn

Scikit-learn is an open-source machine learning library for Python. Give it a try here!

Competition overview
Winner blog - NA
Winner (7th) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

80 Click-Through Rate Prediction

Predict whether a mobile ad will be clicked

Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - Instead of using the whole dataset, in this competition we find splitting data into small parts works better than directly using the entire dataset. For example, in one of our models we select instances whose site id is 85f751fd; and in another one we select instances whose app id is ecad2386

81 BCI Challenge @ NER 2015

A spell on you if you cannot detect errors!

Competition overview
Winner blog - NA
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

82 Sentiment Analysis on Movie Reviews

Classify the sentiment of sentences from the Rotten Tomatoes dataset

Competition overview
Winner blog - NA
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

83 Driver Telematics Analysis

Use telematic data to identify a driver signature

Competition overview
Winner blog/article
Winner (2nd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

84 National Data Science Bowl

Predict ocean health, one plankton at a time

Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

85 Finding Elo

Predict a chess player's FIDE Elo rating from one game

Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

86 Microsoft Malware Classification Challenge (BIG 2015)

Classify malware into families based on file content and characteristics

Type - Classification
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - Cross-validation plays a critical role to overcome overfitting. Out parameter tuning and model selection is based on local cross-validation rather than the public leaderboard.

87 Billion Word Imputation

Find and impute missing words in the billion word corpus

Type - Missing word (NLP)
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - 1) Guessing a word should be done more conservatively if you are not confident you are inserting at the right position. 2) Guessing long words should be done more conservatively than guessing short words.

88 Restaurant Revenue Prediction

Predict annual restaurant sales based on objective measurements

Type - Regression
Competition overview
Winner blog/article
Winner (13th) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

89 How Much Did It Rain?

Predict probabilistic distribution of hourly rain given polarimetric radar measurements

Type - Multclass classification
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - I don't have exact CV scores because it took too much time to run CV so the following is based on 7% holdout as well as occasional two fold CV.

90 Otto Group Product Classification Challenge

Classify products into the correct category

Type - Multiclass classification
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - Definetely the best algorithms to solve this problem are: Xgboost, NN and KNN. T-sne reduction also helped a lot. Other algorithm have a minor participation on performance. So we *learnt not to discard low performance algorithms, since they have enough predictive power to improve performance in a 2nd level training.

91 Walmart Recruiting II: Sales in Stormy Weather

Predict how sales of weather-sensitive products are affected by snow and rain

Type - Regression
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

92 Facebook Recruiting IV: Human or Robot?

Predict if an online bid is made by a machine or a human

Type - Classification
Competition overview
Winner blog/article
Winner (2nd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - Early on I saw that CV was going to be relatively inaccurate so I ended up choosing 500 resamples of different 2/3 – 1/3 splits. This took the standard error on my CV AUC calculation down to about 0.0007 so that I could reasonably have an idea of whether each feature I tested was making a positive or negative difference. The inaccuracy on a single CV fold and the helpful post by T. Scharf, https://www.kaggle.com/c/facebook-recruiting-iv-human-or-bot/forums/t/14394/a-basic-look-at-lb-scores suggested that public LB scores were not going to be particularly useful so I only made 3 submissions to avoid any temptation to overfit the public LB - I find this a very difficult thing to do…

93 West Nile Virus Prediction

Predict West Nile virus in mosquitos across the city of Chicago

Type - Classification
Competition overview
Winner blog - NA
Winner (2nd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

94 Bag of Words Meets Bags of Popcorn

Use Google's Word2Vec for movie reviews

Type - Sentiment Analysis
Competition overview
Winner blog - NA
Winner (3rd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

95 ECML/PKDD 15: Taxi Trip Time Prediction (II)

Predict the total travel time of taxi trips based on their initial partial trajectories

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

96 ECML/PKDD 15: Taxi Trajectory Prediction (I)

Predict the destination of taxi trips based on initial partial trajectories

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

97 Crowdflower Search Results Relevance

Predict the relevance of search results from eCommerce sites

Type - Ranking
Competition overview
Winner blog - NA
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

98 Diabetic Retinopathy Detection

Identify signs of diabetic retinopathy in eye images

Type - Classification
Competition overview
Winner (2nd) blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

99 Avito Context Ad Clicks

Predict if context ads will earn a user's click

Type - NA
Competition overview
Winner blog - NA
Winner (2nd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

100 ICDM 2015: Drawbridge Cross-Device Connections

Identify individual users across their digital devices

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

101 Liberty Mutual Group: Property Inspection Prediction

Quantify property hazards before time of inspection

Type -
Competition overview
Winner blog/article
Winner (16th) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

102 Grasp-and-Lift EEG Detection

Identify hand motions from EEG recordings

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

103 Coupon Purchase Prediction

Predict which coupons a customer will buy

Type - Classification
Competition overview
Winner blog/article
Winner (3rd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

104 Flavours of Physics: Finding τ → μμμ

Identify a rare decay phenomenon

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

105 Truly Native?

Predict which web pages served by StumbleUpon are sponsored

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

106 Springleaf Marketing Response

*Determine whether to send a direct mail piece to a customer *

Type -
Competition overview
Winner (7th) blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

107 How Much Did It Rain? II

Predict hourly rainfall using data from polarimetric radars

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

108 Rossmann Store Sales

Forecast sales using store, promotion, and competitor data

Type -
Competition overview
Winner blog/article
Winner (3rd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

109 What's Cooking?

Use recipe ingredients to categorize the cuisine

Type -
Competition overview
Winner blog - NA
Winner (4th) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

110 Walmart Recruiting: Trip Type Classification

Use market basket analysis to classify shopping trips

Type -
Competition overview
Winner blog/article
Winner (11th) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

111 Right Whale Recognition

*Identify endangered right whales in aerial photographs *

Type -
Competition overview
Winner blog/article
Winner (2nd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

112 The Winton Stock Market Challenge

Join a multi-disciplinary team of research scientists

Type -
Competition overview
Winner (2nd) blog/article
Winner notebook/code/kernel -NA
Other notebook/code/kernel - NA
Take home message - NA

113 Cervical Cancer Screening

Help prevent cervical cancer by identifying at-risk populations

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - The single most important piece of data was actually not in the dataset. Was what Wendy said very early in the competition: "We filtered out some records that are too relevant, records that is direct evidence that the person is a cervical cancer screener or not." Now, I do not argue the skills of those preparing the dataset, but it's really hard to remove records from a relational database without leaving bread crumbs behind you. So, I set myself to look for those crumbs while I was modelling other things I knew had predictive power.

114 Homesite Quote Conversion

Which customers will purchase a quoted insurance plan?

Type -
Competition overview
Winner blog - NA
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

115 Airbnb New User Bookings

Where will a new guest book their first travel experience?

Type -
Competition overview
Winner blog/article
Winner (3rd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

116 The Allen AI Science Challenge

Is your model smarter than an 8th grader?

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

117 Prudential Life Insurance Assessment

Can you make buying life insurance easier?

Type -
Competition overview
Winner blog/article
Winner (2nd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

118 Telstra Network Disruptions

Predict service faults on Australia's largest telecommunications network

Type -
Competition overview
Winner blog/article
Winner (7th) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

119 Second Annual Data Science Bowl

Transforming How We Diagnose Heart Disease

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

120 Yelp Restaurant Photo Classification

Predict attribute labels for restaurants using user-submitted photos

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

121 BNP Paribas Cardif Claims Management

Can you accelerate BNP Paribas Cardif's claims management process?

Type -
Competition overview
Winner blog/article
Winner (2nd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

122 Home Depot Product Search Relevance

Predict the relevance of search results on homedepot.com

Type -
Competition overview
Winner blog/article
Winner (3rd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

123 Santander Customer Satisfaction

Which customers are happy customers?

Type -
Competition overview
Winner blog/article
Winner (3rd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

124 San Francisco Crime Classification

*Predict the category of crimes that occurred in the city by the bay *

Type -
Competition overview
Winner blog - NA
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

125 Expedia Hotel Recommendations

Which hotel type will an Expedia customer book?

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel
Take home message - NA

126 Kobe Bryant Shot Selection

Which shots did Kobe sink?

Type -
Competition overview
Winner blog -NA
Winner (12th) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

127 Draper Satellite Image Chronology

Can you put order to space and time?

Type -
Competition overview
Winner blog/article
Winner (12th) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

128 Facebook V: Predicting Check Ins

Identify the correct place for check ins

Type -
Competition overview
Winner (7th) blog/article
Winner (2nd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

129 Avito Duplicate Ads Detection

Can you detect duplicitous duplicate ads?

Type -
Competition overview
Winner (2nd) blog/article
Winner (2nd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

130 State Farm Distracted Driver Detection

Can computer vision spot distracted drivers?

Type -
Competition overview
Winner blog/article
Winner (29th) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

131 Ultrasound Nerve Segmentation

Identify nerve structures in ultrasound images of the neck

Type -
Competition overview
Winner blog/article
Winner (25th) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

132 Grupo Bimbo Inventory Demand

Maximize sales and minimize returns of bakery goods

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

133 TalkingData Mobile User Demographics

Get to know millions of mobile device users

Type -
Competition overview
Winner blog/article
Winner (3rd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

134 Predicting Red Hat Business Value

Classify customer potential

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - First, proper cross-validation set was very important. Tricks I did to have a representative CV set.

135 Integer Sequence Learning

1, 2, 3, 4, 5, 7?!

Type -
Competition overview
Winner (17th) blog/article
Winner (?) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

136 Painter by Numbers

Does every painter leave a fingerprint?

Type -
Competition overview
Winner blog - NA
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

137 Bosch Production Line Performance

Reduce manufacturing failures

Type -
Competition overview
Winner blog/article
Winner (57) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

138 Melbourne University AES/MathWorks/NIH Seizure Prediction

Predict seizures in long-term human intracranial EEG recordings

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

139 Allstate Claims Severity

How severe is an insurance claim?

Type -
Competition overview
Winner blog/article
Winner (2nd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

140 Santander Product Recommendation

Can you pair products with people?

Type -
Competition overview
Winner (2nd) blog/article
Winner (2nd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

141 Facial Keypoints Detection

Detect the location of keypoints on face images

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel
Take home message - NA

142 Outbrain Click Prediction

Can you predict which recommended content each user will click?

Type -
Competition overview
Winner blog - NA
Winner (13th) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

143 Two Sigma Financial Modeling Challenge

Can you uncover predictive value in an uncertain world?

Type -
Competition overview
Winner (7th) blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

144 Dstl Satellite Imagery Feature Detection

Can you train an eye in the sky?

Type -
Competition overview
Winner (2nd) blog/article
Winner (3rd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

145 Transfer Learning on Stack Exchange Tags

Predict tags from models trained on unrelated topics

Type -
Competition overview
Winner blog - NA
Winner (4th) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

146 Data Science Bowl 2017

Can you improve lung cancer detection?

Type -
Competition overview
Winner blog - NA
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

147 The Nature Conservancy Fisheries Monitoring

Can you detect and classify species of fish?

Type - Classification
Competition overview
Winner blog/article
Winner (?) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

148 Two Sigma Connect: Rental Listing Inquiries

How much interest will a new rental listing on RentHop receive?

Type -
Competition overview
Winner (2nd) blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

149 Google Cloud & YouTube-8M Video Understanding Challenge

Can you produce the best video tag predictions?

Type - Video Tagging
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

150 Quora Question Pairs

Can you identify question pairs that have the same intent?

Type -
Competition overview
Winner blog/article
Winner (3rd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

151 Intel & MobileODT Cervical Cancer Screening

Which cancer treatment will be most effective?

Type -
Competition overview
Winner blog - NA
Winner (4th) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

152 NOAA Fisheries Steller Sea Lion Population Count

How many sea lions do you see?

Type -
Competition overview
Winner blog - NA
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

153 Sberbank Russian Housing Market

Can you predict realty price fluctuations in Russia’s volatile economy?

Type -
Competition overview
Winner blog/article
Winner (15th) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

154 iNaturalist Challenge at FGVC 2017

Fine-grained classification challenge spanning 5,000 species.

Type - Classification
Competition overview
Winner blog - NA
Winner (13th) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

155 Mercedes-Benz Greener Manufacturing

Can you cut the time a Mercedes-Benz spends on the test bench?

Type -
Competition overview
Winner blog/article
Winner (11th) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

156 Planet: Understanding the Amazon from Space

Use satellite data to track the human footprint in the Amazon rainforest

Type -
Competition overview
Winner blog/article
Winner (3rd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

157 Instacart Market Basket Analysis

Which products will an Instacart consumer purchase again?

Type -
Competition overview
Winner blog - NA
Winner (2nd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

158 Invasive Species Monitoring

Identify images of invasive hydrangea

Type -
Competition overview
Winner blog - NA
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

159 New York City Taxi Trip Duration

Share code and data to improve ride time predictions

Type -
Competition overview
Winner (4th) blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

160 Carvana Image Masking Challenge

Automatically identify the boundaries of the car in an image

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

161 NIPS 2017: Defense Against Adversarial Attack

Create an image classifier that is robust to adversarial attacks

Type -
Competition overview
Winner blog - NA
Winner (3rd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

162 NIPS 2017: Targeted Adversarial Attack

Develop an adversarial attack that causes image classifiers to predict a specific target class

Type -
Competition overview
Winner blog - NA
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

163 NIPS 2017: Non-targeted Adversarial Attack

Imperceptibly transform images in ways that fool classification models

Type -
Competition overview
Winner blog - NA
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

164 Web Traffic Time Series Forecasting

Forecast future traffic to Wikipedia pages

Type - Forecasting
Competition overview
Winner blog/article
Winner (2nd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

165 Text Normalization Challenge - Russian Language

Convert Russian text from written expressions into spoken forms

Type - NLP
Competition overview
Winner blog - NA
Winner (3rd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

166 Text Normalization Challenge - English Language

Convert English text from written expressions into spoken forms

Type - NLP
Competition overview
Winner (4th) blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

167 Porto Seguro’s Safe Driver Prediction

Predict if a driver will file an insurance claim next year.

Type - Classification
Competition overview
Winner blog/article
Winner (2nd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

168 Cdiscount’s Image Classification Challenge

Categorize e-commerce photos

Type - Classification
Competition overview
Winner blog/article
Winner (2nd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

169 Spooky Author Identification

Share code and discuss insights to identify horror authors from their writings

Type - NLP
Competition overview
Winner blog - NA
Winner notebook/code/kernel - NA
Other notebook/code/kernel
Take home message - NA

170 Passenger Screening Algorithm Challenge

Improve the accuracy of the Department of Homeland Security's threat recognition algorithms

Type -
Competition overview
Winner blog/article
Winner (7th) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

171 WSDM - KKBox's Music Recommendation Challenge

Can you build the best music recommendation system?

Type - Recommender System
Competition overview
Winner blog/article
Winner (3rd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

172 Zillow Prize: Zillow’s Home Value Prediction (Zestimate)

Can you improve the algorithm that changed the world of real estate?

Type - Regression
Competition overview
Winner (17th) blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

173 Santa Gift Matching Challenge

Down through the chimney with lots of toys...

Type -
Competition overview
Winner (2nd) blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

174 Corporación Favorita Grocery Sales Forecasting

Can you accurately predict sales for a large grocery chain?

Type - Regression
Competition overview
Winner blog/article
Winner notebook/code/kernel - see winner blog
Other notebook/code/kernel - NA
Take home message - NA

175 TensorFlow Speech Recognition Challenge

Can you build an algorithm that understands simple speech commands?

Type - Speech Recognition
Competition overview
Winner blog/article
Winner (3rd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

176 Statoil/C-CORE Iceberg Classifier Challenge

Ship or iceberg, can you decide from space?

Type - Classification
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel
Take home message - NA

177 Recruit Restaurant Visitor Forecasting

Predict how many future visitors a restaurant will receive

Type - Regression
Competition overview
Winner blog/article
Winner (3rd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

178 IEEE's Signal Processing Society - Camera Model Identification

Identify from which camera an image was taken

Type - Classification
Competition overview
Winner blog/article
Winner (2nd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

179 Nomad2018 Predicting Transparent Conductors

Predict the key properties of novel transparent semiconductors

Type -
Competition overview
Winner (5th) blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

180 Mercari Price Suggestion Challenge

Can you automatically suggest product prices to online sellers?

Type - Recommender System
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

181 Toxic Comment Classification Challenge

Identify and classify toxic online comments

Type - Classification
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

182 Google Cloud & NCAA® ML Competition 2018-Women's

Apply machine learning to NCAA® March Madness®

Type - Forecasting
Competition overview
Winner blog/article
Winner notebook/code/kernel -NA
Other notebook/code/kernel - NA
Take home message - NA

183 Google Cloud & NCAA® ML Competition 2018-Men's

Apply Machine Learning to NCAA® March Madness®

Type - Forecasting
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

184 2018 Data Science Bowl

Find the nuclei in divergent images to advance medical discovery

Type -
Competition overview
Winner blog - NA
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

185 DonorsChoose.org Application Screening

Predict whether teachers' project proposals are accepted

Type - NLP
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

186 TalkingData AdTracking Fraud Detection Challenge

Can you detect fraudulent click traffic for mobile app ads?

Type - Classification
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

187 Google Landmark Retrieval Challenge

Given an image, can you find all of the same landmarks in a dataset?

Type - Information Retrieval
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

188 Google Landmark Recognition Challenge

Label famous (and not-so-famous) landmarks in images

Type - Labelling
Competition overview
Winner blog/article
Winner (19th) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

189 iMaterialist Challenge (Furniture) at FGVC5

Image Classification of Furniture & Home Goods.

Type - Classification
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

190 iMaterialist Challenge (Fashion) at FGVC5

Image classification of fashion products.

Type - Image Classification
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

191 CVPR 2018 WAD Video Segmentation Challenge

Can you segment each objects within image frames captured by vehicles?

Type - Video Segmentation
Competition overview
Winner blog - NA
Winner (2nd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

192 Avito Demand Prediction Challenge

Predict demand for an online classified ad

Type - Forecasting
Competition overview
Winner blog/article
Winner (5th) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

193 Freesound General-Purpose Audio Tagging Challenge

Can you automatically recognize sounds from a wide range of real-world environments?

Type - Audio Tagging
Competition overview
Winner (4th) blog/article
Winner (4th) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

194 The 2nd YouTube-8M Video Understanding Challenge

Can you create a constrained-size model to predict video labels?

Type - Video Labelling
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

195 TrackML Particle Tracking Challenge

High Energy Physics particle tracking in CERN detectors

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

196 Santander Value Prediction Challenge

Predict the value of transactions for potential customers.

Type -
Competition overview
Winner blog/article
Winner (21st) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

197 Home Credit Default Risk

Can you predict how capable each applicant is of repaying a loan?

Type -
Competition overview
Winner blog/article
Winner (2nd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

198 Google AI Open Images - Object Detection Track

Detect objects in varied and complex images.

Type - Object Detection
Competition overview
Winner (2nd) blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

199 Google AI Open Images - Visual Relationship Track

Detect pairs of objects in particular relationships.

Type - Object Detection
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

200 TGS Salt Identification Challenge

Segment salt deposits beneath the Earth's surface

201 RSNA Pneumonia Detection Challenge

Can you build an algorithm that automatically detects potential pneumonia cases?

Type - Image Classification
Competition overview
Winner blog - NA
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

202 Inclusive Images Challenge

Stress test image classifiers across new geographic distributions

Type -
Competition overview
Winner (2nd) blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

203 Airbus Ship Detection Challenge

Find ships on satellite images as quickly as possible

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

204 Don't call me turkey!

Thanksgiving Edition: Find the turkey in the sound bite

Type -
Competition overview
Winner blog - NA
Winner (2nd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

205 Quick, Draw! Doodle Recognition Challenge

How accurately can you identify a doodle?

Type -
Competition overview
Winner blog/article
Winner (8th) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

206 PLAsTiCC Astronomical Classification

Can you help make sense of the Universe?

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

207 Traveling Santa 2018 - Prime Paths

But does your code recall, the most efficient route of all?

Type -
Competition overview
Winner (2nd) blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

208 Human Protein Atlas Image Classification

Classify subcellular protein patterns in human cells

Type - Classification
Competition overview
Winner blog/article
Winner (3rd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

209 20 Newsgroups Ciphertext Challenge

V8g{9827$A${?^?}$$v7�*.yig$w9.8}*

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

210 PUBG Finish Placement Prediction (Kernels Only)

Can you predict the battle royale finish of PUBG Players?

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

211 Reducing Commercial Aviation Fatalities

Can you tell when a pilot is heading for trouble?

Type -
Competition overview
Winner (8th) blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

212 Quora Insincere Questions Classification

Detect toxic content to improve online conversations

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

213 Google Analytics Customer Revenue Prediction

Predict how much GStore customers will spend

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

214 Elo Merchant Category Recommendation

Help understand customer loyalty

Type - Category Recommendation
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

215 Humpback Whale Identification

Can you identify a whale by its tail?

Type - Images Classification
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

216 Microsoft Malware Prediction

Can you predict if a machine will soon be hit with malware?

Type - Malware Prediction
Competition overview
Winner (2nd) blog/article
Winner (2nd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

217 VSB Power Line Fault Detection

Can you detect faults in above-ground electrical lines?

Type - Faults Detection
Competition overview
Winner (2nd) blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

218 Histopathologic Cancer Detection

Identify metastatic tissue in histopathologic scans of lymph node sections

Type - Image Classification
Competition overview
Winner (17th) blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

219 Google Cloud & NCAA® ML Competition 2019-Women's

Apply Machine Learning to NCAA® March Madness®

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

220 Google Cloud & NCAA® ML Competition 2019-Men's

Apply Machine Learning to NCAA® March Madness*

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

221 PetFinder.my Adoption Prediction

How cute is that doggy in the shelter?

Type -
Competition overview
Winner (2nd) blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

222 Santander Customer Transaction Prediction

Can you identify who will make a transaction?

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

223 CareerCon 2019 - Help Navigate Robots

Compete to get your resume in front of our sponsors

Type -
Competition overview
Winner (3rd) blog/article
Winner (3rd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

224 Gendered Pronoun Resolution

Pair pronouns to their correct entities

Type - NLP
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

225 xx Don't Overfit!

A Fistful of Samples

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel
Take home message: The critical insight was being among the first to realize leaderboard probing is the key to winning.

226 Google Landmark Recognition 2019

Label famous (and not-so-famous) landmarks in images

Type - Image Labelling
Competition overview
Winner (2nd) blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

227 LANL Earthquake Prediction

Can you predict upcoming laboratory earthquakes?

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

228 Google Landmark Retrieval 2019

Given an image, can you find all of the same landmarks in a dataset?

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

229 iWildCam 2019 - FGVC6

Categorize animals in the wild

Type - Images Classification
Competition overview
Winner blog/article - NA
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

230 iMet Collection 2019 - FGVC6

Recognize artwork attributes from The Metropolitan Museum of Art

Type - Image Classification
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

231 iMaterialist (Fashion) 2019 at FGVC6

Fine-grained segmentation task for fashion and apparel

Type - Images Segmentation
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

232 Freesound Audio Tagging 2019

Automatically recognize sounds and apply tags of varying natures

Type - Audio Tagging
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

233 Instant Gratification

A synchronous Kernels-only competition

Type
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

234 Jigsaw Unintended Bias in Toxicity Classification

Detect toxicity across a diverse range of conversations

Type - NLP
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

235 Northeastern SMILE Lab - Recognizing Faces in the Wild

Can you determine if two individuals are related?

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

236 Generative Dog Images

Experiment with creating puppy pics

Type - Generative Images
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

237 Predicting Molecular Properties

Can you measure the magnetic interactions between a pair of atoms?

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

238 SIIM-ACR Pneumothorax Segmentation

Identify Pneumothorax disease in chest x-rays

Type - Vision
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

239 APTOS 2019 Blindness Detection

*Detect diabetic retinopathy to stop blindness before it's too late *

Type - Vision
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

240 Recursion Cellular Image Classification

CellSignal: Disentangling biological signal from experimental noise in cellular images

Type - Vision
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

241 Open Images 2019 - Visual Relationship

Detect pairs of objects in particular relationships

Type -
Competition overview
Winner (2nd) blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

242 Open Images 2019 - Object Detection

Detect objects in varied and complex images

Type - Vision
Competition overview
Winner (6th) blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

243 Open Images 2019 - Instance Segmentation

Outline segmentation masks of objects in images

Type - Vision
Competition overview
Winner (7th) blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

244 IEEE-CIS Fraud Detection

Can you detect fraud from customer transactions?

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

245 The 3rd YouTube-8M Video Understanding Challenge

Temporal localization of topics within video

Type - Vision
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

246 Kuzushiji Recognition

Opening the door to a thousand years of Japanese culture

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

247 Severstal: Steel Defect Detection

Can you detect and classify defects in steel?

Type - Vision
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

248 Lyft 3D Object Detection for Autonomous Vehicles

Can you advance the state of the art in 3D object detection?

Type - Vision
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

249 RSNA Intracranial Hemorrhage Detection

Identify acute intracranial hemorrhage and its subtypes

Type - Vision
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

250 Understanding Clouds from Satellite Images

Can you classify cloud structures from satellites?

Type - Visiion
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

251 2019 Kaggle Machine Learning & Data Science Survey

The most comprehensive dataset available on the state of ML and data science

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

252 Categorical Feature Encoding Challenge

Binary classification, with every feature a categorical

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

253 BigQuery-Geotab Intersection Congestion

Can you predict wait times at major city intersections?

Type -
Competition overview
Winner blog/article - NA
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

254 Kannada MNIST

MNIST like datatset for Kannada handwritten digits

Type -
[Competition overview](https://www.kaggle.com/c/Kannada-MNIST
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

255 ASHRAE - Great Energy Predictor III

How much energy will a building consume?

Type - Forecasting
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

256 NFL Big Data Bowl

How many yards will an NFL player gain after receiving a handoff?

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

257 Santa's Workshop Tour 2019

In the notebook we can build a model, and pretend that it will optimize...

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

258 Santa 2019 - Revenge of the Accountants

Oh what fun it is to revise . . .

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

259 Peking University/Baidu - Autonomous Driving

Can you predict vehicle angle in different settings?

Type - Vision
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

260 2019 Data Science Bowl

Uncover the factors to help measure how young children learn

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

261 TensorFlow 2.0 Question Answering

Identify the answers to real user questions about Wikipedia page content

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

262 Google QUEST Q&A Labeling

Improving automated understanding of complex question answer content

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

263 Bengali.AI Handwritten Grapheme Classification

Classify the components of handwritten Bengali

Type - Vision | Classification | Handewriting
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

264 DS4G - Environmental Insights Explorer

Exploring alternatives for emissions factor calculations

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

265 Categorical Feature Encoding Challenge II

Binary classification, with every feature a categorical (and interactions!)

Type - Binary Classification
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

266 Deepfake Detection Challenge

Identify videos with facial or voice manipulations

Type - Video
[Competition overview](https://www.kaggle.com/c/deepfake-detection-challenge
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

267 Google Cloud & NCAA® March Madness Analytics

Uncover the madness of March Madness®

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

268 COVID19 Global Forecasting (Week 5)

Forecast daily COVID-19 spread in regions around world

Type - Forecasting
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

269 University of Liverpool - Ion Switching

Identify the number of channels open at each time point

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

270 Herbarium 2020 - FGVC7

Identify plant species from herbarium specimens. Data from New York Botanical Garden.

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

271 iWildCam 2020 - FGVC7

Categorize animals in the wild

Type - Classification
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

272 iMaterialist (Fashion) 2020 at FGVC7

Fine-grained segmentation task for fashion and apparel

Type - Classification | Images Segmentation
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

273 Plant Pathology 2020 - FGVC7

Identify the category of foliar diseases in apple trees

Type - Vision | Classification
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

274 Abstraction and Reasoning Challenge

Create an AI capable of solving reasoning tasks it has never seen before

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

275 Tweet Sentiment Extraction

Extract support phrases for sentiment labels

Type - NLP | Sentiment Analysis
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

276 Jigsaw Multilingual Toxic Comment Classification

Use TPUs to identify toxicity comments across multiple languages

Type - NKP | Classification | Sentiment Analysis
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

277 TReNDS Neuroimaging

Multiscanner normative age and assessments prediction with brain function, structure, and connectivity

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

278 M5 Forecasting - Uncertainty

Estimate the uncertainty distribution of Walmart unit sales.

Type - Forecasting
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

279 M5 Forecasting - Accuracy

Estimate the unit sales of Walmart retail goods

Type - Forecasting
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

280 ALASKA2 Image Steganalysis

Detect secret data hidden within digital images

Type - Vision
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

281 Prostate cANcer graDe Assessment (PANDA) Challenge

Prostate cancer diagnosis using the Gleason grading system

Type - Medicine | Vision
Competition overview
Winner blog/article - NA
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

282 Hash Code Archive - Photo Slideshow Optimization

Optimizing a photo album from Hash Code 2019

Type - Vision
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

283 SIIM-ISIC Melanoma Classification

Identify melanoma in lesion images

Type - Vision | Medicine
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

284 Google Landmark Retrieval 2020

Given an image, can you find all of the same landmarks in a dataset?

Type - Vision
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

285 Global Wheat Detection

Can you help identify wheat heads using image analysis?

Type - Vision
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

286 Cornell Birdcall Identification

Build tools for bird population monitoring

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

287 Halite by Two Sigma

Collect the most halite during your match in space

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

288 Google Landmark Recognition 2020

Label famous (and not-so-famous) landmarks in images

Type - Vision
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

289 OpenVaccine: COVID-19 mRNA Vaccine Degradation Prediction

Urgent need to bring the COVID-19 vaccine to mass production

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

290 OSIC Pulmonary Fibrosis Progression

Predict lung function decline

Type - Medicine
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

291 RSNA STR Pulmonary Embolism Detection

Classify Pulmonary Embolism cases in chest CT scans

Type - Medicine | Vision
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - Since the provided data are big and of high quality, we don't have to do cross validation, a single training/validation split is reliable enough.

292 Lyft Motion Prediction for Autonomous Vehicles

Build motion prediction models for self-driving vehicles

Type - Vision | Autonomous Vehicles
Competition overview
Winner blog/article
Winner (3rd) blog/article
Winner (3rd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

293 Conway's Reverse Game of Life 2020

Reverse the arrow of time in the Game of Life

Type - NA
Competition overview
Winner blog/article
Winner (3rd) blog/article
Winner (3rd) notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

294 Mechanisms of Action (MoA) Prediction

Can you improve the algorithm that classifies drugs based on their biological activity?

Type - Classification
Competition overview
Winner blog/article
Winner notebook/code/kernel
Other notebook/code/kernel - NA
Take home message - NA

295 Google Research Football with Manchester City F.C.

Train agents to master the world's most popular sport

Type - Reinforced Learning
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

296 Hash Code Archive - Drone Delivery

Can you help coordinate the drone delivery supply chain?

Type -
Competition overview
Winner blog/article
Winner notebook/code/kernel - NA
Other notebook/code/kernel - NA
Take home message - NA

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
IEEE-CIS Fraud Detection		IEEE-CIS Fraud Detection
Psuedo_Labelling		Psuedo_Labelling
Training/House Prices - Advanced Regression Techniques		Training/House Prices - Advanced Regression Techniques
.gitignore		.gitignore
README.md		README.md

kyaiooiayk/Kaggle-Competitions-Analysis

Folders and files

Latest commit

History

Repository files navigation

Kaggle-Competitions-Analysis

A very short intro on Kaggle competition via bullet points

Introductory Article on Kaggle competition

Other resources on how to approach a Kaggle competition

Kaggle Grandmasters

Installing Kaggle API

Kaggle API

Notebale Techniques

Useful resource

1 Forecast Eurovision Voting

2 Predict HIV Progression

3 Tourism Forecasting Part One

4 Tourism Forecasting Part Two

5 INFORMS Data Mining Contest 2010

6 Chess ratings - Elo versus the Rest of the World

7 IJCNN Social Network Challenge

8 R Package Recommendation Engine

9 RTA Freeway Travel Time Prediction

10 Predict Grant Applications

11 Stay Alert! The Ford Challenge

12 ICDAR 2011 - Arabic Writer Identification

13 Deloitte/FIDE Chess Rating Challenge

14 Don't Overfit!

15 Mapping Dark Matter

16 Wikipedia's Participation Challenge

17 dunnhumby's Shopper Challenge

18 Photo Quality Prediction

19 Give Me Some Credit

20 Don't Get Kicked!

21 Algorithmic Trading Challenge

22 The Hewlett Foundation: Automated Essay Scoring

23 KDD Cup 2012, Track 2

24 Predicting a Biological Response

25 Facebook Recruiting Competition

26 EMI Music Data Science Hackathon - July 21st - 24 hours

27 Detecting Insults in Social Commentary

28 Predict Closed Questions on Stack Overflow

29 Observing Dark Worlds

30 Traveling Santa Problem

31 Event Recommendation Engine Challenge

32 Job Salary Prediction

33 Influencers in Social Networks

34 Blue Book for Bulldozers

35 Challenges in Representation Learning: Multi-modal Learning

36 Challenges in Representation Learning: The Black Box Learning Challenge

37 Challenges in Representation Learning: Facial Expression Recognition Challenge

38 KDD Cup 2013 - Author Disambiguation Challenge (Track 2)

39 The ICML 2013 Whale Challenge - Right Whale Redux

40 KDD Cup 2013 - Author-Paper Identification Challenge (Track 1)

41 Amazon.com - Employee Access Challenge

42 MLSP 2013 Bird Classification Challenge

43 RecSys2013: Yelp Business Rating Prediction

44 The Big Data Combine Engineered by BattleFin

45 Belkin Energy Disaggregation Competition

46 StumbleUpon Evergreen Classification Challenge

47 AMS 2013-2014 Solar Energy Prediction Contest

48 Accelerometer Biometric Competition

49 Multi-label Bird Species Classification - NIPS 2013

50 See Click Predict Fix

51 Partly Sunny with a Chance of Hashtags

52 Facebook Recruiting III - Keyword Extraction

53 Personalized Web Search Challenge

54 Packing Santa's Sleigh

55 Dogs vs. Cats

56 Conway's Reverse Game of Life

57 Loan Default Prediction - Imperial College London

58 Galaxy Zoo - The Galaxy Challenge

59 March Machine Learning Mania

60 Large Scale Hierarchical Text Classification

61 Walmart Recruiting - Store Sales Forecasting

62 The Analytics Edge (15.071x)

63 CONNECTOMICS

64 Allstate Purchase Prediction Challenge

65 Greek Media Monitoring Multilabel Classification (WISE 2014)

66 KDD Cup 2014 - Predicting Excitement at DonorsChoose.org