This respository contains our code for competition in kaggle.
27th Place Solution for PetFinder Adoption Prediction
Team: Y.Nakama, currypurin, atfujita, copypaste
Public score: 0.484(6th)
Private score: 0.43455(27th)
- Features from json files and text are almost same as public kernels
- Features of Malaysia - GDP, Area, Population, HDI(Human Development Index)
- First image features extraction by Densenet121
- Var aggregation on basis of RescuerID to tell the model that if the RescuerID-Group treat their pets in the same quality or not
- New health features of how many 1(good) or 3(Not Sure) in ['Health', 'Vaccinated', 'Dewormed', 'Sterilized']
- New age feature that expresses if the pet is younger or older in its RescuerID-Group or overall by using 'Age' and 'RescuerID_Age_var'
The following features have high importance
- First image features extraction by Densenet121 and MobileNet
- second later image features extraction by Densenet121
- groupby RescuerID
- pure_breed(x)
- image features extraction & SVD
- text data SVD
- groupby RescuerID
- text data SVD and NMF
- different tokenization, with/without stemming
- countvectorizer / tfidfvectorizer
- image features extraction by Densenet121 and MobileNet
- image quality features by blur and NIMA
- LightGBM
- XGBoost
- CatBoost
- We performed ridge regression stacking using 9models(all GBDT).
To check a part of our challenges, see this blog (written in Japanese).