How to get into the top 15 of a Kaggle competition using Python

Kaggle competitions are a fantastic way to learn data science and build your portfolio. I personally used Kaggle to learn many data science concepts. I started out with Kaggle a few months after learning programming, and later won several competitions.

Doing well in a Kaggle competition requires more than just knowing machine learning algorithms. It requires the right mindset, the willingness to learn, and a lot of data exploration. Many of these aspects aren’t typically emphasized in tutorials on getting started with Kaggle, though. In this post, I’ll cover how to get started with the Kaggle Expedia hotel recommendations competition, including establishing the right mindset, setting up testing infrastructure, exploring the data, creating features, and making predictions.

At the end, we’ll generate a submission file using the techniques in the this post. As of this writing, the submission would rank in the top 15.

Where this submission would rank as of this writing.

The Expedia Kaggle competition

The Expedia competition challenges you with predicting what hotel a user will book based on some attributes about the search the user is conducting on…