Which customers will purchase a quoted insurance plan?

This is a past Kaggle competition in which the participants where challenged to find whether a customer whom we are going to reach out will accept or reject the quote from the tele-marketer. Even though the dataset in this case was provided by a insurance company this is a generic problem applicable to many industry domain . Basically if we had a crystal ball and knew accurately the people who are going to be our future customer it would be very easy to assign marketing budget and strategies by performing targeted offers. …

NLP is one domain which is changing rapidly . The advent of transformer architecture has made NLP task more close to human level accuracy. I too wanted to explore this development which is making machine parse data like a human , but just reading research paper was not enough . I tried to apply my new gained skill on an old Kaggle competition.

Transformer architecture was born out of the Attention is all you need paper from Google. A nice explanation of Transformer can be found in this article

Hugging Face’s transformer library builds and maintains the different kind of…

If Machine learning would have a engineering subject during my under-graduate days unsupervised learning would be the chapter which many of my engineering friends would have kept as an optional read, focusing most of the attention on the other glamorous brother supervised learning.

If you pick any general machine learning text book you will surely find un-supervised learning always the last chapter of the book , but if you dig deeper un-supervised learning can be applied to a number of important task such as manufacturing defect detection ,labelling un-labeled samples, catching outliers in a dataset and fraud detection in a…

As part of my continuing data analysis learning journey I thought of trying out past completed Kaggle competition in order to test my skills and knowledge so far . While going through the datasets I came across this Mercedes Green Manufacturing Kaggle competition conducted in sometime in 2017.

Coming from a automotive domain I though this could be a good dataset to apply by data analysis skills. On reading the competition description I could relate to this problem even more closely .

As Corona-virus spreads across the world cancelling sporting events including the IPL thereby rendering my Hotstar membership moot .So as a budding data-scientist I decided to substitute IPL window with a Data visualization project of my own . I found the data set of all IPL games from 2008–2019 on Kaggle.

For readers unknown to cricket . Here is Youtube video on IPL and Cricket

The data set contained two csv file

Matches.csv : Information of all matches played in the IPL from 2008–2019 providing the below information


When I moved to Canada couple of years back I was looking to buy a pre-owned/pre-certified car. However, I used to the feel the prices quoted by the dealers to be very high for the car which I use to like. The prices which used to be quoted always use to beat my prediction by $2000–$3000 . Also since I am mechanical engineer who also happen to work in the automotive domain I always use to wonder what factors influence the price of a pre-certified car. With this data set on Kaggle, I got a chance to predict just that.


Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store