Predict the re-sale price of BMW car using Neural Network

SUMEET SAWANT
4 min readApr 18, 2020

--

When I moved to Canada couple of years back I was looking to buy a pre-owned/pre-certified car. However, I used to the feel the prices quoted by the dealers to be very high for the car which I use to like. The prices which used to be quoted always use to beat my prediction by $2000–$3000 . Also since I am mechanical engineer who also happen to work in the automotive domain I always use to wonder what factors influence the price of a pre-certified car. With this data set on Kaggle, I got a chance to predict just that.

I am sure many more people face the same question. I hope this data analysis and subsequent deep learning prediction model can help them.

First lets describe the data-set variables

Maker key: The brand of the car

Model key: The model of the car

Mileage: Total miles driven

Engine power: Engine capacity

Registration date: Date car was registered

Fuel: Type of fuel ( diesel, petrol,..)

Paint color: The color of the car car type- The type of car (sedan, SUV,)

Feature 1 to 8: Boolean features which the company wants to explore

Price: The price at which it was auctioned

Sold at: The date at which it was sold at

I explored the data with the help of matplotlib and seaborn the data visualization packages of python I was able to get some insights into the data set . I am going to present few of them in this post with a link to my Kaggle work space for more detailed graphs

First I see that in the re-sale market the paint color of the car rarely influence the final price of the car . Below is the violin plot of paint color vs price .

Violin Plot for Paint color vs Price

We see that the median-price and the Inter quartile range (IQR)for all colors is almost similar .

Second among all the BMW model sold SUV and coupe command a higher price range even with more miles driven. This could in part be due to the higher cost price of this vehicles and in part could be the demand in the resale market for this model is higher

Box plot showing how SUV and Couple have higher median price vs other models
SUV ( red) and Coupe (pink) show a higher price

Third as the mileage on the car increases the price commanded by the car decreases. The price of the car also decreases with time even if the miles on the car is lower.

The graph below shows Price vs Mileage and color coded by the registration date of the car . We can see that older car with less miles command a lower price

Finally I proceed to fit a 3 Layer Neural Network model on the data to predict the price of BMW car with the above features . After hyper parameter tuning I was able to get a R-squared value of about 82% which means that the model can explain about 82% of the variation in the price with the given features .

Summary of the Neural Network fitted

The side snapshots show the R-squared value I obtain from my deep neural network . Also below is the Loss and validation loss function plotted wrt iterations , I will avoid going into the initialization and other fine details of the Neural net fitted .

All those information can be found on my Kaggle notebook

The one thing I was not able to find was the 8 boolean features .If any one knows what those 8 boolean feature are please let me know as well.

For additional questions, please feel free to connect with me via LinkedIn here: https://www.linkedin.com/in/sawantsumeet/

--

--