Car depreciation and regression splines….

You might have that terrible feeling when buying a new car. After picking it up and driving your new car out of the show room, it immediately looses value! The question is: how much?  Of course this car depreciation will depend on the make and model of the car.

In order to get some idea of the depreciation I have extracted data from a Dutch used car sales site, so the amounts are in Euros.There are some features that you can scrape for every car like. For example the make, brand, fuel type, transmission, energy label, age. To get an idea of the data, the figure below displays around 2000 Renault Clios, extracted from the site. On the x axis, we have mileage (in this case kilometers driven), and on the y axis we have the price in Euros (the price that is displayed in the add, so not the price that is actually paid).


A simple linear regression model is fitted with these 2000 Renault Clio’s. The parameters are given in the following figure


So on average, a new Clio will cost around 15,082 Euros (Clios with automatic transmission are 1989 Euros more expensive), every kilometer you drive in a Clio will cost you 7.28 cents in loss of value, The R-squared of this simple regression model is 0.66. Some other cars to compare the depreciation are given in the table below.


Looking at the plot above, you can already see that a straight line is probably not the best curve that can fit the data points. Hmmm, so what other curves can we try?  Splines!

Splines can be seen as piece-wise polynomials, glued together. So for example from 0 to 25,000 kilometers a polynomial is used to predict the price, from 25,000 km to 75,000 another polynomial is used to predict the price. The points at which the polynomials are glued together are called knots. Splines are constructed in such a way that at the knots we have a smooth curve. The term comes from the tool used by shipbuilders and drafters to construct smooth shapes having desired properties. Drafters have long made use of a bendable strip fixed in position at a number of points that relaxes to form a smooth curve passing through those points.

In SAS the adaptivereg procedure can fit splines. It has some handy features, it constructs spline basis functions in an adaptive way by automatically selecting appropriate knot values for different variables and obtains reduced models by applying model selection techniques. Let’s fit a spline model on the Renault Clio’s using the procedure.


The spline model has an R-squared of 0.76, a big improvement compared to the R-squared of 0.66 of the simple linear regression model. How does the car value prediction look like? Look at the figure below


we see that new Clios with an automatic transmission are around 5000 Euro more expensive than Clios with a manual transmission, however, these automatics loose value much faster. There is a turning point at around 55,000 KM, the rate at which a Clio looses value (around 17 cents/KM) starts to decline after 55,000 KM (around 4 cents/KM). Interested in the depreciation of other car makes, look at my little Shiny app.


A happy Clio driver!


5 thoughts on “Car depreciation and regression splines….

  1. Pingback: Deploying a car price model using R and AzureML | Longhow Lam's Blog

  2. Peter,

    I have used the R package rvest to scrape data from web sites. Example code can be found on my blog post on soap analytics. And there are many good tutorials out there,



  3. I got this already from the post, was hoping for a full code example with this data… I’m a rookie with R etc. thanks anyhow for responding.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s