Polynomial Regression | Data Science | Machine Learning

Bayesian Information Criterion (BIC)

Determining the best degree of polynomial to choose in a polynomial regression.

Swapnil Kangralkar
3 min readApr 13, 2021

--

In this article we will learn what is Bayesian Information Criterion (BIC) and how it is used to choose the degree of a polynomial in a Polynomial Regression.

Sometimes R2 values vary slightly across two different degrees of polynomials. i.e. comparing a R2 score = 88.3% to R2 score = 88.4%. Also, how do we know which is better. R2=88% or R2=90% ?

Let’s study this by creating some dummy data:

Let’s fit the model with Ordinary Least Square (OLS). This package provides detailed stats summary like AIC, BIC etc.

A straight line definitely does not fit this data. Let’s generate polynomial equations from 2 to 17.

Instead of doing it individually for each degree, let’s use a for loop:

I will paste a few plots here:

Let’s create a pandas dataframe to easily compare the R2 values.

If you look at the R2 values for degree 5, 6, and 7, they are all very close and in such cases it would be impossible to tell looking simply at the R2 values which order of polynomial is best.

Sure after degree 10, the R2 value starts to drop, however, how could you tell whether degree 8 is better or degree 10. Conventionally, we consider higher value of R2 to be better, which in this case would lead us to choose degree 11. (It’s easy to plot graphs with 1 or 2 independant variables, but with more features, we are not able to plot the graphs either)

In such cases BIC helps us choose the optimal value of order for the polynomial equation.

The plot above shows us that the optimal degree of the polynomial equation for this particular data is 5.

Therefore, the 5th order of the polynomial is best.

Thank you for reading. Any feedback will be appreciated.

Get in touch if you have further questions via LinkedIn.

--

--