Bias-Variance Tradeoff

Thiruthuvaraj Rajasekhar
2 min readMar 14, 2022

--

Every model in machine learning is some mathematical model and it assumes data to follow some distributions, and assumptions like the data to should be a non singular matrix etc. After these assumptions, if we build any model using the data, the Total Error of the model can be defined as a Sum of Bias, Variance and irreducible error.

Every supervised learning model tried to find the best fit for function f(x), for output variable Y, given input variable X. And we say that the model is best fit, if has a very low error.

As I said before, the error can be attributed to Error due to Bias and Error due to Variance.

Error Due to Bias: it is the difference between average predictions of our model and actual values. Suppose that you have built n number of models with different data, so due to randomness, you get a range of predictions. How far are the average predictions from the actual values is measured by Bias. If you build a model which does not capture any complex patterns or if model is built on non linear data. We say it has high bias, because of the fact that it pays less attention to the training data and the errors in train data and test data are always high

Error Due to Variance: it is the difference between the prediction and actual value for a single data point. Suppose that you built n number of models, variance is how much difference in predictions for a given point, in multiple iterations of models. Here Model pays a lot of attention to the training data and does not generalize on which it is trained.

Now, consolidating the above points, if you have a model equation as

y = f(x) + e

Err(X) = E[(y-y_hat)²]

Can be further be split as
Err(X) = Bias² + Variance + irreducible error

Bias² = (E[y_hat] — y)² ( difference between the average prediction and the actual value sqaure)
Variance = E[E[y_hat] — y_hat]² ( average of the difference between average prediction and prediction from model squared)

Support If you like this content, Please support for more content!!!! :-)

--

--

Thiruthuvaraj Rajasekhar

Mining Data For Insights | Passionate to write | Data Scientist