Brief
The Adjusted R-Squared is an evaluation metric that eliminates the limitations of the R-Squared. It tells us how good our regression line fit to our model and how much output varies based on changes in independent variables just like R-Squared but penalizes the number of variables. When you add a new useful variable then Adjusted R-Squared increases, and if the variable is not useful then it will decrease. It will give us a floating-point positive or negative value. It will always be less than or equal to R-Squared value. Let’s find out how it works using the following Adjusted R-Squared formula:
Formula
\begin{equation} R_{a d j}^{2}=1-\left[\frac{\left(1-R^{2}\right)(n-1)}{n-k-1}\right] \end{equation}
Explanation
Here, \( R^{2} \) = R-Squared value,
n = number of records in your dataset,
k = number of variables added in the model excluding dependent variable.
• In the numerator of this formula, we are subtracting R-Squared from 1 and multiplying with number of records n - 1.
• Whereas In the denominator, we are subtracting the number of variables k - 1 from the number of records n.
• After that we are dividing both values and subtracting it from 1, that gives us the Adjusted R-Squared.
Let’s solve one example:
Example
Suppose we have a data with R-Squared \( R^{2} \) = 0.73, number of records n = 87652, number of variables (excluding dependent variable) k = 34.
If you want to know how to calculate the R-Squared \( R^{2} \), then check out here: R-Squared
Now, we will put this data into the Adjusted R-Squared formula:
\begin{equation} R_{a d j}^{2}=1-\left[\frac{\left(1-R^{2}\right)(n-1)}{n-k-1}\right] \end{equation}
\begin{equation} R_{a d j}^{2}=1-\left[\frac{\left(1 - 0.73\right)(87652-1)}{87652-34-1}\right] \end{equation}
\begin{equation} R_{a d j}^{2}=1-\left[\frac{23665.77}{87619}\right] \end{equation}
\begin{equation} R_{a d j}^{2}=1-0.27 = 0.73 \end{equation}
Conclusion
Here, 0.73 is the Adjusted R-Squared means that our model can fit a regression line such that it can identify 73% of the data correctly with currently added variables. If you add/remove variables and the Adjusted R-Squared increases that means your model is performing better.
Remember that Adjusted R-Squared can be negative too when your R-Squared is very low or close to 0.
Check out other Evaluation Metrics.