2.39
—
—
2.39
—
—
The Sum of Squared Residuals (SSR) Calculator computes the total sum of squared differences between observed values and predicted values from a regression model. The SSR (also called the residual sum of squares or RSS) is the most fundamental measure of model fit in ordinary least squares regression. It quantifies the total amount of variability in the dependent variable that remains unexplained by the model.
In regression analysis, the OLS method finds the parameter estimates (intercept and slopes) that minimize the SSR. This optimization criterion defines the "best-fitting" line as the one that makes the total squared prediction errors as small as possible. The SSR is therefore not just a diagnostic measure — it is the objective function that drives the entire estimation process. A smaller SSR indicates a better fit, meaning the model's predictions are closer to the observed data.
This calculator also computes two related statistics derived from the SSR: the Mean Squared Error (MSE) and the Root Mean Squared Error (RMSE). The MSE is the SSR divided by the number of observations, representing the average squared prediction error. The RMSE is the square root of the MSE, bringing the error measure back to the original units of the dependent variable, which makes it directly interpretable as a typical prediction error.
The SSR plays a central role in nearly every aspect of regression inference. It is used to compute R² (as the numerator in 1 - SSR/SS_total), to calculate the standard error of regression coefficients, to construct F-tests for overall model significance, and to perform analysis of variance (ANOVA) decompositions. Understanding the SSR and its relationship to other sums of squares is essential for anyone working with regression models in any quantitative field.
In machine learning and predictive modeling, the MSE and RMSE derived from the SSR are the most common loss functions for evaluating regression models. They are used in model training (gradient descent minimizes MSE), model selection (cross-validated RMSE compares models), and model reporting (RMSE gives stakeholders an intuitive sense of prediction accuracy). This calculator accepts up to five data points with paired observed and predicted values, enabling quick computation of these essential error metrics.
The calculator computes the SSR by summing the squared residuals across all data points: $$SSR = \sum_{i=1}^{n}(y_i - \hat{y}_i)^2$$
Where y_i is the observed value and \u0177_i is the predicted value for each observation. Each residual e_i = y_i - \u0177_i is squared and accumulated.
The Mean Squared Error (MSE) normalizes by the number of observations: $$MSE = \frac{SSR}{n} = \frac{\sum_{i=1}^{n}(y_i - \hat{y}_i)^2}{n}$$
The Root Mean Squared Error (RMSE) takes the square root to return to the original units: $$RMSE = \sqrt{MSE} = \sqrt{\frac{\sum_{i=1}^{n}(y_i - \hat{y}_i)^2}{n}}$$
Note: In formal regression theory, the unbiased estimate of error variance uses n-k-1 in the denominator (where k is the number of predictors). This calculator uses n for simplicity, which gives the population MSE rather than the sample-adjusted version.
A smaller SSR indicates better model fit — the predictions are closer to observed values. The SSR of zero means the model predicts every observation perfectly. The MSE gives the average squared prediction error per observation, useful for comparing models applied to the same dataset. The RMSE is the most interpretable metric: it represents the typical prediction error in the same units as the dependent variable. For example, if predicting house prices in dollars, an RMSE of 15,000 means the model's predictions are typically off by about $15,000.
Inputs
Results
Residuals are small (0.5, 0.8, -1.1, 0.2, -0.5). SSR = 2.15 with RMSE = 0.656, indicating predictions are typically off by less than 1 unit. This is a well-fitting model.
Inputs
Results
Large residuals (-5, 8, 5) produce SSR = 114 and RMSE = 6.16. The model's predictions are typically off by over 6 units, suggesting a poor fit.
The terminology can be confusing because different textbooks use different conventions. SSR (Sum of Squared Residuals) and SSE (Sum of Squared Errors) refer to the same quantity: $$\sum(y_i - \hat{y}_i)^2$$. Some authors use SSR to mean Sum of Squares due to Regression (the explained portion), so always check the context. In this calculator, SSR = Sum of Squared Residuals = unexplained variation.
If you simply sum the raw residuals, positive and negative errors cancel out, and the sum always equals zero (for regression with an intercept). Squaring ensures all terms are positive and prevents cancellation. Additionally, squaring has mathematical advantages: the squared function is differentiable (enabling calculus-based optimization), and minimizing squared errors has connections to maximum likelihood estimation under normal error assumptions.
There is no universal 'good' RMSE because it depends on the scale and variability of the dependent variable. An RMSE of 5 is excellent if y values range from 0 to 10,000, but terrible if y values range from 0 to 10. Compare RMSE to the standard deviation of y: if RMSE < SD(y), the model predicts better than the mean. You can also compute RMSE/mean(y) as a relative measure (coefficient of variation of RMSE).
R² is defined as: $$R^2 = 1 - \frac{SSR}{SS_{\text{total}}}$$ where SS_total is the total sum of squares $$\sum(y_i - \bar{y})^2$$. A smaller SSR relative to SS_total yields a higher R². When SSR = 0 (perfect fit), R² = 1. When SSR = SS_total (model is no better than predicting the mean), R² = 0.
RMSE is generally preferred for interpretation because it is in the same units as the dependent variable. MSE is preferred for mathematical operations because it avoids the square root (which complicates derivatives and decompositions). In machine learning, MSE is the standard loss function for gradient descent optimization. In reporting results to non-technical audiences, RMSE is more intuitive.
The F-test for overall regression significance compares the explained and unexplained variance: $$F = \frac{SS_{\text{reg}} / k}{SSR / (n-k-1)} = \frac{MS_{\text{reg}}}{MS_{\text{res}}}$$ A large F-value (small SSR relative to SS_regression) indicates the model explains significantly more variance than expected by chance. The F-statistic follows an F-distribution with k and n-k-1 degrees of freedom under the null hypothesis.
Roboculator Team
The Roboculator Team explains calculations, planning tools, and practical formulas in clear language for real-life situations.
How helpful was this calculator?
Be the first to rate!
Linear Regression Calculator
Regression & Correlation Analysis
Simple Linear Regression Calculator
Regression & Correlation Analysis
Multiple Regression Calculator
Regression & Correlation Analysis
Polynomial Regression Calculator
Regression & Correlation Analysis
Exponential Regression Calculator
Regression & Correlation Analysis
Logarithmic Regression Calculator
Regression & Correlation Analysis