2.25
2.25
—
2.25
2.25
—
The Standardized Residual Calculator converts a raw residual into a standardized (z-score) form by dividing it by the standard error of the residuals. Standardized residuals allow you to compare residuals across different models, datasets, and scales on a common metric. They are essential for identifying potential outliers, checking the normality assumption in regression analysis, and assessing whether individual observations are unusually far from the model's predictions.
Raw residuals (the difference between observed and predicted values) are measured in the original units of the dependent variable and have different variances depending on the leverage of each observation. This makes direct comparison of raw residuals misleading. Standardizing divides each residual by a measure of its expected variation, producing a dimensionless quantity that can be compared against standard normal distribution benchmarks.
A standardized residual greater than 2 in absolute value is commonly flagged as a potential outlier, and values exceeding 3 are considered strong outliers. Under the normality assumption, approximately 95% of standardized residuals should fall between -2 and +2, and 99.7% should fall between -3 and +3. This calculator provides the standardized residual, its absolute value, and an outlier flag to help you quickly assess whether an observation warrants further investigation.
Standardized residuals are widely used in regression diagnostics, quality control, experimental analysis, and machine learning model evaluation. In clinical trials, a standardized residual might flag a patient whose response to treatment was unusually different from predictions. In manufacturing, it might identify a product batch that deviates significantly from expected quality parameters. In any setting where a predictive model is used, standardized residuals provide a principled, statistically grounded method for detecting anomalies.
It is important to distinguish between standardized residuals (dividing by the overall residual standard error) and studentized residuals (dividing by an observation-specific standard error that accounts for leverage). Studentized residuals are more precise for formal outlier tests, but standardized residuals are simpler to compute and widely used as a first-pass diagnostic. This calculator implements the simpler standardized version, which is appropriate for most practical purposes.
The standardized residual divides the raw residual by the standard error of the residuals: $$z_i = \frac{e_i}{s}$$
Where:
The standard error s represents the typical magnitude of residuals in the model. By dividing by s, the standardized residual tells you how many standard deviations the observation is from the regression line.
The outlier flag uses common thresholds:
A standardized residual near zero means the observation is close to the model's prediction. Values between -1 and +1 are typical. Values between -2 and +2 are within normal range but worth monitoring. Values beyond +/-2 suggest the observation may be an outlier or that the model fits it poorly. The outlier flag reports: 0 (normal, |z|<=1), 1 (mild, 1<|z|<=2), 2 (potential outlier, 2<|z|<=3), or 3 (strong outlier, |z|>3). Always investigate outliers rather than automatically removing them — they may contain valuable information.
Inputs
Results
A residual of 4.5 with standard error 2.0 gives z = 2.25, flagged as a potential outlier (|z| > 2). This observation is 2.25 standard deviations from the regression line.
Inputs
Results
A residual of -1.3 with standard error 2.0 gives z = -0.65, which is well within normal range (|z| < 1). This observation is close to the model's prediction.
Standardized residuals divide the raw residual by the overall standard error s, treating all observations the same. Studentized residuals (also called internally studentized) divide by an observation-specific standard error that accounts for the observation's leverage (how far its predictor values are from the mean). Externally studentized (deleted) residuals go further by recomputing the standard error with that observation removed. Studentized residuals follow a t-distribution and are more appropriate for formal outlier tests.
Under the assumption that residuals follow a normal distribution, approximately 95% of observations should have standardized residuals between -2 and +2. An observation with |z| > 2 falls in the extreme 5% of the distribution. While this does not prove the observation is erroneous, it is sufficiently unusual to warrant further investigation. The threshold is a convention, not a strict rule, and should be adjusted based on sample size and context.
No. Outliers should be investigated, not automatically removed. An outlier may indicate: (1) a data entry or measurement error (should be corrected), (2) an observation from a different population (may need to be modeled separately), (3) a genuine extreme value that is informative, or (4) model misspecification (the model, not the data, may be wrong). Removing valid observations biases results and reduces statistical power.
The standard error of residuals (also called the residual standard error or root MSE) is: $$s = \sqrt{\frac{\sum_{i=1}^{n} e_i^2}{n - k - 1}}$$ where n is the sample size and k is the number of predictors. Most statistical software reports this value in the regression output. It represents the typical size of prediction errors in the model's original units.
Yes. Standardized residuals only follow a normal distribution if the underlying regression assumptions hold. If the true error distribution is skewed, heavy-tailed, or multimodal, the standardized residuals will reflect this. Checking the normality of standardized residuals (via histograms, Q-Q plots, or Shapiro-Wilk tests) is itself a key diagnostic step. Non-normal residuals may indicate the need for a generalized linear model or data transformation.
Under normality, you expect about 5% of observations to have |z| > 2 and about 0.3% to have |z| > 3, purely by chance. So in a sample of 100, about 5 observations with |z| > 2 are expected even if no true outliers exist. In a sample of 1000, about 3 observations with |z| > 3 are expected. This is why you should not panic over a few flagged values in a large dataset.
Roboculator Team
The Roboculator Team explains calculations, planning tools, and practical formulas in clear language for real-life situations.
How helpful was this calculator?
Be the first to rate!
Linear Regression Calculator
Regression & Correlation Analysis
Simple Linear Regression Calculator
Regression & Correlation Analysis
Multiple Regression Calculator
Regression & Correlation Analysis
Polynomial Regression Calculator
Regression & Correlation Analysis
Exponential Regression Calculator
Regression & Correlation Analysis
Logarithmic Regression Calculator
Regression & Correlation Analysis