1.99
0.05
11.99
0.997305
0.998652
3
6.02
0.107
39.708
1
1.99
0.05
11.99
0.997305
0.998652
3
6.02
0.107
39.708
1
The Linear Regression Calculator fits a straight line to five data points using the ordinary least squares (OLS) method. Enter paired (x, y) values and a prediction point, and the calculator determines the best-fit line equation, coefficient of determination (R²), Pearson correlation coefficient (r), and predicted value.
Simple linear regression models the relationship between a dependent variable $$y$$ and independent variable $$x$$ as: $$y = a + bx + \epsilon$$ where $$a$$ is the y-intercept, $$b$$ is the slope, and $$\epsilon$$ represents random error. The OLS method minimizes the sum of squared residuals: $$\sum_{i=1}^{n} (y_i - \hat{y}_i)^2$$.
The slope is computed as: $$b = \frac{n \sum x_i y_i - \sum x_i \sum y_i}{n \sum x_i^2 - (\sum x_i)^2}$$ and the intercept as: $$a = \bar{y} - b\bar{x}$$. These formulas derive from setting the partial derivatives of the sum of squared residuals to zero and solving the resulting normal equations.
The coefficient of determination $$R^2 = 1 - \frac{SS_{res}}{SS_{tot}}$$ measures the proportion of variance in $$y$$ explained by the linear relationship with $$x$$. Values range from 0 to 1, where 1 indicates a perfect linear fit. The Pearson correlation coefficient $$r = \pm\sqrt{R^2}$$ indicates both the strength and direction of the linear association.
For prediction, the fitted model estimates: $$\hat{y} = a + bx_{predict}$$. Predictions are most reliable within the range of observed x-values (interpolation). Extrapolation beyond this range carries increasing uncertainty because the linear assumption may not hold outside the observed data.
Assumptions of linear regression include linearity between x and y, independence of errors, homoscedasticity (constant variance of residuals), and normally distributed residuals. Diagnostic tools such as residual plots, the Durbin-Watson test for autocorrelation, and the Breusch-Pagan test for heteroscedasticity help validate these assumptions. When assumptions are violated, transformations, weighted least squares, or generalized linear models may be appropriate alternatives.
Computes sums: Σx, Σy, Σxy, Σx², Σy² for the 5 data points. Slope b = (nΣxy - ΣxΣy) / (nΣx² - (Σx)²). Intercept a = (Σy - bΣx) / n. R² = 1 - SS_res / SS_tot where SS_res = Σ(yᵢ - a - bxᵢ)² and SS_tot = Σ(yᵢ - ȳ)². Predicted y = a + b × x_predict.
Slope b represents the average change in y per unit increase in x. Intercept a is the predicted y when x = 0. R² near 1 indicates a strong linear fit; near 0 indicates a weak fit. The correlation r indicates direction: positive r means y increases with x; negative r means y decreases with x. Predictions outside the data range should be interpreted with caution.
Inputs
Results
R² = 0.998 indicates an almost perfect linear fit. Slope ≈ 2: each unit increase in x increases y by about 2.
Inputs
Results
R² = 0.76 indicates a moderate fit with more scatter around the line.
R² (coefficient of determination) represents the proportion of variance in the dependent variable explained by the independent variable. R² = 0.85 means 85% of y's variability is explained by the linear relationship with x.
r (Pearson's correlation) measures the strength and direction of linear association, ranging from -1 to +1. R² = r² measures explained variance, ranging from 0 to 1. r indicates direction; R² indicates explanatory power.
Linear regression fits a straight line and is not suitable for curved relationships. For nonlinear data, consider polynomial regression, exponential regression, logarithmic transformation, or other nonlinear models.
Extrapolation means predicting y for x-values outside the observed range. The linear relationship may not hold beyond your data. Extrapolation error grows with distance from the data range and can produce misleading predictions.
Residuals are the differences between observed and predicted values: eᵢ = yᵢ - ŷᵢ. Analyzing residual plots helps verify regression assumptions. Patterns in residuals suggest model inadequacy or violated assumptions.
Ordinary least squares minimizes the sum of squared residuals, providing the best linear unbiased estimators (BLUE) when assumptions hold. Squaring prevents positive and negative errors from canceling and penalizes larger deviations more heavily.
Roboculator Team
The Roboculator Team explains calculations, planning tools, and practical formulas in clear language for real-life situations.
How helpful was this calculator?
Be the first to rate!