Simple Linear Regression

Intuition

Given a scatter plot of two variables, simple linear regression draws the single best straight line through the points. “Best” means the line that minimizes the total squared vertical distance from each point to the line. One predictor, one response, one line. It’s the most elementary form of regression and the natural starting point before moving to the multiple-predictor case in Regression Fundamentals.

Definition

The simple linear regression model expresses a response $Y$ as a linear function of a single predictor $x$ :

$Y_{i} = β_{0} + β_{1} x_{i} + ε_{i}, i = 1, \dots, n$

$β_{0}$ - the true intercept (value of $Y$ when $x = 0$ )
$β_{1}$ - the true slope (change in $E [Y]$ per unit change in $x$ )
$ε_{i} \sim N (0, σ^{2})$ - independent random errors

The goal is to estimate $β_{0}$ and $β_{1}$ from observed data $(x_{i}, Y_{i})$ .

Key Formulas

Estimated regression equation:

$\overset{y}{^} = b_{0} + b_{1} x$

where $b_{0}$ and $b_{1}$ are the least-squares estimates that minimize the residual sum of squares:

$RSS = \sum_{i = 1}^{n} (Y_{i} - \overset{y}{^}_{i})^{2}$

Slope estimate:

$b_{1} = \frac{\sum _{i = 1}^{n} ( x _{i} - x ˉ ) ( Y _{i} - Y ˉ )}{\sum _{i = 1}^{n} ( x _{i} - x ˉ ) ^{2}} = \frac{S _{x y}}{S _{xx}}$

Intercept estimate:

$b_{0} = \overset{ˉ}{Y} - b_{1} \overset{x}{ˉ}$

Estimated error variance:

$s^{2} = \frac{RSS}{n - 2} = \frac{\sum _{i = 1}^{n} ( Y _{i} - y ^ _{i} ) ^{2}}{n - 2}$

The denominator is $n - 2$ because two parameters ( $b_{0}$ , $b_{1}$ ) are estimated.

Note

For the full OLS derivation, multiple regression extension, residual diagnostics, and $R^{2}$ interpretation, see Regression Fundamentals.

Example

Predicting chemistry grades. A professor collects intelligence test scores ( $x$ ) and chemistry grades ( $Y$ ) for 20 students, obtaining $\overset{x}{ˉ} = 110$ , $\overset{ˉ}{Y} = 75$ , $S_{x y} = 1, 320$ , and $S_{xx} = 4, 400$ .

$b_{1} = \frac{1320}{4400} = 0.30$

$b_{0} = 75 - 0.30 (110) = 42.0$

The fitted line is $\overset{y}{^} = 42.0 + 0.30 x$ . Interpretation: each additional point on the intelligence test is associated with a 0.30-point increase in chemistry grade, on average. For a student scoring $x = 120$ :

$\overset{y}{^} = 42.0 + 0.30 (120) = 78.0$

The predicted chemistry grade is 78. The same technique applies to predicting final animal weight from feed consumed - any scenario where one continuous variable drives another.

Scatter plot of intelligence scores vs chemistry grades with the fitted line y-hat = 42.0 + 0.30x

Why It Matters in CS

Simple linear regression is the “can a straight line explain this?” test. In ML, it’s the first model you fit before reaching for anything fancier, because if a line already gets you 90% of the way there, a neural net is probably overkill. It sets both a performance floor and an interpretability ceiling.

One underrated use: empirical complexity analysis. Plot execution time $T$ against input size $n$ and fit $T = b_{0} + b_{1} n$ . If the fit is tight, your algorithm is linear. Fit $ln T = b_{0} + b_{1} ln n$ instead and $b_{1}$ estimates the polynomial exponent. This is often faster than deriving the complexity analytically, especially for messy real-world code.

Tip

When doing exploratory data analysis, fitting a simple regression for each feature individually is a quick way to see which predictors have any marginal relationship with the response. It’s not a substitute for multiple regression (confounding is real), but it’s a useful first pass.

Regression Fundamentals - extends to multiple predictors, OLS in matrix form, residual diagnostics, and $R^{2}$
Maximum Likelihood Estimation - under normality, MLE of regression coefficients equals OLS
Normal Distribution - the error distribution assumption $ε \sim N (0, σ^{2})$
Hypothesis Testing - $t$ -tests on $b_{1}$ to assess whether the slope differs from zero

Cam's Cyberspace

Recent Notes

Algorithm Efficiency - Bridging Theory and Practice

Home

Best, Worst & Average Cases

Explorer

Simple Linear Regression

Intuition

Definition

Key Formulas

Example

Why It Matters in CS

Graph View

Table of Contents

Backlinks

Cam's Cyberspace

Recent Notes

Algorithm Efficiency - Bridging Theory and Practice

Home

Best, Worst & Average Cases

Explorer

Simple Linear Regression

Intuition

Definition

Key Formulas

Example

Why It Matters in CS

Related Notes

Graph View

Table of Contents

Backlinks