Simple Linear Regression
Fitting Linear Models
lm is used to fit linear models.
It can be used to carry out regression,
single stratum analysis of variance and
analysis of covariance (although
aov may provide a more
convenient interface for these).
lm(formula, data, subset, weights, na.action, method = "qr", model = TRUE, x = FALSE, y = FALSE, qr = TRUE, singular.ok = TRUE, contrasts = NULL, offset, ...)
an object of class
an optional data frame, list or environment (or object
an optional vector specifying a subset of observations to be used in the fitting process.
an optional vector of weights to be used in the fitting
process. Should be
a function which indicates what should happen
when the data contain
the method to be used; for fitting, currently only
an optional list. See the
this can be used to specify an a priori known
component to be included in the linear predictor during fitting.
This should be
additional arguments to be passed to the low level regression fitting functions (see below).
lm are specified symbolically. A typical model has
response ~ terms where
response is the (numeric)
response vector and
terms is a series of terms which specifies a
linear predictor for
response. A terms specification of the form
first + second indicates all the terms in
with all the terms in
second with duplicates removed. A
specification of the form
first:second indicates the set of
terms obtained by taking the interactions of all terms in
with all terms in
second. The specification
indicates the cross of
second. This is
the same as
first + second + first:second.
If the formula includes an
offset, this is evaluated and
subtracted from the response.
response is a matrix a linear model is fitted separately by
least-squares to each column of the matrix.
model.matrix for some further details. The terms in
the formula will be re-ordered so that main effects come first,
followed by the interactions, all second-order, all third-order and so
on: to avoid this pass a
terms object as the formula (see
demo(glm.vr) for an example).
A formula has an implied intercept term. To remove this use either
y ~ x - 1 or
y ~ 0 + x. See
more details of allowed formulae.
weights can be used to indicate that different
observations have different variances (with the values in
weights being inversely proportional to the variances); or
equivalently, when the elements of
weights are positive
integers w_i, that each response y_i is the mean of
w_i unit-weight observations (including the case that there are
w_i observations equal to y_i and the data have been
lm calls the lower level functions
see below, for the actual numerical computations. For programming
only, you may consider doing likewise.
offset are evaluated
in the same way as variables in
formula, that is first in
data and then in the environment of
lm returns an object of
"lm" or for
multiple responses of class
anova are used to
obtain and print a summary and analysis of variance table of the
results. The generic accessor functions
various useful features of the value returned by
An object of class
"lm" is a list containing at least the
a named vector of coefficients
the residuals, that is response minus fitted values.
the fitted mean values.
the numeric rank of the fitted linear model.
(only for weighted fits) the specified weights.
the residual degrees of freedom.
the matched call.
(only where relevant) the contrasts used.
(only where relevant) a record of the levels of the factors used in fitting.
the offset used (missing if none were used).
if requested, the response used.
if requested, the model matrix used.
if requested (the default), the model frame used.
(where relevant) information returned by
In addition, non-null fits will have components
effects and (unless not requested)
qr relating to the linear
fit, for use by extractor functions such as
Using time series
Considerable care is needed when using
lm with time series.
na.action = NULL, the time series attributes are
stripped from the variables before the regression is done. (This is
necessary as omitting
NAs would invalidate the time series
attributes, and if
NAs are omitted in the middle of the series
the result would no longer be a regular time series.)
Even if the time series attributes are retained, they are not used to
line up series, so that the time shift of a lagged or differenced
regressor would be ignored. It is good practice to prepare a
data argument by
ts.intersect(..., dframe = TRUE),
then apply a suitable
na.action to that data frame and call
na.action = NULL so that residuals and fitted
values are time series.
Offsets specified by
offset will not be included in predictions
predict.lm, whereas those specified by an offset term
in the formula will be.
The design was inspired by the S function of the same name described in Chambers (1992). The implementation of model formula by Ross Ihaka was based on Wilkinson & Rogers (1973).
Chambers, J. M. (1992) Linear models. Chapter 4 of Statistical Models in S eds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.
Wilkinson, G. N. and Rogers, C. E. (1973) Symbolic descriptions of factorial models for analysis of variance. Applied Statistics, 22, 392–9.
require(graphics) ## Annette Dobson (1990) "An Introduction to Generalized Linear Models". ## Page 9: Plant Weight Data. ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14) trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69) group <- gl(2, 10, 20, labels = c("Ctl","Trt")) weight <- c(ctl, trt) lm.D9 <- lm(weight ~ group) lm.D90 <- lm(weight ~ group - 1) # omitting intercept anova(lm.D9) summary(lm.D90) opar <- par(mfrow = c(2,2), oma = c(0, 0, 1.1, 0)) plot(lm.D9, las = 1) # Residuals, Fitted, ... par(opar) ### less simple examples in "See Also" above
Add new comment
From Around the Site...
|Title||Authored on||Content type|
|R Dataset / Package Stat2Data / ChildSpeaks||March 9, 2018 - 1:06 PM||Dataset|
|R Dataset / Package HSAUR / polyps3||March 9, 2018 - 1:06 PM||Dataset|
|R Dataset / Package DAAG / roller||March 9, 2018 - 1:06 PM||Dataset|
|R Dataset / Package HSAUR / heptathlon||March 9, 2018 - 1:06 PM||Dataset|
|R Dataset / Package HistData / Wheat||March 9, 2018 - 1:06 PM||Dataset|