Last edited: 2015-02-11/gs

Statistik und Datenauswertung mit R
Computational Statistics: An Introduction to R


Course participants are welcome to send their questions using this link. Please add a tag/keyword to the subject line. Answers will be posted in Questions  and  Answers.


Exercises 2015-02-10/11



(1) Write a function to plot an extended histogram. You can use these lines as a builing block:
hist(x, probability = TRUE)
rug(x)
lines(density(x))

This is a basic solution, to be improved upon:


histogram

(2) Write a function to plot a distribution function.
You can start using
plot(ecdf( x ))
Here is a slighly better example:

ecdf


(3) Write a function to plot a residual plot analogous to plot 1 of
plot.lm, but plotting externally studentized residuals against ranked
fitted values. Start with
plotestud <- function( lmx )
{
    #outdated and replaced: plot(lmx, which=1)

    # check for valid arguments
    plot(rank(lmx$fitted.values), rstudent(lmx))
    # identification. See source code of plot.lm

    # smooth. Use panel.smooth()
}
These are the plots you get for free from plot.lm():

regression diagnostics


(4) Write a function to give a scatterplot with added regression line
and confidence bands. Here is the sample code used in the lecture:

# example data
n <- 100
sigma <- 1
x <- (1:n)/n-0.5
err <- rnorm(n)
y <- 2.5 * x + sigma*err
lmxy <- lm(y ~ x)

plotlim <- function(x){ xlim <- range(x) # check implementation of plot. is this needed? del <- xlim[2]-xlim[1] if (del>0) xlim <- xlim+c(-0.1*del, 0.1*del) else xlim <- xlim+c(-0.1, 0.1) return(xlim) } xlim <- plotlim(x) ylim <- plotlim(y)
newx <- data.frame(x = seq(xlim[1], xlim[2], 1/(2*n)))
# calculation pred.w.plim <- predict(lmxy, newdata = newx, interval = "prediction") pred.w.clim <- predict(lmxy, newdata = newx, interval = "confidence")
# plotting plot(x, y, xlim = xlim, ylim = ylim) abline(lmxy) matplot(newx$x, cbind(pred.w.clim[, -1], pred.w.plim[, -1]), lty = c(2, 2, 6, 6), col = c(2, 2, 4, 4), type = "l", add = TRUE) title(main = "Simultaneous Confidence") legend("topleft", lty = c(2, 6), legend = c("confidence", "prediction"), col = c(2, 4), inset = 0.05, bty = "n")
This is the plot used in the lecture:

Scheffe bands