Before doing regression on a real dataset, one can use as minimum a set of simulated data to test the steps (code adapted after [1]):
# define the model with simulated data n <- 100 x <- c(1:n) error <- rnorm(n,0,10) y <- 1+2*x+error fit <- lm(y~x) # plotting the values plot(x, y, ylab="1+2*x+error") lines(x, fit$fitted.values) #using anova (analysis of variance) anova(fit)
In the first step is created the data model, while in the second the data are plotted, while in the third the analysis of variance is run. For the y variable, can be used any linear function that represents a line in the plane.
rnorm() function generates multivariate normal random variates based on the parameters given, therefore the output will vary between the runs of the above code. The bigger the value of the third parameter, the more dispersed the data is.
To test the code on real data, one can use the Sleuth3 library with the data from [2] (see RPubs):
install.packages ("Sleuth3") library("Sleuth3")
Let's look at the data from the first case, which represent an experiment concerning the effects of intrinsic and extrinsic motivation on creativity run by the psychologist Teresa Amabile (see [2]):
attach(case0101) case0101 summary(case0101)
The regression can be applied to all the data:
# case 0101 (all data) x <- c(1:47) y <- case0101$Score fit <- lm(y~x) plot(x, y, ylab="Score") lines(x, fit$fitted.values)
Though, a more appropriate analysis should be based on each questionnaire:
# case 0101 (extrinsic vs intrinsic treatments) extrinsic <- subset(case0101, Treatment %in% "Extrinsic") intrinsic <- subset(case0101, Treatment %in% "Intrinsic") par(mfrow = c(1,2)) #1x2 matrix display
x <- c(1:length(extrinsic$Score)) y <- extrinsic$Score fit <- lm(y~x) plot(x, y, ylab="Extrinsic Score") lines(x, fit$fitted.values) x <- c(1:length(intrinsic$Score)) y <- intrinsic$Score fit <- lm(y~x) plot(x, y, ylab="Intrinsic Score") lines(x, fit$fitted.values)
title("Extrinsic vs. Intrinsic Motivation on Creativity", line = -2, outer = TRUE)
And, here's the output:
Case 0101 Extrinsic vs. Intrinsic Motivation on Creativity |
Happy coding!
References:
[1] DeWayne R Derryberry (2014) Basic Data Analysis for Time Series with R 1st Ed.
[2] Fred L Ramsey & Daniel W Schafer (2013) The Statistical Sleuth: A Course in Methods of Data Analysis 3rd Ed.