Learning goals

After completing these tasks, you will know how to run simple analyses in R. You will know how to test for normality of your data and know how to run relevant non-parametric tests if your data is not suitable for parametric tests. You will know how to construct formulae for regression analyses. You will have also tried running regression analyses and comparing regression model fits. New concepts: This session will not introduce new programming concepts. We will focus on translating what you already know about statistical analyses to R and implementing things we learned from the previous sessions. Useful packages for getting the analysis output in human readable/paper ready format include yarrr, apaTables, and questionr.

Exercises

4.1 Simple statistical tests

Demo 4: Statistical tests (talk)

Demo 4: Statistical tests (demo)

Start by loading in and inspecting your data as you have done in previous tasks. I recommend you do these tasks on a data set which you have already made tidy & you know that the data import works as it should.

Q4.1.1 Select a continuous variable and a factor with two levels, e.g. sex. Run a t-test (command t.test) on these variables.

Q4.1.2 From the output, find the t-statistic, degrees of freedom and p-value. Can you also find confidence interval and means in both groups?

Q4.1.3 Can you find out how to change the test from two-sided to one-sided? Run the same test with one-sided alternative hypothesis (you can pick which direction makes more sense).

Tip: this is something you can answer with the built-in help!

Q4.1.4 By default, R assumes unequal variances and runs Welch's t-test. Can you find out how to tell the t.test function that you assume equal variances in the two groups?

4.2 Testing assumptions

Demo 4: Testing assumptions (talk)

Demo 4: Testing assumptions (demo)

Q4.2.1 Are we justified in assuming equal variances in our t-test? Find out by testing for homogeneity of variance. If your data are normally distributed, you can use var.test for F-test or bartlett.test (for more than two groups). If your data are not normally distributed, you can use leveneTest from the car package (can also manage more than two groups) or fligner.test.

Tip: At least the Levene test requires that your explanatory variable (the one after the ~) is a factor. If you did not take the time to convert your factors into factors when importing data, do that now!

Q4.2.2 What does your result tell you? Is a low p-value a sign of having equal variances or not having equal variances?

Tip: You can try to formulate the null hypothesis and alternative hypothesis for these tests, and reason based on that.

Q4.2.3 Choose one continuous variable from your data and see if it is normally distributed. First, run visual inspection (hist, qqnorm and qqline).

Q4.2.4 Then, also run Shapiro-Wilk test of normality (shapiro.test)

4.3 Regression

Demo 4: (Linear) regression models (talk)

Demo 4: Regression (demo)

For the regression tasks, find a continuous dependent variable and a number of independent variables

Q4.3.1 Create a simple linear regression with lm(y ~ x, data=nameofdataframe) (where y is your dependent variable, x is one of your independent variables and nameofdataframe is the name of your dataframe).

Q4.3.2 Assign the result of the linear model to a variable and run summary on that variable. What can you tell about the result?

Q4.3.3 Fit another linear model. Add another independent variable to your model by adding + z to your lm() function call and assign the result to another variable. Again, see the result with summary

Q4.3.4 Compare these two models with anova. Does the more complex model fit your data better than the simpler one?

Bonus: logistic regression

Create a new binary variable (or use one you have already) to run logistic regression.

Demo 4: Recap

Course feedback

Now that you made it this far, would you please take a few minutes and fill in the course feedback form which you can find here. It would mean a lot to me if you did that and it shouldn't take too long. Thanks in advance!

 

Created by Juulia Suvilehto
juulia.suvilehto@iki.fi

Creative Commons License