Exercises: 10.33.346 Analyzing Linguistic Data with R - Details

Exercises: 10.33.346 Analyzing Linguistic Data with R - Details

You are not logged into Stud.IP.

General information

Course name Exercises: 10.33.346 Analyzing Linguistic Data with R
Subtitle
Course number 10.33.346
Semester SoSe2025
Current number of participants 12
Home institute Institute of Dutch Studies
Courses type Exercises in category Teaching
Next date Monday, 28.04.2025 08:00 - 10:00, Room: A09 0-018
Type/Form Ü
Pre-requisites In this practice (Übung) course, you will learn how to analyze linguistic data, using R. Think about the following questions:

1. Which of verbs and nouns have more irregular forms in English (e.g., "bought", not "buyed")?
2. Which of prefixed words (e.g., replay) or suffxied words (e.g., playing) is longer in terms of the number of letters?
3. Are high-frequency (familiar/common/easy) words pronounced rather short or long?
4. Do some people speak faster than others? How do you take individual differences into account?

How do you address these questions? What should you be careful about when analyzing linguistic data? By the end of the course, you will learn the basic linguistic analysis techniques, so that you will be able to perform appropriate analysis techniques to and answer linguistic questions such as above.

In this course, we will mainly follow the following book:

Baayen, R. Harald. (2008). "Analyzing linguistic data: A practical introduction to statistics using R". Cambridge University Press.

We will learn and practice several analysis techniques from each chapter of the book approximately every week. Additional contents will also be provided and explained as needed. You will be encouraged to read the relevant chapter before each session. In each session, we will run the codes in the chapter and discuss certain problems and questions together. This course will be organized in such a way to focus on interaction among participants, and therefore, you will be highly encouraged to bring your own questions and discussion topics and join an active discussion. After each session, you need to submit your script file of the week, which you have worked on during the week. You will additionally need to finish a web-based quiz with a few simple questions related to the topics of the week.
Learning organisation Below are some topics planned to learn in the course (*subject to change):
- Common data classes in R
- Random variables
- Distributions
- Data visualization
- T-test
- ANOVA
- Simple/multiple/generalized regressions
- Mixed models
- Non-linear regression

[IMPORTANT] You need to bring your own laptop with the internet connection, on which R must be able to run. Below are some useful websites for installing and setting up R for common operating systems. Please install R ready to use before the course begins.

Linux:
https://cran.r-project.org/bin/linux/ubuntu/fullREADME.html
https://dh-r.lincolnmullen.com/installing-r-and-packages.html

Mac OS:
https://dh-r.lincolnmullen.com/installing-r-and-packages.html
https://pages.cms.hu-berlin.de/EOL/gcg_quantitative-methods/HowTo_r-on-macos.html

Windows:
https://cran.r-project.org/bin/windows/base/rw-FAQ.html
https://www.dataquest.io/blog/installing-r-on-your-computer/
https://teacherscollege.screenstepslive.com/a/1108074-install-r-and-rstudio-for-windows
https://dh-r.lincolnmullen.com/installing-r-and-packages.html
Performance record The final grade will be determined based on attendance (30%), submission of the script after each session (40%), and performance on the weekly-quiz (30%). You will get 3 CPs upon successful completion of the course.
You will have an option of pursuing for 6 CPs, instead of 3 CPs. For 6 CPs, you will get an additional assignment of analyzing and summarizing liguistic data, using R and the Jupyter notebook. You will need to figure out what analysis is appropriate, perform it to the data, summarize results with texts and figures in a jupyter notebook script. The deadline of submission of the assignment is now planned to be the end of August.
You can also join the course without pursuing for grades and/or CPs. Even if you will not need a grade or a CP, you will still be encouraged to read a chapter before a session, bring your own questions, join a dicussion, submit the script of the week, and complete the web-based quiz, so that you can learn/practice as much and as effectively as possible.
Lehrsprache englisch
ECTS points 3 / 6

Rooms and times

A09 0-018
Monday: 08:00 - 10:00, weekly (12x)

Module assignments

Comment/Description

In this practice (Übung) course, you will learn how to analyze linguistic data, using R, following the book "Analyzing linguistic data: A practical introduction to statistics using R" (by R. Harald Baayen, 2008, Cambridge University Press). We learn several analysis techniques every week. After each session, you need to submit your own script of the week and complete a web-based quiz. The final grade will be determined based on attendance (30%), the submitted script (40%), and the weekly-quiz (30%). You will get 3 CPs upon successful completion of the course. You can also join the course even when you do not pursue for a grade or a CP, though you will be encouraged to join the discussion and complete the assignments in the course. [IMPORTANT:] You need to have your own laptop, on which R can run.