Materials
Class forms
Attendance reporting form; fill out once per class meeting.
Capstone project intake form; fill out by September 23.
Resources
Textbooks:
Modern Data Science with R by Baumer, Kaplan, and Horton.
Introduction to Statistical Learning with Applications in R by James et al.
Fundamentals of Data Visualization by Claus Wilke.
R for Data Science by Wickham and Grolemund.
Deep Learning by Goodfellow, Bengio, and Courville.
Documentation:
Tidyverse and tidymodels packages
Introductory module
Objectives: set expectations; explore data science raison d’etre; introduce systems and design thinking; introduce software tools and collaborative coding; conduct exploratory/descriptive analysis of class background and interests.
Week 0
Thursday meeting: Course orientation [slides]
Assignments due by next class meeting:
install course software and create github account;
fill out intake form
read Peng and Parker (2022);
prepare a reading response
Week 1
Tuesday meeting: On projects in(volving) data science [slides]
Section meeting: software and technology overview [activity]
Assignments due by next class meeting:
read MDSR 9.1 and 9.2
prepare a reading response
Week 2
Tuesday meeting: Introducing class intake survey data [slides]
Section meeting: tidyverse basics [activity]
Thursday meeting: planning group work for analysis of survey data [slides]
Assignments:
- first team assignment due Friday, October 14, 11:59 PM PST [accept via GH classroom here]
Module 1: biomarker identification
Objectives: introduce variable selection, classification, and multiple testing problems; discuss classification accuracy metrics and data partitioning; fit logistic regression and random forest classifiers in R; learn to implement multiple testing corrections for FDR control (Benjamini-Hochberg and Benjamini-Yekutieli); discuss selection via penalized estimation. Data from Hewitson et al. (2021) .
Week 3
Tuesday meeting: introducing biomarker data; multiple testing [slides]
Section meeting: iteration strategies [activity]
Thursday meeting: correlation analysis; random forests [slides] [activity]
Assignments due by next class meeting:
read MDSR 10.1 - 10.2
read Hewitson et al. (2021)
prepare a reading response
Week 4
- Tuesday meeting: random forests cont’d; logistic regression [slides]
- Section meeting: logistic regression and classification metrics [activity]
- Thursday meeting: LASSO regularization [slides]
- Assignments:
- second group assignment due Friday, October 28, 11:59pm PST [accept via GH classroom] [group assignments]
Module 2: fraud claims
Objectives: introduce NLP techniques for converting text to data and web scraping tools in R; discuss dimension reduction techniques; introduce multiclass classification; learn to process text, fit multinomial logistic regression models, and train neural networks in R.
Week 5
Week 6
Tuesday meeting: feedforward neural networks [slides]
Section meeting: fitting neural nets with keras [activity]
Thursday meeting: assignment review and planning [slides]
Assignments:
- Midquarter assessments [form]
- Request winter add code [form]
- Read Emmert-Streib et al. (2020) (§1-5, §9) and prepare a reading response
- third group assignment due Monday, November 14, 11:59pm PST [accept via GH classroom] [group assignments]
Optional further reading:
Module 3: soil temperatures
Objectives: build a forecasting model; introduce concepts of spatial and temporal correlation; discuss function approximation and curve fitting with regression techniques; fit elementary time series models and regression with AR errors; spatial interpolation.
Week 7
Week 8
Module 4: vignettes
Objectives: learn independently about a method of choice and prepare a teaching vignette illustrating its use; create shared reference material potentially useful for project work.
Week 9
Tuesday meeting: discussion on results of claims module; vignette workshopping [slides]
Section meeting: NO SECTION MEETING (Thanksgiving)
Thursday meeting: NO CLASS (Thanksgiving)
Assignments: vignettes [guidelines]
drafts due in class Thursday, 12/1 2pm PST
final version due Thursday, 12/8 11:59pm PST
Week 10
Tuesday meeting: capstone project overviews [slides]
Section meeting: office hours for vignette help
Thursday meeting: vignette presentation/exchange/feedback [feedback form]
Assignments due by Friday, 12/2:
read project abstracts
fill out preference form (will be active end of day 11/29)