PSTAT197A/CMPSC190DD Fall 2022
UCSB
Confer with your table and choose a word of the day. Agree on spelling.
Please sign in using the attendance reporting form found here:
PSTAT197A/CMPSC190DD is the first course in UCSB’s year-long data science capstone sequence.
Audience: undergraduate students of any discipline with a basic background in data science and an interest in research
Aim: prepare for an independent research or project experience
Most students are preparing for capstone projects in winter and spring. Course foci were chosen with this in mind.
Read about past projects at https://centralcoastdatascience.org/projects
Continuation in PSTAT197B-C/CMPSC190DE-DF during winter and spring:
students admitted to this course in spring have a seat;
students admitted from the waitlist are on the waitlist.
I hope to support all of you in:
We are in an interactive classroom for a reason: to interact!
Let’s acknowledge:
Preparations and areas of expertise vary widely among the class
It’s okay not to know things
If you have a question, probably someone else does too
All course content is hosted on our website
The course is configured in modules defined by a dataset and questions (much like a project).
A module typically comprises:
One session on data introduction (lecture/discussion)
Two sessions on problem patterns and related methodology (lecture)
Two labs with related examples (section meeting)
One session on sharing data analysis results (discussion)
The module datasets are currently as follows:
Class intake survey data (exploratory/descriptive analysis)
Biomarkers of autism (predictive modeling and variable selection)
Web fraud (text processing and deep learning)
Soil temperatures (correlated data)
Each module you will be assigned a working group.
Your group’s objective is to produce an analysis of the dataset:
Reproduce analysis presented/discussed in class meeting
Extend the analysis by
applying an alternative method that addresses the same question(s)
or addressing a corollary question
At the end of the class in place of a fifth module you will create a vignette (short demonstration) on a topic of interest.
present a use case
explain methodology
demonstrate implementation with example code
Students are expected to:
prepare for class meetings as directed;
attend and actively participate in class and section meetings;
contribute meaningfully to group activities and assignments.
Students are assessed on:
attendance, preparation, and participation;
quality of submitted work;
individual contributions to group assignments;
oral interview/presentation.
We’ll discuss:
data science as a discipline;
the research landscape;
systems and design thinking for data science.
Complete all of the following before our next meeting.