Data Science Pathways
a research project predicting student outcomes in introductory data science courses
📌 Project Abstract
- Project: Research project asking psychological questions using the CourseKata platform.
- Timeframe: June 2022 - present
- My Role: Lead researcher and project manager.
- Team: The team consists of two principal investigators, a graduate student researcher, an undergraduate researcher, and myself.
- Methods: Agile method, , , &
- Tools: Miro, R, Rmarkdown, Keynote, Statistical Models, Adobe Illustrator, Latex, Bibtex, Git, & Zotero
- Deliverable: Conference presentation, presentation to CourseKata creators, & paper submission to association for computing machinery.
- Impact: Awarded best presentation at a conference.
Project Overview
🚀 Client kickoff
CourseKata is an online, interactive textbook for teaching introductory statistics and data science. The CourseKata textbook is compatible with popular Learning Management Systems (i.e., Canvas, Blackboard, Moodle) to make it more accessible for the students throughout the course \cite{watson2007argument}. This setup not only made it so that students could access the CourseKata material in their courses’ central platform, but critically allowed us to measure the various interactions the students have with the textbook. CourseKata has 12 chapters that cover introductory statistics topics, including: understanding data, examining distributions, sampling, explaining variation, modeling, quantifying error, and quantifying models. Each chapter includes interleaved activities to support learning, along with an end-of-chapter assessment.
🔎 The problem
Learning to reason and work with quantitative data is increasingly vital in many fields. Introductory data science courses at the high-school and collegiate levels have begun to provide new pathways for students from diverse backgrounds to acquire these foundational computational and statistical reasoning skills. However, the factors that predict student achievement in these courses are not well understood. The goal of our study was to investigate the relationship between the attitudes that students bring to these courses (e.g., anxiety towards math, confidence with computer programming), their ongoing engagement with the course material, and their comprehension of key concepts.
🛠️ The solution: methods, research process, and toolbox
We leveraged anonymous data from CourseKata, an interactive web-based textbook and learning platform, which automates the collection of key variables.
This longitudinal dataset includes 2,998 students across 110 classrooms (15\% high school and 85\% higher education) over a two-year time period between 2021 and 2022.
How well do student’s initial attitudes & beliefs predict learning?
Initial attitudes were strong predictors of learning
How well do student engagement with a course predict learning?
Engagement with interleaved exercises predicts learning outcomes
How well do student attitudes & engagement predict learning?
Engagement improves our ability to predict learning, but fasr less than initial attitudes
💭Project Takeaways
We found that several attitudinal measures (e.g., interest in the material, math anxiety, fixed mindset) were predictive of performance on embedded learning assessments, while individual differences in the degree of engagement with course material was only modestly related to performance.
We also discovered that a substantial number of students displayed total loss of engagement with both the course material and assessments, posing challenges for obtaining accurate estimates of learning within this population.
Taken together, these results suggest that identifying the factors that moderate the effect of initial attitudes and predict sustained student engagement may be especially critical for improving retention of diverse learner populations in data science pathways.