Background on the Project

Our project will increase long-term data science capacity in the Pioneer Valley area of Western Massachusetts, while also providing students with valuable hands-on data science experience.

The two components of the proposed project are:

  1. Data Science WAV teams: specially-trained teams of four undergraduate students who are deployed to community-based organizations to Wrangle, Analyze, and Visualize their data.

  2. Summer Faculty Development Workshops designed to help new instructors - especially those at community colleges - teach data science at their institutions. Curricular innovations that bring experiential data science learning into the curriculum, leading to sustainable long term impact at the partnering academic institutions and in the larger Pioneer Valley region.

Overview

A major goal of the faculty development workshops is to prepare the Five College and community college faculty to teach data science. The emphasis on community college faculty is critical, since nearly 50% of all bachelor’s degree recipients come through a two-year college (National Student Clearinghouse (2017)). Receiving training in data science will help the faculty engage with their students on relevant topics.

The workshops will be divided into two parts: during the first year (2021) approximately 20 participants will receive instruction in fundamental data science concepts, while in the second year (2022) participants will work on implementing a specific curricular innovation.

Dates

The 2021 workshop was postponed and will take place Monday through Friday, June 14th–18th, 2021 at Smith College. The workshop will begin at 9:30am each day and finish at 1:00pm on Friday. The dates for the 2022 workshop have not yet been finalized.

Funding

Stipends are available for faculty who attend the workshop and demonstrate their integration of the material from the workshop in their teaching. Applications for the 2021 workshop are now available and will be evaluated with rolling selection, beginning at the end of March LINK TO APPLICATION

Leaders

These workshops will be led by:

  • Benjamin Baumer, Smith College: PI on the DSC-WAV project, Ben has contributed to multiple curricular development efforts at the national level, including De Veaux et al. (2017), and B. Baumer (2015). He is a co-author of B. S. Baumer, Kaplan, and Horton (2021), one of the first comprehensive textbooks on data science.
  • Nicholas Horton, Amherst College: co-PI on the DSC-WAV project, Nick is a leader in the statistics and data science community (Horton, Baumer, and Wickham 2015; Hardin et al. 2015), and has worked to provide curriculum guidelines for data science at two-year colleges. He is a co-author of B. S. Baumer, Kaplan, and Horton (2021).
  • Ethan Meyers, Hampshire College/Yale University: co-PI on the DSC-WAV project, Ethan has extensive experience in data science practice and education through his work at Hampshire College, Yale University, and the through is research affiliation with the Center for Brains, Minds and Machines at MIT.

Audience and Topics

The gathering is focused on engaging faculty from two-year colleges and the Five Colleges who have an interest in data science to better prepare students to meet the challenges of integrating data science into their courses.

The aim of the workshop is to introduce a powerful suite of data science tools including R/RStudio, the tidyverse suite of packages, Shiny for dynamic visualization, and GitHub for collaboration. The workshop will provide an introduction to using these tools to undertake the entire data science analysis cycle: posing a question, identifying data sources, ingesting data, wrangling data, undertaking exploratory data analysis, modeling, assessment, and communication. Visualization and data wrangling Python will also be explored.

References

Baumer, Ben. 2015. “A Data Science Course for Undergraduates: Thinking with Data.” The American Statistician 69 (4): 334–42. http://amstat.tandfonline.com/doi/abs/10.1080/00031305.2015.1081105.
Baumer, Benjamin S., Daniel T. Kaplan, and Nicholas J. Horton. 2021. Modern Data Science with R (2e). Chapman; Hall/CRC Press: Boca Raton. https://mdsr-book.github.io/mdsr2e.
De Veaux, Richard D., Mahesh Agarwal, Maia Averett, Benjamin S. Baumer, Andrew Bray, Thomas C. Bressoud, Lance Bryant, et al. 2017. “Curriculum Guidelines for Undergraduate Programs in Data Science.” Annual Review of Statistics and Its Application 4 (1): 1–16. https://doi.org/10.1146/annurev-statistics-060116-053930.
Hardin, Johanna, Roger Hoerl, Nicholas J Horton, Deborah Nolan, Benjamin Baumer, Ofer Hall-Holt, Paul Murrell, et al. 2015. “Data Science in Statistics Curricula: Preparing Students to ‘Think with Data’.” The American Statistician 69 (4): 343–53. http://www.tandfonline.com/doi/abs/10.1080/00031305.2015.1077729.
Horton, Nicholas J., Benjamin S. Baumer, and Hadley Wickham. 2015. “Setting the Stage for Data Science: Integration of Data Management Skills in Introductory and Second Courses in Statistics.” CHANCE 28 (3): 40–50. http://chance.amstat.org/2015/04/setting-the-stage/.

.