Modules-- Bringing Data to every classroom

February 20, 2018

Today, data ties together many pieces of our professional, academic, and everyday lives.  This interdisciplinary approach to data science is the cornerstone of the Data Science Modules Program, a collaborative effort between the Division of Data Sciences and D-Lab, which allows faculty across campus to explore unprecedented ways to incorporate data science into their classes.  

In a classroom integrated with a data science module, students work hands-on with datasets that are relevant to their course, whether it be through an in-class lesson, a guided lab or a class project. Modules teams, made up of experienced undergraduate and graduate students from the Division of Data Sciences and D-Lab, collaborate directly with faculty to develop and implement these innovative data science lessons into current or future classes, ensuring that the resulting module connects with the faculty’s teaching and research objectives.

In Fall 2017, over 1,300 undergraduates at Berkeley interacted with modules in 24 different classes.  A survey of these students showed that 40% hoped to eventually take Data 8, the innovative course on the Foundations of Data Science that anchors Berkeley’s Data Science Education Program, and 70% would like to see modules integrated into future classes that they would take.

By spreading data science education to all corners of campus, the data science modules program offers students the opportunity to discover how they can incorporate this powerful analytic tool into their future studies, life and work experiences.  With modules, learning data science not only has become more accessible, but also more applicable and useful to a wider range of studies, overcoming pre-existing barriers and student anxieties

Students Collect Groundbreaking Data

Students have interacted with data science modules in fields such as Public Health, Legal Studies, Psychology, Linguistics and Ethnic Studies.  In many of these data science modules, students were able to specifically work with data they had collected first hand.

In Joanna Reed’s Sociology 130AC class, Social Inequalities, an upper-division American Cultures course, students completed a neighborhood assessment, in which they used a census tract map to choose a block in a neighborhood to collect data and make observations such as whether the neighborhood section could be classified as residential, commercial or industrial.  Then, using a guided program created by members of the modules team, the students visualized their data to tell a story and enhance the bigger picture of their assessment.

“What’s new from a teaching perspective is to be able to show students the power of what we can do [with data] without them having to first suffer through mastering the technical aspects of [statistical software],” sociology professor Reed said of her course’s modules experience.

Linguistics 110, Introduction to Phonetics and Phonology, taught by Susan Lin, also incorporated personal data into its curriculum.  The students in the class recorded their own speech data and then used a guided program created by the modules team to analyze and visualize their findings.

“It’s really cool to see how you can collect your own data, because normally this data comes prepackaged,” Geoff Bacon, a graduate student instructor for the course said.

Bacon emphasized that the modern data science curriculum makes it easier to analyze data and gain further insights into the course content.  According to Bacon, integrating data science approaches into the Linguistics curriculum could lead to future projects such as studying certain speech sounds in order to trace human migration patterns.

With New Data Science Modules Come New Discoveries

Modules offer invaluable learning experiences for both the students working with the new curricular components and for the students who are developing them. Creating these modules requires building off fundamental data science knowledge, which is taught on campus through fast-growing backbone courses such as Data 8 and Data 100.

“[Developing modules] gave me an opportunity to practice skills I had just learned,” said Sujude Dalieh, a current Modules lead who joined the modules development program while enrolled in Data 100, Berkeley’s junior-level class on Principles and Techniques of Data Science.  “Working on this module has been huge for me in terms of building my data science skills.”

Dalieh’s team helped develop a module for Rhetoric 1A, The Craft of Writing, taught by Amy Tick.  Within it, students in the course partnered with members of the modules team in order to analyze how politicians’ speeches changed over time and use the conclusions they drew to illustrate a social theory for the mutability of human moral reasoning.

One day, when a group of students presented their graph of the 2016 presidential candidates’ usage of empathetic words, Dalieh said she heard an audible gasp from the students in the class.“We followed Donald Trump and there was this huge outlier on this one day,” Dalieh said.  Made apparent through the visualization, the outlier pointed to a significant increase in President Trump’s empathetic word usage.  

The student listening to the presentation had connected the date of the outlier to the release date of Trump’s controversial Hollywood Access Tape.“We finally understood why that outlier happened,” Dalieh said.

Creating a Curriculum for Everyone

Aditya Sheth teaching a module for Psychology 167AC

(Aditya Sheth teaching a module for Psychology 167AC, Stigma and Prejudice)

Part of the appeal of incorporating data science modules into a wide range of courses is that it offers students a welcoming introduction to programming and statistics.  Coding in the programming language Python using Jupyter notebooks (a cloud-based data science tool that is a staple within the Data Science Education Program), instructors can seamlessly share data, send assignments, and collect student responses without any complicated software installation.

Sociology professor Reed said that after she first mentioned coding in Python, many of her students were quick to express their apprehension. “You could sense the anxiety level rising of other students in the room,” Reed said. “‘What? Coding? Snakes?’  If you don’t have the exposure, you’re not going to know.”

But with the module's program, lessons in programming and statistics are carefully integrated into the course and made applicable to the content of the course.  It also tackles other challenges that can come up when data science is first introduced in classrooms.  According to Psychology Professor Rodolfo Mendoza-Denton, who also serves as Executive Associate Dean for Diversity in the College of Letters & Science, data can often be perceived by students as “really scary” and potentially threatening.

“It comes up in terms of math anxiety and a stereotype threat,” Denton said. Those concerns were a crucial barrier to overcome in the context of Professor Mendoza-Denton’s large class on Stigma and Prejudice, Psychology 167AC.

To create a more cohesive and friendly introduction, the modules learning experience is carefully created not only with the input of students working on the modules teams but with input from the course professors.  The resulting experience drives forward the potential and accessibility of data science education as a tool to enhance learning and understanding for students with any background.

“If we can work towards making sure that people don’t see data science and applied science as incompatible,” Denton added, “we will have taken a big step forward.”


More information about the Data Science Modules Program can be found in this overview. Faculty interested in adding a module to their class, or learning more about modules, should fill out this interest form, or email

A set of modules that have been incorporated into American Cultures classes will be highlighted at a workshop at the Academic Innovation Studio on March 21st, 2018.