Renata Barreto

Cathryn Carson

Rochelle Terman

Claudia von Vacano

Christopher Hench

David Harding

Benjamin Gebre-Medhin

November 15, 2021

While taking a statistics class as part of her joint JD/PhD program at UC Berkeley, Renata Barreto realized she needed to use Python programming for assignments. So she looked for help refreshing and deepening those skills.

She found support at D-Lab, a learning community that helps Berkeley scholars conduct data intensive social science and humanities research. She’s stayed involved with the lab since, solidifying her programming interest, teaching Python, researching online hate speech using computational tools and preparing for potential industry jobs that weren’t on her radar before.

“I found that having an instructor that taught the ins and outs of programming was really helpful in terms of how it clarified my career path,” said Barreto. As she’s grown with the lab, she’s also been able to focus on publishing research that “wouldn't have been possible without having the values of D-Lab, things like equity and justice, at the forefront.”

Barreto’s story is common among the thousands of students, faculty and staff served by D-Lab since it opened in 2013. Its peer consulting, training and data availability for scholars at Berkeley helps individuals, especially graduate students, pass graduate programs, conduct social science computational research and identify careers in an inclusive environment.

The lab has been unique from the start. Graduate students were seen as leaders and key participants from its inception. It focused on both qualitative and quantitative research. These decisions defied norms at the time, and enabled women, underrepresented minorities and graduate students to shape the lab’s research and culture into an equity-driven, welcoming space. It then provided a scaffold for Berkeley’s data science landscape as other programs started on campus.

“The lab was really built around empowering graduate students and seeing them as forces of change [with] the understanding that meant giving power to groups that had suffered under status differentials,” said Cathryn Carson, D-Lab’s founding director who is now chair of Berkeley’s History Department. “We were able to talk explicitly about these racial, class, gender and status divides in our research and how we wanted to overcome them.”

Now “our graduate students come out of the lab with both the experience of doing [research and teaching] on their own and a social justice orientation in a field where that's not taken for granted,” Carson said.

‘It’s OK Not to Know’

Carla Hesse, who was the dean of Berkeley’s Division of Social Sciences, had the idea to create the lab coming out of the recession of 2008 and 2009, Carson said. Hesse identified computational research as an area the division should invest in for future graduate students. The fastest, most effective way to do that: provide consulting and training outside of classes. 

The imagined lab providing that support would bridge gaps between those who wanted to learn these skills and the upper-level classes that implement them. It would be a welcoming environment for beginners and also eliminate some monetary barriers by offering work stations with expensive computing programs and data for students to use for free.

“At the time I felt this attitude going around that if you don’t already know [how to use these tools], then you don’t deserve to be taught,” said Rochelle Terman, one of the first D-Lab graduate student leaders and now an assistant professor at the University of Chicago. “It ended up being people like women and underrepresented minorities or people from non-techie fields that were shut out.”

That common experience resulted in the lab’s initially implicit approach and eventually explicit slogan of, “It’s okay not to know.” Today, graduate students say that slogan and atmosphere made them comfortable asking questions and gave them confidence to work through programming challenges.

D-Lab is focused on serving and hiring diverse participants, especially historically marginalized populations. They look at the data, tracking demographic groups’ participation to understand who they are and aren’t reaching. They also focus on listening and empathizing.

“We're actively seeking diversity in everything that we do because we know that it increases the quality of the work and because it brings more perspectives and eyes to bear on the work,” said Claudia von Vacano, one of the lab’s founders who is now its executive director. “If we’re able to address our needs – and I’ll say ‘our’ because I am from a marginalized community – then we’re able to address needs universally.”

That eye toward diversity affects the research approach the lab takes, too, both in terms of the kinds of projects it takes on and the ethics behind how it handles those projects. Christopher Hench, who was one of the first researchers on Dr. Chris Kennedy and Dr. von Vacano’s  Measuring Hate Speech Project that uses machine learning to study online hate speech while seeking his PhD in German literature, said he learned to think about how data should be used rather than only how it could be used.

“We always had in the back of our minds, ‘How can we make sure that we’re as inclusive as possible -- that the data science we’re carrying out is not further marginalizing groups and, if anything, it’s advocating for them,’” said Hench, now a research scientist at Amazon. “We would empower everybody, make sure that everyone had the opportunity to use these really powerful tools, but also make sure that everybody was aware of how they can inherently contain bias and work to address that.”

Hench also took a leading role creating D-Lab's class on data science for social scientists for SAGE Campus, an online research-focused learning platform.

‘Where the Impact of Data Science Is’

As D-Lab matures, it’s able to support more graduate students, faculty and staff. Now, the lab serves about 6,000 scholars annually. 

“It’s also serving an increasing number of undergraduates, who not only attend workshops and use consulting services, but who also serve as undergraduate peer consultants, tech support along side D-Lab’s staff and graduate consultants and instructors to assist them,” said Aaron Culich, the lab’s deputy director and a leader in cyberinfrastructure. 

The lab offers hundreds of workshops in more tools and topics – from R to cloud computing – and is one of 31 Federal Statistical Research Data Centers nationwide where researchers can access sensitive restricted federal data from the U.S. Census Bureau and other agencies.

The impact of the lab is being felt across the Berkeley campus. Its free and frequent workshops have affected the kind of research scholars can accomplish, for example, helping scholars use more tools that change what data they can use to answer their research questions.

“It really has expanded the type of data that we can work with in the social sciences,” said David Harding, faculty director of D-Lab and a professor of sociology. “That's really where the impact of data science is, being able to work with large scale data, being able to work with texts, data, geographical data, network data, and take advantage of those methodologies.”

The lab is also overseeing a $3 million National Science Foundation grant, which involves multiple campus stakeholders, aimed at ensuring widespread access to data science classes. And D-Lab participants, like Terman, use their training to create formal classes on campus that remain intact after they graduate. D-Lab has also launched several courses, including a digital humanities minor and certificate program.

The lab’s experience and resources supported and helped shape parts of what is now the Data Science Undergraduate Studies program and other data science-centered institutions on campus, too, said Carson, who co-chaired the Data Science Education Rapid Action Team that created the blueprint for the undergraduate program.

Relevant expertise helps external groups, too, from training state employees on data-centered tools to conducting research with nonprofits and companies. For example, D-Lab is now exploring with Kaiser Permanente how machine learning can be used to improve health outcomes during pregnancy.

These kinds of outside partnerships often bring in revenue to support the lab. But each of them must offer something more – unique research opportunities, access to data or other benefits – to the lab and its students, said Patty Frontiera, the lab’s senior data services manager.

“Providing consulting and workshop services to the campus community and hiring graduate students to lead that work so they, in turn, develop as professionals are the most important things we do,” said Frontiera. “Everybody loves a good research project, but it has to enhance these services and our D-Lab student experience.”

A Movement and a Learning Community

Moving forward, the lab will continue to evolve as new students bring their own ideas, energy and enthusiasm to the center's work. And as students graduate, they'll continue to leave their mark on the lab, whether it's through training they've developed, research they've led or students they've helped.

In many cases, D-Lab students take their empathy-forward teaching approach and social justice-oriented research bent to other higher education institutions where they reach even more students. That includes alumni like Laura Nelson at the University of British Columbia and Benjamin Gebre-Medhin at Mount Holyoke College, among others.

"We're really more of a movement or a learning community rather than just a static bureaucratic organization," said Dr. von Vacano. "We're constantly re-inventing ourselves and re-imagining how we could do our work."