Gateway Data Sciences Courses Reach Enrollment Milestones

February 9, 2018

The “backbone” courses within UC Berkeley’s Data Science Education Program continue to draw unprecedented numbers of students, as the Foundations of Data Science (Data 8) and Principles and Techniques of Data Science (Data 100) reached record enrollments in Spring 2018. These courses draw students from over 70 majors on campus, welcoming anyone with or without programming experience, and giving them the tools to apply data science throughout their life. By creating these welcoming and collaborative classroom opportunities and implementing open source, accessible-to-all learning environments, these courses are coding inclusivity into their curriculum and throughout the world.

Photo: First day of the Data 8 course in Spring 2018.

Data 8’s Continued Expansion

Data 8, the fastest growing course in UC Berkeley history, has again enrolled over 1,000 students this spring. For the first time in the program’s six semesters of experience, seats in Data 8 had to be limited by the availability of teaching staff and classroom space, a restriction that the faculty hope to be able to remove again in the fall. 

Developed by professors Ani Adhikari and John DeNero with contributions from David Wagner and Henry Milner, the Data 8 curriculum allows students to apply statistical inference, program in Python, visualize distributions and test different hypotheses using real world datasets. Student investigate a rich variety of real world topics, which range from parsing through novels and analyzing public health epidemics to looking at the ethnic breakdown among jury panelists in Alameda County.

Data 8’s accessibility to anyone with no prior coding experience has resulted in a near 50/50 gender ratio, a milestone that has been celebrated at the Grace Hopper Celebration of Women in Computing for its role in “proactively shaping the trajectory of the data science field.”

The Data Science Education Program also supports Connector courses across multiple campus departments, where students can apply data science to various focused topics and real-world issues of their interests. Students can use data science to understand social networks, improve business strategies and analyze clinical trials of cancer immunotherapy. This semester, there are 550 students enrolled in twelve different Connector courses, including a new course on Sports Analytics, and one on Data and Decisions for Undergraduate Business Administration. 

Data 8 has also continued to grow this year with its first offering of Data 8X, an online version of Data 8 schedules to launch this April. Initial developers of the Data 8 curriculum, statistics professor Ani Adhikari and computer science professors John DeNero and David Wagner, have teamed up again to create and teach Data 8x, which is offered through online course provider edX.

Rapid Growth of Data 100

Like Data 8, Data 100 has also seen a sharp rise in popularity. Co-taught this spring by professors Joey Gonzales and Fernando Perez, Data 100 explores the principles and techniques of data sciences. The intermediate level class bridges Data 8 with upper division statistics and computer sciences courses, diving into data analysis, algorithms for machine learning methods, classifications and clustering, principles of data visualizations, measurement of error and prediction, and techniques for scalable data processing.

data 100

Photo: First day of the Data 100 course in Spring 2018.

Now in its third semester, Data 100’s enrollment has grown 250% each semester since it began, with current enrollment at 600 students, said Interim Dean of Data Sciences David Culler.

In Data 100, students work with real world data sets and real world computational tools that are used in both research and industry. “The questions we’re asking require non-trivial answers,” Perez said of the course. “Students learn to talk to databases, run things on the cloud, and run complex installations of software in computers. It’s a real world experience. We’re asking, ‘What are the problems in data science?’ versus purely hypothetical questions from the back of a book,” Perez said.