November 2, 2016

Berkeley’s data science offerings continue to grow this spring, with a variety of new courses available in departments across the campus. In addition to the courses listed below, an additional set of courses is still under development. Note that the information below is subject to change; please check back often for updates.

Foundations of Data Science 

This spring, Berkeley’s four-unit Foundations of Data Science course will be taught by Computer Science Professor John DeNero. The course teaches core computational and statistics concepts while enabling students to work hands-on with real data. Accessible to students of any intended major, it is ideal for freshmen and sophomores. No prior experience with computer science or statistics is expected. It satisfies requirements, including the L&S Quantitative Reasoning requirement and the statistics requirement in most majors requiring statistics.

Connector Courses

The list of available connector courses (subject to change) will include:

Data Science and the Mind (COGSCI 88)

Exploring Geospatial Data (ESPM 88A) (expected)

Time Series Analysis: Sea Level Rise and Coastal Flooding (CIV ENG 88B)

Immunotherapy of Cancer: Success and Failure (MCB 88)

Crime and Punishment: Taking the Measure of the US Justice System (Legal Studies 88)

Data and Ethics (INFO 88)

Computational Structures in Data Science (COMPSCI 88)

Probability and Mathematical Statistics in Data Science (STAT 88)

New advanced data science courses debuting this spring

Three new courses are being developed for Spring 2017 that take Foundations of Data Science (Data 8) as prerequisite. They are ideal for students looking to move further into data science and take their knowledge to the next level.

Principles and Techniques of Data Science (CS C100 / Stat C100): (subject to approval by the Academic Senate)
Combining data, computation, and inferential thinking, data science is redefining how people and organizations solve challenging problems and understand the world. This intermediate level class bridges between Data 8 and upper division computer science and statistics courses as well as methods courses in other fields. In this class, students will master the data science life-cycle and learn many of the basic principles and techniques of data science spanning algorithms, statistics, machine learning, visualization, and data systems. Skills and expertise developed in this class will enable students to pursue careers in data science or apply data science to research.More information is available on the course website at https://ds100.org/sp17/.

Statistical Methods for Data Science (Stat 28): Stat 28 is a new course for students in many disciplines who have taken Data 8 and want to learn more advanced techniques without the additional mathematics called on in upper-division statistics. Topics include group comparisons and ANOVA, standard parametric statistical models, multivariate data visualization, multiple linear regression and classification, classification and regression trees and random forests. An important focus of the course is on statistical computing and reproducible statistical analysis. Students are introduced to “R”, the widely used statistical language, and obtain hands-on experience in implementing a range of commonly used statistical methods on real-world datasets. Foundations of Data Science is the only prerequisite. 

Probability for Data Science (Stat 140):  This new course introduces students to probability theory using both mathematics and computation, the two main tools of the subject. The contents have been selected to be useful for data science, and include discrete and continuous families of distributions, bounds and approximations, dependence, conditioning, Bayes methods, random permutations, convergence, Markov chains and reversibility, maximum likelihood, and least squares prediction. Labs will cover a variety of topics including matches in random sampling, distance between distributions, Page rank, and Markov Chain Monte Carlo methods. The prerequisites are Foundations of Data Science (Data 8) and one year of calculus. Data 8 gives students a practical understanding of randomness and sampling variability. Stat 140 will capitalize on this, abstraction and computation complementing each other throughout. Students will develop multiple approaches to problem solving, understand the difference between theory and simulation, and appreciate the power of both.

Advanced Integrative Opportunities

A new category of classes is under development that will enable more advanced students to work hands-on with data in an interdisciplinary, project-based manner. Among them are Terrestrial Hydrology (Geog c136/ESPM c130), a new project-based, interdisciplinary course focused on the role that hydrology plays in malaria transmission in sub-Saharan Africa (prerequisites are Math 1A-1B and Physics 7A).

Stay up to date
This website will be updated as course information is finalized.