Fall 2017 Courses



Course Number

Days and Times


Instructor Units
Foundations of Data Science (Data 8)


CCN: 42193

MWF 10-11

Foundations of data science from three perspectives: inferential thinking, computational thinking, and real-world relevance. Given data arising from some real-world phenomenon, how does one analyze that data so as to understand that phenomenon? The course teaches critical concepts and skills in computer programming and statistical inference, in conjunction with hands-on analysis of real-world data sets.

John DeNero and David Wagner 4



Course number

Days and Times




Data Science for Smart Cities


CCN: 42535

Mon 12-2

Design and operation of smart, efficient, and resilient cities nowadays require data science skills. This course provides an introduction to working with data generated within transportation systems, power grids, communication networks, as well as collected via crowd-sensing and remote sensing technologies.

Alexei Pozdnukhov 2
Immigration: What Do the Data Tell Us? DEMOG 88


This course will cover the small but important part of the rich history human migration that deals with the population of the United States--focusing on the period between 1850 and the present. Enrollment in Demog 88 will open shortly, probably early in August. Please contact the instructor cmason@berkeley;edu for updates.

Carl Mason 2
Social Networks

L&S 88-1

CCN: 46819

Mon 12-2

Insights from the study of social networks are used in a wide range of real-world settings, ranging from predicting and preventing the spread of Ebola, to convincing people to vote for a political candidate, to connecting people across the globe through Facebook. Learn how to work with social network data and why it’s useful.

Dennis Feehan 2
Web Data Visualization L&S 88-2
CCN: 46820


Learn how to generate effective visualizations for Web data (e.g., social networks). The course covers basic principles and tools for understanding and visualizing Web data. It focuses heavily on project work that aims to give students hands-on experience with handling Web data.

Yasmin AlNoamany 2
Data Science for Cognitive Neuroscience

L&S 88-3

CCN: 46821


The human brain is a complex information processing system and is currently the topic of multiple fascinating branches of research. Understanding how it works is a very challenging scientific task.

Samy Abdel-ghaffar
Behind the Curtain in Economic Development

L&S 88-4

M 2-4, 105 Cory

This class will look at methods in sampling, surveying, and data collection. The students will use their data analysis tools to start to look at questions about household behavior and how households are situated in local, regional and national contexts.

Eric VanDusen 2
Rediscovering Texts as Data

L&S 88-5

M 4-6, 458 Evans

In this course, we will help you find and explore newly available texts of interest to you and guide your understanding of textual phenomena obtained through computational methods, enriching your reading of an individual text. As a connector course to Data 8 (Foundations of Data Science), this class will give students experience in the Python programming language.

Chris Hench
Claudia von Vacano
Probability and Mathematical Statistics in Data Science

STAT 88 

CCN: 22288


In this connector course we will state precisely and prove results discovered in the foundational data science course through working with data.

Shobhana Murali Stoyanov 2

Courses with Data 8 as prerequisite


Course Number

Days and TImes


Instructor Units
Principles and Techniques of Data Science

CS C100 / Stat C100

CCN: 45085

Tue-Th 11-12:30

Combining data, computation, and inferential thinking, data science is redefining how people and organizations solve challenging problems and understand the world.

Deborah NolanJoseph Gonzalez 4

Additional data-enabled courses 

These courses are taught in a way that permits students to build on Data 8. Please review prerequisites. To add a proposed course to this list, please contact DSEP.

TitleCourse numberDays and TimesDescriptionInstructorUnits
Intro to Machine Learning in Computational Biology


CCN: 46422



This course will review the fundamentals of Data Science and data mining techniques. We will begin by reviewing Data Science across the disciplines, including guest lectures from data scientists on campus. As the semester progresses, we will focus increasingly on data science techniques in computational biology and bioinformatics, illustrating major methods and issues from these fields. Finally, we will discuss ethical issues related to data from biomedical research and genomics. 

Kimmen V Sjolander 4
Special Topics in Cognitive Science


CCN: 22849

F 1-3 

Cory 105

This course develops computational thinking and technical skills for senior undergraduates and graduate students who wish to pursue data-driven approaches to cognitive science and related disciplines. It introduces core ideas from probability theory, information theory, statistics, and machine learning, and explores the nature of human cognition from a computational perspective. The course emphasizes hands-on analysis of data and involves a combination of lectures, discussion, labs, and research-oriented project clinics.

Terry Regier 3
Demographic Methods: Introduction to Population Analysis


CCN: 13815



145 Moffit

Measures and methods of Demography. Life tables, fertility and nuptiality measures, age pyramids, population projection, measures of fertility control.  Robert E. Chung 3
Data Science in Global Change Ecology

ESPM 157

CCN: 46582



Mulford 230

Many of the greatest challenges we face today come from understanding and interacting with the natural world: from global climate change to the sudden collapse of fisheries and forests, from the spread of disease and invasive species to the unknown wealth of medical, cultural, and technological value we derive from nature. Advances in satellites and micro-sensors, computation, informatics and the Internet have made available unprecedented amounts of data about the natural world, and with it, new challenges of sifting, processing and synthesizing large and diverse sources of information. In this course, students will learn and apply fundamental computing, statistics and modeling concepts to a series of real-world ecological and environment. Carl Boettiger 3
Applied Data Science with Venture Applications


CCN: 47035



North Gate 105

This highly-applied course surveys a variety of key of concepts and tools that are useful for designing and building applications that process data signals of information. The course introduces modern open source, computer programming tools, libraries, and code samples that can be used to implement data applications. The mathematical concepts highlighted in this course include filtering, prediction, classification, decision-making, Markov chains, LTI systems, spectral analysis, and frameworks for learning from data. Each math concept is linked to implementation using Python using libraries for math array functions (NumPy), manipulation of tables (Pandas), long term storage (SQL, JSON, CSV files), natural language (NLTK), and ML frameworks. Ikhlaq Sidhu 3
Introduction to Machine Learning and Data Analytics


CCN: 46498



Etcheverry 3106

This course introduces students to key techniques in machine learning and data analytics through a diverse set of examples using real data sets from domains such as e-commerce, healthcare, social media, sports, the Internet, and more. Through these examples, exercises in R, and a comprehensive team project, students will gain experience understanding and applying techniques such as linear regression, logistic regression, classification and regression trees, random forests, boosting, text mining, data cleaning and manipulation, data visualization, network analysis, time series modeling, clustering, principal component analysis, regularization, and large-scale learning. Paul Grigas 3
Natural Language Processing

INFO 159

CCN: 46356



LeConte 4

This course introduces students to natural language processing and exposes them to the variety of methods available for reasoning about the text in computational systems. NLP is deeply interdisciplinary, drawing on both linguistics and computer science, and helps drive much contemporary work in text analysis (as used in computational social science, the digital humanities, and computational journalism). We will focus on major algorithms used in NLP for various applications (part-of-speech tagging, parsing, coreference resolution, machine translation) and on the linguistic phenomena those algorithms attempt to model. Students will implement algorithms and create linguistically annotated data on which those algorithms depend.  David Bamman 3
Introduction to Data Visualization

INFO 190-1

CCN: 43668



South Hall 210

This course introduces students to data visualization: the use of the visual channel for gaining insight with data, exploring data, and as a way to communicate insights, observations, and results with other people. The field of information visualization is flourishing today, with beautiful designs and applications ranging from journalism to marketing to data science. This course will introduce foundational principles and relevant perceptual properties to help students become discerning judges of data displayed visually. The course will also introduce key practical techniques and include extensive hands-on exercises to enable students to become skilled at telling stories with data using modern information visualization tools. Marti A. Hearst 3
Introduction to Computational Techniques in Physics


CCN: 18921

M 2-4

Etcheverry 3016

Introductory scientific programming in Python with examples from physics. Topics include visualization, statistics and probability, regression, numerical integration, simulation, data modeling, function approximation, and algebraic systems. Recommended for freshman physics majors. Kam-Biu Luk 3
Concepts in Computing with Data

STAT 133

CCN: 21287



Dwinelle 155

An introduction to computationally intensive applied statistics. Topics will include organization and use of databases, visualization and graphics, statistical learning and data mining, model validation procedures, and the presentation of results. Gaston Sanchez Trujillo 3
Reproducible and Collaborative Statistical Data Science

STAT 159

CCN: 21089



Etcheverry 3106

A project-based introduction to statistical data analysis. Through case studies, computer laboratories, and a term project, students will learn practical techniques and tools for producing statistically sound and appropriate, reproducible, and verifiable computational answers to scientific questions. Course emphasizes version control, testing, process automation, code review, and collaborative programming. Software tools may include Bash, Git, Python, and LaTeX. Fernando Perez 4

For additional guidance about data science courses at Berkeley, consult the Curriculum Overview page.