Spring 2020 Courses

Spring 2020 Courses

Backbone 

Title
Course Number
Times & Locations
Description 
Instructor
Units
Foundations of Data Science (Data 8)

STAT C8

Class #: 22140

MWF

10-11am

Wheeler 150

Foundations of data science from three perspectives: inferential thinking, computational thinking, and real-world relevance. Given data arising from some real-world phenomenon, how does one analyze that data so as to understand that phenomenon? The course teaches critical concepts and skills in computer programming and statistical inference, in conjunction with hands-on analysis of real-world datasets, including economic data, document collections, geographical data, and social networks. It delves into social and legal issues surrounding data analysis, including issues of privacy and data ownership. 4
Principles & Techniques of Data Science (Data 100)

COMPSCI C100

Class #: 28679

TTh

9:30-11am

Wheeler 150

In this course, students will explore the data science lifecycle, including question formulation, data collection and cleaning, exploratory data analysis and visualization, statistical inference and prediction​, and decision-making.​ This class will focus on quantitative critical thinking​ and key principles and techniques needed to carry out this cycle. These include languages for transforming, querying and analyzing data; algorithms for machine learning methods including regression, classification and clustering; principles behind creating informative data visualizations; statistical concepts of measurement error and prediction; and techniques for scalable data processing. Joseph Edgar Gonzalez, Ani Adikari 4

Data, Inference, and Decisions (Data 102)

STAT 102

Class #: 31062

TTh

9:30-11am

Lewis 100

This course develops the probabilistic foundations of inference in data science, and builds a comprehensive view of the modeling and decision-making life cycle in data science including its human, social, and ethical implications. Topics include: frequentist and Bayesian decision-making, permutation testing, false discovery rate, probabilistic interpretations of models, Bayesian hierarchical models, basics of experimental design, confidence intervals, causal inference, Thompson sampling, optimal control, Q-learning, differential privacy, clustering algorithms, recommendation systems and an introduction to machine learning tools including decision trees, neural networks and ensemble methods. Jacob Noah Steinhardt 4
Probability for Data Science

STAT 140

Class #: 22613

TuTh

3:30-5pm

Li Ka Shing 245

An introduction to probability, emphasizing the combined use of mathematics and programming to solve problems. Random variables, discrete and continuous families of distributions. Bounds and approximations. Dependence, conditioning, Bayes methods. Convergence, Markov chains. Least squares prediction. Random permutations, symmetry, order statistics. Use of numerical computation, graphics, simulation, and computer algebra. Ani Adhikari 4

Connectors

Title
Course Number
Times & Location
Description
Instructor
Units
Computational Structures in Data Science

COMPSCI 88

Class #: 29260

TBD

Development of Computer Science topics appearing in Foundations of Data Science (C8); expands computational concepts and techniques of abstraction. Understanding  the structures that underlie the programs, algorithms, and languages used in data science and elsewhere.   Mastery of a particular programming language while studying general techniques for managing program complexity, e.g., functional, object-oriented, and declarative programming. Provides practical experience with composing larger systems through several significant programming projects. Gerald Friedland, Michael Ball 2
Sports Analytics

DATA 88

Class #: TBD

TBD 

THIS CLASS IS NOT YET SCHEDULED BUT WILL BE AVAILABLE FOR ENROLLMENT SOON. PLEASE CHECK THE SCHEDULE OF CLASSES FOR THE LATEST UPDATES.

Sports Data Analytics is a connector course to Data8 and will follow Data8's technical curriculum with specific examples from and applications to analyzing the rich world of sports data. We will primarily use publicly available data from sports such as Major League Baseball and NBA basketball, but students will be encouraged to explore and analyze data from other sports. We will address questions around data acquisition, performance measurement, and real-world uses of analytics. Prerequisites: Concurrent registration in Data8 or knowledge of equivalent material.

Mike Kintslick 2
Digital Humanities

DATA 88

Class #: TBD

TBD 

THIS CLASS IS NOT YET SCHEDULED BUT WILL BE AVAILABLE FOR ENROLLMENT SOON. PLEASE CHECK THE SCHEDULE OF CLASSES FOR THE LATEST UPDATES.

Adam Anderson 2
Economic Models

DATA 88

Class #: TBD

TBD

THIS CLASS IS NOT YET SCHEDULED BUT WILL BE AVAILABLE FOR ENROLLMENT SOON. PLEASE CHECK THE SCHEDULE OF CLASSES FOR THE LATEST UPDATES.

This Data Science connector course will motivate and illustrate key concepts in Economics with examples in Python Jupyter notebooks. The course will give data science students a pathway to apply python programming and data science concepts within the discipline of economics.  The course will also give economics students a pathway to apply programming to reinforce fundamental concepts and to advance the level of study in upper division coursework and possible thesis work.

Eric Van Dusen 2
Statistical Genomics

DATA 88

Class #: TBD

TBD

THIS CLASS IS NOT YET SCHEDULED BUT WILL BE AVAILABLE FOR ENROLLMENT SOON. PLEASE CHECK THE SCHEDULE OF CLASSES FOR THE LATEST UPDATES.

Recent years have witnessed a rapid expansion in the creation and utilization of genetic and genomic data across diverse domains such as business, biological research, and medicine. In this Data 8 connector course we will survey relevant questions of interest and employ the methods frequently relied upon by analysts to derive insights from genetic and genomic data. Topics will include the comparison of DNA sequences, dimension reduction, the characterization of transcriptomes, and genome-wide association studies, among others. In addition to hands-on work with data, we will also consider the history of the genetic and genomic sciences and their intersection with current events, ethics, and modern medicine. Students should exit with an understanding of the central role played by data in the fields and an appreciation for the remaining challenges in light of ever-increasing degrees of personalization of, and access to, these sciences. No biological background is required.

Jonathan Fischer 2
How does History Count?

HISTORY 88

Class #: 33101

M

2-4pm

Cory 105

In this connector course, we will explore how historical data becomes historical evidence and how recent technological advances affect long-established practices, such as close attention to historical context and contingency. Will the advent of fast computing and big data make history “count” more or lead to unprecedented insights into the study of change over time? During our weekly discussions, we will apply what we learn in lectures and labs to the analysis of selected historical sources and get an understanding of constructing historical datasets. We will also consider scholarly debates over quantitative evidence and historical argument.
Jameson Karns 2
Crime and Punishment: Taking the Measure of the US Justice System

LEGAL ST 88

Class #: 30431

Tu

8-10am

Latimer 102

We will explore how data are used in the criminal justice system by exploring the debates surrounding mass incarceration and evaluating a number of different data sources that bear on police practices, incarceration, and criminal justice reform. Students will be required to think critically about the debates regarding criminal justice in the US and to work with various public data sets to assess the extent to which these data confirm or deny specific policy narratives. Building on skills from Foundations of Data Science, students will be required to use basic data management skills working in Python: data cleaning, aggregation, merging and appending data sets, collapsing variables, summarizing findings, and presenting data visualizations. 2
Data Science Applications in Physics

PHYSICS 88

Class #: 23403

M

2-4pm

Cory 277

Introduction to data science with applications to physics. Topics include: statistics and probability in physics, modeling of the physical systems and data, numerical integration and differentiation, function approximation. Connector course for Data Science 8, room-shared with Physics 77. Recommended for freshmen intended to major in physics or engineering with emphasis on data science.
2
Data Science for Social Impact

SOCIOL 88

Class #: 31004

Th

10-11am

Evans 4

This course explores the role of social research in policymaking and public decisions and develops skills for the communication of research findings and their implications in writing and through data visualization. Students will develop an understanding of various perspectives on the role that data and data analysts play in policymaking, learn how to write for a public audience about data, results, and implications, and learn how to create effective and engaging data.
David Harding 2
Probability and Mathematical Decisions in Data Science

STAT 88

Class #: 21888

MWF 

2-3pm

VLSB 2050

In this connector course we will state precisely and prove results discovered while exploring data in Data 8. Topics include: probability, conditioning, and independence; random variables; distributions and joint distributions; expectation, variance, tail bounds; Central Limit Theorem; symmetries in random permutations; prior and posterior distributions; probabilistic models; bias-variance tradeoff; testing hypotheses; correlation and the regression model. Adam Lucas 2
Data and Decisions

UGBA 88-1

Class #: 32625 

W

8-10am

Cheit C320

The goal of this connector course is to provide an understanding of how data and statistical analysis can improve managerial decision-making. We will explore statistical methods for gleaning insights from economic and social data, with an emphasis on approaches to identifying causal relationships. We will discuss how to design and analyze randomized experiments and introduce econometric methods for estimating causal effects in non-experimental data. The course draws on a variety of business and social science applications, including advertising, management, online marketplaces, labor markets, and education. This course, in combination with the Data 8 Foundations course, satisfies the statistics prerequisite for admission to Haas. Richard Huntsinger 2
Data and Decisions

UGBA 88 -2

Class #: 32626

W

10am-12pm

Cheit C320

The goal of this connector course is to provide an understanding of how data and statistical analysis can improve managerial decision-making. We will explore statistical methods for gleaning insights from economic and social data, with an emphasis on approaches to identifying causal relationships. We will discuss how to design and analyze randomized experiments and introduce econometric methods for estimating causal effects in non-experimental data. The course draws on a variety of business and social science applications, including advertising, management, online marketplaces, labor markets, and education. This course, in combination with the Data 8 Foundations course, satisfies the statistics prerequisite for admission to Haas. Richard Huntsinger 2

Human Contexts & Ethics

Title
Course Number
Times & Location
Description
Instructor
Units
Ethics in Science and Engineering

BIOENG 100

Class #: 32366

TuTh

5-6:30pm

Cory 277

The goal of this semester course is to present the issues of professional conduct in the practice of engineering, research, publication, public and private disclosures, and in managing professional and financial conflicts. The method is through historical didactic presentations, case studies, presentations of methods for problem solving in ethical matters, and classroom debates on contemporary ethical issues. The faculty will be drawn from national experts and faculty from religious studies, journalism, and law from the UC Berkeley campus. 

Dorian Liepmann 3
Introduction to Urban Data Analytics

CYPLAN 101

Class #: 13300

TuTh

11am-12:30pm

Wurster 112

This course (1) provides a basic intro to census and economic data collection, processing, and analysis; (2) surveys forecasting and modeling techniques in planning; (3) demonstrates the uses of real-time urban data and analytics; and (4) provides a socio-economic-political context for the smart cities movement, focusing on data ethics and governance.

Karen Chapple 4
Environmental Health and Development

ESPM C167

Class #: 26122

PBHLTH C160

Class #: 10552

TuTh

9:30-11am

Hearst Field Annex A1

The health effects of environmental alterations caused by development programs and other human activities in both developing and developed areas. Case studies will contextualize methodological information and incorporate a global perspective on environmentally mediated diseases in diverse populations. Topics include water management; population change; toxics; energy development; air pollution; climate change; chemical use, etc.

Rachel A. Morello-Frosch 4
The Social Life of Computing

ISF 100J

Class #: 30345

TuTh

9:30-11am

Wheeler 212

In this class, we will look at computing as a social phenomenon: to see it not just as a technology that transforms but to see it as a technology that has evolved, and is being put to use, in very particular ways, by particular groups of people. We will be doing this by employing a variety of methods, primarily historical and ethnographic, oriented around a study of practices. We will pay attention to technical details but ground these technical details in social organization (a term whose meaning should become clearer and clearer as the class progresses). We will study the social organization of computing around different kinds of hardware, software, ideologies, and ideas.

Shreeharsh Kelkar 4
Human Contexts and Ethics of Data

HISTORY C184D / STS C104

Class #: 24434

MWF

3-4pm

Valley Life Sciences 2050

This course teaches you to use the tools of applied historical thinking and Science, Technology, and Society (STS) to recognize, analyze, and shape the human contexts and ethics of data. It addresses key topics such as doing ethical data science amid shifting definitions of human subjects, consent, and privacy; the changing relationship between data, democracy, and law; the role of data analytics in how corporations and governments provide public goods such as health and security to citizens; sensors, machine learning and artificial intelligence and changing landscapes of labor, industry, and city life.  It prepares you to engage as a knowledgeable and responsible citizen and professional in the varied arenas of our datafied world.

Margarita Boenig-LiptsinAri Edmundson 4

Data Enabled Courses

These courses are taught in a way that permits students to build on Data 8. Please review the prerequisites.

Title Course Number Times & Locations Description Instructor Units
Astronomy Data Science Laboratory

ASTRON 128

Class #: 24268

M

4-7pm

Campbell Hall 131A

This course features 3 data-centric laboratory experiments that draw on a variety of tools used by professional astronomers. Students will learn to procure and clean data (drawn from a variety of world-class astronomical facilities), assess the fidelity/quality of data, build and apply models to describe data, learn statistical and computational techniques to analyze data (e.g., Bayesian inference, machine learning, parallel computing), and effectively communicate data and scientific results. There is a heavy emphasis on software development in the Python language, statistical techniques, and high-quality communication (e.g., written reports, oral presentations, and data visualization).

Daniel R. Weisz

4

Engineering Data Analysis

CIVENG 93

Class #: 28048

MW

9-10am

Davis 502

Application of the concepts and methods of probability theory and statistical inference to CEE problems and data; graphical data analysis and sampling; elements of set theory; elements of probability theory; random variables and expectation; simulation; statistical inference. Use of computer programming languages for analysis of CEE-related data and problems. The course also introduces the student to various domains of uncertainty analysis in CEE.

Michael Hansen

3

Introduction to Machine Learning

COMPSCI 189

Class #: 28347

MW

6:30-8pm

Wheeler 150

Theoretical foundations, algorithms, methodologies, and applications for machine learning. Topics may include supervised methods for regression and classification (linear models, trees, neural networks, ensemble methods, instance-based methods); generative and discriminative probabilistic models; Bayesian parametric learning; density estimation and clustering; Bayesian networks; time series models; dimensionality reduction; programming projects covering a variety of real-world applications. 

Jonathan Shewchuk

4

Computational Models of Cognition

COGSCI 131

Class #: 22773

TuTh

11am-12:30pm

Haas Faculty Wing F295

This course will provide advanced students in cognitive science and computer science with the skills to develop computational models of human cognition, giving insight into how people solve challenging computational problems, as well as how to bring computers closer to human performance. The course will explore three ways in which researchers have attempted to formalize cognition -- symbolic approaches, neural networks, and probability and statistics -- considering the strengths and weaknesses of each.  Steven Piantadosi

4

Applied Econometrics and Public Policy

ECON/PUBPOL C142

Class #: 22428, 30075

TuTh

5-6:30pm

Moffitt Library 102

This course focuses on the sensible application of econometric methods to empirical problems in economics and public policy analysis. It provides background on issues that arise when analyzing non-experimental social science data and a guide for tools that are useful for empirical research. By the end of the course, students will have an understanding of the types of research designs that can lead to convincing analysis and be comfortable working with large scale data sets. David Card 4
Introductory Applied Econometrics

ENVECON/IAS C118

Class #: 2634419977

TTh

9:30-11am

Mulford 159

Formulation of a research hypothesis and definition of an empirical strategy. Regression analysis with cross-sectional and time-series data; econometric methods for the analysis of qualitative information; hypothesis testing. The techniques of statistical and econometric analysis are developed through applications to a set of case studies and real data in the fields of environmental, resource, and international development economics. Students learn the use of a statistical software for economic data analysis.  Sofia Villa-Boas 4
Terrestrial Hydrology

GEOG C136/ CIVENG C103N/ ESPM C130

Class #: 23730, 3110031095

TuTh

2-3:30pm

Wheeler 102

A quantitative introduction to the hydrology of the terrestrial environment including lower atmosphere, watersheds, lakes, and streams. All aspects of the hydrologic cycle, including precipitation, infiltration, evapotranspiration, overland flow, streamflow, and groundwater flow. Chemistry and dating of groundwater and surface water. Development of quantitative insights through problem solving and use of simple models. This course requires one field experiment and several group computer lab assignments. Paolo D'odorico 4
Applied Data Science with Venture Applications

IND ENG 135

Class #: 28923

Tu

3:30-6:30pm

Evans 60

This highly-applied course surveys a variety of key of concepts and tools that are useful for designing and building applications that process data signals of information. The course introduces modern open source, computer programming tools, libraries, and code samples that can be used to implement data applications. The mathematical concepts highlighted in this course include filtering, prediction, classification, decision-making, Markov chains, LTI systems, spectral analysis, and frameworks for learning from data. Each math concept is linked to implementation using Python using libraries for math array functions (NumPy), manipulation of tables (Pandas), long term storage (SQL, JSON, CSV files), natural language (NLTK), and ML frameworks. Ikhlaq Sidhu 3
Data, Prediction and Law

LEGAL ST 123

Class #: 24069

MW

12-2pm

Barrows 110

Data, Prediction and Law allows students to explore different data sources that scholars and government officials use to make generalizations and predictions in the realm of law. The course will also introduce critiques of predictive techniques in law. Students will apply the statistical and Python programming skills from Foundations of Data Science to examine a traditional social science dataset, “big data” related to law, and legal text data. Jonathan Marshall 4
Introduction to Computational Techniques in Physics

PHYSICS 77

Class #: 21289

M

2-4pm

Cory 277

Introductory scientific programming in Python with examples from physics. Topics include: visualization, statistics and probability, regression, numerical integration, simulation, data modeling, function approximation, and algebraic systems. Recommended for freshman physics majors. 3
Research and Data Analysis in Psychology

PSYCH 101

Class #: 21528

F

3-6pm

Lewis 100

The course will concentrate on hypothesis formulation and testing, tests of significance, analysis of variance (one-way analysis), simple correlation, simple regression, and nonparametric statistics such as chi-square and Mann-Whitney U tests. Majors intending to be in the honors program must complete 101 by the end of their junior year.  Arman Daniel Catterson 4
Concepts in Computing with Data

STAT 133

Class #: 21894

MWF

9-10pm

Valley Life Sciences 2050

An introduction to computationally intensive applied statistics. Topics will include organization and use of databases, visualization and graphics, statistical learning and data mining, model validation procedures, and the presentation of results.  Gaston Sanchez Trujillo 3
Modern Statistical Prediction and Machine Learning

STAT 154

Class #: 21932

TuTh

12:30-2pm

Hearst Mining 390

Theory and practice of statistical prediction. Contemporary methods as extensions of classical methods. Topics: optimal prediction rules, the curse of dimensionality, empirical risk, linear regression and classification, basis expansions, regularization, splines, the bootstrap, model selection, classification and regression trees, boosting, support vector machines. Computational efficiency versus predictive performance. Emphasis on experience with real data and assessing statistical assumptions.  Gaston Sanchez Trujillo 4