Backbone Courses
Title |
Course Number |
Times & Locations |
Description |
Instructor |
Units |
---|---|---|---|---|---|
Foundations of Data Science (Data 8) |
STAT/COMPSCI C8 Class #: 32888(link is external) |
MWF 11-12pm Wheeler 150 |
Foundations of data science from three perspectives: inferential thinking, computational thinking, and real-world relevance. Given data arising from some real-world phenomenon, how does one analyze that data so as to understand that phenomenon? The course teaches critical concepts and skills in computer programming and statistical inference, in conjunction with hands-on analysis of real-world datasets, including economic data, document collections, geographical data, and social networks. It delves into social and legal issues surrounding data analysis, including issues of privacy and data ownership. |
Ramesh Sridharan |
4 |
Principles & Techniques of Data Science (Data 100) |
STAT/COMPSCI C100 Class #: 24961(link is external) |
TTh 9:30-11am Wheeler 150 |
In this course, students will explore the data science lifecycle, including question formulation, data collection and cleaning, exploratory data analysis and visualization, statistical inference and prediction, and decision-making. This class will focus on quantitative critical thinking and key principles and techniques needed to carry out this cycle. These include languages for transforming, querying and analyzing data; algorithms for machine learning methods including regression, classification and clustering; principles behind creating informative data visualizations; statistical concepts of measurement error and prediction; and techniques for scalable data processing. |
Joshua A. Hug |
4 |
Data, Inference, and Decisions (Data 102) |
STAT 102 Class #: 33141(link is external) |
TTh 2-3:30pm Soda 306 |
This course develops the probabilistic foundations of inference in data science, and builds a comprehensive view of the modeling and decision-making life cycle in data science including its human, social, and ethical implications. Topics include: frequentist and Bayesian decision-making, permutation testing, false discovery rate, probabilistic interpretations of models, Bayesian hierarchical models, basics of experimental design, confidence intervals, causal inference, Thompson sampling, optimal control, Q-learning, differential privacy, clustering algorithms, recommendation systems and an introduction to machine learning tools including decision trees, neural networks and ensemble methods. |
Fernando Perez, Michael Jordan |
4 |
Probability for Data Science |
STAT 140 Class #: 26032(link is external) |
TuTh 6:30-8pm Valley Life Sciences 2050 |
An introduction to probability, emphasizing the combined use of mathematics and programming to solve problems. Random variables, discrete and continuous families of distributions. Bounds and approximations. Dependence, conditioning, Bayes methods. Convergence, Markov chains. Least squares prediction. Random permutations, symmetry, order statistics. Use of numerical computation, graphics, simulation, and computer algebra. |
Ani Adhikari |
4 |
Connectors
Title |
Course Number |
Times & Location |
Description |
Instructor |
Units |
---|---|---|---|---|---|
Computational Structures in Data Science |
COMPSCI 88 Class #: 28629(link is external) |
M 2-3pm Stanley 105 |
Development of Computer Science topics appearing in Foundations of Data Science (C8); expands computational concepts and techniques of abstraction. Understanding the structures that underlie the programs, algorithms, and languages used in data science and elsewhere. Mastery of a particular programming language while studying general techniques for managing program complexity, e.g., functional, object-oriented, and declarative programming. Provides practical experience with composing larger systems through several significant programming projects. |
|
2 |
Economic Models |
DATA 88-1 Class #: 33469(link is external) |
T 2-4pm 105 Cory |
This Data Science connector course will motivate and illustrate key concepts in Economics with examples in Python Jupyter notebooks. The course will give data science students a pathway to apply python programming and data science concepts within the discipline of economics. The course will also give economics students a pathway to apply programming to reinforce fundamental concepts and to advance the level of study in upper division coursework and possible thesis work. |
Eric Van Dusen |
2 |
Data Science in Genetics and Genomics |
DATA 88-2 Class #: 33483(link is external) |
T 12-2pm 105 Cory |
Recent years have witnessed a rapid expansion in the creation and utilization of genetic and genomic data across diverse domains such as business, biological research, and medicine. In this Data 8 connector course we will survey relevant questions of interest and employ the methods frequently relied upon by analysts to derive insights from genetic and genomic data. Topics will include the comparison of DNA sequences, dimension reduction, the characterization of transcriptomes, and genome-wide association studies, among others. In addition to hands-on work with data, we will also consider the history of the genetic and genomic sciences and their intersection with current events, ethics, and modern medicine. Students should exit with an understanding of the central role played by data in the fields and an appreciation for the remaining challenges in light of ever-increasing degrees of personalization of, and access to, these sciences. No biological background is required. |
Jonathan Fischer |
2 |
Immigration: What do the data tell us? |
DEMOG 88 Class #: 25347(link is external) |
M 2-4pm 2232 Piedmont 100 |
This course will cover the small but important part of the rich history human migration that deals with the population of the United States -- focusing on the 20th and 21st Centuries. We will use the tools of DS8 to answer specific questions that relate to the themes |
|
2 |
PyEarth: A Python Introduction to Earth Science |
EPS 88 Class #: 26406(link is external) |
T 12-2pm McCone 325 |
Earthquakes and El Ninos are examples of natural hazards in California. The course uses Python/Jupyter Notebook and real-world observations to introduce students to these and other Earth phenomena and their underlying physics. The students will learn how to access and visualize the data, extract signals, and make probability forecasts. The final module is a project that synthesizes the course material to make a probabilistic forecast. The course will be co-taught by a team of EPS faculty, and the focus of each semester will depend on the expertise of the faculty in charge. |
Nicolas Swanson-hysell |
2 |
Data Science Applications in Physics |
PHYSICS 88 Class #: 26399(link is external) |
M 2-4pm Barrows 166 |
Introduction to data science with applications to physics. Topics include: statistics and probability in physics, modeling of the physical systems and data, numerical integration and differentiation, function approximation. Connector course for Data Science 8, room-shared with Physics 77. Recommended for freshmen intended to major in physics or engineering with emphasis on data science. |
|
2 |
Data and Decisions |
UGBA 88-1 Class #: 33263(link is external) |
M 2-4pm Cheit C210 |
The goal of this connector course is to provide an understanding of how data and statistical analysis can improve managerial decision-making. We will explore statistical methods for gleaning insights from economic and social data, with an emphasis on approaches to identifying causal relationships. We will discuss how to design and analyze randomized experiments and introduce econometric methods for estimating causal effects in non-experimental data. The course draws on a variety of business and social science applications, including advertising, management, online marketplaces, labor markets, and education. This course, in combination with the Data 8 Foundations course, satisfies the statistics prerequisite for admission to Haas. |
Conrad Miller |
2 |
Data and Decisions |
UGBA 88-2 Class #: 33264(link is external) |
M 4-6pm Cheit C210 |
The goal of this connector course is to provide an understanding of how data and statistical analysis can improve managerial decision-making. We will explore statistical methods for gleaning insights from economic and social data, with an emphasis on approaches to identifying causal relationships. We will discuss how to design and analyze randomized experiments and introduce econometric methods for estimating causal effects in non-experimental data. The course draws on a variety of business and social science applications, including advertising, management, online marketplaces, labor markets, and education. This course, in combination with the Data 8 Foundations course, satisfies the statistics prerequisite for admission to Haas. |
Conrad Miller |
2 |
How does History Count? |
HIST 88 Class #: 26125(link is external) |
T 4-4:30pm Cory 105 |
In this connector course, we will explore how historical data becomes historical evidence and how recent technological advances affect long-established practices, such as close attention to historical context and contingency. Will the advent of fast computing and big data make history “count” more or lead to unprecedented insights into the study of change over time? During our weekly discussions, we will apply what we learn in lectures and labs to the analysis of selected historical sources and get an understanding of constructing historical datasets. We will also consider scholarly debates over quantitative evidence and historical argument. |
|
2 |
Human Contexts & Ethics
Title |
Course Number |
Times & Location |
Description |
Instructor |
Units |
---|---|---|---|---|---|
Ethics in Science and Engineering |
BIOENG 100 Class #: 27356(link is external) |
MWF 12-1pm Evans 10 |
The goal of this semester course is to present the issues of professional conduct in the practice of engineering, research, publication, public and private disclosures, and in managing professional and financial conflicts. The method is through historical didactic presentations, case studies, presentations of methods for problem solving in ethical matters, and classroom debates on contemporary ethical issues. The faculty will be drawn from national experts and faculty from religious studies, journalism, and law from the UC Berkeley campus. |
Dorian Liepmann |
3 |
Human Contexts and Ethics of Data |
HISTORY C184D / STS C104 Class #: 31650(link is external) |
MWF 4-5pm Li Ka Shing 245 |
This course teaches you to use the tools of applied historical thinking and Science, Technology, and Society (STS) to recognize, analyze, and shape the human contexts and ethics of data. It addresses key topics such as doing ethical data science amid shifting definitions of human subjects, consent, and privacy; the changing relationship between data, democracy, and law; the role of data analytics in how corporations and governments provide public goods such as health and security to citizens; sensors, machine learning and artificial intelligence and changing landscapes of labor, industry, and city life. It prepares you to engage as a knowledgeable and responsible citizen and professional in the varied arenas of our datafied world. |
Cathryn Carson |
4 |
Behind the Data: Humans and Values |
INFO 188 Class #: 29335(link is external) |
TTh 12:30-2pm Etcheverry 3108 |
This course blends social and historical perspectives on data with ethics, law, policy, and case examples to help students understand current ethical and legal issues in data science and machine learning. Legal, ethical, and policy-related concepts addressed include: research ethics; privacy and surveillance; bias and discrimination; and oversight and accountability. These issues will be addressed throughout the lifecycle of data--from collection to storage to analysis and application. The course emphasizes strategies, processes, and tools for attending to ethical and legal issues in data science work. Course assignments emphasize researcher and practitioner reflexivity, allowing students to explore their own social and ethical commitments. |
Deirdre Kathleen Mulligan |
3 |
Data Enabled Courses
These courses are taught in a way that permits students to build on Data 8. Please review the prerequisites.
Title |
Course Number |
Times & Locations |
Description |
Instructor |
Units |
---|---|---|---|---|---|
Engineering Data Analysis |
CIVENG 93 Class #: 27400(link is external) |
MW 9-10am Etcheverry 3106 |
Application of the concepts and methods of probability theory and statistical inference to CEE problems and data; graphical data analysis and sampling; elements of set theory; elements of probability theory; random variables and expectation; simulation; statistical inference. Use of computer programming languages for analysis of CEE-related data and problems. The course also introduces the student to various domains of uncertainty analysis in CEE. |
Joan L. Walker |
3 |
Introduction to Machine Learning |
COMPSCI 189 Class #: 27462(link is external) |
TTh 5-6:30pm Pimentel 1 |
Theoretical foundations, algorithms, methodologies, and applications for machine learning. Topics may include supervised methods for regression and classification (linear models, trees, neural networks, ensemble methods, instance-based methods); generative and discriminative probabilistic models; Bayesian parametric learning; density estimation and clustering; Bayesian networks; time series models; dimensionality reduction; programming projects covering a variety of real-world applications. |
Jennifer Listgarten, Stella Xingxing Yu |
4 |
Computational Models of Cognition |
COGSCI 131 Class #: 32577(link is external) |
MWF 6-7pm Latimer 120 |
This course will provide advanced students in cognitive science and computer science with the skills to develop computational models of human cognition, giving insight into how people solve challenging computational problems, as well as how to bring computers closer to human performance. The course will explore three ways in which researchers have attempted to formalize cognition -- symbolic approaches, neural networks, and probability and statistics -- considering the strengths and weaknesses of each. |
|
4 |
Introduction to Population Analysis |
DEMOG 110 Class #: 21585(link is external) |
TTh 3:30-5pm Barrows 20 |
Measures and methods of Demography. Life tables, fertility and nuptiality measures, age pyramids, population projection, measures of fertility control. |
|
3 |
Data, Environment and Society |
ENERES 131 Class #: 33105(link is external) |
TTh 9:30-11am Wheeler 202 |
Critical, data-driven analysis of specific issues or general problems of how people interact with environmental and resource systems. This course will teach students to build, estimate and interpret models that describe phenomena in the broad area of energy and environmental decision-making. More than one section may be given each semester on different topics depending on faculty and student interest. |
Duncan Calloway |
4 |
Basic Modeling and Simulation Tools for Industrial Research Applications |
ENGIN 150 Class #: 31486(link is external) |
TTh 2-3:30pm Jacobs Hall 310 |
The course emphasizes elementary modeling, numerical methods & their implementation on physical problems motivated by phenomena that students are likely to encounter in their careers, involving biomechanics, heat-transfer, structural analysis, control theory, fluid-flow, electrical conduction, diffusion, etc. This will help students develop intuition about the strengths and weaknesses of a variety of modeling & numerical methods, as well as develop intuition about modeling physical systems & strengths and weaknesses of a variety of numerical methods, including: Discretization of differential equations, Methods for solving nonlinear systems, Gradient-based methods and machine learning algorithms for optimization, stats & quantification |
Tarek Zohdi |
3 |
Introductory Applied Econometrics |
ENVECON/IAS C118 Class #: 26748(link is external) |
TTh 9:30-11am Mulford 159 |
Formulation of a research hypothesis and definition of an empirical strategy. Regression analysis with cross-sectional and time-series data; econometric methods for the analysis of qualitative information; hypothesis testing. The techniques of statistical and econometric analysis are developed through applications to a set of case studies and real data in the fields of environmental, resource, and international development economics. Students learn the use of a statistical software for economic data analysis. |
Jeremy R. Magruder |
4 |
Data Science in Global Change Ecology |
ESPM 157 Class #: 27205(link is external) |
WF 10-12pm Barrows 110 |
Many of the greatest challenges we face today come from understanding and interacting with the natural world: from global climate change to the sudden collapse of fisheries and forests, from the spread of disease and invasive species to the unknown wealth of medical, cultural, and technological value we derive from nature. Advances in satellites and micro-sensors, computation, informatics and the Internet have made available unprecedented amounts of data about the natural world, and with it, new challenges of sifting, processing and synthesizing large and diverse sources of information. In this course, students will learn and apply fundamental computing, statistics and modeling concepts to a series of real-world ecological and environment |
Carl Boettiger |
4 |
Introduction to Machine Learning and Data Analytics |
INDENG 142 Class #: 28312(link is external) |
TTh 3:30-5pm Morgan 101 |
This course introduces students to key techniques in machine learning and data analytics through a diverse set of examples using real datasets from domains such as e-commerce, healthcare, social media, sports, the Internet, and more. Through these examples, exercises in R, and a comprehensive team project, students will gain experience understanding and applying techniques such as linear regression, logistic regression, classification and regression trees, random forests, boosting, text mining, data cleaning and manipulation, data visualization, network analysis, time series modeling, clustering, principal component analysis, regularization, and large-scale learning. |
Paul Grigas |
3 |
Introduction to Data Visualization |
INFO 190-1 Class #: 19471(link is external) |
MW 10:30-12pm W 12-1pm South Hall 202 |
This course introduces students to data visualization: the use of the visual channel for gaining insight with data, exploring data, and as a way to communicate insights, observations, and results with other people. |
Marti A. Hearst |
4 |
Artificial Intelligence in Medicine and Health Policy |
PBHLTH 196-8 Class #: 33297(link is external) |
W 3-6pm Hearst Field Annex B5 |
Over the coming decades, data and algorithms will transform medicine and our health care system. Whether you plan to be a doctor, an algorithm developer, or work elsewhere in the health sector, this course will help you understand the tremendous upside of artificial intelligence for health: what the tools of machine learning can do in this important sector, and where they can do harm. The course will focus on teaching concepts, not the mechanics of specific algorithms. But genuine conceptual understanding will require engagement with technical content (e.g., readings from computer science and statistics, problem sets requiring analysis of real datasets with statistical software). As a result, it is designed for students who are already comfortable with basic data analysis, thanks to coursework in data science/computer science, biostatistics/statistics, or economics (e.g., you should already know how to load and manipulate datasets in statistical software). |
Ziad Obermeyer |
1-4 |
Introduction to Computational Techniques in Physics |
PHYSICS 77 Class #: 23245(link is external) |
M 2-4pm Barrows 166 |
Introductory scientific programming in Python with examples from physics. Topics include: visualization, statistics and probability, regression, numerical integration, simulation, data modeling, function approximation, and algebraic systems. Recommended for freshman physics majors. |
|
3 |
Research and Data Analysis in Psychology |
PSYCH 101 Class #: 23558(link is external) |
W 4-7pm Evans 10 |
The course will concentrate on hypothesis formulation and testing, tests of significance, analysis of variance (one-way analysis), simple correlation, simple regression, and nonparametric statistics such as chi-square and Mann-Whitney U tests. Majors intending to be in the honors program must complete 101 by the end of their junior year. |
|
4 |
Data Science for Research Psychology |
PSYCH 101D Class #: 33033(link is external) |
MW 5-6:30pm Mulford 240 |
This Python based course builds upon the inferential and computational thinking skills developed in the Foundations of Data Science course by tying them to the classical statistical and research approaches used in Psychology. Topics include experimental design, control variables, reproducibility in science, probability distributions, parametric vs. non-parametric statistics, hypothesis tests (t-tests, one and two way ANOVA, chi-squared and odds-ratio), linear regression and correlation. |
|
4 |
Concepts in Computing with Data |
STAT 133 Class #: 23999(link is external) |
MWF 9-10pm Li Ka Shing 245 |
An introduction to computationally intensive applied statistics. Topics will include organization and use of databases, visualization and graphics, statistical learning and data mining, model validation procedures, and the presentation of results. |
Gaston Sanchez |
3 |
Modern Statistical Prediction and Machine Learning |
STAT 154 Class #: 23882(link is external) |
TTh 3:30-5pm Hearst Mining 390 |
Theory and practice of statistical prediction. Contemporary methods as extensions of classical methods. Topics: optimal prediction rules, the curse of dimensionality, empirical risk, linear regression and classification, basis expansions, regularization, splines, the bootstrap, model selection, classification and regression trees, boosting, support vector machines. Computational efficiency versus predictive performance. Emphasis on experience with real data and assessing statistical assumptions. |
Gaston Sanchez |
4 |