Spring 2019 Courses
Backbone
Title 
Course Number 
Times & Locations 
Description 
Instructor 
Units 
Foundations of Data Science (Data 8) 
STAT/COMPSCI C8 CCN: 24996 
MWF 1011am Wheeler 150 
Foundations of data science from three perspectives: inferential thinking, computational thinking, and realworld relevance. Given data arising from some realworld phenomenon, how does one analyze that data so as to understand that phenomenon? The course teaches critical concepts and skills in computer programming and statistical inference, in conjunction with handson analysis of realworld datasets, including economic data, document collections, geographical data, and social networks. It delves into social and legal issues surrounding data analysis, including issues of privacy and data ownership.  William Fithian & Ani Adhikari  4 
Principles & Techniques of Data Science (Data 100) 
STAT/COMPSCI C100 CCN: 28558 
TTh 6:30pm8 Wheeler 150 
In this course, students will explore the data science lifecycle, including question formulation, data collection and cleaning, exploratory data analysis and visualization, statistical inference and prediction, and decisionmaking. This class will focus on quantitative critical thinking and key principles and techniques needed to carry out this cycle. These include languages for transforming, querying and analyzing data; algorithms for machine learning methods including regression, classification and clustering; principles behind creating informative data visualizations; statistical concepts of measurement error and prediction; and techniques for scalable data processing.  John S. DeNero & Sandrine Dudoit  4 
Probability for Data Science 
STAT 140 CCN: 25622 
TuTh 56:30pm Valley Life Sciences 2050 
An introduction to probability, emphasizing the combined use of mathematics and programming to solve problems. Random variables, discrete and continuous families of distributions. Bounds and approximations. Dependence, conditioning, Bayes methods. Convergence, Markov chains. Least squares prediction. Random permutations, symmetry, order statistics. Use of numerical computation, graphics, simulation, and computer algebra.  Ani Adhikari  4 
Connectors
Title 
Course Number 
Times & Location 
Description 
Instructor 
Units 
Data Science and the Mind 
COGSCI 88 CCN: TBD 
TBD  How does the human mind work? We explore this question by analyzing a range of data concerning such topics as human rationality and irrationality, human memory, how objects and events are represented in the mind, and the relation of language and cognition. This class provides young scientists with critical thinking and computing skills that will allow them to work with data in cognitive science and related disciplines.  Dmetri Hayes  2 
Reproducibility and Open Science 
L&S 88 CCN: TBD 
TBD  The purpose of this course is to introduce students to the issues of scientific reproducibility and highlight some recent examples in the scientific literature. The class will present and cover different tools and concepts used in open science. In this course, Data Science students across domains will be introduced to concepts in scientific reproducibility and transparency through readings, casestudies, and handson lab activities. The course is intended as a survey of topics for students who have taken or are currently enrolled in Data 8. Students in the course will learn about the principles of doing open science, be able to create their own reproducible workflows, and gain an introduction to computational tools for Reproducible Data Analysis  TBD  2 
Data Science Applications in Physics 
PHYSICS 88 CCN: 27035 
M 24 Evans 60  Introduction to data science with applications to physics. Topics include: statistics and probability in physics, modeling of the physical systems and data, numerical integration and differentiation, function approximation. Connector course for Data Science 8, roomshared with Physics 77. Recommended for freshmen intended to major in physics or engineering with emphasis on data science.  2  
Probability and Mathematical Statistics in Data Science 
STAT 88 CCN: 24717 
WF 121 Birge 50  In this connector course we will state precisely and prove results discovered in the foundational data science course through working with data. Topics include: total variation distance between discrete distributions; the mean, standard deviation, and tail bounds; correlation, and the derivation of the regression equation; probabilities, random variables, and the Central Limit Theorem; probabilistic models; symmetries in random permutations; prior and posterior distributions, and Bayes' rule.  Fletcher H Ibser  2 
Linear Algebra for Data Science 
STAT 89A CCN: 24718 
TuTh 12:302 Evans 60  This connector will cover introductory topics in the mathematics of data science, focusing on discrete probability and linear algebra and the connections between them that are useful in modern theory and practice. We will focus on matrices and graphs as popular mathematical structures with which to model data. For examples, as models for termdocument corpora, highdimensional regression problems, ranking/classification of web data, adjacency properties of social network data, etc. 
4 

Data & Decisions 
UGBA 964 CCN: 17360 
M 24  The objective of the course is to provide an understanding of how data and statistical analysis can improve managerial decisionmaking. Students learn how to ask the right questions, find or collect relevant data, and apply appropriate statistical methods to solve problems and make better business decisions. We will explore statistical methods for gleaning insights from economic and social data, with an emphasis on approaches to identifying causal relationships. We will discuss how to design and analyze randomized experiments and introduce econometric methods for estimating causal effects in nonexperimental data. This course, in combination with the Foundations course, satisfies the statistics prerequisite for admissions to Haas.  Conrad Miller 
2 
Data & Decisions 
UGBA 965 CCN: 17361 
M 46  (see above)  Conrad Miller  2 
Human Contexts & Ethics
Title 
Course Number 
Times & Location 
Description 
Instructor 
Units 
Introduction to Urban Data Analytics* 
CYPLAN 101 CCN: 32303 
MW 12:302 Wurster 112

This course (1) provides a basic intro to census and economic data collection, processing, and analysis; (2) surveys forecasting and modeling techniques in planning; (3) demonstrates the uses of realtime urban data and analytics; and (4) provides a socioeconomicpolitical context for the smart cities movement, focusing on data ethics and governance. *This course is approved for the Human Contexts & Ethics requirement for students completing certain approved Domain Emphases only. Please see the Domain Emphasis page for more information. 
Karen Chapple 
4 
Environmental Health and Development* 
ESPM C167/ PBHLTH C160 CCN: 27274 CCN: 29594 
TuTh 9:30am11am
Genetics & Plant Bio 100

The health effects of environmental alterations caused by development programs and other human activities in both developing and developed areas. Case studies will contextualize methodological information and incorporate a global perspective on environmentally mediated diseases in diverse populations. Topics include water management; population change; toxics; energy development; air pollution; climate change; chemical use, etc. *This course is approved for the Human Contexts & Ethics requirement for students completing certain approved Domain Emphases only. Please see the Domain Emphasis page for more information. 
Rachel A. MorelloFrosch Simona A Yibalan 
4 
Introduction to Science, Technology, and Society 
HISTORY C182C CCN: 26861 
MWF
121pm
Dwinelle 145

This course explores how data science is entangled with diverse human contexts (histories, institutions, and material bases) and ethics (domains of valueladen choice). We will bring historicallygrounded perspectives as well as frameworks and methods from Science, Technology, and Society (STS) (such as crossnational comparison, coproduction, and controversy studies) to bear on topics that include: Doing ethical data science amid shifting definitions of human subjects, consent, and privacy; the changing relationship between data, democracy, and law; the role of data analytics in how corporations and governments provide public goods such as health and security to citizens; sensors, machine learning and artificial intelligence and changing landscapes of labor, industry, and city life; and the implications of data for how publics and varied scientific disciplines know the world. 
Cathryn Carson  4 
Moral Questions of Data Science 
PHILOS 121 CCN: 31934 
TuTh 89:30am Barrows 56

This course explores, from a philosophical perspective, ethical questions arising from collecting, drawing inferences from, and acting on data, especially when these activities are automated and on a large scale. Topics include: bias, fairness, discrimination, interpretability, privacy, paternalism, freedom of speech, and democracy. 
Nicholas G Kolodny  4 
Data Enabled Courses
These courses are taught in a way that permits students to build on Data 8. Please review the prerequisites.
Title  Course Number  Times & Locations  Description  Instructor  Units 
Astronomy Data Science Laboratory 
ASTRON 128 CCN: 32227 
M 47pm Campbell 131 
This course features 3 datacentric laboratory experiments that draw on a variety of tools used by professional astronomers. Students will learn to procure and clean data (drawn from a variety of worldclass astronomical facilities), assess the fidelity/quality of data, build and apply models to describe data, learn statistical and computational techniques to analyze data (e.g., Bayesian inference, machine learning, parallel computing), and effectively communicate data and scientific results. There is a heavy emphasis on software development in the Python language, statistical techniques, and highquality communication (e.g., written reports, oral presentations, and data visualization).  Daniel R. Weisz  4 
Engineering Data Analysis 
CIVENG 93 CCN: 27748 
MW 910 Davis 502 
Application of the concepts and methods of probability theory and statistical inference to CEE problems and data; graphical data analysis and sampling; elements of set theory; elements of probability theory; random variables and expectation; simulation; statistical inference. Use of computer programming languages for analysis of CEErelated data and problems. The course also introduces the student to various domains of uncertainty analysis in CEE.  Mark Hansen  3 
Computational Models of Cognition 
COGSCI 131 CCN: 25868 
TuTh 1112:30 Haas F295 
This course will provide advanced students in cognitive science and computer science with the skills to develop computational models of human cognition, giving insight into how people solve challenging computational problems, as well as how to bring computers closer to human performance. The course will explore three ways in which researchers have attempted to formalize cognition  symbolic approaches, neural networks, and probability and statistics  considering the strengths and weaknesses of each.  Steven T. Piantadosi  4 
Sensemaking and Organizing 
COGSCI 190 CCN: 19849 
TuTh 12:302 Wheeler 104 
Sensemaking and Organizing: When something "makes sense” or " is organized” we are imposing or discovering order in the arrangement of concepts, events, or resources of some kind. Sensemaking and organizing are fundamental human activities that raise many multi or transdisciplinary questions about perception, knowledge, decision making, interaction with things and with other people, values and value creation.We can analyze sensemaking and organizing from four interrelated perspectives. The most fundamental one is provided by language and culture, which shapes the perspectives one takes as an individual, in institutional contexts governed by business or legal processes, or in dataintensive or scientific contexts. CogSci 1 required.  Robert J. Glushko  3 
Introduction to Machine Learning 
COMPSCI 189 CCN: 28076 
MW 6:308pm Wheeler 150 
Theoretical foundations, algorithms, methodologies, and applications for machine learning. Topics may include supervised methods for regression and classication (linear models, trees, neural networks, ensemble methods, instancebased methods); generative and discriminative probabilistic models; Bayesian parametric learning; density estimation and clustering; Bayesian networks; time series models; dimensionality reduction; programming projects covering a variety of realworld applications.  Jonathan Shewchuk  4 
Social Networks 
DEMOG 180 CCN: 26510 
TuTh 9:3011am McCone 141 
The science of social networks focuses on measuring, modeling, and understanding the different ways that people are connected to one another. We will use a broad toolkit of theories and methods drawn from the social, natural, and mathematical sciences to learn what a social network is, to understand how to work with social network data, and to illustrate some of the ways that social networks can be useful in theory and in practice. We will see that network ideas are powerful enough to be used everywhere from UNAIDS, where network models help epidemiologists prevent the spread of HIV, to Silicon Valley, where data scientists use network ideas to build products that enable people all across the globe to connect with one another.  Dennis Feehan  4 
Applied Econometrics and Public Policy 
ECON C142/POL SCI C131A/PUB POL C142 CCN: 25380 
TuTh 56:30 Moffitt Library 102 
This course focuses on the sensible application of econometric methods to empirical problems in economics and public policy analysis. It provides background on issues that arise when analyzing nonexperimental social science data and a guide for tools that are useful for empirical research. By the end of the course, students will have an understanding of the types of research designs that can lead to convincing analysis and be comfortable working with large scale data sets.  David Card  4 
Introductory Applied Econometrics 
ENVECON/IAS C118 CCN: 27500 
TuTh 9:3011am VLSB 2060 
Formulation of a research hypothesis and definition of an empirical strategy. Regression analysis with crosssectional and timeseries data; econometric methods for the analysis of qualitative information; hypothesis testing. The techniques of statistical and econometric analysis are developed through applications to a set of case studies and real data in the fields of environmental, resource, and international development economics. Students learn the use of a statistical software for economic data analysis.  Sofia B. VillasBoas  4 
Terrestrial Hydrology 
GEOG C136 CCN: 30492 
TuTh 9:3011am Evans 9 
A quantitative introduction to the hydrology of the terrestrial environment including lower atmosphere, watersheds, lakes, and streams. All aspects of the hydrologic cycle, including precipitation, infiltration, evapotranspiration, overland flow, streamflow, and groundwater flow. Chemistry and dating of groundwater and surface water. Development of quantitative insights through problem solving and use of simple models. This course requires one field experiment and several group computer lab assignments.  Laurel Larsen  4 
Applied Data Science with Venture Applications 
INDENG 135 CCN: 28885 
TuTh 23:30 Dwinelle 145 
This highlyapplied course surveys a variety of key of concepts and tools that are useful for designing and building applications that process data signals of information. The course introduces modern open source, computer programming tools, libraries, and code samples that can be used to implement data applications. The mathematical concepts highlighted in this course include filtering, prediction, classification, decisionmaking, Markov chains, LTI systems, spectral analysis, and frameworks for learning from data. Each math concept is linked to implementation using Python using libraries for math array functions (NumPy), manipulation of tables (Pandas), long term storage (SQL, JSON, CSV files), natural language (NLTK), and ML frameworks.  Ikhlaq Sidhu, Alexander Fred Ojala  3 
Data, Prediction & Law 
LEGALST 123 CCN: 31757 
MW 1012 in Barrows 110 
Data, Prediction, and Law is a new Legal Studies seminar that allows students to explore different data sources that scholars and government officials use to make generalizations and predictions in the realm of law. The course will also introduce critiques of predictive techniques in law. Students will apply the statistical and Python programming skills from Foundations of Data Science to examine a traditional social science dataset, “big data” related to law, and legal text data.  Jonathan D. Marshall  4 
Introduction to Computational Techniques in Physics 
PHYSICS 77 CCN: 24060 
M 24 Evans 60 
Introductory scientific programming in Python with examples from physics. Topics include: visualization, statistics and probability, regression, numerical integration, simulation, data modeling, function approximation, and algebraic systems. Recommended for freshman physics majors.  3  
Research and Data Analysis in Psychology 
PSYCH 101 CCN: 24314 
TuTh 24 Lewis 100 
The course will concentrate on hypothesis formulation and testing, tests of significance, analysis of variance (oneway analysis), simple correlation, simple regression, and nonparametric statistics such as chisquare and MannWhitney U tests. Majors intending to be in the honors program must complete 101 by the end of their junior year.  Christopher J. Gade  4 
Concepts in Computing with Data 
STAT 133 CCN: 24723 
TuTh 24 Lewis 100 
The course will concentrate on hypothesis formulation and testing, tests of significance, analysis of variance (oneway analysis), simple correlation, simple regression, and nonparametric statistics such as chisquare and MannWhitney U tests. Majors intending to be in the honors program must complete 101 by the end of their junior year.  Gaston Sanchez Trujillo  3 
Modern Statistical Prediction and Machine Learning 
STAT 154 CCN: 24765 
TuTh 89:30am VLSB 2040 
Theory and practice of statistical prediction. Contemporary methods as extensions of classical methods. Topics: optimal prediction rules, the curse of dimensionality, empirical risk, linear regression and classification, basis expansions, regularization, splines, the bootstrap, model selection, classification and regression trees, boosting, support vector machines. Computational efficiency versus predictive performance. Emphasis on experience with real data and assessing statistical assumptions.  Bin Yu  4 