Foundations of Data Science course photo

This introductory course in data science is built on three interrelated perspectives: inferential thinking, computational thinking, and real-world relevance. Given data arising from some real-world phenomenon, how does one analyze that data so as to understand that phenomenon? How does one collect data to answer questions that one is interested in?

Inferential thinking refers to an ability to connect data to underlying phenomena and to the ability to think critically about the conclusions that are drawn from data analysis. Computational thinking refers to the ability to conceive of the abstractions and processes that allow inferential procedures to be embodied in computer programs, and to ensure that such programs are scalable, robust and understandable.

In addition to teaching critical concepts and skills in computer programming and statistical inference, the course will involve the hands-on analysis of a variety of real-world datasets, including economic data, document collections, geographical data and social networks, and it will delve into social and legal issues surrounding data analysis, including issues of privacy and data ownership.

Connector courses in various departments will peer with this foundations course, typically either applying the concepts and skills of the Foundations of Data Science course to a particular domain of interest or further developing mathematical, computational, or other formal foundations.

The labs, homework, and projects are interwoven with the lectures. Computational concepts and skills are integrated with statistical concepts in the context of working with real data and using visualizations and other summaries to report on the results of data analyses.

The labs provide immediate feedback, supporting students’ repeated efforts to master a skill or concept. The three projects incorporate both the computational thinking and inferential thinking into tackling a larger question on real data. As each project is introduced an “explorations” lecture is folded into the main lecture stream to explore contextual issues, such as privacy, ownership, and bias in data, to deepen students’ consideration of critical thinking in regard to data.

Stay connected

A newsletter focused on research and teaching in the area of data science taking place across Berkeley. Read the latest issue and subscribe here.