New Physics Data Science Course

June 18, 2018

A new UC Berkeley course, Data Science and Bayesian Statistics for Physical Sciences (Physics 151) is leading pioneering efforts to incorporate data analysis and computation into physics coursework. Offered for the first time in Spring 2018, the three unit upper-division course is helping to bridge the gap between the physical sciences and the world of computing by exposing students to modern computational methods, data science, and Bayesian statistics.

Physics 151 students have the unique opportunity to explore technology and its applications, as they complete homework assignments using the popular programming language Python and data science tools, such as Jupyter Notebooks. “The problems,” states GSI Byeonghee Yu, “are simplified enough for students to take the numerical methods they learn in class and apply them to real data.” Most exciting and brag-worthy about the course, according to recent students, is the ability to “work with real data,” as they complete projects involving Nobel Prize winning discoveries and research. “Working on actual datasets” is a unique, informative, and engaging experience, one that rarely arises in physical science courses. This data science-based approach to problem solving is extremely valuable for physics students to develop, as computation and data analysis become further ingrained in many fields and majors.


Examples of Spring 2018 Physics 151 Assignments and Projects

Above: Students take a compilation of supernovae data to show that the expansion of the universe is accelerating, and it contains dark energy


Above: Students analyze the first LIGO event and show that it has detected gravitational waves

Delving into topics untraditionally included in physics courses, such as data analysis, data modeling, and machine learning, Physics 151 acquaints students to widely-applicable computational tools and numerical methods. While the course has “a distinct astronomy/physics flavor,” it successfully exposes students to the intersection between physics and data science, an important connection in a world so centered around technology.

One of the unique aspects of the course is its emphasis on Bayesian statistics, which is not traditionally offered at the undergraduate level. “Given sufficient computational resources the Bayesian statistics is both easier to understand and produces better results than traditional frequentist statistics” states professor Seljak. “There are no theorems or methods to learn, everything follows from a single equation, and computers take care of the rest.”


Professor Uros Seljak

GSI Byeonghee Yu

Course Syllabus: