Progress


Berkeley’s strength in the diverse areas of data science has been growing for decades, and that growth has accelerated along with the field in recent years. The Division has been innovating and building since its inception, in collaboration with partners across campus and beyond, to comprehensively develop data science and its connections with essentially all fields of inquiry at all levels of teaching and research to ensure Berkeley’s continued preeminence in this field.

Following are some of Berkeley’s highlights to date.

 

Data Science Timeline

Spring 2012

Presidential Big Data Initiative & AMP Lab

obama big dataThe Obama Administration launches the Big Data Research and Development Initiative to “develop Big Data technologies, demonstrate applications of Big Data, and train the next generation of data scientists.”

In one of Berkeley’s signal contributions, AMP Lab (Algorithms, Machines, and People) receives a $10M National Science Foundation inaugural “Expeditions in Computing” award, advancing the development of the Berkeley Data Analytics Stack.

Spring 2012

DataEDGE Conference

I School launches inagural DataEDGE(link is external) Conference to bring together senior industry and academic leaders for a conversation about the challenges and opportunities created by the rise of big data.

Spring 2013

Social Sciences Data Laboratory 

 

d-labThe Social Sciences Data Laboratory (D-Lab) opens its doors, redefining Berkeley’s profile in a new era of data-intensive social science research. It offers training, consulting, working groups, and access to datasets, serving students and researchers across Berkeley’s colleges and schools.

Summer 2013

MIDS

south hallBerkeley’s School of Information announces its online Master of Information and Data Science(link is external). The multidisciplinary curriculum draws upon computer science, social sciences, statistics, management, and law. Students use the latest tools and analytical methods to work with data at scale and solve real-world problems. It joins the company of several new and renovated on-campus masters programs with a focus on data science, including the Master of Statistics(link is external) and the Master of Engineering(link is external).

Fall 2013

Simons Institute for the Theory of Computing

simonsThe Simons Institute for the Theory of Computing, established with a grant of $60M from the Simons Foundation, initiates its programs with a semester devoted to theoretical foundations of big data analysis. Centered on the theme of a “computational lens” on the sciences, it swiftly becomes the world’s leading venue for collaborative research in theoretical computer science and its connections to other fields, including the foundations of data science.

December 2013

Berkeley Institute for Data Science 

bidsThe Berkeley Institute for Data Science (BIDS) launches with a major grant from the Gordon and Betty Moore Foundation and the Alfred P. Sloan Foundation. Nobel Laureate Saul Perlmutter, its founding director, spearheads an initiative drawing together faculty from throughout the campus. In collaboration with the University of Washington and New York University, Berkeley anchors the Moore and Sloan Foundations’ Data Science Environments program, transforming the university setting to accelerate data-intensive discovery.

Spring 2014 

Project Jupyter

jupyterProject Jupyter is launched by Fernando Perez, a researcher in Berkeley’s Helen Wills Neuroscience Institute, and a broad team of collaborators across institutions. The evolution of IPython, Jupyter is a non-profit, open-source project that supports interactive data science and scientific computing across all programming languages. Perez later joins the Berkeley Department of Statistics in 2017, while Jupyter becomes an essential tool for data science across disciplines, winning the 2018 ACM Software System award.

Summer 2014

Data Sciences Education Rapid Action Team

The Chancellor and Provost convene the Data Sciences Education Rapid Action Team (DSERAT) to design a comprehensive response to growing student and faculty interest in advancing data science education. A team of faculty is charged with “rethinking at a fundamental level what every educated person must know about quantitative reasoning: how to effectively understand, process, and interpret information to inform decisions in their professional and personal lives and as citizens of the world in the 21st century.” 

Members:

Cathryn Carson (co-chair), History
Bob Jacobsen (co-chair), Physics and Interim Dean, L&S Undergraduate Studies
David Culler, EECS
Michael Franklin, EECS
Michael Jordan, EECS and Statistics
AnnaLee Saxenian, Dean, School of Information
Jasjeet Sekhon, Political Science and Statistics
Bin Yu, Statistics and EECS

October 2014

Data-Driven Discovery 

satelliteFaculty in ecology, astronomy, and microscopy become inaugural investigators in the Gordon and Betty Moore Foundation’s Data-Driven Discovery initiative, created to foster collaboration and across disciplines and enable new types of scientific breakthroughs.

January 2015

DSERAT Report

The Data Sciences Education Rapid Action Team issues its recommendations.

These include:

  • A foundational lower division course accessible to everyone, with broad applicability, further enriched by “connector” courses in a diverse range of areas,
  • A suite of courses that advance the use of data sciences within the broad swath of disciplines available to Berkeley undergraduates,
  • A high-quality minor for students who wish to couple data sciences with their major discipline, and
  • A data sciences major for students who want this to be their primary area of study.

Faculty across the campus join with staff and student collaborators in a grass-roots effort to begin designing new courses and programs.

Spring 2015

Discovery Program

DiscoveryThe Berkeley Institute for Data Science, the emerging Data Science Education Program, and the Undergraduate Research Apprenticeship Program create the Discovery Program to provide undergraduates with more opportunities to engage in hands-on, team-based discovery opportunities leveraging data science.

March 2015 

West Big Data Innovation Hub

wbdhubThe National Science Foundation announces funding for Big Data Regional Innovation Hubs to cultivate multi-sector collaborations among academia, industry, and government. Part of the Big Data Research and Development Initiative, the regional hubs are created to accelerate progress towards addressing societal challenges, enable access to and use of important and valuable available data assets, and foster a national big data ecosystem. UC Berkeley becomes an active partner in the West Big Data Innovation Hub.

Fall 2015

Foundations of Data Science and Connectors

data 8 classThe pilot for Data 8: Foundations of Data Science launches with 94 students enrolled. The course, co-taught by Ani Adhikari of Statistics and John DeNero of EECS, is designed to be accessible to students of all backgrounds and does not require advanced math or computer programming experience. Data 8 focuses on building an understanding of computational and inferential thinking in the context of real world data and questions. 

Offered alongside Data 8 are six complementary “connector” courses that enable students to experience data science in the context and perspective of a variety of disciplines. Pilot offerings include: “Race, Policing, and Data,” “Health, Human Behavior, and Data,” and “How Does History Count?” Berkeley’s Data Science education program finds its first home in the Undergraduate Division in the College of Letters and Science.

August 2015  

DS421

ds421DS421, a five-year NSF-funded Research Traineeship program, welcomes its first cohort of graduate students. With a focus on Environment and Society: Data Sciences for the 21st Century, the program draws on faculty from eight departments and offers foundational and advanced training, research experiences, and professional development.

September 2015

Statistics Strategic Plan

The Department of Statistics prepares its Academic Program Review. Its self-study observes, “The major driving trend in the field of statistics over the last decade has been the rise of ‘data science.’ … We have been taking a leading role in this campus-wide dialog. To continue to take such a role will require us to marshal our internal resources and to pursue partnerships on campus that both exploit our intellectual strength and expose us to critical intellectual challenges.” The department recommends that Berkeley create a new organizational structure bringing together statistics, computing, and their applications. 

November 2015

Faculty Advisory Board

Chancellor Dirks and Provost Steele convene a campus-wide Data Science Planning Initiative Faculty Advisory Board (FAB) to chart paths of institutional development and frame an integrated strategy for Berkeley’s global leadership in data science.  

Members:

Cathryn Carson, History, FAB Chair
Lisa García Bedolla, Graduate School of Education and Political Science
Francesco Borrelli, Mechanical Engineering
Ron Cohen, Chemistry and Earth & Planetary Sciences
David Culler, Electrical Engineering & Computer Sciences
Rosemary Gillespie, Environmental Science, Policy & Management
Sol Hsiang, Goldman School of Public Policy
Bob Jacobsen, Physics and L&S Undergraduate Division (Dean)
Michael Jordan, Statistics (Chair) and Electrical Engineering & Computer Sciences
Susan Marqusee, Molecular & Cell Biology and QB3 (Director)
Anno Saxenian, School of Information (Dean)
Jas Sekhon, Political Science and Statistics
Chris Shannon, Economics and Mathematics
Ion Stoica, Electrical Engineering & Computer Sciences
Bin Yu, Statistics and Electrical Engineering & Computer Science

January 2016

Data 8 and Connectors 

Data 8 is offered as regular course; enrollment increases to 447, and connector courses grow to 217 students. Using Jupyter notebooks across the curriculum, faculty develop new ways to teach diverse students at scale. Twenty undergraduate majors revise their statistics requirements to accept Data 8, in some cases along with the Statistics 88 connector.

June 2016 

Pedagogy Workshop

More than 30 UC Berkeley faculty and instructors participate in a week-long course on the approach and curriculum of Data 8. The Pedagogy Workshop includes support for developing connector courses in participants’ disciplines that connect with and build on the Data 8 foundation. The workshop continues to be offered annually as more faculty and lecturers become interested in integrating data into their curricula.

June 2016 

BIDS XDs

xdBIDS hosts the inaugural ImageXD event, a workshop connecting researchers across domains to share knowledge about use of image processing data, algorithms, and software. New XDs soon emerge for text analysis (TextXD) and graphs (GraphXD).

August 2016 

FAB Report

The Faculty Advisory Board issues its report detailing a vision for all aspects of data science at Berkeley, including cross-campus collaboration, faculty hiring, fundraising, and creating a Division to design a campus-scale approach that leverages the University’s academic and institutional strengths. The report envisions an approach that is:

  • Deep - “laying the foundations of the field and pushing its conceptual frontiers;”
  • Broad – “applying established or emerging technologies and techniques to the wide range of areas or domains;” and
  • Rich – “studying the implications of the explosion of data and analysis for ethics, policy, society, and human knowledge.”

Campus leadership opens the report for broad review and comment. Responses include the Divisional Council of the Academic Senate’s endorsement of the creation of a Division and support for “a rapid and aggressive move into the intellectual space of Data Science
by the Berkeley campus.”

Fall 2016

Diversity and Data-enablement

The Division of Data Sciences and D-Lab introduce modules, short explorations into data science embedded in existing courses, from linguistics to sociology to ethnic studies. A series of American Cultures modules begins. Advanced courses in a variety of disciplines are introduced, such as ESPM 157 Data Science for Global Change Ecology, reflecting important research programs and growing out of earlier connector offerings.

September 2016

Data Scholars

data scholarsThe Data Science Education Program and Berkeley’s D-Lab partner with student organizers to initiate the Data Scholars program, a community to engage, support, and empower underrepresented students taking Data 8. Student teams supporting the curriculum grow as the Data Science Education Program expands under the aegis of the Undergraduate Division.

November 2016 

Faculty Support

An interdisciplinary faculty group submits a letter to Chancellor Christ indicating its support for the FAB recommendations, including the creation of a cross-departmental Division. The letter states: “A major strength of the proposal is the opportunity to integrate outstanding faculty beyond computer science and statistics in its intellectual vision.”

December 2016

Cross-Campus Initiative

 

In response to the FAB report and faculty feedback, campus leadership announces next steps in a cross-campus data science initiative, stating, “The campus should invest in a prominent and significant way in data science.” The institutional vehicle is a new Division of Data Sciences, placed at the level of Berkeley’s schools and colleges, to be formed in 2017.

Spring 2017

Advanced Offerings

data 100Nearly 100 students enroll in the new Data 100 course, an intermediate level class developed by a multidisciplinary team of faculty that bridges Data 8 and upper level data science courses, addressing the entire data lifecycle. It begins more than doubling every semester. Professor Ani Adhikari introduces Probability for Data Science, initially to 55 students; it grows exponentially in the following years. Enrollment in Data 8 climbs to 695.

May 2017

Interim Division and Interim Dean

cullerInterim Executive Vice Chancellor & Provost Christ announces the creation of a Division of Data Sciences with the two-year appointment of Professor of Electrical Engineering and Computer Sciences David Culler as Interim Dean. His responsibilities include bringing together faculty, researchers and students from across campus to foster new initiatives in data-intensive discovery and education. The Data Science Education Program is a major anchor of the new Division.

August 2017 

Data Science Major/BA Committee

The L&S Data Science Major/BA Committee, which includes faculty from disciplines from across campus, comes together to design a data science major for the College of Letters and Science.The Committee proposes a major that encompasses courses from faculty from a variety of departments with the aim of preparing students for a wide range of professions involving data science and analytics and for graduate studies in related fields.

Committee Members:

Alexei Efros, EECS
Alexey Pozdnukhov, Civil and Environmental Engineering
Anca Dragan, EECS
Ani Adhikari, Statistics
Carl Boettiger, ESPM
Cathryn Carson, History and Faculty Lead, Data Science Education Program
Charis Thompson, Gender and Women’s Studies
Daniel Rokhsar, MCB, Physics
David Ackerly, Integrative Biology
Deborah Nolan, Statistics
Deirdre Mulligan, I School, Law
Haiyan Huang, Statistics
Heather Haveman, Sociology, Business
Jack Gallant, Psychology, Neuroscience
James Demmel, Math, EECS
Jasjeet Sekhon, Political Science, Statistics
John DeNero, EECS
Joseph Gonzalez, EECS
Joseph Hellerstein, EECS
Joshua Blumenstock, I School
Laurel Larsen, Geography
Laurent El Ghaoui, EECS
Lisa Barcellos, Public Health, Computational Biology
Lisa Garcia Bedolla, Education
Michael Jordan, Statistics, EECS, Computational Biology
Nicolaas Veldhuis, Near Eastern Studies
Paul Grigas, IEOR
Paul Waddell, City and Regional Planning
Perry de Valpine, ESPM
Philip Stark, Statistics
Philip Marcus, Mechanical Engineering
Ronald Cohen, Chemistry, Earth and Planetary Science
Uros Seljak, Physics
Zsolt Katona, Business

October 2017 

FODA

fodaThe National Science Foundation awards Berkeley a grant to create the Foundations of Data Analysis Institute (FODA) to bring together core research communities in theoretical statistics, applied mathematics, and theoretical computer science. It is supported by NSF’s TRIPODS Program (Transdisciplinary Research in Principles of Data Science), which was launched to address fundamental open questions in the theoretical underpinnings of data science. It aligns with the first of NSF’s 10 Big Ideas, Harnessing the Data Revolution, intended to frame the foundation’s investments for years to come.

January 2018

Faculty Advisory Council

Interim Dean Culler appoints the Data Science Division Faculty Advisory Council to bring perspectives from throughout campus, advise the Interim Dean, and provide a conduit for socializing important issues. The council meets roughly monthly and the tenure is intended to be the remainder of the Interim process.

Members:

Bin Yu, Statistics
Cathryn Carson, History
Costas Spanos, EECS/EE, CITRIS
Dan Fletcher, Bioengineering
David Wagner, EECS/CS
John Chuang, Information
Josh Bloom, Astronomy
Josh Goldstein, Demography
Karen Chapple, City and Regional Planning
Kathy Yelick, EECS/CS, Lawrence Berkeley National Laboratory
Lisa García Bedolla, Graduate School of Education, Political Science
Louise Mozingo, Landscape Architecture and Environmental Planning, American Studies
Max Auffhammer, Agricultural and Resource Economics, International and Area Studies
Michael Jordan, EECS/Statistics
Pamela Samuelson, Law
Ron Cohen, Chemistry, Earth and Planetary Science
Susan Marqusee, Molecular and Cell Biology

January 2018 

Human Contexts and Ethics

HCENew classes are piloted in the area of Human Contexts and Ethics of Data, a universal requirement of the proposed data science major. HCE courses explore the human and social structures, formations, and practices that shape data science activities, from data collection and analysis, to data privacy and security practices, to the use of data use in justice contexts.

February 2018

RISELab

RISEBerkeley’s RISELab, successor to AMPLab, receives NSF’s Expeditions in Computing Award, providing $10 million in funding over five years to enable game-changing advances in real-time decision making technologies.

February 2018

Inter-departmental Task Force

The department chairs of Statistics, Industrial Engineering and Operations Research, and Electrical Engineering and Computer Sciences, along with leaders in science and technology studies, new media, cognitive science, and computational biology, assemble an Interdepartmental Task Force to propose basic principles and explore possible organizational structures for the Division.

The task force recommends the following goals:

  • Shared governance and transparency: “To build trust and a joint sense of mission” among the diverse units and disciplines.
  • Synergy and cohesion within the Division: “An inclusive culture that respects different disciplinary practices and approaches while emphasizing interdisciplinary collaboration.”
  • Tight integration and increased synergy with other units across campus: “...Driven by both shared educational missions and synergistic research goals.”
  • New undergraduate program (s): that are “responsive to student needs and the changing intellectual and practical landscape in Data Science.”
  • Breadth: Recognition that “faculty slots related to Data Science are spread throughout campus, not restricted to this Division” and that the Division “should nurture and oversee programs that foster the application of Data Science campus-wide and create an environment where all faculty engaged in Data Science can become part of a collaborative community.”

The Task Force proposes circulating its recommendations among a wide group of potentially interested faculty for feedback and convening a smaller task force to draft a more concrete proposal for moving forward that includes this feedback.

Contributors to the Task Force report:

Scott Shenker, EECS/CS, Chair
Murat Arcak, EECS/EE
Cathryn Carson, History and Science and Technology Studies
Sandrine Dudoit, Statistics
Jack Gallant, Psychology
Robert Glushko, Cognitive Science
Philip Stark, Statistics 

March 2018 

BA in Data Science

 

The Academic Senate approves a Bachelor of Arts in Data Science in the College of Letters and Science. Requirements include courses in human contexts and ethics to help students develop an understanding of the human and social structures, formations, and practices that shape data science activity; and a domain emphasis, or a specialization in one of more than two dozen fields that include data applications.

April 2018

Data 8x

8xData 8 becomes available on edX in an online Foundations of Data Science sequence as Data 8X. More than 75,000 students enroll.

April 2018 

Ready for Launch

Campus leadership gives the go-ahead to launch the Data Science Major. In the Division of Data Sciences' Fall 2017 Data Science Survey, 13% of all undergraduates at Berkeley indicated they would be “very likely” to declare as a Data Science major (5/5 on a five-point scale). Departments and faculty across campus take further steps to integrate data science into their own teaching. The National Academies of Sciences, Engineering and Medicine release their study report, “Data Science for Undergraduates: Opportunities and Options,” drawing significantly from the Berkeley developments.

May 2018 

Programs Governance Committee

The Division of Data Science Degree Programs Governance Committee, a working body of faculty from multiple departments, is formed to oversee the data science major and future minor; review the integration of course offerings; curate course lists; liaise with other units, programs, and majors; and provide a focal point for addressing student and institutional issues.

Committee Members:

David Culler (Chair), Division of Data Sciences
Karen Chapple, City & Regional Planning
John DeNero (Chair Delegate), EECS
Sandrine Dudoit (Chair Delegate), Statistics
Ani Adhikari, Statistics
David Bamman, Information
Rasmus Nielsen, Statistics and Integrative Biology
Joe Hellerstein, EECS
Carl Boettiger, Environmental Science, Policy & Management (Domain Emphasis)
David Harding, Sociology (Domain Emphasis)
Cathryn Carson, History (Human Contexts and Ethics) and Faculty Lead, Data Science Education Program
Nasser Zakariya, Rhetoric (Human Contexts and Ethics)

 

June 2018

HCE Pedagogy Workshop

HCEA pilot workshop on pedagogy for Human Contexts and Ethics is offered for Berkeley instructors and students, following the model of other summer workshops.

July 2018

Pedagogy Workshop for All

pedagogy2The Division of Data Sciences, in partnership with the National Science Foundation, the West Big Data Innovation Hub, and Microsoft offer the first Data Science Pedagogy Workshop open to faculty from other universities and colleges. Three dozen participants from across the US and Canada attend, gaining experience with the Data 8 approach, best practices in curriculum design, and technological infrastructure.

July 2018 

Center for Connected Learning

CCLUC Berkeley Library, the Division of Data Sciences, and Educational Technology Services create the vision for the Center for Connected Learning, an innovative collider space in Moffitt Library. The beta version of Center provides data science peer advising and consulting, maker workspaces, and VR labs. This creative space, run by student organizations, is designed for students to support other students and encourages shared expertise and community building in the data sciences.

August 2018 

Data Collaboratives

data collaborativesThe Division launches the Data Collaboratives program, a new form of collaboration supported by Schmidt Futures that links students, government, community, and business partners to leverage data to help address complex social problems.

Fall 2018

5th Year MIDS

The I School announces 5th Year MIDS(link is external), a pathway to the Master of Information and Data Science for graduating UC Berkeley undergrads, to launch in fall 2019.

September 2018

Major Open

With a new advising team in place, students can now declare a data science major. Nearly 1,000 students file pre-declarations in the first eight weeks of advising.

November 2018 

New Division

DS@berkeleyBerkeley announces plans for a new division, provisionally named the Division of Data Science and Information. The innovative Division is envisioned to encompass computation, information, and data science and their human, social, political, artistic, and scientific implications. It connects departments from the College of Engineering, the College of Letters and Science, and the School of Information; incorporates the Berkeley Institute for Data Science (BIDS); and establishes a new Data Science Commons that will bring together groups of faculty and students from across the University to open new research domains and develop new fields of study.

December 2018 

UC Berkeley Strategic Plan

strat planBerkeley announces its ten-year Strategic Plan, investing heavily in data science and integrating it across the campus’ signature initiatives. The Strategic Plan provides the intellectual and organizational framework for Berkeley’s next capital campaign.

December 2018 

First Data Science BA Graduates

2018 DS gradsNine students become Berkeley’s first data science BA graduates. Their domain emphases include cognition, computational biology methods, economics, robotics, social welfare, health and poverty, and applied mathematics and modeling.

 

April 2019

Certificate in Applied Data Science

Academic Senate approved the School of Information to offer a new Graduate Certificate in Applied Data Science to graduate students across the UC Berkeley campus. The Graduate Certificate in Applied Data Science, an Option II Certificate program, introduces the tools, methods, and conceptual approaches used to support modern data analysis and decision-making in professional and applied research settings.

January 2019

Growing DS Enrollment

data82019Data 8 crosses the threshold to become the largest class at Berkeley, with more than 1,500 students registering for Spring semester. Registration tops 1,000 for Data 100, 900 for Introduction to Machine Learning, 250 for Probability for Data Science, and 200 for Human Contexts and Ethics of Data.

 

May 2019

First Data Science 
Commencement

spring 2019 commencementMore than 100 students graduated with a Bachelor of Arts in Data Science in the first commencement. Kate Johnson, President of Microsoft US, delivered the commencement address at the ceremony at Wheeler Hall Auditorium.

August 2019

Division's first Associate Provost

Jennifer Tour ChayesJennifer Tour Chayes, Microsoft technical fellow and managing director for Microsoft Research in New England, New York, and Montreal, is named the university’s first Associate Provost for the Division of Data Science and Information and Dean of the School of Information, effective January 2020. In the interim, Dean of Undergraduate Studies Bob Jacobsen is named to serve as division of data science interim dean and Associate Dean John Chuang as head of the School of Information.

 

Latest Updates