The Berkeley Data Stack is a collection of open source tools that help enable large-scale data science research and education efforts across UC Berkeley. These tools include:

  • Jupyter - Interactive computing notebooks
  • Online Textbooks - Open-source textbooks used in classroom instruction
  • Interact Links - One-click access to notebook content
  • Autograding Tools - Ok.py, Nbgrader, and Otter Grader

All of these elements of the Berkeley Data Stack can be found on Berkeley's JupyterHub, known as DataHub. UC Berkeley currently has one of the largest collection of JupyterHub deployments in the world, capable of supporting tens of thousands of users across campus and beyond. 

CloudBank(link is external)

Berkeley is partnering with UC San Diego and the University of Washington on a grant from the National Science Foundation to develop CloudBank, a suite of managed services to remove barriers to public cloud access for data science and computer science research and education. Details and updates here. (link is external)

Cloud Bank

Education Hubs

JupyterHubs designed for classroom use in relevant courses. 

Research JupyterHubs

Research oriented JupyterHubs, though are also used in course settings. Most deployments utilize SLURM.  

JupyterHub Working Group Steering Committee

Shawna DarkChief Academic Technology Officer & Executive Director of Research, Teaching and Learning

Eric FraserAssistant Dean and Director of IT, College of Engineering

Yuvi Panda - Dev Ops Architect, Data Science Undergraduate Studies 

Anthony Suen(link sends e-mail) - Director of Programs, Data Science Undergraduate Studies