Become a Project Partner >

Discovery Offers Cloud Computing Credits > 

Click here to request Microsoft Azure credits! >

Discovery Projects by Semester

Spring 2021 Projects(link is external)

UN OCHA student team

Fall 2020 Projects

Data Science Showcase 2019

Spring 2020 Projects

Fall 2019 Projects

Summer 2019 Projects

a student presenting their research project to another student at the spring 2018 data scholars project showcase

Spring 2019 Projects

Fall 2018 Projects

Four students gathering around a laptop, laughing, at the Spring 2018 semester Data Scholars project showcase.

Spring 2018 Projects

Fall 2017 Projects

Portfolio of all undergraduate research projects for Spring 2020. 

Content Filters

Results

D-Lab, Digital Humanities, Near Eastern Studies - Ancient World Citation Analysis (AWCA)

The goal of this research project has been to build a generalizable workflow which enables both “close” and “distant reading” and language modeling of a digitized corpus in a field of study.

New Sun Road - Anomaly Detection in Solar Microgrids

New Sun Road's anomaly detection project is developing machine learning tools to highlight unusual electrical events at solar microgrids, with applications in predictive maintenance, theft detection, and beyond.

UC Berkeley School of Law - The Docket of the Supreme Court of the United States: How does the Supreme Court decide what cases to take?

The Supreme Court of the United States has essentially unconstrained discretion to set its own docket.

LBNL - Geometric and Manifold Learning on Graphs and Unstructured Data

While feature vectors and pixelated images are being used as standard proxies for presenting data in machine learning, many objects encountered in a scientific machine learning context can only be suitably modeled as discrete objects containing no

NASA Ames Research Center - NASA Data Visualization

This project investigates the flight behavior of airline pilots and the factors affecting that behavior in a series of challenging simulated flights. We need software that can “playback” state of the simulator so events can be viewed for coding.

UCSF - Detecting heart disease with cardiac sensor data

Cardiovascular disease remains one of the leading causes of death worldwide. This project is based at the UCSF Medical Center and will use medical data from various sources to improve AI-enabled detection and prevention of cardiovascular disease.

Data for Social Good Foundation

Data for Social Good provides easy to use tools for non-profits and political campaigns who want to organize during and in between campaigns. We have created a data system is designed to support relational organizing in communities of color.

SimpleWater Inc - Environmental Health Estimator

While genomics data has garnered enormous attention, the vital and indeed dominant role of environmental exposure (exposomics) has been gravely overlooked.

Kiwi Campus - Dispatcher algorithm

We are developing a dispatching algorithm that optimizes the Kiwi Mate and robot assignation for the orders that we receive through our marketplace.

Kiwibot - Kiwibot delivery robots

Estimating the delivery time that satisfies clients. 

Clarity Movement Co. - Automated reports for city-wide air sensor network

At Clarity, we have real-time air quality data from our sensors deployed in cities around the world. We use it to work on exploring an incoming dataset and start analyses to understand how exposure to air pollution can vary across a city.

Catalyst Off-Grid Advisors - Data visualization and analytics for off-grid solar in Africa

The team will deliver on two main workstreams. The first will be to develop a “dummy” dataset that would capture a simulated set of consumer receivable payments from thousands of customers.

Gauge.io - Data Visualization for User Experience Startup Projects

We seek to apply data visualization techniques to help analyze large document collections from topic modeling methodologies.

East Bay Community Energy - Forecasting Day Ahead Electric Load using a Machine Learning Application and Bottom-up Aggregation of Individual Meters

Using Hourly Data for a large set of individual meters, develop a machine-learning algorithm to predict the day-ahead aggregated load. Relevant characteristics of each meter will be provided, along with some exogenous variables

East Bay Community Energy - Identifying Electricity Usage & Disconnection Patterns in Disadvantaged Communities

Using anonymized electricity usage data, disconnection data (for non-payment) research and identify patterns and propensity scores through Machine Learning applications. 

Public Health Institute - Pathologists Cancer Alert

The idea for this project is supported by population-based cancer registry data for bladder cancer incidence.

Political Research Associates - Right Wing Sheriffs

Much has been made of the state and local government takeover strategy by the far right, but one area that is currently under scrutinized is the presence of so-called “Constitutional Sheriffs” on ballots and in office across the country.

Drexel University - Community Air Monitoring Monthly Report

Real-time air monitoring in communities next to the Bay Area's five oil refineries measure various pollutants in the air once every minute.

Innovations for Youth - Examining the spaces of violence against youth experiencing homelessness

The SF-YEAH project is a multiphase study in the School of Public Health exploring the ways young people experiencing homelessness in San Francisco experience violence and find safety and resources.

Climate Policy Initiative - D3 Data Viz using climate finance data

Working with Climate Policy Initiative's dataset of global climate finance data, make interactive graphics with D3 to host on the Climate Policy Initiative website.

WAHVE

Analysis and projection of trends in work patterns of seniors. 

WITI@UC (CITRIS/COE) - Data visualization of State of Women in Tech

As part of understanding the state of women in tech, WITI@UC(link sends e-mail) would like to visualize data from CalAnswers and UCOP to show percentages and changes over time in the participation of women in tech fields

Women in Tech Initiative (COE/CITRIS) - Innovation Resources Database

Collaborative database of resources for innovation and entrepreneurship - bit.ly/in-resources

LBNL - Streaming data analysis

We are studying prediction techniques for streaming data, with a couple of examples from financial applications.

UCSF - Meta-analysis of individual participant data from clinical trials of Crohn's Disease

Thanks to medical research we now have many treatments for Crohn's disease; unfortunately, we don't know which ones are better than others (comparative effectiveness) and in whom they're most likely to work (precision medicine).

UCSF - Knowledge Representation and Discovery from a large Clinical Text Corpus

Early-stage project to use representation learning methods to extract new insights from a large and high-quality clinical corpus. 

UCSF - Text Classification on Clinical Notes

Looking to perform supervised learning on endoscopy reports in order to enable clinical research on the comparative effectiveness of drugs for Inflammatory Bowel Disease (Crohn's Disease, Ulcerative Colitis).

Exploring the Achievement Gap in Berkeley Public Schools

Working with state and district data looking at achievement gaps in Berkeley Public Schools. 

Cal Alumni Association - Cal Alumni Association Analysis of Engaged Alumni

The UC Berkeley Division of Data Sciences and the Cal Alumni Association (CAA) will partner to conduct data analysis with the goal of determining the correlations between the participation in various alumni engagement activities and the propensity

Energy Resources Group - Thermal Unit Characterization Based on Data

The project focuses on generating realistic data sets for renewable generation integration. We use publicly available time-series data from generation companies to characterize real resource availability.

Department of Linguistics - Exploratory cross-linguistic analysis of acoustic data

This project aims to better understand the relationship between acoustics and language, and apply novel machine learning techniques to extract patterns in acoustic data.

Sociology & Haas School of Business - Pricing norms in cannabis markets

For this project students will use python to write code scraping, munging, and classifying product data to better understand the dynamics of the United States cannabis industry.

Sociology - Computational Analysis of Charter School Identities and Stratification

Do charter schools' marketing strategies attract parents of specific race and class backgrounds, contributing to educational segregation by race and class?

Sociology - Computational Analysis of Social Science Research

Professor Heather Haveman and doctoral candidate Jaren Haber are analyzing about 70,000 research articles gathered from JSTOR, the leading online repository of journal articles for the social sciences.

D-Lab, Digital Humanities, Near Eastern Studies - Sumerian Network Analysis

Cuneiform tablets from the ancient Sumerian city of Puzriš-Dagan (currently in Iraq) document deliveries and expenditures of a state center with close connections to the cult and to diplomacy.

ESPM - Addressing Structural Inequality in U.S. Agricultural Higher Education: An Assessment of Pedagogical Practices and Food Systems Coursework at Land-grant Institutions

In the last decade, many U.S. “land-grant” agricultural universities (including UC Berkeley) have turned a reflexive lens on the fact that their institutions marginalize certain community members.

Assessing Research Usability for Development Practitioners in Southern Africa

Policymakers, NGO's, and businesses are overwhelmed with information from research on climate change and natural resource management.

Making Cyberspace Inclusive: Developing Inclusive Online Speech Detector

Many studies have focused on building a detection algorithm for hate speech, but no one has developed a similar algorithm for inclusion and belonging. The gap matters because inclusive cyberspace is not equal to the absence of online hate speech.

Goldman School of Public Policy - Environmental Justice Mapping Project

The Environmental Justice Mapping Project's goal is to create a hub for environmental justice data across the country.

Data-Enabled Donations

This project aims to find ways to optimize the supply and demand of physical donations that shelters receive following a disaster. 

Department of Chemistry - BEACO2N

The project focused on the analysis of a dense network of urban measurements - 1) Assessing strategies for converting raw voltages to concentrations 2) Collecting information on emissions from disparate sources and formatting them for comparison t

Automated Water Purification

Creating an app to remotely monitor and control arsenic purification systems.

College of Letter and Sciences - Student Success Analytics Platform

Creating machine learning algorithms to predict student success using anonymized data. 

Office of Undergraduate Research and Scholarship - Campus Discovery Opportunities Database

Help with the development campus database platform that will better showcase student experiential learning opportunities at Cal. Seeking students with database development and web design skills.