Aditya Parameswaran came to Berkeley in 2019 from the University of Illinois, Urbana-Champaign. He received the 2019 Very Large Data Bases Early Career Research Contributions Award and 2017 Technical Committee on Data Engineering Rising Star Award.
“Academic research in data management has been somewhat myopically focused on what I call the "1%"—the needs of the Amazons, Microsofts, and Googles of the world. But there are so many others—the "99%"—analysts, accountants, scientists, doctors, small business owners, journalists, among others—whose needs are not being addressed. This is a worthy cause, with the potential to have untold impact…”
Question: What is your research Focus:
Answer: I build tools for simplifying data science: making it easy for end-users and teams from a variety of disciplines, from journalism to public health, to leverage, make sense of, and manage their large and complex datasets. I employ techniques from multiple disciplines: data management (or databases), human-computer interaction, and data mining, to address the hairy problems in democratizing data science.
Q: How is this research used in the world?
A: Our tools have been used by domain experts in genomics, astrophysics, battery science, and ad analytics, among others. We built a scalable spreadsheet tool, DataSpread, that enables users to work interactively on spreadsheets 1000s of times larger than those supported by Microsoft Excel or Google Sheets. We built ZenVisage, a visualization search tool (like Google Search, but for visualizations) that allows end-users to sketch a pattern on an online canvas and find data that matches that pattern. This has been used to discover various novel scientific insights, including a star that was known to harbor a Jupiter-sized planet (for astrophysics), a previously-unknown relationship between solvent properties (battery science), and gene expression profiles that independently confirmed the results of a related study (genomics). Other tools include Helix(link is external), a human-in-the-loop machine learning tool that allows end-users to train machine learning models easily, and Orpheus(link is external), a data versioning tool that allows users to store and keep track of datasets generated during data science. We have received funding from the NIH, Siebel Energy Institute, and Toyota Research Institute, indicative of the significance of these tools for genomics and battery science.
Q: Why is this area important to you?
A: Academic research in data management has been somewhat myopically focused on what I call the "1%"—the needs of the Amazons, Microsofts, and Googles of the world. But there are so many others—the "99%"—analysts, accountants, scientists, doctors, small business owners, journalists, among others—whose needs are not being addressed. This is a worthy cause, with the potential to have an untold impact in addressing many of our societal grand challenges. From a research standpoint, this work is highly interdisciplinary and challenging, and my students reflect that diversity—spanning a variety of backgrounds, countries, experiences, and skillsets.