Ion Stoica

Ion Stoica

Ion Stoica joined Berkeley in 2001 from Carnegie Mellon. He is a professor in the Department of Electrical Engineering and Computer Sciences.

"I believe that building this positive feedback loop between systems and ML will be key to the rapid evolution of our field over the next decade, and beyond."    

Research Focus: Cloud computing, big data, and machine learning (ML). Building large scale systems to analyze massive amounts of data and develop Artificial Intelligence applications.

What are some examples of how this is applied in the world?

Personalized medical diagnosis and treatment. Increasing user engagement and satisfaction. Fraud and intrusion detection.

Why is this area important to you?

I am very excited about working at the intersection between systems and machine learning. In particular, I believe that the two can lead to a rapidly evolving positive feedback loop, where systems can accelerate ML algorithms, and in turn ML algorithms can optimize the systems and make them faster. In the first category, my group has already built several systems for ML, the most mature being Ray. Ray provides support for large scale ML workloads, including reinforcement learning (RL) and hyperparameter tuning. In turn, we have already used Ray to optimize systems (e.g., database execution engines, compilers), algorithms (e.g., decision trees for network packet classification), and program synthesis. I believe that building this positive feedback loop between systems and ML will be key to the rapid evolution of our field over the next decade, and beyond.    

Anything else you’d like to share?

I co-founded three companies: Conviva (2006), Databricks (2013), and Anyscale (2019). Of those, Databricks commercialized Apache Spark, an open source distributed framework we developed at Berkely's AMPLab. Apache Spark is now the de facto standard for big data processing being used by tens of thousands of organizations and millions of people around the globe. Databricks is now employing 1,000 people, has over 2,000 customers, and is one of the fastest growing enterprise companies. Its customers span virtually every industry, from financial, to health care, to manufacturing, to media, and retail. Anyscale commercializes Ray, which we developed at RISELab. Ray already has more than 200 contributors, and it is used by tens of organizations to develop new scalable ML applications for supply chain management, on-line recommendation, manufacturing, and money-laundry prevention, to name a few.