September 27, 2021

When Alan Liang applied for a job as an undergraduate student assistant at UC Berkeley in December 2017, the last thing on his mind was helping launch a nationwide revolution in teaching data science at colleges and universities. It just worked out that way.

Born in Oregon, Liang and his family moved to Shanghai when he was five and lived there for 13 years, during which he attended an international school and became interested in economics. His parents worked in the tech industry and, growing up, Liang saw how technology was becoming increasingly valuable and ubiquitous in everyday life.

When planning for his university studies, Liang decided to major in economics and then go on to earn a Ph.D. in the field. Once he chose Berkeley, he opted for a dual major with computer science. “I felt the opportunity cost not to do CS while at Berkeley was too high,” Liang said. He graduated with his bachelor’s degree in May 2020, then completed his master’s in August. 

Although he wasn’t yet familiar with data science as a discipline, Liang decided as a sophomore to take Data 8: The Foundations of Data Science -- Berkeley’s introductory data science class -- in the fall 2017 semester. He was inspired to do so after an internship at Microsoft, where he analyzed web traffic data to try to gain insight on why users who visited the company’s services quickly left. He drew up a hypothesis as to why this happened and conducted an A/B test from which he could robustly draw conclusions. 

“I had barely heard of data science, but I realized that I should take Data 8 from my internship experience,” he said, speaking from New York City. “To this date, it’s still one of my favorite classes, and I was a TA in Data 8 for four semesters.”

He really enjoyed how the class was taught, starting with data manipulation and visualization and statistics and ending with how to make predictions based on data. Through course examples drawn from many different domains, Liang realized that data science could complement any field of study and help support more robust conclusions when solving problems.

As the course ended, a message was posted on the class forum about student opportunities to work on project management, though there weren’t a lot of details. An interview was set up to accommodate the two time zones in Berkeley and Shanghai and Liang was hired.

Liang credits Anthony Suen, who led the data science student programs in what is now Data Science Undergraduate Studies, with creating a very green-field environment that allowed students to take potentially big or exciting ideas and run with them. “There were so many things that needed to be done. It’s what has kept me working with the division for the rest of my time at Berkeley,” Liang said.

As part of the Data 8 course, the instructors created a website at on GitHub describing the course and listing materials and other information. More than 100 responses had been filled out on the Data 8 Instructor Interest Form from educators who wanted to learn more, but had not been responded to. “That’s when we realized there was a huge amount of interest in Data 8 and other educators wanted to teach a similar course.”

The founding dean of what is now the Division of Computing, Data Science, and Society decided to offer an in-person workshop to bring interested people to Berkeley and tasked Suen to run it. Liang was instrumental in providing logistical support for the inaugural four-day workshop. “It was clearly very successful,” Liang said. “Berkeley really started this revolution in teaching data science.”

By fall 2018, Liang led a three-person team called the External Pedagogy and Platform team. External because they focused on schools other than Berkeley; pedagogy because the idea of teaching data science was still new and Berkeley’s approach was revolutionary; and platform because at the time there were a lot of technical challenges when deploying services like Jupyterhub.

In June 2019, the workshop was again an in-person event, drawing more than 70 participants from around the world. When COVID-19 hit, the program went online, renamed as the National Workshop on Data Science Education. Without travel costs, hundreds of educators from the United States and other countries signed up. As participation grew, the workshop has transitioned from focusing on Berkeley’s resources to becoming a forum for all institutions to present and discuss their experiences in teaching data science.

“Our mission is to truly make data science accessible to all,” Liang said. “We put a lot of effort into making that happen -- everyone who is interested can and should be able to participate.”

“As a new team lead, I excitedly told prospective new hires that they should join to ‘help shape the national landscape in data science education,” Liang wrote in a farewell email. “Along the way, I learned so much – many times through trial and error – about managing projects, structuring problems and working with people.”

In September, Liang began his new job as an AI software engineer with Boston Consulting Group in New York City. In that role, he works with clients who typically have a great deal of data but not the infrastructure or technical skills to help them build and deploy models that help achieve their business goals. 

“I've had the pleasure to watch data science grow from something resembling a start-up to a multi-departmental division at Cal,” Liang said. “What I learned there and in my internships really contributed to how I ended up where I am now.”