Image

Olivia Lewke

Major: English and Data Science

Graduation year: 2020

Originally from Southern California, Olivia first came to Berkeley as an English major. She hadn’t been exposed to coding prior to coming to Berkeley. While working in the Student Learning Center on campus as a writing tutor during her sophomore year, she became interested in data science. She was able to take her teaching experience and training and apply it as an Undergraduate Student Instructor (UGSI) for Data 8. 

"Ultimately what drew me to data science was its ability to shed light on larger-scale questions-- as an individual person the scope of your analysis when you approach a book or a dataset is very fine-grained, but having access to computational methods can broaden the lens of your inquiry to an almost staggering extent."
Question: How did you first get interested in data science as an English major?
Answer: Originally I was interested in adding Cognitive Science as a second major, as I was interested in the linguistics domain emphasis. Having never programmed before coming to Berkeley, I decided to take Data 8 to learn a bit about Python before diving into CS 61A. I remember sitting in the Data 8 lecture hall on the first day of the semester very clearly. After they covered the syllabus, there was time for a brief example. Professor DeNero loaded the full text of Little Women into a Jupyter Notebook and began to investigate character relationships by treating the book as a dataset. He plotted the frequency count of character names over the course of the novel, and even just that simple visualization conveyed so much information-- it became clear which parts of the book focused on which sisters, when love interests started to converge and diverge, when certain characters died, etc. It was amazing to me that looking at something as basic as name usage, which would have been incredibly time-consuming to track by hand as a human reader, could change my perspective on a book. From that point on I was hooked! 

I think “data science” can often be a vague blanket term, but it is incredible that computational skills can help you understand the world in a new way. I am very interested in natural language processing, but even just a few years ago I would have been skeptical of applying empirical methods to something as complicated and ambiguous as a book. By no means does data science give one the ability to “solve” a book, but it can unveil very exciting insights and patterns that might have been previously undetectable due to the fact that our minds are restricted in the amount of “data” they can process at any given moment. I think the same holds true for many other problems-- simply throwing an algorithm or a machine learning model at a complicated problem will not necessarily “solve” that problem, but it will reveal dimensions of a solution or a new understanding that was previously unavailable.  

Q: How are you applying data science or using it in your courses/and or research?

A: I spent the summer after my junior year interning in Washington D.C. through a non-profit named Coding It Forward. I was placed at a government agency that was interested in updating its technology and making itself more accessible. My project ended up involving standardizing and automating the process of writing economic news releases at the Bureau of Labor Statistics. Even though I lacked an economic background coming into the internship, having a background in NLP made the process of figuring out how to generate drafts of monthly reports much easier!  

I was also very lucky to have the opportunity to do URAP for a few semesters with Professor David Bamman, who works on NLP and cultural analytics. Reading his work on gender representation in literature and information propagation opened up a whole new world for me in applying what I learned in my data science courses to books! I was able to contribute to LitBank, his annotated dataset of English-language fiction. 

Q: How do you envision using data science in the future?

A: After graduating this year, I’ll be working in the Bay Area as a data scientist, and I have plans to return to graduate school at some point in the future. I think even just having the skills to pick apart a large dataset can fundamentally change how you come to understand the world around you. If you’re interested in a sport you can download and import a large dataset and see if you can find trends or biases, or if you enjoy a particular genre of movie you can download a large number of movie scripts and see what you uncover in them. The general set of skills you learn can be useful in a large variety of domains, and even though it is up to you to understand the domain you apply them in, the possibilities are endless. 

Q: Any advice for students curious about data science who do not have a traditional math or computer science background?

A: I would tell them not to be intimidated by those who might have that background! One of the wonderful things about being in college is that you get to try things out-- there are introductory classes for a reason, and although learning the ropes of a new discipline might take a lot of effort, you come away with an entirely new way of thinking and approaching problems. It is never too late, and there is never a wrong time to try and learn something new. 

It took me a while to realize (and I wish I had internalized this earlier on) that being confused is a reason to be excited rather than upset. Confusion means that you have encountered something new and that you have the opportunity to learn a new skill or to improve your existing understanding. Very few problems worth solving are straightforward, and being able to talk through the confusing or difficult aspects of something you’re trying to understand will only make you better at communicating and creating narratives with your data. It might feel risky to take a computer science or math class that has a reputation for being “hard,” but there is a massive reward in the knowledge you leave the class with, and also in the personal satisfaction in having expanded your skillset.