Speaking her Language: Using NLU to build conversational products for girls in developing countries
Project Description: Girl Effect builds products for girls in developing markets. We work in Rwanda, India, Ethiopia, South Africa and Tanzania. Our product portfolio contains magazines, television shows, music videos, websites, chatbots and social media channels, all designed to drive uptake of knowledge and change in attitude around key subject areas defined in girl effect’s theory of change.
We try to make products which put our users at the front and centre of the experience. In our digital products, this means allowing girls to share their feedback and guide their own user journeys, rather than be passive recipients of content. As a result of this, we have collected a lot of unstructured qualitative data from our users. In Big Sis, our chatbot product in South Africa, we are now using this data to enable girls to write their sex and relationships questions which we run gainst a dictionary we have created based on existing training data and provide relevant content suggestions.
The challenge that we face is that the submissions we get are in multiple languages, use lots of slang, and often contain misspellings. Building up dictionaries of intent has to reflect the actual language that these girls are using to describe specific subject matter areas. It also means that with each new language we have to begin the same process again. There are very few existing models that are able to cater for this type of engagement.
There are many different use cases we would like to explore to build on the conversational experiences which we are creating: continuing with our subject specific classification to a greater level of nuance and accuracy, but also looking at signs of change expressed by users as a result of their engagement, and also to be able to provide for natural moments of 'smalltalk' conversations which help users feel comfortable and guide them through their experience.
The interview is with Shravan Yadav, the project lead for Girl Effect.
Could you tell me a little bit about yourself and your professional background?
I have a total of 10 years of experience in the technology sector where I’ve developed web-based and windows-based applications along with building ETL and data capabilities in organisations. I joined Girl Effect in May 2020, and in my position I am responsible for data integration from various resources and visualisation and visualization with analytics to track and showcase the impact of our programs and programs - basically taking care of the data warehouse same processing and visualization capabilities for operations. As a part of our product team here my role is to help define our technology strategy, choosing the right technologies and channels to achieve our impact goals, so I get very excited with technology and what we can achieve.
What motivated you to initiate/propose this project?
At Girl Effect we started with the thought process of providing a private space to get trusted and non-judgemental advice about sex and relationships - building girls’ knowledge and confidence around subjects like sex, STIs and HIV prevention. We have launched AI powered chatbots available on Messenger, WhatsApp etc - channels that are already used by girls.
Girls are communicating a lot with our chatbot and we’ve received a million submissions from users that are varied like user feedback, user questions and casual contextual conversations. We have many responses that our chatbot fails to understand, because we have trained our chatbot for basic chats but not advanced versions of it which leads to unsatisfactory experiences for our users which leads to a loss of trust.
Hence, we decided to propose a project that utilises natural language processing to modify the chatbot that improves its understanding of the user input and respond to them correctly.
Since you are a multi-semester Discovery partner, could you describe your project in phases and what it has been building up to? Give us an idea of the impact it could potentially have.
The first step was to understand the context of the data we have, in light of what Girl Effect does as an organisation. We started working with the students in using different data science approaches to understand the data - because there is a lot of it where even the machine is taking a lot of time to read and process it.
Then they focused on cleaning the data extensively (removing certain special characters, emojis, or words not found in the english language) and then they started seeing patterns in the data - group of words that can help us differentiate between certain categories of user input.
Currently they are building and identifying subject areas and categories, that can then be fed into the machine learning model and that model will be integrated into our chatbot - thus, when a user submits a response, the chatbot can provide suggestions, answers and some service linkage for the user.
Could you elaborate on certain aspects of the project that you find to be the most challenging?
Because we have a lot of data, the most difficult part was to understand the data and find a common approach for a solution because all students have different capabilities and approaches to find a solution for our problem. It was challenging for students because they often got stuck on the processing part - their machines would run out of memory or compute power.
How has your experience been with Discovery and your student researchers so far (through all the semesters you have been a Discovery partner)?
It’s been great! We were surprised by the results shared by the students given the challenges they faced. Overall, it’s a great experience - students have been amazing in doing data analytics and they are engaged, know what to do and how it fits into the bigger picture.
What to you, is the most rewarding part of working with students from Discovery? What is the greatest challenge?
The most rewarding part, like Steve Jobs said, the journey is the reward. When we started working on this project, in our organisation we discussed that it was a lot of data but the way students have approached it and shared their results has been marvelous. In the process, they have provided us insights about the data and how we can use different approaches to achieve it. The insights they shared with us, are being shared organisation-wide and in various different areas.
The greatest challenge was to find a common time amongst different time zones (India, US and UK).
How has Discovery helped accelerate your project? What is it like for you being a Discovery Partner?
It helped a lot because this project is not a one person task. It required a lot of effort to build a team and start the work - there is a lot of time saving because through the program, we got a team fast and they started work immediately. It is also great to interact with young minds to see how they interact with the data, the way they utilise data science techniques to achieve a common goal.