in

Celina Lee, CEO and Co-Founder of Zindi – Interview Series


Celina Lee is the CEO and co-founder of  Zindi, the largest professional network for data scientists in Africa.

Celina has a passion for unleashing the power of data for social good. Celina has a proven track record of thought leadership in the intersect between data and development and has played central roles in the launches of global platforms including the Alliance for Financial Inclusion, insight2impact, and now Zindi. Celina’s work has expansively bridged across the private and public sectors and across various development areas including financial inclusion, micro and small enterprise development, market system development, gender, climate change, and public health. She has lived and worked in countries throughout Asia, Latin America, and Sub-Saharan Africa.

What initially attracted you to computer science and applied mathematics?

My entire life I enjoyed math. When I learned about the applied mathematics program it just made sense to me because I appreciate how data and math translates into real-world applications. What I like about working with data is that data has a story to tell. Data can be tremendously impactful, but only if you get it into the right person’s hands. It is magic.

What are some of the unique challenges of implementing data science and machine learning solutions in Africa?

A challenge is that datasets can be sparse. For example if you are working on natural language processing problems on local African languages, some languages only have thousands of native speakers; some are not even written. You don’t have the plethora of data that you do for English for example. But the nature of the challenge is exactly what makes the solutions even more important and impactful.

When did you initially conceive of the concept behind crowdsourcing data solutions?

I learned about Kaggle many years ago when I was in San Francisco, when it was just a start-up. The concept of having the crowd build data solutions for organizations resonated with me. But I saw a gap in that the datasets and problems were clearly sourced from large, mostly-American corporate companies and the participants similarly were mostly from the “developed” world. I had worked for many years in data in the international development sector. I saw an opportunity for crowd-solving problems for, and by, other regions as well.

In the first few days of launching, the platform crashed because Zindi had so many sign ups. Were you at all surprised by how quickly this was adopted by the community?

I was surprised, but not shocked. We had clearly not anticipated the amount of traffic we would get in the first few days or else it would not have crashed! But I knew that there was a demand in the market among young African data scientists and aspiring data scientists for this kind of platform. Young people on the continent are ambitious, energetic, and innovative. They will put the work in, and they will make anything possible. So I was not shocked that an online space like Zindi immediately resonated. On Zindi they are able to connect with other like-minded people from across Africa and around the world, they can build new skills, they grow their own profiles and portfolio, and they can get jobs. Additionally, I would note that people took a lot of pride in the fact that this was an African platform hosting African datasets and problems. As one data scientist told me, on Zindi she has found a home.

DeepMind launched a competition on the platform a bit over a year ago, what was this competition?

The DeepMind competition was to develop deep learning models to identify sea turtles using the unique patterns on their faces. The geometric patterns on sea turtles’ faces are like fingerprints. But there is not a large amount of close-up and out-of-water images of sea turtle faces. We worked with Local Ocean Conservation, a local non-profit organization in Kenya, that had a collection of thousands of images collected over 10 years of working in the field of sea turtle conservation.

The importance of these AI models is they can eliminate the need for physical tags, which can be expensive, unreliable (because they fall off or get damaged), and they can be dangerous to the sea turtles’ health. We had over 700 participants working on this problem. And the solutions are open-source, and other non-profits are currently working to develop mobile-based applications using the resulting algorithms.

What are some examples of other challenges that have been launched on the platform?

We have run over 300 challenges on the Zindi platform. These challenges range across many different industries, technical areas, and complexity! What is exciting is that they are all real-world applications of AI and data science, mostly in Africa.

To name a few: Using machine learning to forecast air pollution levels in Kampala, predicting the energy consumption levels of 5G networks, identifying landslides using satellite imagery, correcting irregular and faulty GPS locations for a fitness app in Egypt, identifying agriculture-related words in Luganda (a local language in Uganda) on the radio, measuring biomass in Ivory Coast using satellite data.

The list goes on! You can check them all out here.

On average how many data scientists work on a listed problem, and how successful are companies in solving the challenges that are listed?

Usually between 500 and 1000, or sometimes more, will work on any given problem on the platform. This depends on the complexity of the problem and the amount of prize money on offer. We have given out a total of over $500,000 USD to winning data scientists in the Zindi community.

We have had a number of success stories over the years. For example, Zimnat the largest insurance company in Zimbabwe sourced machine learning algorithms they got from their Zindi competition to predict which customers were most likely to churn (stop paying and leave the system). They incorporated these models into their customer service dashboard, which enabled them to reduce customer churn by 30% that year! Zimnat also ended up hiring one of the top data scientists in Zimbabwe.

Companies own the IP from the top three solutions. Aside from the models themselves, companies really value having hundreds of intelligent people working on their problems. It is a way to test new ideas, outsource problems that their internal teams don’t have time or the technical capability to work on, or often what is most valuable is just having an injection of new ideas and perspectives.

Can you discuss how Zindi then connects data scientists with companies after the competition is over?

There are a total of 70,000 users (data and AI practitioners) registered on Zindi from across 190 countries in the world, and 52 out of the 54 countries in Africa. Approximately 50% of our users are in university; 85% have a university degree or are working towards one, and 28% are women. Our goal is to make AI and data science accessible to everyone.

Every month approximately 6,000 are active on the platform. That means they are either entering and working on competitions, reading learning blogs, messaging on the discussion forums, direct messaging with friends, or applying for jobs.

Everytime a data scientist enters a competition, posts on the discussion forum, or joins a team, this activity gets added to their Zindi profile. The Zindi profile becomes their live resume and their proof of work.

We help companies hire data scientists and build their talent pipeline in multiple ways. We offer companies corporate memberships to Zindi, which allow them  to access benefits including running competitions on Zindi where they own the IP of the top three solutions and they also get to hire directly from the leaderboard of their competition. They also get an account to Zindi Talent Search, which allows potential employers to search the Zindi profiles and directly identify and hire candidates based on their actual performance on different types of real-world problems, i.e. the competitions.

What is your vision for the future of Zindi?

My vision for the future is for Zindi to be recognized as the single most important pipeline of millions of undiscovered and diverse data and AI talent from around the world. Every aspiring data and AI practitioner will know that they must come to Zindi. The Zindi platform is a place where no matter their background, they know they can build their skills, connect with mentors and peers to help them on their journey, create a profile that showcases their capabilities, and offers them career opportunities.

And every company will need their Zindi membership in order to stay ahead of the competition because in a few years’ time, every company will be competing on the quality of their data science and AI capabilities.

We currently make a promise to all Zindians on the platform, that we will change their life if they let us. We have already seen many young people who have started on Zindi, struggling to even load their CSV file, and one to two years later after entering multiple competitions on Zindi, engaging on the discussion forums, and teaming up with different people, they land incredible jobs because of the skills and reputation they built on Zindi.

Thank you for the great interview, readers who wish to learn more should visit Zindi. 

Uncovering the Surprising Impact of Chat GPT on Deal Desks

Exploring the Possibilities of AI for Learning: A Conversation with Chat GPT