Deborah D. Kanubala is tackling one of AI’s most overlooked blind spots—treatment-level discrimination. Her research brings causal reasoning into algorithmic fairness, helping expose and fix how AI systems treat people unequally not just by outcome, but by process.
1. What inspired you to focus your research on fairness and causality in AI, and what specific problem of algorithmic discrimination were you most determined to solve?
Deborah D. Kanubala: My first real experience with algorithmic discrimination happened during my second master’s research, which focused on financial inclusion. I was working with financial inclusion data from five East African countries. The idea of the project was to train a machine learning model to predict whether an individual was financially included. In our case, we defined financial inclusion by whether an individual had access to a bank account or not. The models performed quite well overall, but we decided to go a step further and evaluate their performance across different demographic groups; in this case, we used gender. That’s when we noticed significant performance differences between the groups. The model performance differed greatly between groups. This finding raised important questions for me, Why was there such a disparity? Was it due to the data? The model? Or something deeper? That moment sparked my curiosity about algorithmic fairness and pushed me to explore not just what models predict, but how and for whom they work. Since then, I have been interested to better understanding and addressing the root causes of these disparities, especially in systems that affect people’s access to opportunities.
2. How has your research at Saarland University shaped your approach to moving beyond simple yes/no predictions to address complex, layered decisions?
Deborah D. Kanubala: In machine learning, we often work with historical data, which includes both features of the individuals and some past decisions. Based on that, we build models aimed at making accurate predictions to support future decisions. Traditionally, the main goal has been accuracy/performance. More recently, there’s also been a push to ensure these systems make fair predictions. But what we often overlook is the bigger picture. My research at Saarland University has helped me realize that it’s not always just about correcting unfairness at the final prediction stage, which is often binary, like a yes or no decision (i.e I provide you a loan or not). Instead, it’s just as important to step back and examine the entire decision-making pipeline. This is because historical data have been shaped by multiple, layered decisions over time. As such, if these past decisions included in the data have been biased, fixing just the last step won’t be enough to ensure fairness. Through my research at Saarland, I have learned that we need to understand and intervene at every point where a decision was made. That means thinking beyond prediction and considering causality, policy choices, and system design as part of the fairness conversation.
3. What are the biggest challenges in developing practical metrics to detect and reduce treatment discrimination in real-world AI systems?
Deborah D. Kanubala: Sometimes, developing fairness evaluation metrics is the easier part, the bigger challenge often lies in testing and validating them in real-world settings. One of the major bottlenecks I have faced is accessing high-quality, relevant data.
For example, in our recent studies on treatment decisions on loan approval, we reviewed many datasets. But in the end, only four of them were suitable for our analysis. The rest were either incomplete or missing critical variables like interest rates or detailed demographic information, but these were also necessary to evaluate treatment discrimination properly. This lack of detailed, transparent, and well-documented datasets makes it hard to assess whether a metric truly captures unfair treatment. It also limits how generalizable the findings are. So for me, one of the biggest challenges isn’t just about designing the right metric. It is about finding or building the kind of rich, representative datasets needed to apply them meaningfully in practice.
4. Your latest scientific paper is available on arXiv. What were some of the key moments or findings in that research that you are most proud of?
Deborah D. Kanubala: One key moment in the research was realizing how treatment decisions, like the loan amount or duration offered, can significantly influence the final outcomes, such as whether someone is able to repay the loan or not. It sounds obvious, but in many fairness studies, we tend to focus only on the final decision (e.g., loan approval) and ignore how intermediate decisions shape people’s real-world outcomes. Another part I am proud of is proposing a method that could actually make all stakeholders better off, not just the bank but also the applicants. That was an exciting result because it showed that fairness doesn’t have to come at the cost of utility. We can design systems that align different objectives meaningfully. This work emphasized how important it is to account for stakeholder goals and downstream effects when designing decision-making systems. It pushed me to think more holistically about fairness, not just as an abstract metric, but as something that must reflect the complexity of real-world trade-offs.
5. Can you walk us through how causal frameworks can make hidden discrimination in AI systems visible, especially in cases of non-binary treatment?
Deborah D. Kanubala: Many people may have heard the phrase “correlation does not imply causation.” That’s exactly where causal frameworks come in. They help us move beyond surface-level patterns to understand the actual causes of outcomes. In the context of algorithmic discrimination, causal frameworks allow us to dig deeper and separate true discriminatory effects from spurious correlations or confounding variables. Instead of just observing that one group receives worse outcomes than another, we can ask: Would this outcome have changed if this person had been treated differently, say, if their gender or identity were different? This becomes especially important in non-binary treatment settings, where people receive varying levels of treatment, like different loan amounts or durations, not just a yes or no. In these cases, we can use counterfactual reasoning. We begin by modelling the data using a structural causal model, which represents how variables interact in the real world. This allows us to ask counterfactual questions like: What loan amount would this applicant have received if they had been a different gender, but everything else remained the same?
By comparing the actual treatment to these counterfactuals, we can uncover hidden or structural discrimination that wouldn’t be visible from just looking at predictions or average outcomes.
6. What sets your approach to fairness apart from other bias-detection methods in machine learning that rely on purely statistical measures?
Deborah D. Kanubala: What sets our approach apart is that it goes beyond statistical correlations. We focus on identifying the actual causes of discrimination, not just differences in outcomes. Statistical methods might detect disparities, but they can’t tell whether those differences are due to unfair treatment or something else, like a confounding variable. Using our approach, we can trace whether sensitive attributes like gender or race actually influenced how people were treated, and whether that treatment led to worse outcomes. This helps us avoid misleading conclusions and design fairer systems based on true causal effects.
7. How can AI developers use your work to build fairer and more transparent tools for high-stakes decisions in banking, justice, and healthcare?
Deborah D. Kanubala: For AI developers and engineers, we have made our codebase publicly available here, along with clear instructions in the README on how to run the experiments. The paper itself is also accessible here. With the paper and code, I believe we provide all the necessary things needed to replicate our method or utilize it in other domains. To extend to other domains, the most important thing would be to identify the non-binary decisions.
8. Can you share a specific example that illustrates how your metrics could identify and help mitigate bias in a process like credit scoring or hiring?
Deborah D. Kanubala: Sure. In loan approval, treatment decisions like loan amount or duration can deeply affect outcomes(whether a loan is paid back or not). If these treatment decisions were assigned unfairly in the past, then models trained on such data will likely reproduce those historical biases. In our work, we use a causal framework (i.e Causal Normalizing Flow) to detect such treatment decision disparities and then propose a pre-processing method to correct them. For example, if women received consistently lower loan amounts, we simulate what would have happened if they had been treated like men with similar profiles. This gives us a new, treatment-fair dataset that can be used to train fairer models, ones that don’t just repeat past discrimination but aim for better, more equitable outcomes.
9. What challenges have you faced in a field that requires both deep technical expertise and a nuanced understanding of societal issues, and how have you navigated them?
Deborah D. Kanubala: That’s a great question and very timely, as it is relevant to what I am currently working on. A challenge I recently encountered was while exploring multidimensional discrimination, that is, unfair disparities that arise when multiple sensitive attributes, like race and gender, interact in complex ways. Within this, we distinguish between multiple discrimination (additive), sequential discrimination, and intersectional discrimination, which captures non-additive effects. As I reviewed ML fairness work, I noticed a concerning misalignment. The way intersectional discrimination is defined and measured technically doesn’t always reflect how it’s understood in law or social theory. In legal contexts, intersectionality is about systemic harm and lived experiences, and this is mostly not additive harm. This matters because the metrics we build today could eventually be used to argue cases in court or inform public policy. If those metrics are based on narrow or incomplete definitions, the consequences could be serious and risk undermining real claims of discrimination or misguiding interventions. This experience reminded me that technical fairness metrics can’t stand alone. They must be grounded in interdisciplinary thinking, especially when they’re meant to speak to questions of justice, rights, and structural inequality.
10. Looking back, are there any strategic decisions in your research that you would approach differently now?
Deborah D. Kanubala: Collaboration, collaboration, collaboration! Looking back, I wish I had prioritized it earlier in my research journey. I have seen firsthand how impactful quality collaboration can be. Recently, I have started actively seeking out collaborations, and the insights I have gained, especially from colleagues outside of computer science, have been incredibly valuable. For example, being a member of the AI Grid Network, which aims to connect and support emerging AI researchers in Germany across Europe, I have the opportunity to work in a micro focus group that allows us to learn from each other. It has helped me see my work from new angles and ask better questions. If I were to start over, I would make intentional, interdisciplinary collaboration a core part of my approach from the beginning.
11. What does success look like for you and your research in the next 5–10 years? What change do you hope to see in the industry?
Deborah D. Kanubala: Success, for me, means making a real impact not just through research, but by helping others navigate and grow in their careers. I want to use my journey to support and uplift others, especially those from underrepresented backgrounds. From a research perspective, success would mean we have reached a point where the benefits of ML/AI are fully realized, while the harms, especially toward marginalized groups, are meaningfully mitigated. I want to see fairness and accountability become core to how we design and deploy these systems.
In the next 5–10 years, I definitely hope to have completed my PhD and returned to the beautiful, sunny Ghana. My goal is to contribute to the country’s development by helping train the next generation of ML engineers and scientists, sharing both technical skills and ethical foundations. More broadly, I hope the ML community evolves to the point where everyone involved in model development, not just fairness researchers, deeply understands and takes responsibility for the societal impact of their work, and actively designs systems with that in mind.
12. What does a typical day in your life as a researcher look like, and how do you stay motivated?
Deborah D. Kanubala: I usually wake up between 6 and 7 a.m., and I like to start my mornings slowly. Depending on what I need to get done that day, I might spend the early hours reading a paper, learning German, or reading my Bible. How I use the morning really depends on my to-do list, but I always try to create space to ease into the day. I typically get to the office between 8:30 and 9:00 a.m. From then on, my day is divided between coding, reading research papers, and writing down ideas related to my research. I am also involved in teaching a seminar on Causality and Ethical Machine Learning, so Tuesdays, for example, include seminar time. Wednesday mornings are usually for meetings with my supervisor, followed by reading group presentations with my research group. As for staying motivated, I draw a lot of inspiration from self-help books. I enjoy reading about mindset, growth, and purpose. But honestly, a big part of my motivation comes from my support system. I have a strong circle of family and friends who believe in me deeply. Their encouragement means everything to me. I wouldn’t be where I am without them.
Editor’s Note
Deborah’s work bridges machine learning and social impact. At Saarland University, she is building tools that go beyond yes-or-no decisions to identify how subtle biases shape access to credit, healthcare, and opportunity. This interview sheds light on her vision for fairer, more transparent AI.