As the importance of ethical considerations in AI applications is being recognized not only by ethicists and researchers but also by industry tech leaders, AI ethics research is moving from general definitions of fairness and bias to more in-depth analysis. The research papers introduced in 2019 define comprehensive terminology for communicating about ML fairness, go from general AI principles to specific tensions that arise when implementing AI in practice, explain the reasons behind frustrating decisions made by AI algorithms, and more.
To give you an overview of the important work done in this research area last year, we have summarized 12 research papers covering different aspects of AI ethics.
Subscribe to our AI Research mailing list at the bottom of this article to be alerted when we release new summaries.
If you’d like to skip around, here are the papers we featured:
- Controlling Polarization in Personalization: An Algorithmic Framework
- Learning Existing Social Conventions via Observationally Augmented Self-Play
- Actionable Auditing: Investigating the Impact of Publicly Naming Biased Performance Results of Commercial AI Products
- The Role and Limits of Principles in AI Ethics: Towards a Focus on Tensions
- Problem Formulation and Fairness
- A Framework for Understanding Unintended Consequences of Machine Learning
- Fairwashing: the Risk of Rationalization
- What’s in a Name? Reducing Bias in Bios without Access to Protected Attributes
- Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But Do Not Remove Them
- Street–Level Algorithms: A Theory at the Gaps Between Policy and Decisions
- Average Individual Fairness: Algorithms, Generalization and Experiments
- Energy and Policy Considerations for Deep Learning in NLP
12 Important AI Ethics Research Papers of 2019
1. Controlling Polarization in Personalization: An Algorithmic Framework, by L. Elisa Celis, Sayash Kapoor, Farnood Salehi, Nisheeth Vishnoi
Original Abstract
Personalization is pervasive in the online space as it leads to higher efficiency for the user and higher revenue for the platform by individualizing the most relevant content for each user. However, recent studies suggest that such personalization can learn and propagate systemic biases and polarize opinions; this has led to calls for regulatory mechanisms and algorithms that are constrained to combat bias and the resulting echo-chamber effect. We propose a versatile framework that allows for the possibility to reduce polarization in personalized systems by allowing the user to constrain the distribution from which content is selected. We then present a scalable algorithm with provable guarantees that satisfies the given constraints on the types of the content that can be displayed to a user, but – subject to these constraints – will continue to learn and personalize the content in order to maximize utility. We illustrate this framework on a curated dataset of online news articles that are conservative or liberal, show that it can control polarization, and examine the trade-off between decreasing polarization and the resulting loss to revenue. We further exhibit the flexibility and scalability of our approach by framing the problem in terms of the more general diverse content selection problem and test it empirically on both a News dataset and the MovieLens dataset.
Our Summary
Social media feeds, advertising and search results are increasingly personalized based on user preferences, which increases user engagement and platform revenue. However, as people’s biases and opinions are reinforced, and they don’t get exposed to opposing opinions, this causes an “echo chamber” or “filter bubble” effect, which leads to divisive social fragmentation. The proposed algorithm tackles this problem by placing constraints on the content that can be sampled. The experiments confirm that this approach is flexible, scalable, and effective in controlling polarization.
What’s the core idea of this paper?
- Content is often classified into groups based on various attributes. Existing algorithms use a multi-armed bandit model to select content and receive rewards (e.g. clicks or purchases).
- This can lead to over-specialization, where results are narrowed to a small subset of groups.
- The proposed Constrained-ε-Greedy algorithm constrains the probability distribution from which content is sampled at each time step in the sampling process. These constraints limit the total weight that can be given to any single group, which prevents this over-specialization.
What’s the key achievement?
- The researchers’ modifications of the bandit algorithm improve the regret bound (how far the algorithm’s rewards fall short of the theoretical optimum) relative to the state of the art.
- The algorithm is scalable and converges quickly to the theoretical optimum, even for the tightest constraints on the arm values selected.
- This optimum is within a factor of 2 of the unconstrained version.
What does the AI community think?
- The paper received the Best Paper award at ACM FAT 2019, one of the key conferences in AI ethics.
What are future research areas?
- Testing the algorithm in the field and measuring user satisfaction given diversified feeds.
- Applying the proposed approach to content which changes over time.
What are possible business applications?
- This approach could be used to satisfy corporate social responsibility requirements, by reducing bias and unfairness and dampening the socially divisive echo-chamber effect.
2. Learning Existing Social Conventions via Observationally Augmented Self-Play, by Adam Lerer and Alexander Peysakhovich
Original Abstract
In order for artificial agents to coordinate effectively with people, they must act consistently with existing conventions (e.g. how to navigate in traffic, which language to speak, or how to coordinate with teammates). A group’s conventions can be viewed as a choice of equilibrium in a coordination game. We consider the problem of an agent learning a policy for a coordination game in a simulated environment and then using this policy when it enters an existing group. When there are multiple possible conventions we show that learning a policy via multi-agent reinforcement learning (MARL) is likely to find policies which achieve high payoffs at training time but fail to coordinate with the real group into which the agent enters. We assume access to a small number of samples of behavior from the true convention and show that we can augment the MARL objective to help it find policies consistent with the real group’s convention. In three environments from the literature – traffic, communication, and team coordination – we observe that augmenting MARL with a small amount of imitation learning greatly increases the probability that the strategy found by MARL fits well with the existing social convention. We show that this works even in an environment where standard training methods very rarely find the true convention of the agent’s partners.
Our Summary
The Facebook AI research team addresses the problem of AI agents acting in line with existing conventions. Learning a policy via multi-agent reinforcement learning (MARL) results in agents that achieve high payoffs at training time but fail to coordinate with the real group. The researchers suggest solving this problem by augmenting the MARL objective with a small sample of observed behavior from the group. The experiments in three test settings (traffic, communication, and team coordination) demonstrate that this approach greatly increased the probability of the agent finding a strategy that fits with the existing group’s conventions.
What’s the core idea of this paper?
- Without any input from an existing group, a new agent will learn policies that work in isolation but do not necessarily fit with the group’s conventions.
- To solve this problem, the authors propose a novel observationally augmented self-play (OSP) method, where the agent is trained with a joint MARL and behavioral cloning objective. In particular, the researchers suggest providing the agent with a small number of observations of existing social behavior (i.e., samples of (state, action) pairs from the test environment).
What’s the key achievement?
- The experiments on several multi-agent situations with multiple conventions (a traffic game, a particle environment combining navigation and communication, and a Stag Hunt game) show that OSP can learn relevant conventions with a small amount of observational data.
- Moreover, with this method, the agent can learn conventions that are very unlikely to be learned using MARL alone.
What does the AI community think?
- The paper was awarded the Best Paper Award at AAAI-AIES 2019, one of the leading conferences in the AI Ethics research area.
What are future research areas?
- Exploring alternative algorithms for constructing agents that can learn social conventions.
- Investigating the possibility of fine-tuning the OSP training strategies during test time.
- Considering problems where agents have incentives that are partly misaligned, and thus need to coordinate on a convention in addition to solving the social dilemma.
- Extending the work into more complex environments, including interaction with humans.
What are possible business applications?
- This work is a stepping-stone towards developing AI agents that can teach themselves to cooperate with humans. This has positive implications for chatbots, customer support agents and many other AI applications.
3. Actionable Auditing: Investigating the Impact of Publicly Naming Biased Performance Results of Commercial AI Products, by Inioluwa Deborah Raji and Joy Buolamwini
Original Abstract
Although algorithmic auditing has emerged as a key strategy to expose systematic biases embedded in software platforms, we struggle to understand the real-world impact of these audits, as scholarship on the impact of algorithmic audits on increasing algorithmic fairness and transparency in commercial systems is nascent. To analyze the impact of publicly naming and disclosing performance results of biased AI systems, we investigate the commercial impact of Gender Shades, the first algorithmic audit of gender and skin type performance disparities in commercial facial analysis models. This paper 1) outlines the audit design and structured disclosure procedure used in the Gender Shades study, 2) presents new performance metrics from targeted companies IBM, Microsoft and Megvii (Face++) on the Pilot Parliaments Benchmark (PPB) as of August 2018, 3) provides performance results on PPB by non-target companies Amazon and Kairos and, 4) explores differences in company responses as shared through corporate communications that contextualize differences in performance on PPB. Within 7 months of the original audit, we find that all three targets released new API versions. All targets reduced accuracy disparities between males and females and darker and lighter-skinned subgroups, with the most significant update occurring for the darker-skinned female subgroup, that underwent a 17.7% – 30.4% reduction in error between audit periods. Minimizing these disparities led to a 5.72% to 8.3% reduction in overall error on the Pilot Parliaments Benchmark (PPB) for target corporation APIs. The overall performance of non-targets Amazon and Kairos lags significantly behind that of the targets, with error rates of 8.66% and 6.60% overall, and error rates of 31.37% and 22.50% for the darker female subgroup, respectively.
Our Summary
In this paper, Raji and Buolamwini investigate how the publicly available performance evaluations for commercial AI products impact the performance of the respective machine learning systems in future releases. In particular, they review how the Gender Shades study (Buolamwini and Gebru, 2018) affected the performance of the targeted facial analysis systems (Face++, Microsoft, IBM) as well as systems not covered in the study (non-targeted systems: Amazon and Kairos). The researchers observed a significant reduction in overall error for the targeted systems, especially with regard to the darker-skinned female subgroup, which is the most challenging for existing face analysis systems. The results of this research demonstrate that, if prioritized, the disparities in performance between different subgroups can be significantly minimized in a reasonable amount of time.
What’s the core idea of this paper?
- Evaluating updated API releases of three companies targeted in the Gender Shades study by strictly following the methodology of that study.
- Investigating differences in the pre-audit and post-audit performance of the systems (overall and across different subgroups).
- Evaluating the performance of two non-targeted face analysis systems to investigate if publicly available auditing results have an impact on similar AI systems not included in the audit.
- Exploring differences in company responses to the audit results.
What’s the key achievement?
- Demonstrating that, when reacting to publicly available performance evaluations, companies were able to significantly reduce the error rates of their models, especially for the most challenging intersectional subgroup of darker-skinned females.
- Revealing that API updates have been mostly data-driven, which implies that significant improvements have been achieved through training data diversification.
- Showing that the performance of the non-targeted companies is closer to the pre-audit performance of targeted companies than to their post-audit performance, which may imply that the systems not mentioned in the study have probably not been revised since the publication of the auditing results.
What does the AI community think?
- The paper received the Best Student Paper award at AAAI-AIES 2019, one of the leading conferences in the AI Ethics research area.
What are future research areas?
- Considering the confidence scores of the face analysis systems, to get a complete view of their real-world performance.
- Evaluating these systems on another balanced dataset or using metrics such as balanced error to account for class imbalances in existing benchmarks.
What are possible business applications?
- This case study demonstrates how an external audit can be beneficial to the performance of commercial AI products: the targeted companies were able not only to significantly reduce the error gap between the best-performing and the worst-performing subgroups but also to improve the overall performance of the system with improvements observed across all subgroups.
4. The Role and Limits of Principles in AI Ethics: Towards a Focus on Tensions, by Jess Whittlestone, Rune Nyrup, Anna Alexandrova, and Stephen Cave
Original Abstract
The last few years have seen a proliferation of principles for AI ethics. There is substantial overlap between different sets of principles, with widespread agreement that AI should be used for the common good, should not be used to harm people or undermine their rights, and should respect widely held values such as fairness, privacy, and autonomy. While articulating and agreeing on principles is important, it is only a starting point. Drawing on comparisons with the field of bioethics, we highlight some of the limitations of principles: in particular, they are often too broad and high-level to guide ethics in practice. We suggest that an important next step for the field of AI ethics is to focus on exploring the tensions that inevitably arise as we try to implement principles in practice. By explicitly recognizing these tensions we can begin to make decisions about how they should be resolved in specific cases, and develop frameworks and guidelines for AI ethics that are rigorous and practically relevant. We discuss some different specific ways that tensions arise in AI ethics, and what processes might be needed to resolve them.
Our Summary
The research team from the University of Cambridge points out that AI ethics is currently based on principles that are quite broad and unspecific. It is recognized that AI should be applied to the common good, shouldn’t harm people, and should respect their privacy. But how do you implement this in practice? To answer this question, the researchers recommend focusing on tensions that arise while applying AI in the real world, and discuss how these tensions should be resolved in specific cases. The paper lists four key tensions and provides some general guidelines for resolving them.
What’s the core idea of this paper?
- There is an agreement that AI technologies should follow some specific ethics principles, including AI not being used to harm people or undermine their rights, as well as AI technologies respecting such values as fairness, privacy, and autonomy.
- However, these principles have a number of significant limitations:
- They are too general.
- They often come into conflict in practice.
- Different groups may understand the principles differently.
- The authors suggest that focusing on tensions instead of principles brings several important advantages:
- bridging the gap between principles and practice;
- acknowledging differences in values;
- highlighting areas where new solutions are needed;
- identifying ambiguities and knowledge gaps.
- Next, the research team introduces four tensions that they find central to the current applications of AI:
- Using data for service improvement and efficiency vs. respecting the privacy and autonomy of individuals.
- Increasing the accuracy of decisions and predictions vs. ensuring fairness and equal treatment.
- Enjoying the benefits of personalization in the digital world vs. enhancing solidarity and citizenship.
- Making people’s lives more convenient with automation vs. promoting self-actualization and dignity.
What’s the key achievement?
- Explaining why focusing on tensions is important for further development of the AI ethics area.
- Introducing four key tensions in applying AI.
- Providing general guidelines for resolving the tensions.
What does the AI community think?
- The paper was presented at AAAI-AIES 2019, one of the leading conferences in the AI Ethics research area.
What are future research areas?
- Identifying further tensions.
- Exploring the ways to address the existing tensions between AI goals and values.
What are possible business applications?
- The suggested approach may help in guiding the ethical application of AI systems in the real world.
5. Problem Formulation and Fairness, by Samir Passi and Solon Barocas
Original Abstract
Formulating data science problems is an uncertain and difficult process. It requires various forms of discretionary work to translate high-level objectives or strategic goals into tractable problems, necessitating, among other things, the identification of appropriate target variables and proxies. While these choices are rarely self-evident, normative assessments of data science projects often take them for granted, even though different translations can raise profoundly different ethical concerns. Whether we consider a data science project fair often has as much to do with the formulation of the problem as any property of the resulting model. Building on six months of ethnographic fieldwork with a corporate data science team – and channeling ideas from sociology and history of science, critical data studies, and early writing on knowledge discovery in databases – we describe the complex set of actors and activities involved in problem formulation. Our research demonstrates that the specification and operationalization of the problem are always negotiated and elastic, and rarely worked out with explicit normative considerations in mind. In so doing, we show that careful accounts of everyday data science work can help us better understand how and why data science problems are posed in certain ways – and why specific formulations prevail in practice, even in the face of what might seem like normatively preferable alternatives. We conclude by discussing the implications of our findings, arguing that effective normative interventions will require attending to the practical work of problem formulation.
Our Summary
The researchers from Cornell University investigate the issue of problem formulation in data science and its implications for the fairness of data science projects. Specifically, they point out that the process of translating a business objective that a company wants to achieve to the problem formulated in terms of a target variable is very uncertain and challenging. The problem formulation process is driven by numerous factors, including available data as well as financial and time constraints, while ethical considerations are rarely addressed. Thus, to ensure greater fairness in data science projects, it is important to investigate in-depth the iterative work of problem formulation. The research team illustrates its claims with a case study from a multi-billion-dollar US-based e-commerce organization.
What’s the core idea of this paper?
- Problem formulation in data science projects is a negotiated translation. Specifically, translation between high-level goals and tractable machine learning problems does not have a given outcome – it is elastic.
- Different problem formulations give rise to different ethical concerns.
- Translation of strategic goals into tractable problems is always imperfect as it always requires some assumptions about the world to be modeled. However, it is important to consider the consequences of different translations.
- An in-depth analysis of the data formulation process may help us understand why data science problems are posed in certain ways when it seems that more ethical alternatives are available.
What’s the key achievement?
- Demonstrating the elasticity of the problem formulation process and its importance for the fairness of data science projects.
- Illustrating the uncertainty and difficulty of the problem formulation process with a case study.
What does the AI community think?
- The paper was presented at ACM FAT 2019, one of the key conferences in AI ethics.
What are future research areas?
- The authors of this paper suggest the following questions for investigation and intervention:
- Which goals are set and why?
- How goals are transformed into tractable problems?
- How and why do certain problem formulations succeed?
What are possible business applications?
- Following the findings of this paper, companies may avoid the implementation of data science projects with undesired consequences by discussing the ethical implications of their systems at the stage of problem formulation.
6. A Framework for Understanding Unintended Consequences of Machine Learning, by Harini Suresh and John V. Guttag
Original Abstract
As machine learning increasingly affects people and society, it is important that we strive for a comprehensive and unified understanding of how and why unwanted consequences arise. For instance, downstream harms to particular groups are often blamed on “biased data,” but this concept encompasses too many issues to be useful in developing solutions. In this paper, we provide a framework that partitions sources of downstream harm in machine learning into five distinct categories spanning the data generation and machine learning pipeline. We describe how these issues arise, how they are relevant to particular applications, and how they motivate different solutions. In doing so, we aim to facilitate the development of solutions that stem from an understanding of application-specific populations and data generation processes, rather than relying on general claims about what may or may not be “fair.”
Our Summary
Machine learning applications often result in unwanted consequences that people commonly attribute to “biased data”. The MIT research team draws our attention to the fact that this concept encompasses lots of different issues. Moreover, the data is not the only source of unfair outcomes – the ML pipeline also includes some choices and practices that can lead to unwanted effects. Thus, the researchers introduce a framework that partitions sources of downstream harm into five distinct categories. This framework provides a comprehensive terminology for communicating about ML fairness and facilitates solutions that come from a clear understanding of the source problem instead of relying on general terms, like “fair” or “biased”.
What’s the core idea of this paper?
- There are five sources of bias in machine learning:
- Historical bias arises when the world, as it is, is biased (e.g., men-dominated image search results for the word “CEO” simply reflect that 95% of Fortune 500 CEOs are men).
- Representation bias occurs when some groups of the population are underrepresented in the training dataset. For example, models trained on ImageNet, where 45% of images come from the US and only 1% of images represent China, perform poorly on images depicting Asia.
- Measurement bias arises when there are issues with choosing or measuring the particular features of interest. The issues may come from varying granularity or quality of data across groups or oversimplification of the classification task. For example, the success of a student is often measured by a GPA score, which ignores many important indicators of success.
- Aggregation bias occurs when a one-size-fits-all model is used for groups that have different conditional distributions. For example, studies suggest that HbA1c levels, which are used for diagnosing diabetes, differ in a complex way across ethnicities and genders. Thus, a single model is not likely to be the best fit for predicting diabetes for every group in the population.
- Evaluation bias arises when evaluation and/or benchmark datasets are not representative of the target population. Such datasets encourage the development of models that only perform well on a subset of data. For example, facial recognition benchmarks used to have a very small fraction of images with dark-skinned female faces, which resulted in commercial facial recognition systems performing very badly on this subset of the population.
- Solutions for mitigating a bias need to be tailored to the specific source of the bias. For example, in the case of representation bias, we need to add more samples from the underrepresented group, while aggregation bias might be addressed with multi-task learning.
What’s the key achievement?
- Providing a consolidated and comprehensive terminology for understanding and communicating about ML fairness.
- Facilitating solutions that arise from a clear understanding of the source of downstream harm.
What are possible business applications?
- The introduced framework can serve as a guide for data scientists and ML engineers when designing fair ML systems.
7. Fairwashing: the Risk of Rationalization, by Ulrich Aïvodji, Hiromi Arai, Olivier Fortineau, Sébastien Gambs, Satoshi Hara, Alain Tapp
Original Abstract
Black-box explanation is the problem of explaining how a machine learning model – whose internal logic is hidden to the auditor and generally complex – produces its outcomes. Current approaches for solving this problem include model explanation, outcome explanation as well as model inspection. While these techniques can be beneficial by providing interpretability, they can be used in a negative manner to perform fairwashing, which we define as promoting the false perception that a machine learning model respects some ethical values. In particular, we demonstrate that it is possible to systematically rationalize decisions taken by an unfair black-box model using the model explanation as well as the outcome explanation approaches with a given fairness metric. Our solution, LaundryML, is based on a regularized rule list enumeration algorithm whose objective is to search for fair rule lists approximating an unfair black-box model. We empirically evaluate our rationalization technique on black-box models trained on real-world datasets and show that one can obtain rule lists with high fidelity to the black-box model while being considerably less unfair at the same time.
Our Summary
Society requires AI systems to be ethically aligned, which implies fair decisions and explainable results. In this study, the researchers point out the possible pitfall behind this. Specifically, they think that there is a risk of fairwashing, when malicious decision-makers give fake explanations for their unfair decisions. To demonstrate that this risk is real, the authors introduce LaundryML, an algorithm that systematically generates fake explanations. The experiments confirm that this algorithm can generate explanations that look faithful and rationalize the unfair decisions of the black-box model.
What’s the core idea of this paper?
- There is a risk of malicious entities promoting the false perception that a machine learning model respects some ethical principles, while in reality its results are heavily biased.
- To show that this risk is not imaginary, the authors introduce LaundryML, an algorithm that systematically generates fake explanations for an unfair black-box model:
- In the first step, the algorithm generates many explanations using a model enumeration technique.
- Next, one of these explanations is selected based on fairness metrics such as demographic parity (i.e., the algorithm picks the explanation that is the most faithful to the model with the demographic parity score within certain limits).
- The two versions of LaundryML introduced in the paper can rationalize both the model explanation and the outcome explanation.
What’s the key achievement?
- Pointing out the risk of fairwashing in machine learning.
- Providing concrete evidence for this risk by introducing an algorithm that can create faithful and yet fake explanations that hide the real unfairness of the black-box model.
What does the AI community think?
- The paper was presented at ICML 2019, one of the leading conferences in machine learning.
What are future research areas?
- Investigating the general social implications of fairwashing.
- Developing techniques that can detect fairwashing by estimating whether an explanation is likely to be a rationalization.
Where can you get implementation code?
- The implementation code of LaundryML is available on GitHub.
8. What’s in a Name? Reducing Bias in Bios without Access to Protected Attributes, by Alexey Romanov, Maria De-Arteaga, Hanna Wallach, Jennifer Chayes, Christian Borgs, Alexandra Chouldechova, Sahin Geyik, Krishnaram Kenthapadi, Anna Rumshisky, Adam Tauman Kalai
Original Abstract
There is a growing body of work that proposes methods for mitigating bias in machine learning systems. These methods typically rely on access to protected attributes such as race, gender, or age. However, this raises two significant challenges: (1) protected attributes may not be available or it may not be legal to use them, and (2) it is often desirable to simultaneously consider multiple protected attributes, as well as their intersections. In the context of mitigating bias in occupation classification, we propose a method for discouraging correlation between the predicted probability of an individual’s true occupation and a word embedding of their name. This method leverages the societal biases that are encoded in word embeddings, eliminating the need for access to protected attributes. Crucially, it only requires access to individuals’ names at training time and not at deployment time. We evaluate two variations of our proposed method using a large-scale dataset of online biographies. We find that both variations simultaneously reduce race and gender biases, with almost no reduction in the classifier’s overall true positive rate.
Our Summary
The authors introduce a novel approach to mitigating bias in online recruiting and automated hiring without access to protected attributes such as gender, age, and race. In particular, they suggest leveraging only the person’s name and then discouraging an occupation classifier from learning a correlation between the predicted probability of an individual’s occupation and a word embedding of their name. The experiments confirm the effectiveness of the proposed approach in reducing race and gender bias.
What’s the core idea of this paper?
- The traditional methods for mitigating bias in machine learning typically rely on access to protected attributes (e.g., race, age, gender). However, these attributes may not be available or may not be legal to use, even for mitigating the bias.
- To avoid reliance on protected attributes, the researchers suggest using only individuals’ names and leveraging the societal biases encoded in word embeddings.
What’s the key achievement?
- The proposed method is simple and powerful:
- it is applicable when protected attributes are not available;
- it eliminates the need to specify which biases are to be mitigated;
- it allows simultaneous mitigation of multiple biases, including those that relate to group intersections.
- The evaluation on several datasets demonstrates that this approach significantly reduces race and gender biases with almost no reduction in the classifier’s overall true positive rate.
What does the AI community think?
- The paper received the Best Thematic Paper award at NAACL-HTL, one of the leading conferences in natural language processing.
What are future research areas?
- Experimenting with the proposed method in other languages (beyond English).
What are possible business applications?
- Even though the authors focus on mitigating biases in recruiting, the introduced approach can be applied in any domain where the people’s names are available at training time.
9. Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But Do Not Remove Them, by Hila Gonen and Yoav Goldberg
Original Abstract
Word embeddings are widely used in NLP for a vast range of tasks. It was shown that word embeddings derived from text corpora reflect gender biases in society. This phenomenon is pervasive and consistent across different word embedding models, causing serious concern. Several recent works tackle this problem, and propose methods for significantly reducing this gender bias in word embeddings, demonstrating convincing results. However, we argue that this removal is superficial. While the bias is indeed substantially reduced according to the provided bias definition, the actual effect is mostly hiding the bias, not removing it. The gender bias information is still reflected in the distances between “gender-neutralized” words in the debiased embeddings, and can be recovered from them. We present a series of experiments to support this claim, for two debiasing methods. We conclude that existing bias removal techniques are insufficient, and should not be trusted for providing gender-neutral modeling.
Our Summary
It has been demonstrated many times that word embeddings in NLP reflect gender biases in society. To address this problem, several research papers suggest reducing gender bias by zeroing the gender projection of all neutral non-gendered words on a predefined gender projection. The authors of the current paper claim that such debiasing approaches only hide the bias but don’t remove it. Specifically, even though word embeddings change in relation to the gender direction, they still keep their previous similarities and biased words are still grouped together. This claim is supported by a series of experiments.
What’s the core idea of this paper?
- Word embeddings reflect gender biases present in society.
- Existing debiasing methods rely on the same bias definition, where the gender bias of a particular word is defined by a projection of this word onto the “gender direction”. Thus, according to this definition, if a certain word embedding is equally close to male and female gender-specific words, it is not biased.
- The idea of this paper is that bias is much more profound and systematic and cannot be removed by simply zeroing the projection of a word embedding onto the “gender direction”.
- The authors demonstrate that even after debiasing word embeddings based on the above definition, most words that had a specific bias before debiasing are still grouped together, implying that the spatial geometry of word embeddings stays largely the same.
What’s the key achievement?
- Demonstrating that popular debiasing methods don’t remove the gender bias from word embeddings:
- male- and female-biased words still cluster together;
- it is easy to predict the implicit gender of words based on their vectors alone.
What does the AI community think?
- The paper was presented at NAACL-HLT 2019, one of the leading conferences in natural language processing.
What are future research areas?
- Further exploring debiasing methods that would entirely eliminate the gender bias from word embeddings.
What are possible business applications?
- Teams that deploy potentially gender-biased AI systems can use the experiments from this paper to make sure that their systems are free from this bias.
Where can you get implementation code?
- The code for the experiments described in the paper is available on GitHub.
10. Street–Level Algorithms: A Theory at the Gaps Between Policy and Decisions, by Ali Alkhatib and Michael Bernstein
Original Abstract
Errors and biases are earning algorithms increasingly malignant reputations in society. A central challenge is that algorithms must bridge the gap between high-level policy and on-the-ground decisions, making inferences in novel situations where the policy or training data do not readily apply. In this paper, we draw on the theory of street-level bureaucracies, how human bureaucrats such as police and judges interpret policy to make on-the-ground decisions. We present by analogy a theory of street-level algorithms, the algorithms that bridge the gaps between policy and decisions about people in a socio-technical system. We argue that unlike street-level bureaucrats, who reflexively refine their decision criteria as they reason through a novel situation, street-level algorithms at best refine their criteria only after the decision is made. This loop-and-a-half delay results in illogical decisions when handling new or extenuating circumstances. This theory suggests designs for street-level algorithms that draw on historical design patterns for street-level bureaucracies, including mechanisms for self–policing and recourse in the case of error.
Our Summary
Compared to humans, algorithmic systems seem to be more prone to errors that are very frustrating for the people affected. To understand why this might be the case, it is necessary to realize that the policies are usually implemented by street-level bureaucrats like police officers and judges, who make important decisions by interpreting the given policy for both familiar and new situations. Similarly, algorithms that directly interact and make decisions about people can be referred to as street-level algorithms. The Stanford research team claim that street-level algorithms make frustrating decisions more often than street-level bureaucrats because humans, when encountered with a new or marginal case, can refine their decision boundaries before making the decision, while algorithms can refine these boundaries only after the decision is made and the system has received feedback or additional training data.
What’s the core idea of this paper?
- Algorithms that directly interact with people and make decisions are more error-prone than humans in the same position.
- The reason is that:
- Street-level bureaucrats, like judges, teachers, or police officers, can reflexively refine their decision criteria when facing a novel or marginal case.
- Street-level algorithms are not that flexible and at best can refine decision criteria only after the decision is made and they have received some feedback or new training data.
- Thus, the designs of street-level algorithms should consider this problem and include the mechanisms for self-policing and recourse in case of a wrong decision.
What’s the key achievement?
- Introducing a valid explanation for algorithmic systems making frustrating decisions more often than humans.
- Suggesting design implications that can address the issue of algorithms being inadaptive to novel cases. These include:
- providing the user with information that can help understand whether a system has made a mistake (e.g. if YouTube denies monetization for a certain video, it can show the user other videos that the system finds similar to the denied one – this can help the user to understand if the video was misclassified);
- self-policing (e.g., building oversight into algorithms);
- recourse and appeals in case of errors.
What does the AI community think?
- The paper received the Best Paper Award at CHI 2019, the premier conference in human-computer interaction.
What are future research areas?
- Progressing towards systems that can better consider the needs of stakeholders.
What are possible business applications?
- Designing machine learning systems that include oversight components and allow for appeals and recourse in case of wrong decisions.
11. Average Individual Fairness: Algorithms, Generalization and Experiments, by Michael Kearns, Aaron Roth, and Saeed Sharifi-Malvajerdi
Original Abstract
We propose a new family of fairness definitions for classification problems that combine some of the best properties of both statistical and individual notions of fairness. We posit not only a distribution over individuals, but also a distribution over (or collection of) classification tasks. We then ask that standard statistics (such as error or false positive/negative rates) be (approximately) equalized across individuals, where the rate is defined as an expectation over the classification tasks. Because we are no longer averaging over coarse groups (such as race or gender), this is a semantically meaningful individual-level constraint. Given a sample of individuals and classification problems, we design an oracle-efficient algorithm (i.e. one that is given access to any standard, fairness-free learning heuristic) for the fair empirical risk minimization task. We also show that given sufficiently many samples, the ERM solution generalizes in two directions: both to new individuals, and to new classification tasks, drawn from their corresponding distributions. Finally we implement our algorithm and empirically verify its effectiveness.
Our Summary
The researchers from the University of Pennsylvania suggest combining statistical and individual notions of fairness to generate a new family of fairness definitions for classification problems. First of all, they assume that each individual is subject to decisions made by many classification systems. Then, they require that the error rates, or false-positive rates, or false-negative rates, are equal across all individuals. Finally, to satisfy this guarantee, they derive a new oracle-efficient algorithm for learning Average Individual Fairness, called AIF-Learn. The algorithm solves the fair empirical risk minimization task with the solution being generalizable to both new individuals and new classification tasks. The empirical evaluation verifies the effectiveness of the introduced algorithm.
What’s the core idea of this paper?
- The authors show that existing fairness definitions can be divided into two groups:
- Statistical fairness definitions that can be easily enforced on arbitrary data distributions but do not have strong semantics.
- Individual fairness definitions that have very strong individual-level semantics but require strong realizability assumptions.
- The paper introduces an alternative definition of individual fairness that does not require assumptions to be imposed on the data generating process:
- In the suggested setting, individuals are subject to many classification tasks over a given period of time (e.g., users are exposed to multiple target ads when using a particular platform).
- This setting is modeled by assuming that in addition to the unknown distribution over individuals, there is also an unknown distribution over classification problems.
- The model is aimed at ensuring that the error rates or false positive/negative rates are equalized across all individuals.
- This fairness definition is implemented with an oracle-efficient algorithm, called AIF-Learn.
- The algorithm assumes the existence of “oracles”, implemented with a heuristic that can solve weighted classification problems in the absence of fairness constraints.
What’s the key achievement?
- The guarantees of the AIF-Learn algorithm hold both in-sample and also out of sample, implying its generalizability to new individuals and classification tasks.
- The empirical evaluation of the AIF-Learn algorithm demonstrates that it:
- shows strong convergence properties suggested by theory;
- outperforms the random predictions in terms of both average errors and individual errors.
What does the AI community think?
- The paper was accepted for oral presentation at NeurIPS 2019, the leading conference in artificial intelligence.
What are the possible business applications?
- The introduced approach can improve the fairness of AI classification systems across industries and applications.
12. Energy and Policy Considerations for Deep Learning in NLP, by Emma Strubell, Ananya Ganesh, Andrew McCallum
Original Abstract
Recent progress in hardware and methodology for training neural networks has ushered in a new generation of large networks trained on abundant data. These models have obtained notable gains in accuracy across many NLP tasks. However, these accuracy improvements depend on the availability of exceptionally large computational resources that necessitate similarly substantial energy consumption. As a result these models are costly to train and develop, both financially, due to the cost of hardware and electricity or cloud compute time, and environmentally, due to the carbon footprint required to fuel modern tensor processing hardware. In this paper we bring this issue to the attention of NLP researchers by quantifying the approximate financial and environmental costs of training a variety of recently successful neural network models for NLP. Based on these findings, we propose actionable recommendations to reduce costs and improve equity in NLP research and practice.
Our Summary
In this paper, the researchers from the University of Massachusetts Amherst draw the attention of the research community to the huge amounts of energy consumption associated with training large neural networks. The authors focus specifically on the latest NLP models and estimate CO2 emissions from training such models as well as the corresponding cloud computing costs. Thus, training one model on GPU, with tuning and experimentation, results in CO2 emissions that are comparable to the two-year carbon footprint of an average American. Furthermore, the researchers use the case study of developing a state-of-the-art NLP model to show that the relevant cloud computing costs may account for $103–350K, amounts that are rarely available to academic researchers.
What’s the core idea of this paper?
- Modern language models achieve considerable gains in accuracy across NLP tasks but this comes at huge computing costs and energy consumption.
- The energy used for powering GPUs or TPUs during weeks or months of model training results in considerable carbon emissions.
- The huge cloud computing costs required for training state-of-the-art models are often unattainable to academic researchers.
- To overcome these challenges, the authors suggest:
- Reporting training time and sensitivity to hyperparameters in research papers to allow subsequent consumers to assess whether they have access to the required computational resources.
- Providing academic researchers with equitable access to computation resources by investing in shared computing centers.
- Prioritizing computationally efficient hardware and algorithms.
What’s the key achievement?
- Drawing attention to the substantial amount of energy consumption associated with training the latest NLP models.
- Suggesting actionable recommendations for reducing the financial costs and environmental impact of training machine learning models.
What does the AI community think?
- The paper was presented at ACL 2019, one of the leading conferences in natural language processing.
What are future research areas?
- Exploring ways to reduce energy consumption when developing state-of-the-art models (e.g., by using a Bayesian search instead of grid search).
What are possible business applications?
- Prioritizing energy-efficient hardware and algorithms when developing AI systems.
We want to give special thanks to Rachel Thomas, director at USF Center for Applied Data Ethics, and Timnit Gebru, research scientist at Google AI, for generously offering their expertise in curating the most important AI ethics research presented in 2019.
If you like these research summaries, you might be also interested in the following articles:
- Top AI & Machine Learning Research Papers From 2019
- What Are Major NLP Achievements & Papers From 2019?
- 10 Important Research Papers In Conversational AI From 2019
- 10 Cutting-Edge Research Papers In Computer Vision From 2019
- Breakthrough Research In Reinforcement Learning From 2019
Enjoy this article? Sign up for more AI research updates.
We’ll let you know when we release more summary articles like this one.
Winnetou says
Many Thanks. Very useful article !