UPDATE: We’ve also summarized the top 2019 AI Ethics research papers.
2018 was a breakthrough year for AI ethics. Researchers, practitioners, and ethicists have sounded the alarm for years over potential malicious applications of AI technologies as well as unintended consequences from flawed or biased systems. The increased attention from these warnings have led to a proliferation of new and promising research highlighting the issues in existing AI approaches and ideating solutions to address them.
We summarized 10 research papers covering different aspects of AI ethics in order to give you a preliminary overview of important work done in the space last year. There are many more papers, tools, and contributions in ethical AI which we didn’t cover in this article, but are also worth your time to study and learn from. This article is simply a useful starting point.
We’ve done our best to summarize these papers correctly, but if we’ve made any mistakes, please contact us to request a fix.
If these summaries of scientific AI research papers are useful for you, you can subscribe to our AI Research mailing list at the bottom of this article to be alerted when we release new summaries. We’re planning to release summaries of important papers in natural language processing (NLP), computer vision, reinforcement learning, and conversational AI in the next few weeks.
If you’d like to skip around, here are the papers we featured:
- A case study of algorithm-assisted decision making in child maltreatment hotline screening decisions
- Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification
- Delayed Impact of Fair Machine Learning
- Datasheets for Datasets
- Enhancing the Accuracy and Fairness of Human Decision Making
- Fairness and Abstraction in Sociotechnical Systems
- An AI Race for Strategic Advantage: Rhetoric and Risks
- Transparency and Explanation in Deep Reinforcement Learning Neural Networks
- Model Cards for Model Reporting
- 50 Years of Test (Un)fairness: Lessons for Machine Learning
Important AI Ethics Research Papers of 2018
1. A case study of algorithm-assisted decision making in child maltreatment hotline screening decisions, by Alexandra Chouldechova, Emily Putnam-Hornstein, Diana Benavides-Prado, Oleksandr Fialko, and Rhema Vaithianathan
Original Abstract
Every year there are more than 3.6 million referrals made to child protection agencies across the US. The practice of screening calls is left to each jurisdiction to follow local practices and policies, potentially leading to large variation in the way in which referrals are treated across the country. Whilst increasing access to linked administrative data is available, it is difficult for welfare workers to make systematic use of historical information about all the children and adults on a single referral call. Risk prediction models that use routinely collected administrative data can help call workers to better identify cases that are likely to result in adverse outcomes. However, the use of predictive analytics in the area of child welfare is contentious. There is a possibility that some communities—such as those in poverty or from particular racial and ethnic groups—will be disadvantaged by the reliance on government administrative data. On the other hand, these analytics tools can augment or replace human judgments, which themselves are biased and imperfect. In this paper we describe our work on developing, validating, fairness auditing, and deploying a risk prediction model in Allegheny County, Pennsylvania, USA. We discuss the results of our analysis to-date, and also highlight key problems and data bias issues that present challenges for model evaluation and deployment.
Our Summary
Child protection agencies across the US receive thousands of calls every day with allegations of child abuse. For each of these referrals, a welfare worker needs to decide if the physical investigation is needed. However, it is usually not feasible for a caseworker to analyze all the data – from child protective services, mental health services, drug and alcohol services, and homeless services – related to all children and adults involved in the case. Thus, the researchers introduce the Allegheny Family Screening Tool (AFST) that produces a risk score based on all historical information, but not the content of allegation itself. Unless the score is in the high-risk zone, the caseworker doesn’t have to take it into account.
What’s the core idea of this paper?
- It is difficult for welfare workers to make systematic use of historical information about all the children and adults on a single referral call.
- The AFST tool is not intended to replace human decision-making, but instead help to inform, train and improve the decisions made by welfare workers.
- The tool uses the ensemble of models, including Logistic Regression, Random Forest, XGBoost, and Support Vector Machines.
- The resulting risk score ranges from 1 to 20 and reflects the likelihood that the child will be removed from home within 2 years if the referral is screened in (further investigated).
- Predictive analytics should be used with great caution in the area of child welfare since some communities might be disadvantaged by the reliance on government administrative data.
What’s the key achievement?
- Developing and validating a predictive model to assist welfare workers in the screening of incoming calls with allegations of child abuse.
- Discussing the challenges in incorporating algorithms into human decision making processes.
- Highlighting key problems and data bias issues affecting the evaluation and implementation of the AFST.
What does the AI community think?
- The paper won the Best Paper Award at ACM FAT* 2018, an important conference on Fairness, Accountability, and Transparency.
What are future research areas?
- Since the models were trained and tested on the screened-in population only, further research is needed to ensure that the model performs well on the entire set of referrals.
- Redesigning the tool based on the feedback from staff members that use it in the decision-making process.
What are possible business applications?
- Even though the model discussed in the paper cannot be directly applied in the business setting, businesses can still follow the suggested approach when developing the predictive models to assist humans in the screening processes (e.g, hiring, detecting frauds).
2. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification, by Joy Buolamwini and Timnit Gebru
Original Abstract
Recent studies demonstrate that machine learning algorithms can discriminate based on classes like race and gender. In this work, we present an approach to evaluate bias present in automated facial analysis algorithms and datasets with respect to phenotypic subgroups. Using the dermatologist approved Fitzpatrick Skin Type classification system, we characterize the gender and skin type distribution of two facial analysis benchmarks, IJB-A and Adience. We find that these datasets are overwhelmingly composed of lighter-skinned subjects (79.6% for IJB-A and 86.2% for Adience) and introduce a new facial analysis dataset which is balanced by gender and skin type. We evaluate 3 commercial gender classification systems using our dataset and show that darker-skinned females are the most misclassified group (with error rates of up to 34.7%). The maximum error rate for lighter-skinned males is 0.8%. The substantial disparities in the accuracy of classifying darker females, lighter females, darker males, and lighter males in gender classification systems require urgent attention if commercial companies are to build genuinely fair, transparent and accountable facial analysis algorithms.
Our Summary
Buolamwini and Gebru argue that existing computer vision systems demonstrate bias against certain demographic groups (females, Black) and particularly, against certain intersectional groups (i.e., darker-skinned females). This owes to the fact that existing datasets are unbalanced and dominated by lighter-skinned males. So, the researchers introduce a new facial analysis dataset, which is balanced by gender and skin tone. They also evaluate the accuracy of several commercial gender classification systems and confirm the presence of a huge error rate gap (up to 34.4%) between darker-skinned females and lighter-skinned males.
An interactive walkthrough of the Gender Shades research project is available on the official website.
What’s the core idea of this paper?
- Computer vision systems are often used in high-stake scenarios (e.g., surveillance, crime prevention, detecting melanoma from images), while they still often discriminate based on classes like race and gender.
- Algorithms are usually biased against the demographic groups that are underrepresented in the datasets used for training the corresponding algorithms.
- Existing datasets are overwhelmingly composed of lighter-skinned subjects, and thus, there is a need for new facial analysis datasets, which will be balanced by gender and skin type.
What’s the key achievement?
- Introducing a new facial analysis dataset, Pilot Parliaments Benchmark (PPB), which:
- includes 1270 parliamentarians from Rwanda, Senegal, South Africa, Iceland, Finland, and Sweden;
- is balanced by gender (45% females and 55% males) and skin type (46% darker-skinned and 54% lighter-skinned subjects).
- Evaluating how well IBM, Microsoft, and Face++ AI services guess the gender of a face across different demographic groups (i.e., females, males, darker-skinned, and lighter-skinned subjects) and intersectional groups (darker-skinned females, darker-skinned males, lighter-skinned females, lighter-skinned males).
- Demonstrating the huge gap in the accuracy of gender classifiers across different intersectional groups: darker-skinned females are misclassified in up to 34.7% of cases, while the maximum error rate for lighter-skinned males is only 0.8%.
What does the AI community think?
- “The Gender Shades Project pilots an intersectional approach to inclusive product testing for AI.”, Benjamin Carpano, CEO at Reportlinker.
- IBM applied the findings of this paper to substantially increase the accuracy of its new Watson Visual Recognition service. In particular, they claim that the error rate for darker-skinned females dropped from 34.7% to 3.46%.
What are future research areas?
- Exploring gender classification on an inclusive benchmark composed of unconstrained images.
- Researching if the error rate gaps based on race and gender persist in other human-based computer vision tasks.
What are possible business applications?
- Before employing any human-based computer vision system, business needs to check:
- if the training dataset for the respective system correctly represents the population it will be applied to;
- the algorithmic performance for different demographic and phenotypic subgroups.
3. Delayed Impact of Fair Machine Learning, by Lydia T. Liu, Sarah Dean, Esther Rolf, Max Simchowitz, Moritz Hardt
Original Abstract
Fairness in machine learning has predominantly been studied in static classification settings without concern for how decisions change the underlying population over time. Conventional wisdom suggests that fairness criteria promote the long-term well-being of those groups they aim to protect.
We study how static fairness criteria interact with temporal indicators of well-being, such as long-term improvement, stagnation, and decline in a variable of interest. We demonstrate that even in a one-step feedback model, common fairness criteria in general do not promote improvement over time, and may in fact cause harm in cases where an unconstrained objective would not. We completely characterize the delayed impact of three standard criteria, contrasting the regimes in which these exhibit qualitatively different behavior. In addition, we find that a natural form of measurement error broadens the regime in which fairness criteria perform favorably.
Our results highlight the importance of measurement and temporal modeling in the evaluation of fairness criteria, suggesting a range of new challenges and trade-offs.
Our Summary
The goal is to ensure fair treatment across different demographic groups when using a score-based machine learning algorithm to decide who gets an opportunity (e.g., loan, scholarship, job) and who does not. Researchers from Berkeley’s Artificial Intelligence Research lab show that using common fairness criteria may in fact harm underrepresented or disadvantaged groups due to certain delayed outcomes. Thus, they encourage looking at the long-term outcomes when designing a “fair” machine-learning system.
What’s the core idea of this paper?
- Considering delayed outcomes of imposing fairness criteria reveals that these criteria may have an adverse impact on the long-term well-being of those groups they aim to protect (e.g., worsening the credit score of the borrower, who was not able to repay the loan that wouldn’t be granted in the unconstrained setting).
- Since fairness criteria may actively harm disadvantaged groups, the solution can be to use a decision rule which involves the explicit maximization of the outcomes, or an outcome model.
What’s the key achievement?
- Showing that such fairness criteria as demographic parity and equal opportunity fairness can lead to any possible outcomes for the disadvantaged group, including improvement, stagnation, and decline while following the institution’s optimal unconstrained selection policy (e.g., profit maximization) will never lead to decline (active harm) for the disadvantaged group.
- Supporting theoretical predictions with experiments on FICO credit score data.
- Considering alternatives to hard fairness constraints.
What does the AI community think?
- The paper won the Best Paper Award at ICML 2018, one of the key machine learning conferences.
- The study reveals that positive discrimination can sometimes backfire.
What are future research areas?
- Considering the other characteristics of impact beyond the change in population mean (e.g., variance, individual-level outcomes).
- Researching the robustness of outcome optimization to modeling and measurement errors.
What are possible business applications?
- By switching from constraints imposed by fairness criteria to outcome modeling, companies might develop ML systems for lending or recruiting that will be more profitable and yet “fairer”.
Where can you get implementation code?
- The official Github repository containing code required to reproduce this Delayed Impact research project is published under first author Lydia Liu.
4. Datasheets for Datasets, by Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumeé III, Kate Crawford
Original Abstract
The machine learning community has no standardized way to document how and why a dataset was created, what information it contains, what tasks it should and should not be used for, and whether it might raise any ethical or legal concerns. To address this gap, we propose the concept of datasheets for datasets. In the electronics industry, it is standard to accompany every component with a datasheet providing standard operating characteristics, test results, recommended usage, and other information. Similarly, we recommend that every dataset be accompanied with a datasheet documenting its creation, composition, intended uses, maintenance, and other properties. Datasheets for datasets will facilitate better communication between dataset creators and users, and encourage the machine learning community to prioritize transparency and accountability.
Our Summary
When you are searching for a dataset to train your model, how do you know if a particular dataset is appropriate for your task? If it is a dataset for facial recognition, does it contain a sufficient proportion of people from different demographic groups? To address this lack of information, Timnit Gebru and her colleagues suggest the concept of the datasheet for a dataset. This specific document should accompany datasets and contain all the answers to the common questions that dataset users may have.
What’s the core idea of this paper?
- There is a lack of a standardized approach to documenting significant details about the dataset’s characteristics, its recommended usage, any ethical or legal concerns etc.
- The researchers propose that datasheets for datasets include the following sections:
- the motivation for dataset creation (e.g., What tasks can the dataset be used for?);
- dataset composition (e.g., What data does each instance consist of?);
- data collection process (e.g., How was the data collected?);
- data preprocessing (e.g., What preprocessing/cleaning was done?);
- dataset distribution (e.g., What license (if any) is the dataset distributed under?);
- dataset maintenance (e.g., Will the dataset be updated?);
- legal and ethical consideration (e.g., If the dataset relates to people, were they told what the dataset would be used for and did they consent?).
- A datasheet can significantly improve the utility of a dataset for its intended users and mitigate potential harms from misusing the dataset.
What’s the key achievement?
- Explaining the importance of dataset documentation (i.e., datasheets) for increased transparency and accountability.
- Suggesting the structure for the datasheet.
- Providing prototype datasheets for two datasets.
What does the AI community think?
- “Datasheets for Datasets: new approach for transparency & standardization of datasets”, Rachel Thomas, fast.ai co-founder and assistant professor at the University of San Francisco.
- The paper was presented at FAT/ML 2018, an important event that brings together researchers and practitioners concerned with fairness, accountability, and transparency in machine learning.
What are future research areas?
- Standardizing format and content of the datasheets for datasets.
- Identifying incentives to encourage datasheets creation.
- Addressing ethical considerations that relate to data about people.
What are possible business applications?
- Business will benefit from datasheets accompanying the datasets since such documentation:
- improves communication with dataset creators;
- enables more informed decisions when choosing a dataset for training the models.
5. Enhancing the Accuracy and Fairness of Human Decision Making, by Isabel Valera, Adish Singla, Manuel Gomez Rodriguez
Original Abstract
Societies often rely on human experts to take a wide variety of decisions affecting their members, from jail-or-release decisions taken by judges and stop-and-frisk decisions taken by police officers to accept-or-reject decisions taken by academics. In this context, each decision is taken by an expert who is typically chosen uniformly at random from a pool of experts. However, these decisions may be imperfect due to limited experience, implicit biases, or faulty probabilistic reasoning. Can we improve the accuracy and fairness of the overall decision making process by optimizing the assignment between experts and decisions?
In this paper, we address the above problem from the perspective of sequential decision making and show that, for different fairness notions from the literature, it reduces to a sequence of (constrained) weighted bipartite matchings, which can be solved efficiently using algorithms with approximation guarantees. Moreover, these algorithms also benefit from posterior sampling to actively trade off exploitation – selecting expert assignments which lead to accurate and fair decisions – and exploration – selecting expert assignments to learn about the experts’ preferences and biases. We demonstrate the effectiveness of our algorithms on both synthetic and real-world data and show that they can significantly improve both the accuracy and fairness of the decisions taken by pools of experts.
Our Summary
We rely on judges to make jail-or-release decisions and on academics to make accept-or-reject decisions. However, decisions of these experts might suffer from implicit biases, lack of experience, faulty probabilistic reasoning. The group of researchers from Max Planck Institute suggest a way to improve the accuracy and fairness of the decision-making process by optimizing the assignment between experts and decisions. In particular, they address this problem from the perspective of sequential decision making and introduce the algorithms that demonstrate their effectiveness on both synthetic and real-world data.
What’s the core idea of this paper?
- The goal is to find the optimal assignments between human decision makers and decisions which maximize the accuracy of the overall decision-making process while satisfying the popular notions of fairness.
- Human decision making is represented using threshold decision rules.
- If the thresholds used by each expert are known, the problem can be reduced to a sequence of matching problems, which can be solved efficiently with approximation guarantees.
- If the thresholds used by each expert are unknown, they can be estimated using posterior sampling. In other words, we can learn about the experts’ preferences and biases based on their decisions.
What’s the key achievement?
- Introducing the algorithms for optimal assignment of tasks between the human decision makers.
- Showing that these algorithms:
- improve the accuracy and fairness of the overall human decision-making process compared to random assignment;
- ensure fairness more effectively if the pool of experts is diverse;
- ensure fairness even if a significant percentage of experts (e.g., 50%) are biased against a group of individuals with a certain sensitive attribute (e.g., race).
What does the AI community think?
- The paper was presented as an Invited Talk at the AI Ethics Workshop at NeurIPS 2018.
What are future research areas?
- Accounting for experts with different prediction abilities.
- Assuming that experts can learn from the decisions they take over time.
- Designing algorithms for a scenario when a decision is taking jointly by a group of experts.
What are possible business applications?
- The algorithms presented in the paper might be applicable in the business setting when a single decision maker is chosen from a pool of experts (e.g., granting loans).
Where can you get implementation code?
- Code and data for this research paper are available on Github.
6. Fairness and Abstraction in Sociotechnical Systems, by Andrew D. Selbst, Danah Boyd, Sorelle Friedler, Suresh Venkatasubramanian, and Janet Vertesi
Original Abstract
A key goal of the fair-ML community is to develop machine-learning based systems that, once introduced into a social context, can achieve social and legal outcomes such as fairness, justice, and due process. Bedrock concepts in computer science—such as abstraction and modular design—are used to define notions of fairness and discrimination, to produce fairness-aware learning algorithms, and to intervene at different stages of a decision-making pipeline to produce “fair” outcomes. In this paper, however, we contend that these concepts render technical interventions ineffective, inaccurate, and sometimes dangerously misguided when they enter the societal context that surrounds decision-making systems. We outline this mismatch with five “traps” that fair-ML work can fall into even as it attempts to be more context-aware in comparison to traditional data science. We draw on studies of sociotechnical systems in Science and Technology Studies to explain why such traps occur and how to avoid them. Finally, we suggest ways in which technical designers can mitigate the traps through a refocusing of design in terms of process rather than solutions, and by drawing abstraction boundaries to include social actors rather than purely technical ones.
Our Summary
The paper introduces five traps that fair-ML work can fall into, including the Framing trap, the Portability trap, the Formalist trap, the Ripple Effect trap, and the Solutionism trap. It also suggests the ways to avoid these traps drawing on studies of sociotechnical systems in Science and Technology Studies.
What’s the core idea of this paper?
- There are five traps that fair-ML work can fall into:
- The Framing Trap, or the failure to model the entire system over which fairness will be enforced.
- The Portability Trap, or the failure to understand that solution designed for one context may be misleading, inaccurate and harmful in a different context.
- The Formalism Trap, or the failure to account for different definitions of fairness (e.g., procedural, contextual, contestable) and trying to arbitrate between conflicting definitions using purely mathematical means.
- The Ripple Effect Trap, or the failure to fully understand the impact of technology on the pre-existing system.
- The Solutionism Trap, or the failure to recognize the possibility that the best solution to a problem may not involve technology.
- The traps can be avoided by looking at the problem from the sociotechnical perspective:
- The Framing Trap can be solved by adopting a “heterogeneous engineering” approach.
- The Portability Trap can be addressed by developing different user “scripts” for different contexts.
- The Formalism Trap can be avoided by following the Social Construction of Technology (SCOT) framework.
- The Ripple Effect Trap can be solved by avoiding reinforcement politics and reactivity behaviors.
- The Solutionism Trap can be addressed by considering in the first place if the problem can be indeed better solved with technology.
What’s the key achievement?
- Describing five traps in fair-ML work and showing how sociotechnical perspective points out the ways to understand and avoid these traps.
- Providing a checklist for consideration when designing a new fair-ML solution:
- Is a technical solution appropriate to the situation in the first place?
- Does it affect the social context in a predictable way?
- Can this technical solution appropriately handle robust understandings of fairness?
- Has it appropriately modeled the social and technical requirements of the actual context in which it will be deployed?
- Is a solution heterogeneously framed so as to include the data and social actors relevant to the localized question of fairness?
What does the AI community think?
- The paper will be presented at ACM FAT* 2019, an important conference on Fairness, Accountability, and Transparency.
What are future research areas?
- Incorporating the sociotechnical context more directly into the work of fair-ML researchers.
What are possible business applications?
- Domain experts from business usually have a better understanding of social context than ML researchers, and this context is crucial for developing fair decision-making systems. Thus, the business should collaborate deeply with the people from the ML community to design better and fairer technical solutions.
7. An AI Race for Strategic Advantage: Rhetoric and Risks, by Stephen Cave and Seán S Óh Éigeartaigh
Original Abstract
The rhetoric of the race for strategic advantage is increasingly being used with regard to the development of artificial intelligence (AI), sometimes in a military context, but also more broadly. This rhetoric also reflects real shifts in strategy, as industry research groups compete for a limited pool of talented researchers, and nation states such as China announce ambitious goals for global leadership in AI. This paper assesses the potential risks of the AI race narrative and of an actual competitive race to develop AI, such as incentivising corner-cutting on safety and governance, or increasing the risk of conflict. It explores the role of the research community in responding to these risks. And it briefly explores alternative ways in which the rush to develop powerful AI could be framed so as instead to foster collaboration and responsible progress.
Our Summary
Researchers from the University of Cambridge draw our attention to the increased rhetoric of AI race: companies compete for talented researchers, nation states strive for global leadership in AI. The paper discusses the major risks coming from the AI race rhetoric and AI race reality. It also explores the role of the research community with regards to these risks and suggests some prominent alternatives to the race approach.
What’s the core idea of this paper?
- AI race makes it more difficult to minimize the risks of AI technology and maximize its societal benefits.
- There are three sets of risks:
- risks posed by race rhetoric alone (i.e., insecure environment that hinders dialogue and collaboration);
- risks posed by a race emerging (i.e., not taking proper safety precautions);
- risks posed by race victory (i.e., the concentration of power in the hands of one group).
- AI researchers can make a positive impact by speaking against AI arms race and publicly drawing attention to the corresponding risks.
What’s the key achievement?
- Identifying the main risks of AI race with regards to both rhetoric and reality.
- Suggesting several prominent alternatives to a race approach, including:
- AI development as a shared priority for global good;
- cooperation on AI;
- responsible development of AI to improve public perception.
What does the AI community think?
- The paper won the Best Paper Award at AIES 2018, a key conference on artificial intelligence, ethics, and society.
What are future research areas?
- Understanding the dynamics of a possible AI race.
- Developing alternative framings for AI development.
What are possible business applications?
- Business can make a positive impact by avoiding AI race rhetoric, cooperating on AI, and following the concept of responsible AI development and implementation.
8. Transparency and Explanation in Deep Reinforcement Learning Neural Networks, by Rahul Iyer, Yuezhang Li, Huao Li, Michael Lewis, Ramitha Sundar, Katia Sycara
Original Abstract
Autonomous AI systems will be entering human society in the near future to provide services and work alongside humans. For those systems to be accepted and trusted, the users should be able to understand the reasoning process of the system, i.e. the system should be transparent. System transparency enables humans to form coherent explanations of the system’s decisions and actions. Transparency is important not only for user trust, but also for software debugging and certification. In recent years, Deep Neural Networks have made great advances in multiple application areas. However, deep neural networks are opaque. In this paper, we report on work in transparency in Deep Reinforcement Learning Networks (DRLN). Such networks have been extremely successful in accurately learning action control in image input domains, such as Atari games. In this paper, we propose a novel and general method that (a) incorporates explicit object recognition processing into deep reinforcement learning models, (b) forms the basis for the development of “object saliency maps”, to provide visualization of internal states of DRLNs, thus enabling the formation of explanations and (c) can be incorporated in any existing deep reinforcement learning framework. We present computational results and human experiments to evaluate our approach.
Our Summary
The researchers suggest an approach to increasing transparency and explainability of deep reinforcement learning networks. In particular, they show how to explicitly incorporate object features into the model’s architecture and how to visualize the internal states of the network by developing object saliency maps. Finally, they demonstrate through the experiments that these visualizations are indeed understandable for humans.
What’s the core idea of this paper?
- Transparency of AI systems is important for user trust, software debugging, and certification.
- It is possible to increase transparency and explainability of deep reinforcement learning models by explicitly incorporating object features into the model’s architecture.
- Moreover, the researchers suggest the method to produce object saliency maps that reflect the reasoning behind the model’s decisions. These visualizations are interpretable by humans.
- Suggested approach can be incorporated in any existing deep reinforcement learning framework.
What’s the key achievement?
- Introducing an approach to incorporate object characteristics into the learning
process of deep reinforcement learning. - Suggesting the method to produce object saliency maps, visual explanations why a particular action was taken by an agent.
- Demonstrating through the experiments that the object saliency map visualization can help humans understand the learned behavior of an agent.
What does the AI community think?
- The paper won the Best Paper Award at AIES 2018, a key conference on artificial intelligence, ethics, and society.
What are future research areas?
- Using object saliency maps as a basis to automatically produce human intelligible explanations in natural language.
- Testing the ability of object features in more realistic situations (e.g., self-driving cars).
What are possible business applications?
- Businesses that have deep reinforcement learning networks at the core of their AI technologies, may increase user trust in the agent’s decisions by incorporating the suggested approach and providing interpretable visualizations of the model’s behavior.
9. Model Cards for Model Reporting, by Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, Timnit Gebru
Original Abstract
Trained machine learning models are increasingly used to perform high-impact tasks in areas such as law enforcement, medicine, education, and employment. In order to clarify the intended use cases of machine learning models and minimize their usage in contexts for which they are not well suited, we recommend that released models be accompanied by documentation detailing their performance characteristics. In this paper, we propose a framework that we call model cards, to encourage such transparent model reporting. Model cards are short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic groups (e.g., race, geographic location, sex, Fitzpatrick skin type) and intersectional groups (e.g., age and race, or sex and Fitzpatrick skin type) that are relevant to the intended application domains. Model cards also disclose the context in which models are intended to be used, details of the performance evaluation procedures, and other relevant information. While we focus primarily on human-centered machine learning models in the application fields of computer vision and natural language processing, this framework can be used to document any trained machine learning model. To solidify the concept, we provide cards for two supervised models: One trained to detect smiling faces in images, and one trained to detect toxic comments in text. We propose model cards as a step towards the responsible democratization of machine learning and related AI technology, increasing transparency into how well AI technology works. We hope this work encourages those releasing trained machine learning models to accompany model releases with similar detailed evaluation numbers and other relevant documentation.
Our Summary
Machine-learning models are increasingly used in high-stake scenarios like law enforcement, employment, medicine. But how do we no if the corresponding models are robust enough for the cases they are being used? Google researchers propose to accompany released models with respective documentation, model card. This short 1-2 page document should detail the performance characteristics of the model, provide the benchmarked evaluation across different cultural, demographic, and intersectional groups, disclose the context in which the model is intended to be used. The paper provides two examples of such model cards.
What’s the core idea of this paper?
- Trained machine-learning models are increasingly used for performing high-impact tasks. At the same time, there is a lack of transparency about how well the corresponding algorithms work, what are the intended use cases, how these models perform across different demographic groups.
- All trained machine learning models should be accompanied with the model cards, 1-2 page documents that provide:
- model details (author, date, version, type etc.);
- intended uses and out-of-scope use cases;
- model performance across a variety of factors (i.e., groups, instrumentation, environments);
- appropriate metrics depending on the model type;
- information about evaluation and training data;
- quantitative analyses;
- ethical considerations;
- caveats and recommendations.
What’s the key achievement?
- Spotlighting the importance of documentation accompanying released machine learning models.
- Proposing the structure for such documentation.
- Providing exemplary model cards for two supervised models.
What does the AI community think?
- “This looks like a possible tool for the AI DevOps tool suite.”, Joanna J Bryson, Associate Professor at the University of Bath.
- The paper will be presented at ACM FAT* 2019, an important conference on Fairness, Accountability, and Transparency.
What are future research areas?
- Refining the methodology of preparing model cards by considering how model information is used by different stakeholders.
- Standardizing or formalizing model cards to prevent misleading representations of model performance.
- Considering other transparency tools like, for example, algorithmic auditing by third-parties, “adversarial testing”, more inclusive user feedback mechanisms.
What are possible business applications?
- With model cards accompanying all machine learning models, business will be able to take more informed decisions when adopting new machine learning technologies.
10. 50 Years of Test (Un)fairness: Lessons for Machine Learning, by Ben Hutchinson and Margaret Mitchell
Original Abstract
Quantitative definitions of what is unfair and what is fair have been introduced in multiple disciplines for well over 50 years, including in education, hiring, and machine learning. We trace how the notion of fairness has been defined within the testing communities of education and hiring over the past half century, exploring the cultural and social context in which different fairness definitions have emerged. In some cases, earlier definitions of fairness are similar or identical to definitions of fairness in current machine learning research, and foreshadow current formal work. In other cases, insights into what fairness means and how to measure it have largely gone overlooked. We compare past and current notions of fairness along several dimensions, including the fairness criteria, the focus of the criteria (e.g., a test, a model, or its use), the relationship of fairness to individuals, groups, and subgroups, and the mathematical method for measuring fairness (e.g., classification, regression). This work points the way towards future research and measurement of (un)fairness that builds from our modern understanding of fairness while incorporating insights from the past.
Our Summary
One of the challenges for current machine learning research is to be “fair”. However, the important precondition would be to agree upon the notion of fairness along several dimensions, including fairness criteria, the relationship of fairness to individuals and groups, measuring fairness. To achieve this goal, Mitchell and Hutchinson from Google suggest tracing the definition of fairness over the past 50 years and incorporating the revealed insights into the modern understanding of fairness in machine learning.
What’s the core idea of this paper?
- Last 50 years of fairness research provide some useful lessons for future research in ML fairness.
- There were several important milestones in the development of fairness approach:
- In the 1960s, the research focused on mathematical measurement of unfair bias and discrimination within the educational and employment testing communities.
- In the 1970s, the perspective shifted from defining how a test may be unfair to how a test may be fair.
- The 1980s started with the new public debate about the existence of racial differences in general intelligence, and the implications for fair testing.
- Many of the fairness criteria developed over the last 50 years of fairness research are identical to modern-day fairness definitions. However, there are also some conceptual gaps between the earlier fairness approaches and current ML fairness; and these gaps need to be considered.
What’s the key achievement?
- Providing a comprehensive comparison of past and current notions of fairness along several dimensions such as fairness criteria, the focus of the criteria, relationship of fairness to individuals, groups, and subgroups, mathematical measurement of fairness.
- The paper suggests some concrete steps for future research in ML fairness:
- developing methods to explain and reduce model unfairness by focusing on the causes of unfairness;
- expanding fairness criteria to include model context and use;
- incorporating quantitative factors for the balance between fairness goals and other goals;
- diving more deeply into the question of how subgroups are defined, for example, quantifying fairness along one dimension (e.g., age) conditioned on another dimension (e.g., skin tone).
What does the AI community think?
- The paper will be presented at ACM FAT* 2019, an important conference on Fairness, Accountability, and Transparency.
What are future research areas?
- Exploring both technical and cultural causes of unfairness in machine learning.
- Researching how the context and use of ML models influence potential unfairness.
- Identifying the suitable variables to be used in fairness research for capturing systemic unfairness.
What are possible business applications?
- ML practitioners should take into account that in the coming years, courts may start ruling on the fairness of ML models. Thus, business needs to make sure that technical definitions of fairness used when developing ML models are close to the public perception of fairness.
Want Deeper Dives Into Specific AI Research Topics?
Due to popular demand, we’ve released several of these easy-to-read summaries and syntheses of major research papers for different subtopics within AI and machine learning.
- Top 10 machine learning & AI research papers of 2018
- Top 10 AI fairness, accountability, transparency, and ethics (FATE) papers of 2018
- Top 14 natural language processing (NLP) research papers of 2018
- Top 10 computer vision and image generation research papers of 2018
- Top 10 conversational AI and dialog systems research papers of 2018
- Top 10 deep reinforcement learning research papers of 2018
Update: 2019 Research Summaries Are Released
- Top 10 AI & machine learning research papers from 2019
- Top 11 NLP achievements & papers from 2019
- Top 10 research papers in conversational AI from 2019
- Top 10 computer vision research papers from 2019
- Top 12 AI ethics research papers introduced in 2019
- Top 10 reinforcement learning research papers from 2019
Enjoy this article? Sign up for more AI research updates.
We’ll let you know when we release more summary articles like this one.
Keith says
Great layout of info on this article. I look forward to more of your emails. TY.