Skip to main content

Trust in AI Requires Squelching Bias

(Source: Dmytro Zinkevych/Shutterstock.com)

Artificial intelligence (AI) is increasingly enabling personal, professional, and societal benefits by advancing key goals for efficiency and productivity; labor markets; economic growth and welfare; and health, environmental, and human rights protection. Indeed, when developed ethically and harnessed responsibly, AI can benefit humanity across many key domains. Further, this technology is a critical enabler for and integrator with other emergent technologies, from 5G to advanced big data analytics. In making its data-driven predictions, recommendations, or decisions, however, AI can also have negative effects that threaten privacy and security and present risks for potentially unfair or discriminatory outcomes. These harms can be unintentional or deliberate; they can perpetuate existing socioeconomic disparities, as well.

 

More and more, organizations use AI to derive insights from data so that they can make decisions with far-reaching effects for people. AI use especially affects sensitive areas such as hiring and recruitment, access to financial credit, eligibility for benefits, criminal justice, and health care. People have been asking questions about the ethics and fairness of computer systems for some time, so such issues are not new. Now, however, AI is attracting policy attention and intense public interest. The sections that follow explore key areas that affect the just development and operation of AI as well as solutions for building mainstream adoption and acceptance.

Bias, for Better or Worse

Definitions of bias range from being “a tendency or inclination toward something” to including unreasoned judgment, systematic error, disproportionate weighting, or prejudice. More than 180 human biases have been defined and classified, any one of which can affect how we make decisions. Indeed, not all biases are bad, unfair, or unhelpful—take, for example, a bias toward healthy food or toward avoiding dangerous activities. Similarly, some algorithms are designed to target specific people who have an inclination toward something, such as for personalizing marketing offers or aligning profiles on a dating site.

 

In AI, the concern is about systems making unfair decisions that reflect different forms of negative bias. One of the most common yet often invisible example is confirmation bias, defined as a tendency to search, interpret, and recall information in a way that confirms or strengthens one’s preexisting beliefs. A wealth of meta-analytic studies have highlighted the pervasive nature of bias within recruitment and hiring, for example, from sexism, ageism, disability, and racism to people’s appearance. One notable example is Amazon’s now-abandoned experimental search tool, which demonstrated to favor men in its selection of applicants.

 

Other examples abound; in fact, according to the 2018 AI Now report, at least 1,200 incidents have occurred in which AI systems caused harm in the real world. The Apple Card reportedly gave, on average, 10 times the credit limit to men than to women with the same financial status. One of the issues here was not solely the perception or potential that there was an issue with the algorithm but the inability to clearly explain how the decision had been made and to clarify the criteria. In the public sector, the Correctional Offender Management Profiling for Alternative (COMPAS) algorithm—used to inform bail decisions by estimating the risk of recidivism—has demonstrated a systematically higher false-positive rate for Black defendants than white counterparts (all other factors being equal). Bias concerns have also been raised regarding issues as diverse as facial recognition, housing allocation, visa application fast tracking or acceptance, and child welfare.

 

Ironically, thinking back to the scales of justice personified as Lady Justice, bad data most often adversely affects women. A specific and highly tangible example is the development of car safety functions, such as airbags, headrests, and seatbelts. As a recent BBC Television discussion showed, these features were primarily designed based on data collected from car crash dummy testing, which used both the physique and seating position of men. Dimensions of female and pregnant bodies do not feed into standard measurements. As a result, women are 47 percent more likely to be seriously injured and 17 percent more likely to die than males in a similar accident.

Why AI Issues Occur

AI offers us the potential to reduce the degree and impact of human cognitive bias, decision-making styles, attitudes, and subjective interpretation of data. Issues remain, however. In particular, machines cannot learn beyond the data with which they are trained. If we teach AI systems to imitate human preferences—for example, by feeding them historical data and biased human decisions—they will not just replicate such decisions but augment, exacerbate, and scale the baked-in human biases, too. A small element of bias in data can have a huge ripple effect and, unlike human bias, grow exponentially. As noted in the 2018 report from AI Now, “While individual human assessors may also suffer from bias or flawed logic, the impact of their case-by-case decisions has nowhere near the magnitude or scale that a single flawed automated decision-making system can have across an entire population.”

 

Simply put, garbage in, garbage out. But worse, bias embedded within the algorithms or data sets that AI systems use is reinforced through machine learning (ML). The leading factors here are insufficient training data; erroneous assumptions; and issues with data-selection criteria or data collection, such as oversampling of a specific population. There is no substitute for having an equal representation of groups in training data. There can also be hard-coded bias in the ML process, from dirty data affecting predictive policing systems to “word embeddings” trained on news stories that exhibit the gender stereotypes found in society.

 

The availability of historical data that mirrors a situation can create bias, as well. During the coronavirus (COVID-19) pandemic, retail inventory management, marketing, and fraud detection experienced problems because AI systems struggled with unexpected bulk orders and sudden shifts in consumer search and purchase patterns. If no historical data mirrors a situation, AI systems can falter or fail, which also brings into focus the continued need for human involvement for vital checks and balances.

 

Similarly, consider what happens when the AI innovation curve is accelerated—for example, during the swift introduction of contact tracing, tracking apps, and exposure notification technology in the fight against COVID-19. In principle, this acceleration can be viewed as being for the collective good because it helps identify and contain infections and better understand COVID-19’s manner of spread; however, concerns arise for privacy protection, surveillance, security, safety, and future use, as exemplified by the different centralized and decentralized approaches to data storage adopted worldwide.

Trust and Understanding

Trust and understanding of AI systems are central to today’s AI problems and provide hints for future solutions. Indeed, Edelman’s 2019 benchmarking research reflects that trust inequalities are now at record highs, with only one in five participants identified believing that “the system is working for them.” In relation to technology, this issue does not just relate to AI but also to trust in other transformative technologies, such as 5G. In many respects, trust in each other—and trust in tech—has never mattered more, which is why recognition of the impact of AI is increasing in many sectors but concerns remain. Taking education as an example, in a recent UK survey by the National Earth Science Teachers Association (Nesta), 61 percent of parents with children aged 18 or younger believed that AI would have an important role in running school classrooms by 2035; however, the same number were concerned that decisions AI systems would make could be unfair.

 

Confusion about how AI systems make decisions, incidents of these systems’ perceived or actual biased outcomes, and trust issues all combine to contribute to the frequent media and public tendency to view AI in either utopian or dystopian terms. This approach can create an unhelpful inevitability narrative that narrows the opportunity for focused discussion and debate on specific domains of impact, such military, existential, social, or political. Additionally, the growing framing of “AI development as a ‘race’ between superpowers in the East and West” contributes to this narrative, and academics in particular have led the call on nations to work more closely together to better ensure that AI can benefit all of humanity.

Searching for Better

The road toward building AI systems that are free from bias requires that we come together, build understanding, remove barriers, and manage risk. This mandate is particularly complex when we consider an emerging duality: Can you govern algorithms while also governing by algorithms?

 

One need is to define fairness, which is more complicated than it may seem. Princeton Professor Arvind Narayanan identified at least 21 definitions of fairness while saying that even this list was not exhaustive. Indeed, no universal definition of fair exists because it depends on situational context. Different standards and metrics will likely be required, related to the specific-use case and circumstances in scope; one size does not fit all. Therefore, transparency of communication is vital. Research shows that trust rises when the public feels appropriately informed.

 

We must also focus on the fundamental principles of ethics. At least 22 different sets of ethical principles have been published since January 2017. For example, the Montreal Declaration for the Responsible Development of Artificial Intelligence proposes ethical principles based on 10 fundamental values:

  • Well-being
  • Respect for autonomy
  • Protection of privacy and intimacy
  • Solidarity
  • Democratic participation
  • Equity
  • Diversity inclusion
  • Prudence
  • Responsibility
  • Sustainable development

 

Similarly, the European Union High-Level Expert Group on Artificial Intelligence proposes that, in general, models should be:

  • Lawful, respecting all applicable laws and regulations;
  • Ethical, respecting ethical principles and values; and
  • Robust from technical and social perspectives.

 

Beyond these principled foundations, what else must be done to actualize advancements? Strategy consultation at both the county and specific sector levels is useful, and establishing structures that facilitate the use of data for the common good is a key starting point for navigating issues of global and urgent need, such as the COVID-19 pandemic. This approach can enhance international cooperation to address policy, governance, regulatory, and technological challenges and would go some way toward building a new technology diplomacy. Similarly, regulation and legislation have parts to play, as illustrated by elements within the General Data Protection Regulation related to human review and transparency and the proposed Algorithmic Accountability Act in the United States to better enforce testing of whether technology is making biased, inaccurate, discriminatory, or otherwise unfair decisions.

 

Legislation is part of the answer, but we need to do more, starting with developing and embedding ethics and norms for decision-making into everyday organizational culture and practice. Transparent communication, accountability, and sharing are vital. They start with consulting those whom new AI data-driven processes are most likely to be significantly affected before those are processes are implemented, not after. They continue with embedding checks for potential unfair bias in the design phase, accompanied by checking of and transparency in the underpinning rationale and predictability of outcomes. This work also includes ensuring that a person is accountable for decision integrity and consequences, which means embedding a human-machine partnership by design. Further, sharing and learning from examples of where AI has gone wrong are imperative and can highlight what could go wrong in the future, ideally catalyzing discussion, debate, and (where appropriate) intervention.

 

Addressing issues of skills underpins the way forward: Some of the biases found in AI systems can be at least partially attributed to the lack of diversity found within the AI technology field. To negate the risk, we must ensure that the people building AI systems bring together diverse experiences, demographics, and perspectives. In addition, there should be a new focus on boosting the public’s data literacy to facilitate understanding and provide a shared language so that people can demand accountability from companies, government, and research alike. We must make explainable AI models actionable.

 

Finally, regarding specific technical interventions, options include combining bias-mitigation methods and reducing AI bias through use of synthetic data. In other words, AI systems developers must generate artificial records to balance biased data sets and enhance the overall accuracy of models. The work by Gretel Synthetics is a good example. We can also overcome cultural bias or the unfair misrepresentation of marginalized groups in AI-based policy decisions. One approach to doing so is to teach machines who we are through our stories—in effect, fostering machines’ cultural IQ. We must elevate the AI incident response level in the same way a cybersecurity risk would be evaluated and prioritized, which reinforces the message that AI matters.

Conclusions

When developed ethically and harnessed responsibly, AI can benefit humanity across many key domains. Issues of bias, in particular—whether introduced by humans or by AI systems—hinder current AI solutions and their adoption and acceptance. As we work to develop AI systems that we can trust, we must consider the whole development life cycle, from data-selection criteria, collection, and preparation to data training, modelling, and analysis to negate the risk of unfair bias and discriminatory outcomes. As we have seen, garbage in does mean garbage out.

 

It is also critical to address the lack of transparency in algorithmic decision-making and explainability. Underpinning this consideration is the need for adherence to ethical principles and implementation guidelines. A more collaborative, consultative, and open approach is vital for building an AI ecosystem that encourages innovation and digital transformation. The potential exists to make the most of this technology and gain rich insights from our increasingly data-rich world while ensuring fairness and human rights.

About the Author

Sally Eaves is a chief technology officer, practicing professor of fintech and global strategic advisor consulting on the application of disruptive technologies. Globally recognized as a thought leader in the field she has won multiple awards, is an international keynote speaker and an accomplished author.