Trustworthy AI Requires More Than Just Good Ethics

The Most Dangerous Idea in AI Right Now

Imagine you are building a self-driving car. You hire the best ethicists. You write a code of conduct. You make sure the car won't run over pedestrians. You check all the boxes.

Then the car kills someone because its sensors failed in the rain.

That is the problem Natalia Díaz Rodríguez and her colleagues at the University of Granada, the University of Deusto, and other institutions identified in their 2023 paper published in Information Fusion (Díaz Rodríguez et al., 2023). The problem is not that the AI was unethical. The problem is that ethics alone does not make a system trustworthy. Trustworthiness requires something far more mundane and far more difficult: it requires the system to actually work, to be auditable, and to survive contact with the real world.

The authors reviewed hundreds of AI ethics frameworks, regulatory documents, and technical papers. They found that most discussions of trustworthy AI focus on one of three things: principles, ethics, or technical requirements. But these three things do not talk to each other. An AI system can be perfectly ethical in theory and catastrophically unsafe in practice. A system can be technically robust and ethically bankrupt. The authors argue that true trustworthiness requires all three pillars to be integrated from the start, not bolted on at the end.

The Seven Requirements Nobody Talks About Together

Díaz Rodríguez and her team distilled the sprawling field of AI trustworthiness into seven concrete requirements. Each one sounds obvious in isolation. Together, they reveal something uncomfortable about how we build AI today.

1. Human Agency and Oversight

This requirement means humans should always be able to override an AI system. But here is the catch: the authors found that most current AI systems are designed to be opaque. If a human cannot understand why the AI made a decision, they cannot meaningfully override it. The requirement is not just a kill switch. It is a demand for interpretability.

2. Robustness and Safety

This is the sensor failure problem. The authors define robustness as the ability of an AI system to handle unexpected inputs, adversarial attacks, and environmental changes without failing catastrophically. Safety means the system does not cause harm even when it fails. Most AI systems today are tested on clean, curated datasets. The authors found that real world performance degrades sharply when conditions shift, and most organizations do not test for this systematically.

3. Privacy and Data Governance

This requirement goes beyond GDPR compliance. The authors argue that privacy must be built into the architecture, not added as a policy. They point to differential privacy and federated learning as technical approaches, but note that most companies still rely on data collection models that are fundamentally incompatible with privacy guarantees.

4. Transparency

Transparency has become a buzzword, but the authors give it teeth. They distinguish between three levels: traceability (can you track what data and decisions led to an outcome?), explainability (can a human understand why a decision was made?), and communication (can the system tell the user what it is doing in plain language?). Most AI systems achieve none of these.

5. Diversity, Non Discrimination, and Fairness

This requirement is about bias, but the authors broaden it. They argue that fairness cannot be achieved by tweaking a model after training. It must be designed into the data collection, the feature selection, the model architecture, and the deployment context. A system that is fair in one country may be discriminatory in another because social categories differ.

6. Societal and Environmental Wellbeing

This is the requirement almost nobody talks about. The authors argue that AI systems should be evaluated not just on their direct effects but on their systemic effects. Does the system increase inequality? Does it consume massive amounts of energy? Does it concentrate power in the hands of a few companies? The authors found that most AI ethics frameworks ignore these questions entirely.

7. Accountability

This is the hardest requirement. Accountability means someone is responsible when the system fails. But who? The developer? The deployer? The user? The authors argue that accountability requires clear chains of responsibility, auditable processes, and legal frameworks that can assign blame. Most AI systems today have none of these.

Why Ethics Alone Is a Trap

The authors make a sharp distinction between ethics and trustworthiness. Ethics is about principles: fairness, justice, beneficence. Trustworthiness is about whether the system actually deserves your trust. A system can be built on perfect ethical principles and still be untrustworthy if it is unreliable, opaque, or unaccountable.

This distinction matters because many organizations treat ethics as a checkbox. They write a code of ethics, they hire an ethics board, they publish a principles document. Then they deploy a system that fails in predictable ways because they did not address the technical requirements. The authors call this "ethics washing": using ethical language to avoid real accountability.

The paper argues that ethics must be translated into technical requirements through a process they call "operationalization." This means asking not just "is this fair?" but "how do we measure fairness?" and "what do we do when fairness conflicts with accuracy?" These are engineering questions, not philosophy questions.

The Auditing Problem Nobody Wants to Talk About

The authors propose that the only way to make AI truly trustworthy is through auditing. But auditing AI is fundamentally different from auditing a financial system or a manufacturing process. An AI system is not static. It changes as it encounters new data. It can behave differently in different contexts. It can fail in ways that were not anticipated during development.

The authors introduce the concept of a "regulatory sandbox": a controlled environment where AI systems can be tested before deployment. But they acknowledge that sandboxes have limitations. They cannot simulate every real world scenario. They can be gamed by developers who optimize for the sandbox rather than the real world. And they require regulatory bodies to have technical expertise that most currently lack.

This is where the paper gets uncomfortable. The authors found that most AI auditing today is performed by the companies that built the systems. This is like letting a student grade their own exam. Independent auditing is rare, expensive, and technically difficult. The authors argue that without independent auditing, no AI system can be considered truly trustworthy.

What the Research Does Not Prove

The paper is a review and synthesis, not an experimental study. It does not provide a simple checklist or a one size fits all solution. The authors are clear that trustworthiness is context dependent. A system that is trustworthy for recommending movies may not be trustworthy for diagnosing cancer. The requirements are the same, but the thresholds are different.

The paper also does not prove that any existing AI system is trustworthy. It provides a framework for evaluation, but applying that framework to real systems is a separate and difficult task. The authors acknowledge that most AI systems today fail on multiple requirements.

There is an open question that the paper does not fully resolve: can an AI system ever be truly trustworthy if it is owned and operated by a private company with profit incentives? The authors note that regulation is necessary, but they do not specify what form that regulation should take. This is a political question, not just a technical one.

The Regulation Paradox

The authors argue that regulation is the key to trustworthy AI. But they also identify a paradox: regulation can only work if it is informed by technical reality, and technical reality is constantly shifting. By the time a regulation is written, the technology may have moved beyond it.

They propose a solution: regulation should focus on processes rather than specific technologies. Instead of regulating what AI can do, regulate how it is built, tested, and audited. This is analogous to how the FDA regulates drug development: it does not approve individual chemical compounds, it approves the testing process and the evidence generated.

But this approach has its own problems. Process based regulation is harder to enforce. It requires regulators to have deep technical expertise. And it can be gamed by companies that follow the letter of the process while violating its spirit.

The Responsibility Gap

The paper introduces a concept that deserves more attention: the "responsibility gap." This is the space between who builds an AI system and who is harmed by it. In traditional engineering, responsibility is clear. If a bridge collapses, the engineer who designed it is responsible. But AI systems are built by teams, trained on data from unknown sources, deployed in contexts the developers did not anticipate, and modified by users after deployment. Who is responsible when something goes wrong?

The authors argue that closing this gap requires a new kind of legal entity: the "responsible AI system." This is not just a technical system but a sociotechnical one that includes the developers, the deployers, the auditors, and the regulatory framework. The system as a whole must be accountable, not just the algorithm.

What This Actually Means

▸If you are building an AI system, start with the auditing process, not the ethics principles. Design your system so that every decision can be traced back to a specific input and a specific model state. If you cannot audit it, you cannot trust it.

▸If you are buying an AI system, demand evidence of robustness testing. Ask to see the results of tests on edge cases, adversarial inputs, and distribution shifts. If the vendor cannot provide this, the system is not trustworthy.

▸If you are regulating AI, focus on processes and outcomes, not specific technologies. Require independent auditing, clear chains of responsibility, and public reporting of failures. Do not let companies audit themselves.

▸If you are using an AI system, assume it will fail. Ask what happens when it does. Is there a human in the loop? Can you override it? Is there a clear process for reporting problems? If the answer to any of these is no, the system is not ready for deployment.

▸If you are evaluating an AI ethics framework, ask what happens when principles conflict. Most frameworks list values but do not prioritize them. A trustworthy system requires clear trade offs: when fairness conflicts with accuracy, which one wins? If the framework does not answer this, it is not a framework, it is a wish list.

References

[1]Natalia Díaz-Rodríguez, Javier Del Ser, Mark Coeckelbergh, Marcos López de Prado (2023). Connecting the dots in trustworthy Artificial Intelligence: From AI principles, ethics, and key requirements to responsible AI systems and regulation. Information FusionDOI· 644 citations