With contributions from: Alejandro Guerrero (OECD); Alessandra Garbero (IFAD); Miki Tsukamoto (IFRC) and Rob Lloyd (Itad)
Abstract
Evaluators have a crucial role to play in fostering trust in AI. This blogpost argues that evaluators should be key players in ensuring the responsible adoption of AI and invites us, as evaluation practitioners, to explore the evolving nature of our role and the broader implications of our involvement in this rapidly changing context.
Disclaimer: the views expressed in this blogpost are those of the author and do not necessarily reflect those of the European Investment Bank. |
- The AI Dilemma: harnessing opportunity, while managing risks
Artificial intelligence has the potential to significantly reshape sectors and economies by driving innovation, improving efficiency, and enabling new capabilities. Businesses are increasingly using AI to automate tasks, support decision-making, and improve customer experiences, whether through personalised recommendations, predictive analytics, or advanced robotics. In healthcare, AI is enabling early disease detection and helping identify high-risk patients, while in finance, it helps detect fraud. In the public sector as well, organisations are exploring the potential analytical capabilities of AI to inform decision making and to support public service delivery.
These benefits come with risks. AI can perpetuate bias, produce misleading information (hallucinations), and introduce security vulnerabilities and privacy considerations, depending on how they are designed and used. Over time, AI risks displacing jobs and raises ethical questions surrounding autonomous decision-making and the reliance on complex and sometimes opaque algorithms. Over the longer term, risks related to the loss of human oversight and the unintended consequences of increasingly sophisticated AI systems remain areas of active discussion.
In response to these uncertainties, governments and organisations are introducing regulatory measures, corporate AI governance frameworks, and restrictions on deployment to ensure responsible adoption.
At the regulatory level, countries are opting for different approaches to AI governance in an attempt to balance the competing pressures of leveraging AI for competitive advantage with the concerns associated with its adoption. While the United States rely on existing federal laws and guidelines to regulate AI, with no federal prohibition on AI applications and instead a focus on technical standards and policies to foster AI growth, the European Union introduced in 2024 the EU AI Act, as part of a wider package of policy measures to support the development of trustworthy AI, with a strong focus on the risks posed by AI. The AI Act includes the definition and prohibition of AI practices deemed to pose an ‘unacceptable’ risk and requires developers and implementers to register high-risk AI models and maintain technical documentation of the model and training results.
The United Kingdom, in turn, adopted a sector-specific and more principles-based approach, focusing on regulating applications of AI rather than entire sectors or technologies. Through a state-centric approach prioritising national security, social stability and economic growth, China has chosen not to adopt a single, overarching AI law, but instead to introduce a series of targeted regulations and guidelines around AI tools and applications such as deep fakes and generative AI.
The pace of AI advancements further complicates this dynamic between pressure to adopt and concerns with potential risks. New models and applications are emerging rapidly, making it difficult for policymakers, businesses, the workforce and the public to keep up.
- Trust as a foundation for responsible AI adoption
The central question surrounding AI adoption is not just whether to use the technology but how to do so safely and responsibly. How can organisations, and societies at large, maximise the benefits of AI – its potential for efficiency, innovation, and problem-solving – without exposing themselves to risks, such as bias, misinformation, security threats, or loss of human oversight? Striking this balance is one of the most pressing challenges in AI governance today.
As President of the EIB Nadia Calviño pointed out at the 2025 World Economic Forum, a key element in this equation is trust. As AI continues to evolve, the way we build trust in its capabilities and limitations will determine how confidently we can embrace its potential.
- Role of evaluators in ensuring the sound uptake of AI
To ensure that AI tools are adopted in a responsible and effective manner, there is a need to better understand the technology and to introduce reliable accountability mechanisms within organisations and institutions. Evaluators have a key role to play on both aspects.
Right now, we still know relatively little about the impact of AI tools on its users. Researchers are just beginning to explore the real-world impact of such tools, beyond the pure performance metrics (i.e. accuracy, efficiency, scalability) of the models themselves. As AI starts being used in policies, programmes or public services, evaluators may increasingly be called upon to assess the impact of those uses of AI. The recently published Guidance on the Impact Evaluation of AI Interventions by HM Treasury takes a first step in this direction, by presenting key emerging best practice principles for designing impact evaluations of AI interventions. Beyond impact, evaluators may also increasingly need to consider how the use of AI might affect the relevance, effectiveness, efficiency and perhaps even sustainability of the interventions they evaluate.
Given their focus on upholding accountability, evaluators can play a key role in holding organisations and institutions to account on their use of AI. The OECD has taken important steps in promoting the design and uptake of trustworthy AI that respects human rights and democratic values through the definition of its AI Principles. Further work is now being considered to introduce accountability mechanisms within institutions and organisations to ensure the adherence to these principles. These accountability mechanisms would need to be tailored to the incremental approach used for AI development, with evaluators potentially playing a key role throughout the process—from initial design to deployment and implementation.
- Evaluating in an AI-driven world
This think piece aims to kick off a conversation among evaluators about the evolving nature of our role, as well as the impact AI will have on various aspects of our work. Building on an initial conversation held at the 2024 European Evaluation Society Conference in Rimini gathering the OECD, IFAD, IFRC, Itad and the EIB, a few starting points for this broader conversation have been identified.
- Evaluate the effectiveness and relevance of existing evaluation tools
As evaluators are increasingly asked to look into the use of AI within organisations, there will be a need to assess whether existing evaluation frameworks and tools, such as Theories of Change, are still applicable in this highly technological and rapidly shifting context. While at a first glance, such tools may still be useful, particularly for considering long-term outcomes and impacts, there may also be a need to update or refine the evaluator toolbox to address the unique challenges and complexities of evaluating AI. Tools will be needed to identify potential biases or unintended, including societal, effects of AI.
- Shorten the feedback loop
Given the rapid pace of AI development, traditional, longer evaluation cycles may increasingly need to be supplemented by shorter, more frequent cycles aligned with the development and implementation stages of AI. In the realm of AI and beyond, evaluators will face mounting pressure to be agile, providing timely and actionable recommendations that support progress without creating bottlenecks. This is where AI itself could play a role, assisting in data collection and analysis to streamline the evaluation process. However, here as well, evaluators will need to carefully balance short-term gains with the potential risks of relying on AI.
- Develop new evaluator skills
Finally, evaluators may need to expand their skillsets to include knowledge of AI ethics, machine learning, and data science if they are to be equipped to evaluate AI. New competencies and partnerships may be needed to assess AI systems effectively. Increased training or collaboration with technical experts and data scientists will be needed to build these skills or to fill competency gaps. Learning how to work with, and understand, each other across (new) disciplines will be essential.
To encourage further reflection on the topic within the evaluation community, the European Investment Bank’s evaluation division, Itad, the OECD, IFAD and the IFRC are launching a series of think pieces on how to evaluate AI, considering the broader implications for the field of evaluation. For more information or to join the discussion, reach out to us via email at evaluation@eib.org.
About the authors:
Introduction to the EIB:
The European Investment Bank is the lending arm of the European Union. It is one of the biggest multilateral financial institutions in the world and one of the largest providers of climate finance. The Evaluation function independently assesses the EIB Group’s activities to ensure accountability, support learning, and contribute to evidence-based decision-making.