2 Introduction

Artificial intelligence systems based on machine learning (ML) models are currently under rapid development, with many successful applications - so far predominantly in the private sector. Public sector entities are beginning to develop and implement ML algorithms in the provision of public services. The goal is a more efficient public administration with improved, possibly personalised services at lower costs.

Development and implementation of ML algorithms leads to new challenges, including: the use of personal data versus privacy rights; inexplicable and therefore unjustifiable decisions; or potentially institutionalised discrimination by algorithmic bias. If an algorithm is not properly tailored to the objective and its environment, it can result in higher workload, delays and frustrated staff. Usage of a carelessly developed algorithm in public services can thus lead to obscured inefficiency, damaged trust in the authorities and be detrimental to a well-functioning public sector. Both internal control mechanisms and external audits are needed to ensure the proper use of ML and prevent these dangerous side effects.

The first principles and guidelines to address AI-related risks have been developed both internationally and on a national level in several countries, and are likely to result in relevant legislation in the near future.3 Independent third-party auditing is not only recommended in the context of the EU’s General Data Protection Regulation (GDPR) and following interpretations, 4 but for all AI systems affecting fundamental rights; the EU’s Ethics Guidelines for Trustworthy Artificial Intelligence [7] points out the need for the system to be lawful, ethical and robust. It further lists accountability, including auditability, as one requirement for trustworthy AI, and explicitly states the necessity for independent internal and external audits. The topics of fairness, transparency and accountability of AI are extensively discussed in the global research community. 5 Although it is not obvious how to facilitate the auditability of ML algorithms, the necessity is widely acknowledged.6 The idea of specialised, licensed AI system auditors has been put forward [1].

This paper outlines potential audits of AI systems by Supreme Audit Institutions (SAIs), covering risks related to the use of ML models in government agencies as well as possible tests to gain audit evidence. It further includes an auditablity checklist which summarises the minimum prerequisites an auditee organisation should retain from the ML implementation phase to enable any subsequent audit.

An audit of ML algorithms can have components of both performance audit and compliance audit. It should always include a risk assessment of the related IT system, potentially leading to a wider IT audit that includes the AI system. ML algorithms are typically not used as stand-alone software, but rather they are embedded in a pipeline of procedures as part of a wider IT infrastructure. The focus in this paper lies in the audit of the ML component.

The suggested audit model is based on the literature given in the bibliography, as well as on the experiences of the authoring SAIs with audits of IT systems in general and, in particular pilot audits of ML applications. It is thus focused on the most commonly used AI systems and those encountered in the pilot audits, and should eventually be updated with more audit experience and the results of new research where appropriate.

Chapter 3 is structured into five sections aligning with five audit topics. It suggests an ‘audit catalogue’ that specifies the relevant considerations with audit questions and risks for each point. A detailed list of practical audit tests and suggested contacts within the auditee organisations is given at the end of each section.

The ML audit helper tool (in Excel) is available as a seperate file that accompanies this paper.

Bibliography

[1] M. Brundage et al. (2020): Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims, https://arxiv.org/abs/2004.07213.
[2] European Parliament Research Service (EPRS) (2019): EU guidelines on ethics in artificial intelligence: Context and implementation, https://www.europarl.europa.eu/RegData/etudes/BRIE/2019/640163/EPRS_BRI(2019)640163_EN.pdf.
[3] U. von der Leyen (2019): https://g8fip1kplyr33r3krz5b97d1-wpengine.netdna-ssl.com/wp-content/uploads/2019/07/190714-Letter-Candidate-RENEW-1.pdf.
[4] OECD (2019): Recommendation of the Council on Artificial Intelligence, https://legalinstruments.oecd.org/en/instruments/OECD-LEGAL-0449.
[5] J. Gesley et al. (2019): Regulation of Artificial Intelligence in Selected Jurisdictions, https://www.loc.gov/law/help/artificial-intelligence/regulation-artificial-intelligence.pdf.
[6] The Article 29 Working Party (2018): Guidelines on Automated individual decision-making and Profiling for the purpose of Regulation 2016/679, https://ec.europa.eu/newsroom/article29/item-detail.cfm?item_id=612053.
[7] High-Level Expert Group on AI (2019): Ethics Guidelines for Trustworthy Artificial Intelligence, https://ec.europa.eu/digital-single-market/en/news/ethics-guidelines-trustworthy-ai.
[8] ACM Conference on Fairness, Accountability, and Transparency (ACM FAccT), https://facctconference.org/.
[9] I. D. Raji et al. (2020): Closing the AI Accountability Gap: Defining an End-to-End Framework for Internal Algorithmic Auditing, doi: https://doi.org/10.1145/3351095.3372873.

  1. Several countries are in the process of producing standards for AI similar to the EU’s guidelines [2], the European Commission has announced a legislative proposal [3], and the Organisation for Economic Co-operation and Development has already launched recommendations that were accepted by G20 ministers as guiding principles for trustworthy AI [4]. See further [5] for an overview of existing AI laws and policies.↩︎

  2. For example, see the EU’s Guidelines on Automated individual decision-making and Profiling [6]↩︎

  3. For example see the ACM FAccT conference series on fairness, accountability and transparency [8]↩︎

  4. For example researchers at Google have proposed a framework for internal algorithmic auditing [9] composed of five stages that aim to mitigate related risks before the deployment of the ML system.↩︎