Classic IT audit components in ML/AI context

While ML systems are an emerging technology, and their broad usage by public entities is just beginning, they share key features with regular software.

Notably, their development cycle is similar, which is why well-established standards, such as the CRISP-DM cycle that was introduced in Chapter 3, may be applied to break down the lifecycle of such an application into several phases that can be audited all at once or selectively.

Thus, auditors may apply the same techniques and audit questions that would be appropriate for regular IT performance and compliance audits for some of these phases. Risks that may occur in regular software development projects are also of relevance to ML algorithm development, however, one has to keep the risks that are unique to ML applications in mind (a selection of these risks is included in Sections 3.1 to 3.5).

Therefore, it might be suitable to either audit ML systems in teams that are composed of specialist auditors, IT auditors and data scientists, or to focus on one component (data science or classical IT audit) while keeping the other component in mind. Teams that lack the necessary experience in data science might be well-advised to rely on the expertise of their data science colleagues, while teams that lack experience in classical IT audits would be well-advised to delegate certain aspects of the audit to more experienced IT auditors. This well- balanced approach guarantees that no aspects of the system are left unaudited due to a potential lack in knowledge or tools.

The first audit of a system with ML components that was performed by the Bundesrechnungshof, the German SAI, successfully applied the approach detailed above: the audit team was composed of a specialist auditor with good knowledge of the auditee organisation and two technical auditors (one with an IT background, one with a background in natural sciences). The auditors were recruited from their respective audit units to form the ML audit team for this specific audit.

The audit of the following phases might benefit from auditors with an IT or specialist background:

  1. Business understanding (see Section 3.1)
  2. Deployment and change management (see Section 3.1)
  3. Operation (see Section 3.4)

However, the following phases of the CRISP-DM cycle might be more suited to an audit by data scientists with a background in ML:

  1. Data preparation (see Section 3.2)
  2. Modelling (see Section 3.3)
  3. Evaluation (see Section 3.5)

The ‘data understanding’ phase (see Section 3.2), might benefit from a combined approach as it requires a technical understanding of the data as well as an understanding of the business goal that is supposed to be reached by the application of the ML algorithm on the data.

The ML audit helper tool that is included with this paper offers a host of questions that are suitable to all phases of the CRISP-DM cycle, for specialist auditors and IT auditors, as well as for data scientists.