The AI Verify testing framework and toolkit is available for voluntary adoption by companies, to validate the performance of their AI systems against internationally recognised AI governance principles through testing.
AI Verify consists of three parts, namely the: (1) testing framework; (2) software toolkit; and (3) testing report.
The testing framework comprises testable criteria and testing processes corresponding to the following 11 ethics principles that are aligned with internationally recognised principles:
The software toolkit enables companies to conduct technical tests and process checks on their AI models. The toolkit will generate a testing report for the AI model tested.
The draft catalogue of LLM evaluations aims to provide a baseline standard for the evaluation and testing of LLMs. To this end, it contains a taxonomy of LLM evaluations under various domains that provides an overview of currently available tests. It also recommends a minimum baseline set of safety evaluations that LLM developers should conduct prior to LLM release.
Companies that are interested to test the performance of their AI systems against recognised AI governance principles can access the AI Verify toolkit via the AI Verify Foundation website here. Companies can share the test reports of their AI systems with stakeholders to demonstrate transparency.
Companies that provide AI testing and advisory services can incorporate the use of AI Verify into their services if they wish.
AI Verify has been made open source and the government is encouraging interested parties to contribute to growth of the AI testing ecosystem by building plugins to AI Verify. Developers and researchers can contribute new AI testing algorithms to improve AI Verify for future use cases. The open source code can be accessed via the AI Verify Foundation website.
Companies can also integrate AI Verify into their products and systems or build on top of it. The AI Verify open-source code is governed by the Apache 2.0 permissive software licence.
Companies that are interested to develop LLMs can refer to the draft catalogue of LLM evaluations for guidance on available evaluation and testing approaches, and how their LLMs can be evaluated to ensure a minimum level of safety and trustworthiness.
As the AI Verify testing framework and toolkit is intended to be ongoingly developed with input from the global open source community, it remains to be seen what additional use cases and functionalities may be developed in future.
The government may also develop and issue further guidance on the governance of generative AI technologies, including LLMs, as this space continues to evolve.
*Information is accurate up to 27 November 2023