Guaranteed Safe AI in Practice

What is Guaranteed Safe AI?

Published in May 2024 on arXiv (arXiv:2405.06624) by a group of leading AI researchers including Yoshua Bengio, Stuart Russell, and Max Tegmark, the Guaranteed Safe AI (GSAI) framework proposes that AI systems can be made provably safe rather than merely probably safe.

The key insight is architectural. Traditional AI safety relies on behavioural testing and red-teaming -- probing an AI system until it appears safe. GSAI instead requires three formal components: a World Model that formally describes the operating environment and the system's intended scope; a Safety Specification that encodes what the system must and must not do as machine-checkable rules; and a Verifier that continuously checks whether the system's outputs and actions satisfy the specification against the world model.

When a Verifier certifies that a system meets its Safety Specification under its World Model, you have a Proof Certificate -- a formal, auditable record that the system behaved safely under defined conditions. This is categorically stronger than a penetration test, a consultant's report, or a self-assessment questionnaire.

The EU AI Act (Regulation 2024/1689) requires high-risk AI providers to demonstrate conformity -- evidence that a system meets specified obligations. GSAI provides the formal architecture to generate that evidence systematically. Vigilens is built to make this architecture operational for AI development teams without requiring formal methods expertise.

How Vigilens maps to GSAI

Each of the four Vigilens product layers corresponds directly to a GSAI component:

GSAI Component

Safety Specification

Machine-checkable rules encoding what the system must and must not do.

Vigilens Layer

Controls

EU AI Act Articles 8-15 encoded as machine-executable Rules-as-Code. Each rule maps to a specific Article obligation, is version-controlled, and can be customised or extended. Supports BYO-LLM for enterprises with data-residency requirements.

GSAI Component

Verifier

Continuous process that checks system outputs against the Safety Specification and World Model.

Vigilens Layer

Evidence

Automated evidence collection from GitHub, GitLab, Confluence, Datadog, MLflow, Jira, and S3. Every CI/CD release triggers a compliance run. Evidence is timestamped, cryptographically signed, and mapped to the specific Control it satisfies.

EU AI Act article mapping

Each Vigilens layer satisfies specific EU AI Act obligations under Regulation 2024/1689. The table below shows the primary article mapping for high-risk AI systems under Annex III:

EU AI Act Article(s)	Obligation	Vigilens Layer
Art. 6 + Annex III	Determine whether an AI system qualifies as high-risk	Classify -- six-question classifier produces a binding Annex III determination with article references
Art. 9	Risk management system: identify, analyse, evaluate, and mitigate risks throughout lifecycle	Classify produces the risk profile; Controls instantiates the risk management rules as executable checks
Arts. 8-15	Requirements for high-risk AI systems: data governance, technical documentation, record-keeping, transparency, human oversight, accuracy and robustness	Controls -- all Articles 8-15 obligations encoded as machine-executable Rules-as-Code, version-controlled and auditable
Arts. 11-12	Technical documentation (Annex IV) and automatic logging requirements	Evidence -- automated collection from engineering tools at every release generates the technical documentation record and audit log required by Arts. 11-12 and Annex IV
Art. 13	Transparency and provision of information to users	Controls + Audit Pack -- transparency obligations encoded as rules; Audit Pack includes user-facing transparency disclosure
Art. 14	Human oversight measures	Controls -- human-in-the-loop obligations encoded and verified at each Evidence run
Art. 43	Conformity assessment: self-assessment or third-party assessment required before market placement	Audit Pack -- one-click conformity pack containing all evidence, control mappings, and Art. 43 self-assessment declaration
Art. 72 + Annex IV	Post-market monitoring and incident reporting	Evidence -- continuous monitoring triggers from Datadog and MLflow feed ongoing post-market surveillance

Why machine-executable rules matter

Most AI governance platforms encode compliance as policy documents, checklists, or workflow tasks. A human reads the checklist, decides whether the obligation is met, and records a yes or no. This approach has two failure modes: human error in interpretation, and point-in-time assessment that becomes stale between audit cycles.

Vigilens encodes EU AI Act obligations as Rules-as-Code -- machine-executable conditions that run automatically at every CI/CD release. If a system's logging configuration drifts away from Article 12 compliance between releases, the next Evidence run catches it automatically, without human review. The rule failed. The Evidence timestamp records the failure. The next Audit Pack reflects it.

This is the practical meaning of the GSAI Verifier: not a consultant running a quarterly review, but a continuous automated check against a formal specification. The output is not an opinion -- it is a timestamped, cryptographically signed record that the system did or did not satisfy each obligation at each point in its lifecycle.

For EU AI Act compliance specifically, this matters because the regulation requires continuous risk management under Article 9, not one-time assessment. Rules-as-Code running in CI/CD is the only architecture that satisfies the regulation's ongoing nature without creating unsustainable manual overhead.

Why EU AI Act enforcement makes this urgent

EU AI Act Regulation 2024/1689 entered into force on 2 August 2024. Obligations for high-risk AI systems under Annex III apply from 2 August 2026. Full regulation including general-purpose AI (GPAI) applies from 2 August 2027. Penalties reach €35 million or 7% of global annual turnover, whichever is higher, for the most serious infringements.

For providers of AI systems in the Annex III categories -- AI in medical devices, critical infrastructure, employment and HR, education, access to essential services, law enforcement, migration, and administration of justice -- the compliance window is now under six months. Conformity assessment documentation must be in place before market placement.

A manual, document-based compliance programme cannot scale to the pace of AI development. A Rules-as-Code platform running in CI/CD can. That is the practical case for building compliance into the engineering workflow from day one -- and the practical case for the GSAI architecture as the foundation.

References

Primary citation

Guaranteed Safe AI: Making AI systems Provably Safe.
Yoshua Bengio, Stuart Russell, Max Tegmark, Sanjit Seshia, Pieter Abbeel, Francesca Rossi, Pushmeet Kohli, Dawn Song, Jan Leike, Zico Kolter, and others.
arXiv preprint arXiv:2405.06624, May 2024.
Available at: https://arxiv.org/abs/2405.06624

EU AI Act

Regulation (EU) 2024/1689 of the European Parliament and of the Council of 13 June 2024 laying down harmonised rules on artificial intelligence (Artificial Intelligence Act).
Official Journal of the European Union, L series, 12 July 2024.
Available at: EUR-Lex: CELEX:32024R1689

Guaranteed Safe AI
in practice.

What is Guaranteed Safe AI?

How Vigilens maps to GSAI

EU AI Act article mapping

Why machine-executable rules matter

Why EU AI Act enforcement makes this urgent

References

Primary citation

EU AI Act

Further reading

Guaranteed Safe AIin practice.

What is Guaranteed Safe AI?

How Vigilens maps to GSAI

EU AI Act article mapping

Why machine-executable rules matter

Why EU AI Act enforcement makes this urgent

References

Primary citation

EU AI Act

Further reading

Guaranteed Safe AI
in practice.