Article 12 of EU AI Act Regulation 2024/1689 requires high-risk AI systems to have automatic logging capabilities sufficient to ensure traceability throughout the system's operational lifetime. Article 72 requires post-market monitoring. For ML engineering teams, these obligations map directly to two tools already in most production stacks: MLflow for experiment and model tracking, and Datadog for operational monitoring and logging.
The question is not whether to use these tools for compliance -- most teams already use them. The question is how to configure them so that the data they generate constitutes usable EU AI Act evidence.
MLflow: Structuring Experiments and Models for Annex IV
MLflow tracks experiments, runs, models, and datasets. Configured correctly, an MLflow registry becomes a living record of the information required in Annex IV Sections 2 (system elements and development), 3 (monitoring and control), and 5 (changes).
Experiment Naming Convention
Name MLflow experiments with enough structure that a compliance reviewer can understand what changed, why, and what the regulatory context is:
Required MLflow Run Tags
Add standard tags to every production-relevant MLflow run. These tags make it possible to filter runs by compliance context and generate Annex IV Section 5 change records automatically:
Model Registry Stage Lifecycle as Conformity Evidence
Use MLflow's model registry staging lifecycle to enforce a compliance gate before production deployment:
The model registry transition timestamp, the approving user, and the run metrics form a verifiable record that Article 9 (risk), Article 14 (human oversight), and Article 15 (accuracy) obligations were satisfied before the model went to production.
Logging Dataset Lineage for Article 10
Article 10 requires data governance for training, validation, and test datasets. Log dataset metadata with every training run:
Datadog: Configuring Monitoring for Article 12 and Article 72
Datadog is where operational Article 12 logging and Article 72 post-market monitoring live. The goal is not just to have logs -- it is to have logs structured in a way that can be exported as compliance evidence on demand.
Article 12 Log Schema
Article 12 requires automatic logging of events sufficient to ensure traceability. At minimum, each log event for a high-risk AI inference should contain:
Key design decisions:
- Hash inputs, do not log raw personal data -- Article 10 and GDPR require data minimisation. A hash proves the input was a specific value without storing the value itself
- Log human override and oversight flag -- these are Article 14 evidence fields showing human oversight is operational
- Tag with compliance.art12 -- this makes it trivial to generate a compliance-filtered export for auditors
Datadog Monitor Naming for Post-Market Monitoring
Article 72 post-market monitoring requires tracking of performance metrics after deployment. Name Datadog monitors to make their compliance purpose explicit:
Log Retention Configuration
Article 18 requires technical documentation to be retained for 10 years. Your compliance-tagged logs should match this retention policy. In Datadog:
- Create a dedicated Log Archive for compliance-tagged events (
compliance.art12:true) - Route this archive to long-term storage (S3, Azure Blob) with a 10-year retention policy
- Separate compliance logs from operational logs to avoid incurring full Datadog ingestion costs on archived compliance records
- Document the archive configuration in your Annex IV Section 7 (post-market monitoring plan)
Generating Compliance Exports
When a market surveillance authority requests evidence, you need to be able to produce a compliance log export quickly. Set up a saved Datadog query for Article 12 log exports:
The Full Evidence Chain
When MLflow and Datadog are configured this way, every inference event produces a traceable chain:
- The model version in the Datadog log matches the registered model version in MLflow
- The MLflow run record shows the training data, risk assessment, and human reviewer who approved the model for production
- The Datadog monitor alerts feed back into the Confluence Article 9 risk management system
- The Confluence Annex IV Section 7 links to the Datadog monitoring plan
This chain is what an Article 43 conformity assessment reviewer or a market surveillance authority inspector expects to see. It is not a compliance artefact you produce for audits -- it is the normal operational record of a well-run AI system, configured to be auditable.
Frequently Asked Questions
How long must Article 12 inference logs be retained?
Article 18 requires technical documentation to be retained for 10 years after market placement. Article 12 logs, as part of the evidence trail that demonstrates compliance with the logging obligation, should be retained for the same period. For operational cost reasons, compliance-tagged logs can be archived to lower-cost long-term storage (S3 Glacier, Azure Archive) rather than retained in your primary Datadog account.
Is hashing inference inputs sufficient for Article 12, or must raw inputs be logged?
Hashing is both sufficient and preferable. Article 12 requires logging sufficient to ensure traceability -- it does not require storing raw personal data inputs, which would conflict with GDPR data minimisation requirements. A cryptographic hash of the input proves a specific input was processed at a specific time without retaining the personal data. The input reference field in the log should document the hashing algorithm used.