Most legal teams are treating the EU AI Act as an AI vendor problem. Procurement is waiting for software suppliers to send compliance notices. IT is studying up on model documentation requirements and conformity assessments.
Nobody is looking at the PIM.
That is a mistake that could cost seven figures.
August 2, 2026 is when the EU AI Act’s full provisions for high-risk AI systems take effect. According to Pearl Cohen’s analysis of Article 113, this date triggers comprehensive requirements across risk management, data governance, technical documentation, and human oversight. And embedded within those requirements - in Article 10, specifically - is a clause that most e-commerce and manufacturing companies haven’t thought through yet.
It says that high-risk AI systems which use training data must be developed using data sets that meet specific quality criteria: relevance, representativeness, completeness, and absence of errors.
Where does that data come from, for most product-selling companies using AI?
The PIM. The product data layer. The system most teams treat as a content management problem, not a compliance problem.
Honest take: the people building AI systems for product recommendations, automated pricing, assortment planning, and supplier selection are going to need to prove their training data was clean. That proof starts upstream - in the systems that created and managed the data in the first place.
What Article 10 Actually Says About Product Data
Article 10 of the EU AI Act is titled “Data and Data Governance.” It is the provision that most AI compliance guides summarize in a sentence and then move past quickly. That’s the wrong instinct.
The article requires that training, validation, and testing data sets for high-risk AI systems must:
- Be relevant and representative for the intended purpose
- Be collected using appropriate methods
- Be free of errors and complete “to the best extent possible”
- Have appropriate statistical properties, including the persons or groups of persons on which the high-risk AI system is intended to be used
- Take into account the specific characteristics of the use case
Now think about what that actually means for a company using AI to power product recommendations, automated category classification, or intelligent supplier matching.
Every one of those systems was trained on product data. Titles, attributes, category hierarchies, supplier codes, EAN numbers, material compositions, unit specifications. The data that lives - or should live - in your PIM.
If that data was incomplete, inconsistently structured, or riddled with mapping errors when the model trained on it, the model itself may fail Article 10’s criteria. And the operator of that model - which is your company, not the vendor that built the model - carries the compliance obligation.
The fines are not symbolic. Non-compliance with high-risk AI requirements can reach €35 million or 7% of global annual turnover, whichever is higher. For context, GDPR maxes out at 4%. The EU AI Act is stricter.
Is your PIM ready to be exhibit A in a conformity assessment?
Why the AI Act Flows Upstream Into Your Product Data
Here’s what most compliance checklists miss: the EU AI Act does not just regulate AI models. It regulates the entire system that produces an AI outcome, including the data that shaped that outcome.
Kenaz GmbH’s 2026 analysis puts it plainly: “Design your data preparation pipeline with erasure in mind from day one.” The implication is that by the time data reaches a model, it must already be auditable, structured, and attributable.
That is not a model architecture problem. That is a product data infrastructure problem.
In 70+ PIM implementations, the pattern is consistent. Product data arrives from suppliers as Excel files with inconsistent headers, missing mandatory attributes, wrong unit formats, and category assignments that nobody has reviewed in three years. It gets imported into the PIM as-is because the deadline is tomorrow. The PIM “completeness” bar shows 78% and the team moves on.
That 78%-complete data is now potentially part of an AI training set. And if your AI recommendation engine, demand forecasting system, or automated classification tool is a high-risk AI system under the EU AI Act, those quality gaps are your legal liability.
The real kicker: “high-risk” is broader than most assume.
What Counts as a High-Risk AI System in Your Stack
Annex III of the EU AI Act defines high-risk AI systems across several categories. The ones most directly relevant to product-selling businesses include:
- Employment and workforce management - if AI assists in supplier evaluation, procurement decisions, or automated vendor scoring
- Essential private services - broadly interpreted to include AI systems that significantly affect access to products or services
- Access to goods and services - AI-driven credit scoring, pricing automation, or eligibility determination
There is a more nuanced area that is still evolving: AI-powered product recommendation engines that influence consumer purchasing decisions at scale. While not explicitly listed in Annex III today, legal advisors are watching how Member States’ supervisory authorities interpret the “significant harm to health, safety, or fundamental rights” threshold.
Even if your recommendation engine is not ultimately classified as high-risk, the Article 50 transparency obligations - requiring disclosure when AI generates content that might deceive consumers - take effect in August 2026 regardless of risk classification.
Point is: if you are using AI anywhere in your product data or commerce stack, you need to trace the data lineage. And that trace almost always leads back to the PIM.
What does your product data lineage actually look like right now?
The Compliance Gap Nobody Has Mapped Yet
here’s the honest state of things in most PIM environments as of mid-2026:
Product data quality is measured by completeness percentages. A product is “complete” when mandatory fields are filled. That is the standard. It is also, from an Article 10 perspective, nowhere near sufficient.
The EU AI Act’s quality criteria require data to be representative (not just present), free of systematic errors (not just filled), and appropriate for the intended use case (not just the channel it was originally created for).
We covered this in more detail in our analysis of what AI readiness actually requires from your product data, but the compliance dimension adds a new layer. It’s not just about whether your AI agents can work with the data. It’s about whether you can prove the data met quality standards at the time of training.
And proof requires audit trails.
Most PIM systems have version history. Few have structured quality audit trails that record when data was validated, by whom, against what schema, and with what outcome. Fewer still have supplier-side attribution - documentation showing that a specific attribute value came from a specific supplier document on a specific date.
That gap is expensive. In our work across 70+ implementations, rebuilding data lineage after the fact typically costs EUR 14,000 to EUR 22,000 per 1,000 products - because you are not just cleaning data, you are reconstructing a history that should have been captured as a process.
Actually, scratch that - the cost figure understates the problem. The monetary cost is recoverable. The compliance exposure from a retroactive audit is not.
There’s also a subtler issue. The EU AI Act requires data to meet quality criteria not just for initial training but on an ongoing basis. Post-market monitoring obligations mean that if your product data quality degrades over time - because suppliers start sending lower-quality inputs, or internal teams skip validation steps under pressure - the AI system that depends on that data may fall out of compliance between your scheduled audits. Clean data is not a project. It is an operational capability.
If a national AI supervisory authority asks for your training data quality documentation and you cannot produce it, you are not just facing a fine. You may have to pull the AI system from deployment while you remediate - and the downstream cost of that on e-commerce operations is a different magnitude entirely.
What an AI-Act-Ready Product Data Pipeline Looks Like
The good news: the infrastructure requirements of Article 10 and an operationally excellent PIM setup overlap almost completely. This is not new work - it is existing work done properly.
An AI-Act-ready product data pipeline has five characteristics:
Structured intake with schema validation. Supplier data is received in a defined format (not freehand Excel), validated against a schema at ingestion, and rejected or flagged when it does not meet the standard. Every rejection is logged.
Attribute-level completeness tracking. Not “this product is 80% complete,” but “this product is missing material_composition, country_of_origin, and hazardous_material_flag - which are required for AI training data in the safety products category.”
Data lineage from source to model. Each attribute value in the PIM should be traceable to its origin: which supplier document, which version, which import date, which review.
Explicit quality gates before AI consumption. Products that do not meet the quality threshold for a given AI use case are excluded from training and inference datasets until they do. This is documented.
Ongoing monitoring and drift detection. Data quality is not a one-time project. As supplier inputs change and product catalogs evolve, the quality of the training dataset shifts. Monitoring catches this before it becomes a compliance gap.
This is what OpenProd.io’s AI-native onboarding pipeline implements - structured intake, quality gates, completeness metrics across the five dimensions that matter, and audit trails at every step. We see up to 95% reduction in manual validation time when this pipeline replaces the traditional import-and-check workflow.
The business case was already strong from a pure operational efficiency perspective. Adding regulatory compliance to the calculation just makes it more CFO-defensible.
Start with a PIM ROI assessment if you need to quantify the investment. Then layer the compliance risk reduction on top. The payback period typically shortens significantly when you account for the cost of a regulatory incident that you did not prepare for.
Get Your Data House in Order Before August 2026
Four months is not a long time. The August 2 deadline will land before most companies have finished their internal AI system inventory - let alone their training data audit.
The practical checklist is short but demanding:
-
Identify which AI systems in your stack could be classified as high-risk under Annex III or interpretations thereof. Get a legal opinion if you are unsure. Do not assume the vendor’s AI system absorbs your liability.
-
Map the training data back to its origin. For each AI system, determine what data was used to train, validate, and test it. If that data came from a PIM or product catalog, assess its quality at the time of training.
-
Document the quality validation process. Can you show that the training data was checked for completeness, consistency, and representativeness before it was used? If not, build that audit trail now for future training cycles.
-
Establish ongoing quality monitoring. The AI Act requires post-market monitoring. That monitoring has to track whether data quality is maintained as the system continues to operate.
-
Fix the onboarding process upstream. The cheapest way to maintain compliance is to have clean data entering the system from the start - not to clean it retroactively before each training run.
The companies that will move fastest on this are the ones that already have a structured, auditable product data pipeline. Everyone else is going to be rebuilding while the clock runs down.
Book a demo with OpenProd.io to see what a structured product data pipeline looks like in practice - built for the quality standards that AI-powered commerce now requires by law.
Sources and Further Reading
- EU AI Act Article 10 - Data and Data Governance - Full text of the data quality requirements for high-risk AI systems
- Pearl Cohen: New Guidance Under the EU AI Act Ahead of Its Next Enforcement Date - Analysis of August 2026 application timeline and what changes
- Baker McKenzie: EU Regulation on AI - Product Risk Radar - Legal framework overview and high-risk AI classification
- BARR Advisory: Everything You Need to Know About the EU AI Act in 2026 - Compliance requirements and fine structure
- Kenaz GmbH: What Your AI System Needs Before It Touches EU Data - Practical GDPR + AI Act compliance checklist for 2026
- OpenProd.io: Your PIM Says 95% Complete. AI Agents Disagree. - The five data quality dimensions that AI systems require


