Businesses acquiring AI tools will increasingly need to confirm the licensing status of their training data, placing vendors unable to provide this verification at a significant competitive disadvantage.
According to a statement issued Thursday by a committee of the House of Lords, the UK Parliament’s upper chamber, AI developers are mandated to secure licenses for copyrighted content prior to utilizing it for model training.
This approach, termed ‘licensing-first’ by the committee, stipulates that training on protected works requires explicit prior permission and appropriate payment, irrespective of the material’s origin.
The committee has formally requested that the government integrate the licensing-first principle into policy, establish statutory requirements for disclosing AI training data, and definitively dismiss a proposed copyright exception that would permit developers to train on protected works without explicit consent.
“Watering down the protections in our existing copyright regime to lure the biggest US tech companies is a race to the bottom that does not serve UK interests,” asserted Baroness Keeley, chair of the Communications and Digital Committee, in a statement. She added, “We should not sacrifice our creative industries for AI jam tomorrow.”
The committee formally presented these recommendations in a report titled “AI, Copyright and the Creative Industries,” which was released subsequent to an inquiry initiated in November 2025.
A compulsory policy response from the government is due by March 18, in accordance with the Data (Use and Access) Act 2025.
Key Aspects of Licensing-First
The committee emphasized that a licensing-first framework is unworkable without robust transparency. Their report advocates for a statutory mandate compelling AI developers to reveal their model’s training data, supported by open technical standards covering rights reservation, data provenance, and the labeling of AI-generated content.
“Without those foundations,” the report concluded, “rights holders have no reliable way to establish whether their work has been used.”
Karthi P, a senior analyst at Everest Group, noted that standards like C2PA (Coalition for Content Provenance and Authenticity) are progressively being adopted by device manufacturers, generative AI providers, and platform entities. These standards aim to embed content credentials and facilitate traceability – a form of provenance infrastructure essential for a licensing-first regime. “The challenge is scaling this across decades of legacy content and a highly fragmented creator economy,” he explained, adding, “That infrastructure exists in pockets, but it is not yet industrialised end to end.”
The EU has already adopted similar measures. Specifically, Article 53 of the EU AI Act, which became effective in August 2025, obliges providers of general-purpose AI models to disclose a comprehensive summary of their training content. Failure to comply can result in penalties of up to $17.3 million (€15 million) or 3% of global annual revenue. The UK committee indicated that analogous requirements are essential within the UK.
The Undermining Exception
The efficacy of the licensing-first framework hinges on a crucial precondition: the definitive rejection of a specific copyright exception that the committee believes would compromise the framework’s establishment. This exception, termed the text and data mining (TDM) exception, is a legal provision designed to permit AI developers to train models on copyrighted works for commercial use without prior authorization, albeit with an opt-out mechanism for rights holders.
“The Government should, in the next year, publish a final decision on its approach to AI and copyright. In the meantime, it should set out clearly that it will not introduce a new TDM exception with an opt-out mechanism, as initially proposed in its consultation on AI and copyright,” stated the Committee in its report.
The government had initially supported such an exception but withdrew its endorsement in 2025 following considerable pressure. The committee has urged the government to solidify this reversal into a permanent policy.
The report contended that an opt-out model unfairly burdens creators with the task of policing an industry where verifying the usage of their work is often impossible.
A global problem without a settled answer
The UK’s challenges with this issue are not unique. In the United States, over 50 copyright lawsuits have been filed in federal courts against prominent AI developers such as OpenAI, Anthropic, and Google, initiated by publishers, authors, and entertainment firms. The Copyright Alliance reports that judicial decisions regarding whether training on copyrighted material constitutes fair use have been inconsistent.
In May 2025, the US Copyright Office determined that a voluntary licensing market for AI training data is emerging, pinpointing lost licensing revenue and market dilution as the principal damages resulting from unlicensed training.
The committee further asserted that, owing to its extensive creative output, the UK is strategically poised to spearhead a licensed market for AI training data.
The vendor dependency question
To mitigate reliance on AI platforms lacking independently verifiable training data practices, the committee advised the government to prioritize the creation of sovereign AI models — systems developed domestically with integrated transparency and copyright compliance as core design principles.
Karthi highlighted that for enterprise technology buyers, this evolving landscape doesn’t imply a diminished focus on performance during procurement. He stated, “In content-intensive applications like marketing, media production, and customer engagement, purchasers will increasingly evaluate a provider’s training data practices in conjunction with their model’s capabilities.”
Enterprises expanding their AI initiatives will need to regularly revise their procurement strategies to harmonize rapid innovation with robust, defensible data practices, he further explained. This forthcoming policy will establish the operational terms for both AI vendors and their enterprise clients. Karthi concluded, “Trusted data foundations will be just as important as model performance in determining sustainable AI adoption.”