I made a brief reference in my first AI post to:
the well-rehearsed issue of copyright infringement as the LLM hoover up any text found on the web
but it is too important an issue to be left at that: Kate Bush, Annie Lennox and Damon Albarn are among 1,000 artists on a silent ‘AI protest’ album launched to emphasise the impact on musicians of UK’s plans to let AI train on their work without permission. (see this Guardian article). Copyright is important to ALL creative publishers of music, poems, literature, scholarly articles, etc. as it protects their work from unauthorised use and ensures fair recompense for use. The new UK government exemption allows AI companies to train their algorithms on the work of such creative professionals without compensation.
The issue is explained more fully in another Guardian article from the same issue (Tuesday 25th February). Andrew Lloyd Webber and Alistair Webber’s clearly argued opinion piece, “It’s grand theft AI and UK ministers are behind it. Oppose this robbery of people’s creativity” explains the problem in some detail and with some force, noting that the government’s consultation which ended this week “is not regulation, it is a free pass for AI to exploit creativity without consequence.”
Copyright ensures creators retain control and are fairly compensated. It underpins the creative economy. Put simply, it allows artists and creatives to make a living.
The point that both I and Robert Griffiths have made (see my first AI post) is made again here:
AI can replicate patterns, but it does not create. If left unregulated, it will not just be a creative crisis, but an economic failure in the making. AI will flood the market with machine-generated imitations, undercutting human creativity …
… and in replicating the patterns of your work or my work it is undermining our ability to make a living. Copyright protections are the
foundation that allows creators to produce the high-quality work AI depends on. Without strong copyright laws, human creativity will be devalued and displaced by machines.
Both articles are essential reading if you are interested in understanding how AI is set to move forward. Or indeed the stage it has already reached. We need to understand and deal with the problems as they arise. There needs to be more open debate and more understanding about ‘good AI’ and ‘bad AI’.
And, I repeat, the man and woman in the street need both to understand and to have a choice as to whether they use (or are exposed to) AI.
Postscripts:
Number 1
The Authors’ Licensing and Collecting Society (ALCS) has just made public their 24-page response to the Government Consultation. It is introduced by CEO Barbara Hayes here and the link to the full PDF document is at the foot of that page. It makes interesting reading (very!) but perhaps the most interesting issue highlighted is the amount of legal challenges that are likely to ensue if the proposed exception-based approach is taken:
The central issue giving rise to this uncertainty is encapsulated well in a paper coauthored by US and German academics: ‘The training of generative AI models does not limit the use of the training data to a simple analysis of the semantic information contained in the works. It also extracts the syntactic information in the works, including the elements of copyright-protected expression. This comprehensive utilization results in a representation of the training data in the vector space of the AI models and thus in a copying and reproduction in the legal sense. Consequently, the training of generative AI models does not fall under the exceptions for text and data mining.’ (Dornis, Tim W. and Stober, Sebastian, Urheberrecht und Training generativer KI-Modelle – technologische und juristische Grundlagen September, 2024).
In the US there is already a significant number of lawsuits relating to the use of copyright material by AI systems.
If you have concerns over the use of supposedly copyright-protected material being used, this report is a ‘must-read’ document.
Number 2
14th March
Paul Taylor in the London Review of Books (AI Wars, 20th March 2025) discusses the capabilities of Large Language Models as used by ChatBots such as OpenAI’s ChatGPT or Google’s Gemini in the light of the Chinese DeepSeek. He makes the point that:
By themselves, language models are merely machines for generating language. The basic idea of a large language model is that you enter a ‘prompt’… and it responds with a ‘completion’, an answer. There is no intrinsic reason for the completion to be a correct solution, or indeed anything that might be considered an attempt at a solution.
As I said in my previous post, the answer is based on word probabilities.
Number 3
And another article. 18th March
In the very same issue of London Review of Books, Laleh Khalili wrote something, which relates to my point about definition in the first of these three AI posts, in a review (“I Am Genghis Khan”, 20th March 2025) of Lionel Barber’s book about Japan’s Masayoshi Son:
Artificial intelligence is a baggy term: as well as natural language chatbots and virtual assistants such as Apple’s Siri, it includes Google’s search functions, recommendation engines like Netflix’s film suggestions, image and voice recognition software and much else besides.
But she went on to say:
Although people had used most of these applications without fuss for years, AI puffery escalated when Open AI released ChatGPT to the world in 2022. That ChatGPT hallucinated (i.e. made things up), that its responses were only as good as the material on which it had been trained, and that OpenAI had used copyrighted material without acknowledgement or recompense, seemed not to matter to most users.
Because they did not know! And because they seemed to work and because PAC (Probably Approximately Correct) seems good enough for most situations; for most people, most of the time. But I would argue that it is dangerous. And even if it is only dangerous occasionally, to some people, some of the time, those people have a right to know what they are getting.
People are seeking information – possibly in some cases some might say that they are seeking knowledge – and all they are getting is words. Worse, they are getting a pattern of words masquerading as an answer to their question, masquerading as information. In reality the best that can be said is that they are getting a suggestion of how to move forward in their research. A pattern of words – a logical sequence – is not information; is certainly not offering knowledge.
As I said in the fifth issue (of six) in my original post (Good AI/Bad AI) on February 6th:
shouldn’t we all (have the opportunity to) understand [what we are using]?