Addressing misconceptions of Artificial Intelligence
Generative artificial intelligence is a data-intensive method of developing applications and services. The market that revolves around it has had exponential growth and created high expectations. As recital 7 of the GDPR explains that “Those developments require a strong and more coherent data protection framework in the Union, backed by strong enforcement, given the importance of creating the trust”. The British Data Protection Authority (ICO) has carried out a consultation on targeted areas of generative AI development and deployment and, based on it, has issued clarifications for developers to comply with data protection regulations. The evidence submitted surfaced a series of misunderstandings existing positions the regulator had. These findings are presented in this article.

Photo by Milad Fakurian on Unsplash.
The Spanish Data Protection Agency (AEPD) published jointly with the European Data Protection Supervisor the 10 Misunderstandings about Machine Learning in 2022, where a series of misunderstandings about machine learning have already been clarified. In this article we echo the recent publication of the Information Commissioner's Office (ICO) of the United Kingdom, on misunderstandings or misconceptions regarding artificial intelligence (AI) and its relationship to data protection (Tackling misconceptions | ICO), whose results are interesting to transfer to Spanish. This work is part of the broader response to a consultation on generative AI carried out by the ICO (Information Commissioner’s Office response to the consultation series on generative AI | ICO).
The misunderstandings that the ICO collects are described below:
1. "Incidental" or "agnostic" processing of personal data still constitutes processing of personal data, therefore data protection applies.
This statement underlines that the processing of personal data remains subject to data protection regulations, even when it is considered incidental or unintentional. In the public consultation conducted by the ICO, many generative AI developers claimed they did not intend to process personal data and that their processing of that data was purely incidental. In this regard, generative AI developers must accurately assess whether their models handle personal data and, where appropriate, ensure compliance with applicable regulations.
2. Common practice does not equate to meeting people’s reasonable expectations.
Organisations should not assume that a certain way of processing will be within people’s reasonable expectations, just because it is seen as “common practice”, i.e., the fact that a practice is common does not necessarily imply that the data subjects consider it reasonable.
This applies particularly when it comes to the novel use of personal data to train generative AI in an invisible way (e.g., by collecting the data through web scraping and without informing stakeholders) or years after someone provided it for a different purpose (when their expectations were, by default, different).
The principle of transparency requires that data controllers provide information in a concise, transparent, intelligible and easily accessible form, using clear and plain language on the use of personal data, which, in this case, would be on the training of models and the reuse of personal data for purposes other than those originally intended.
3. “Personally identifiable information” (PII) is different to the legal definition of “personal data”.
Many organisations focus their generative AI compliance efforts around “personally identifiable information” (PII). However, to ensure regulatory compliance with data protection they should be considering processing of any “personal data”. The latter is a broader and legally defined concept in the GDPR, which includes any information relating to an identified or identifiable natural person, which encompasses a broader spectrum than PII. This nuance is key to avoiding misinterpretations in the application of regulatory obligations.
4. Organisations should not assume that they can rely on the outcome of case law about search engine data protection compliance when considering generative AI compliance.
A few respondents sought to rely on the results of case law on data protection compliance in search engines (ie crawling the web), arguing that since the initial data collection was substantially the same, the decisions should also apply to the generative AI context.
However, there are key differences which means that the logic of these decisions may not be applicable. For example, while a search engine intends to index, rank and prioritise information and make it available to the public, generative AI goes further by synthesising information and producing new content in its outputs. Traditional search engine operators also allow individuals to exercise their rights, particularly the right to erasure, which is not standard practice for generative AI developers. In the EU, this fact reinforces the need for a targeted analysis of GDPR compliance and the risks to rights and freedoms associated with generating content from personal data.
In short, case law related to search engines cannot be automatically extrapolated to generative AI.
5. Generative AI models can themselves have data protection implications.
The ICO also notes that some developers argued that their models do not "store" personal data. However, the reality is that generative AI models can retain information with which they have been trained and, in some cases, this can be retrievable or disclosable. From a GDPR perspective, this has significant implications, particularly regarding the principle of data minimization and the right to erasure.
6. The scope of data protection and its relationship with other regulatory frameworks.
Some respondents to the ICO thought that the principle of lawfulness implied that the ICO could provide opinions or guidance on legality in regimes other than data protection. However, data protection cannot be used as a tool to interpret legality in other regulatory areas. While non-compliance with other regulations may also lead to a violation of the GDPR (e.g. in the case of unlawful processing of personal data), data protection authorities are not competent to determine lawfulness in other matters.
7. There is no “AI exemption” in data protection regulation.
Some developers argue that generative AI should benefit from differentiated way regarding data protection regulation compliance, as the regulation should not complicate generative AI development.
Organizations should be aware that there are no carve-outs or sweeping exemptions for generative AI. If an organization is processing personal data, in any context, all data protection regulations will apply. In addition, it is highly important that organizations adopt a “data protection by design approach” to ensure that data subjects’ rights are respected.
Ultimately, the ICO publication offers a series of relevant clarifications on the application of data protection regulations to generative AI, which AEPD considers useful to transmit to controller of personal data processing that develop or use generative AI.
This article is related to other materials published by the Innovation and Technology Division of the AEPD, such as:
- EDPS-AEPD Technical Note: 10 Misunderstandings about Machine Learning [sep 2022]
- Audit Requirements for Personal Data Processing Activities involving AI [jan 2021]
- GDPR compliance of processing that embed Artificial Intelligence. An introduction [feb 2020]
- A Guide to Privacy by Design