New copyright challenges in the AI era: Will creation using AI cause infringement?
In the field of digital creativity, artificial intelligence (AI) and its application in generative AI have opened a new era, challenging the boundaries of traditional copyright law.
Contrary to common perception, the machine learning process of AI is not similar to the typical reproduction or imitation by infringers; on the contrary, the machine learning process of AI is more like the human learning process, which is to learn colors, colors and textures through contact with works. The logic and presentation order of language and pattern arrangement are used to produce new works based on learning results that are not substantially similar to the learning materials.
This article aims to delve into the complexities of AI machine learning while clarifying its legal implications and correcting common misunderstandings by citing important recent cases.
How does AI generate images and text?
The basic principle of generative AI is to create new content through learning from information and data. Different from the typical forms of copyright infringement such as "distribution", "reproduction" and "adaptation", it is more about "learning" from existing works and data.
image generation
1. Data analysis and pattern learning: Generative AI for image creation starts by analyzing a large amount of image data.
This includes identifying objects in images, but also understanding deeper elements such as stroke texture, color gradients, lighting and spatial relationships. For example, a mature generation is when AI learns landscape painting, it will identify different elements, such as brush strokes, color mixing techniques, and the interaction of light and shadow, and then apply these elements in the produced works.
2. Feature extraction: The convolutional neural network in the AI algorithm can extract specific features of image works to achieve the effect of identifying and separating various features of the image, such as edges, shapes and textures. . "Feature extraction" is crucial for AI robots to understand the styles, brushstrokes and painting techniques of different works of art.
3. Generation of new works: Once AI learns specific techniques and artistic styles through feature extraction and data analysis, it can generate new images.
This is usually done using Generative Adversarial Networks (GANs). GANs include image generators and image discriminators. Through the interaction and iteration process of the generator and discriminator, generative AI will eventually produce images that are close to the training data in style and characteristics (usually alleging infringement), but In actual comparison, there is no substantial similarity with the images in the training data.
text generation
1. Data acquisition and language model establishment: For text generation, AI models like ChatGPT absorb a large amount of text data, including but not limited to books, articles, website content, and even conversation records and other broad sources. AI uses text data to build a language model that can understand grammar and deduce context.
2. Language prediction: In text generation AI, the most common language prediction model is n-gram, which calculates the probability of a word following a "specific word or phrase" to achieve idiomatic expressions, narrative structure, and subject and object words. consistent purpose. However, the language prediction model n-gram is commonly used in grammar and spelling checks, but it is difficult to handle more complex text generation.
3. Encoding and text understanding: For the complex tasks of context extension and entire text generation, the n-gram model is inadequate because it can only predict from limited context information rather than understand the semantic meaning of the text.
On the contrary, the Transformer model converts the text in the text into vectors (Input Embedding) through the self-attention mechanism (Self-Attention), and then adds text order information through positional encoding (Positional Encoding) to achieve the context of the entire text. Comprehensive understanding.
4. Text generation: After in-depth understanding of the text through the Encoder, the Decoder is responsible for generating text based on the learned text features.
In this process, even long-distance dependencies between texts can be effectively captured. The characteristics of the Transformer model mentioned above allow it to generate coherent and creative text. This generation process is not only based on the deep semantic understanding of the original text, but also can create texts that are logical in meaning and original in content after deep learning. .
What is the difference between AI generation and copyright infringement?
From the aforementioned principles of image and text generation, we can know that the way AI generates content is very different from the infringement patterns stipulated in the copyright law. This is especially obvious from the following points:
1. The nature of creativity in AI: Generative AI obviously does not simply "copy" or "reproduce" the data it learns (existing works).
On the contrary, it is to learn the underlying logic, article structure and style of the text from a large amount of data and materials, and then combine these elements to create novel and informative works. For example, in image generation, although AI may learn from existing works of art, the final image generated is definitely not a copy of an existing work, but a new creation that reorganizes and translates the results of deep learning. .
2. Legal interpretation: From a legal perspective, the difference between AI-generated content and human copying is significant. The basic concept of copyright law is that "only the expression of an idea is protected, not the idea, concept, or system itself."
From the above, it can be seen that the works generated by AI learn the underlying logic, article structure, painting style, brush strokes and other ideas and concepts of the image works from the training data (original works), rather than "reproducing" or "resetting" the training data. (original work) expression.
The way AI generates works clearly challenges the boundaries of traditional copyright infringement. In a representative case like "Andersen v. Stability AI Ltd", the focus of the legal attack and defense in the lawsuit is whether the use of copyrighted images to train AI constitutes infringement when the images generated by Stable Diffusion do not constitute infringement.
3. Transformation and fair use: When discussing whether AI-generated works constitute infringement, we will discuss that the generated works are sufficiently transformative - this means that generative AI adds additional expressions to the original works, and even gives new ones meaning, it is time to discuss whether it is possible to constitute "fair use".
This depends on the AI's ability to create works that are significantly different from the original work. At present, DALL-E is completely prohibited from providing the function of using AI to modify original works in order to avoid such legal disputes.
The controversy over using AI to adapt existing works has reached its peak again. The recently popular "Palu World" uses generative AI to adapt multiple Pokémon, or even fuse multiple Pokémon. . Regarding text generation, during the review of the Thomson Reuters v. Ross Intelligence case, the two parties had in-depth discussions on the issue of fair use of legal questions generated by AI, and concluded that "fair use" was affirmed.
🛎️Extended reading: [Opinion] The character looks like "Pokémon"! Looking at copyright issues in the AI era from the game "Palworld" involved in plagiarism controversy
What impact does generative AI have on copyright law?
The process of AI-generated content, in the context of images and text, demonstrates a different form of creativity than direct copying or reproduction. This distinction is critical to understanding why AI learning and generation methods differ from copyright infringement.
With the continuous development of AI, existing copyrights only have legal interpretations, and interpretations must be revised and developed accordingly. With the continuous promotion and update of AI technology, the types of creation faced by copyright law are constantly changing.
The difference between "AI's learning and generation model" and "the reset of copyright" and "human's learning of ideas and concepts" is not only very different in definition, but also involves the interpretation of far-reaching legislative logic and creative ethics.
With the advancement of AI technology, the existing legal framework will definitely be revised; however, the direction of corresponding legal revisions depends on how legislators balance the two values of "AI innovation potential" and "protection of original works and".
Therefore, the next time you encounter a debate about whether generative AI constitutes copyright infringement, please remember that this is a conflict of two values. Never make a hasty conclusion that "generative AI infringes the copyright of the original work" based on outdated concepts.