But the more these systems are used, the more writers and visual artists are noticing similarities between their work and these systems’ output. Many have called on generative AI companies to reveal their data sources, and—as with the Authors Guild—to compensate those whose works were used. Some of the pleas are open letters and social media posts, but an increasing number are lawsuits.
Like most machine learning software, they work by identifying and replicating patterns in data. But because these programs are used to generate code, text, music, and art, that data is itself created by humans, scraped from the web and copyright protected in one way or another. In other cases, creators’ interests could be Yakov Livshits protected through direct regulation of the development and use of generative AI models. For example, certain creators’ desire for consent, credit, and compensation when their works are included in training data sets for generative AI programs is an issue that could be perhaps addressed through regulation of AI models.
Now laws that have long protected people and their intellectual property are proving insufficient when AI is involved in the creation. Howell agreed with the Copyright Office and said human authorship is a «bedrock requirement of copyright» based on «centuries of settled understanding.» From a license compliance standpoint, it’s always wise to use a scanning tool (like FOSSA), which detects open source components in your code. This way, in the off chance a generative AI tool outputs an entire open source file — which would be copyrightable — you’ll know your compliance requirements.
In that case, the fact that you worked with a machine would not exclude copyright protection,” Gervais said. So, how do we reconcile the rapidly evolving artificial intelligence industry with the knotty particulars of U.S. copyright law? That is something creatives, companies, courts and the United States government are trying to figure out. While artists draw obliquely from past works that have educated and inspired them in order to create, generative AI relies on training data to produce outputs. We’re part of a team of 14 experts across disciplines that just published a paper on generative AI in Science magazine.
Developers should also work on ways to maintain the provenance of AI-generated content, which would increase transparency about the works included in the training data. This would include recording the platform that was used to develop the content, details on the settings that were employed, tracking of seed-data’s metadata, and tags to facilitate Yakov Livshits AI reporting, including the generative seed, and the specific prompt that was used to create the content. Artist James M. Allen utilized the generative AI system “Midjourney,” a text-to-picture AI service, to create a science fiction-themed artwork, which made headlines when it won the 2022 Colorado State Fair art competition.
Registering one’s work is not, of course, a requirement for it to be copyrighted—that occurs automatically upon its creation (although the creator will not be eligible for punitive damages if the work is infringed upon). However, as Arle Lommel, director of data services at CSA Research in Massachusetts, points out, generative AI doesn’t really work in the way that many believe it does. Thaler attempted multiple times to copyright a piece of visual content, “A Recent Entrance to Paradise,” created using Creativity Machine—a computer system owned by Thaler. Copyright is at its core an economic regulation meant to provide incentives for creators to produce and disseminate new expressive works.
Founder of the DevEducation project
A prolific businessman and investor, and the founder of several large companies in Israel, the USA and the UAE, Yakov’s corporation comprises over 2,000 employees all over the world. He graduated from the University of Oxford in the UK and Technion in Israel, before moving on to study complex systems science at NECSI in the USA. Yakov has a Masters in Software Development.
The Daily then issued a correction, stating that fanfiction writers had copyright in their original contributions. Even if the creation of the AI system is not infringing, an artist might not want her creations used to “train” the AI as a matter of principle. Although media reports frequently suggest that the artist is helpless to prevent the ingestion of her content, in fact the artist can employ a widely-used robot exclusion protocol to prevent her website from being crawled by AI bots. 2023 could well be remembered as the year when artificial intelligence (AI) came of age.
Thaler’s motion for summary judgment, which Howell denied in the Friday order, argued that permitting AI to be listed as an author on copyrighted works would incentivize more creation, which is in line with copyright law’s purpose of promoting useful art for the public. Judge Beryl A. Howell of the US District Court for the District of Columbia agreed with the US Copyright Office’s decision to deny a copyright registration to computer scientist Stephen Thaler, who argued a two-dimensional artwork created by his AI program “Creativity Machine” should be eligible for protection. They argue that this approach will lead to market concentration and exploitation of creators by big companies. Because creative labor markets are already heavily concentrated and dominant companies have significant bargaining power, they will be able to impose contractual terms on artists that require them to sign away their “training rights” for reduced compensation. The medium to long term result would be more concentration of power with large companies leaving less control and remuneration for artists. On the one hand, commentators like Paul Keller consider that this approach has the potential to increase the bargaining power of rights holders and lead to licensing deals with (and remuneration from) AI providers.
Consider that search engines have been providing search results based on copyrighted text for decades, including short summaries of the presented documents. This practice has been well-litigated in the United States and found to be copyright compliant, being declared by courts as “fair use” of the web content for many reasons. “Denying copyright to AI-created works would thus go against the well-worn principle that ‘[c]opyright protection extends to all ‘original works of authorship fixed in any tangible medium’ of expression,” Thaler argued in his motion.
The lack of clear copyright considerations in current regulatory frameworks for AI in the U.S. has led to gaps in addressing the use of copyrighted content to train generative AI models and the creation of new artworks. Therefore, it is essential to incorporate copyright considerations and engage the U.S. Copyright Office in discussions and debates with other stakeholders about developing and implementing AI governance frameworks. This process should evaluate whether amendments to existing copyright policies or the implementation of new protections may be necessary to guarantee both creators’ rights and the capacity of AI to enhance creativity. However, like any emerging technology, generative AI also raises challenges requiring consideration from policymakers.
Respected outlets such as the New York Times and the Washington Post have asserted that ChatGPT and other AI providers “steal” content from creators. While copying does occur in the process of developing AI systems, that copying typically does not constitute copyright infringement. And even if it did involve copyright infringement, infringement is the trespass on a government-granted exclusive right, nothing akin to stealing personal property. Yakov Livshits Either way, this may lead to an evolution of copyright laws that enable the recognition of emerging tech, like generative AI, as creative entities eligible for copyright safeguards, Sullivan added. Neither does the ruling clarify what happens when images that are already protected under copyright are used for creating content, as seen in the case where Getty Images sued the creators of Stable Diffusion for using its content.
While it may seem like these new AI tools can conjure new material from the ether, that’s not quite the case. Generative AI platforms are trained on data lakes and question snippets — billions of parameters that are constructed by software processing huge archives of images and text. The AI platforms recover patterns and relationships, which they then use to create rules, and then make judgments and predictions, when responding to a prompt. If generative AI output is considered a derivative work of the training materials, engineering teams that use it would be required to comply with the license(s) of the code upon which the tool is trained. This, of course, could come with requirements to disclose source code, generate attribution notices, and more.
Fair use also considers “the amount and substantiality of the portion used in relation to the copyrighted work as a whole.” Copying the creative “heart” of a work, or its most expressive and creative components, weighs against fair use, especially when multiple complete works are copied. To generate quality Output Works, fair use minimalists would argue that GAIs must analyze and use as much as possible from the underlying Input Works, including the most expressive or creative components of the works. Fundamentally, this is about the relationship of works generated from generative AI models (Output Works) to works used to train generative AI models (Input Works) and how U.S. copyright law applies to that relationship. This article begins with an overview of generative AI and copyright law with a focus on fair use doctrine. It then examines four schools of thought that have emerged to address the novelty of generative AI under copyright law. It posits some of the implications for each of these approaches for innovation and the growth of the generative AI industry.