AI website content has serious implications concerning copyright laws, whether we are talking about copyrighted material receiving copyright protection or whether the AI-generated content you use for your website violates existing copyright laws.
The astronomical rise of artificial intelligence technologies over the last few years has not only disrupted many industries but has also threatened the very existence of some. There have been serious conversations in business and government about how best to regulate AI, but there’s no consensus on what that would even look like.
At Blue Seven, we’re deeply invested in the trajectory of AI laws. We are a company of trained, professional writer-researchers, but we are well aware of the impact of AI and large language models (LLMs) like ChatGPT on our industry. We (company founders Allen Watson and Victoria Lozano, Esq.) have kept a close eye on developments, particularly the ones that directly impact website content writing.
We’re at the one-year mark since the release of ChatGPT to the world, and it’s certainly been a year. This is a good time to review the still-evolving issues surrounding AI website content and copyright laws in the US.
Could AI Written Content on Your Website Violate Copyright Laws?
Before diving into the debate over whether or not AI-generated content violates copyright laws, we have to understand what can be copyrighted in the first place.
The US Constitution authorizes federal legislators to “secur[e] for limited Times to Authors . . . the exclusive Right to their…Writings.” As with every facet of federal law, a regulatory agency gets to interpret what the law (or Constitution) actually means (or what they think it means). Throughout history, regulatory agencies have been given significant leeway when it comes to these interpretations.
The Copyright Act was born out of the aforementioned language from the Constitution, and this Act allows for copyright protection to “original works of authorship.”
One of the main issues of note as we dive into this subject is the failure to define what it means to be an “author,” which is something that didn’t need much clarity throughout history. However, history didn’t have to contend with artificial intelligence.
As with all pieces of legislation, their trajectory is determined by legal precedence, which there is plenty of when it comes to the Copyright Act. The Copyright Office only recognizes a copyright for works “created by a human being.” We can look to various court cases to narrow down what the Act, through the Office, defines as “human” in the copyrighted works.
- Courts have denied protection to non-human authors, holding that a monkey cannot receive copyright protections for photos it took because it lacked standing to sue (non-humans cannot bring a legal action in court).
- Courts have explicitly said that some human creativity is needed for a copyright when they decided on whether or not to issue a copyright to celestial beings (seriously).
- Courts have denied a copyright for a living garden because a garden does not have a human author (this could probably be argued otherwise, but, alas, the Courts have spoken for now).
More recently, Dr. Stephen Thaler was denied an application to register a piece of AI artwork with the Copyright Office. The piece of art was authored “autonomously” by an AI technology called the Creativity Machine. Dr. Thaler argued, unsuccessfully, that the piece did not need “human authorship” as required through existing copyright laws. A federal district court disagreed, stating clearly that “human authorship is an essential part of a valid copyright claim.”
This decision will almost certainly be appealed.
Is There Any Hope For AI and Copyrights?
All hope is not lost for those who want to obtain copyrights for material that gets created using AI.
Generative AI programs could still receive copyright protections, but whether or not they do will depend entirely on the level and type of human involvement in the creative process for the piece. One major preemptive blow for those seeking AI-generated content copyright came in the form of a copyright proceeding and copyright registration guidance. Both of these indicate that the Copyright Office is unlikely to grant human authorship for any AI program generating content via text prompts.
Before the release of ChatGPT, the major discussions around AI and copyright protections centered on artwork. In October 2022, the Copyright Office canceled copyright proceedings for Kris Kashtanova.
Kashtanova filed a copyright protection in 2022 for a graphic novel containing illustrations created by AI tech Midjourney through a text prompt. The Office said that Kashtanova failed to disclose that the images were made by AI. Kashtanova responded by arguing that the images were made through a “creative, iterative process,” but the Office disagreed. Guidance issued by the Office in March 2023 (4 months after the release of ChatGPT) says that when AI “determines the expressive elements of its output, the generated material is not the product of human authorship.”
Counterarguments have certainly been made. Many believe that AI-created works should be eligible for copyright protection because AI has been used to create works that subsequently received copyright protection. Funnily enough, we can examine a case from 1884, Burrow-Giles Lithographic Co. v. Sarony, in which the Supreme Court held that photographs could receive copyright protections in situations where the creator makes decisions about the creative elements in the shot itself (lighting, arrangement, composition, etc.). The argument could be made (and has been) that new AI tools are basically the same when a human sets the parameters for the output.
Of course, it’s much more complicated and nuanced than that. In fact, that argument doesn’t hold much weight upon closer examination. The analogy between photography and new AI is a weak thread. For example, in the Kashtanova case, the Copyright Office says that Midjourney (the technology used to create the graphic novel) is not a tool that Kashtanova controls or guides to get the desired image because it “generates images in an unpredictable way.” Whereas the photographer claiming copyright protection can distinctly point to the elements under their control, someone using generative AI will struggle with that.
The Copyright Office offered a counter-analogy by saying that using AI to create a piece is similar to a person commissioning an artist. The person who commissions the artist can’t claim a copyright for the piece if they only give general directions for its completion. The Office, again in March 2023, determined that a user does not have ultimate creative control over generative AI outputs in order to reach the level of intimacy required for a copyright.
Even though it seems like the Copyright Office is deadset against granting copyrights to AI-generated content, the issue certainly isn’t settled (are any laws ever settled?). The Office knows this and has left the door open to the idea of copyrights for works that contain AI, but, again, it’s complicated. A copyright likely wouldn’t be available for all of a piece of work containing AI-generated content – only for the human-generated portion of the piece.
Copyright office only allows copyright protection for a person’s own contributions to works that combine both AI and human-generated content. They say that a creator must “identify and disclaim AI-generated parts of the work if they apply for a copyright.”
Having said all of that, it’s important to understand that regulatory agencies, including the Copyright Office, cannot implement regulations that are considered unconstitutional. How do regulations created by regulatory agencies pass Constitutional muster?
Why, the courts, of course. That’s for discussion later in this article.
Who (or What) Owns the Copyright to Generative AI Outputs?
If we work on the assumption that some AI-generated works will be eligible for copyright protection, exactly who would own the copyright?
- Would it be the person who tells the AI technology what to do?
- Would it be the company or entity that created or leases the AI technology?
We could even go so far as to ask whether investors in AI technology could ultimately hold copyrights for works created by the AI. For example, Microsoft is OpenAI’s largest financial backer, and they even hired OpenAI’s CEO, Sam Altman, to run their new AI division the day after he was fired by OpenAI’s board of directors. That was, until a day later when Altman was hired back as CEO of the company after nearly every OpenAI employee threatened to leave and go to Microsoft with him.
Alas, this is an interesting story of internal politics for another time, but it does illustrate just how intertwined AI technology is with investors and other major companies, perhaps ones we don’t want obtaining all of the copyrights available.
Chapter Two of the Copyright Act says that ownership initially falls to the author(s) of the work in question. Since we don’t have much judicial or regulatory direction about AI-created works yet, there’s not a clear rule about who an “author or authors” are for a piece of work (here we are again, debating authorship).
We would consider a photographer the author of their photographs, not the maker of the camera the photographer used. Drawing an analogy, it would seem this opens the door to copyrights for people who input the parameters for a piece of work into AI technology, not to the creators of said technology.
This particular view would equate the person who inputs the parameters for the work to an author and the initial copyright owner for the piece. However, this argument loses weight if we consider the AI creator’s claim to some form of authorship due to the coding involved in the training and the training the AI has undertaken to help it create the piece.
Regardless of whether OpenAI says users do or do not have the rights to their work, the Courts will be the ultimate decision-makers on the questions of copyrights.
Copyright Infringement – Does Your Website Content Already Violate Copyright Law?
Perhaps of more concern for our main industry at Blue Seven (law firm content marketing) is the question of copyright infringement by generative AI, especially LLMs like ChatGPT. If you’ve been in tune with legal marketing this year, you’ll know there’s an entire industry that’s sprung into existence offering rapid, scalable content for law firms (and every other industry).
Many websites quickly pivoted. Why, CEOs and CFOs reasoned, should they pay human writers to do something that can now be done for free? That’s a topic for another conversation, but suffice to say, the quality of AI-generated legal content has been less than stellar.
The issue here is that many people jumped to AI and are still jumping ship before knowing whether or not the content produced by AI would violate copyright laws. The debate over whether or not generative AI content infringes on copyrights is raging in public and in the courts. While we understand that website owners with a non-legal background wouldn’t necessarily know much about the potential copyright issues, we do fully expect law firms and legal marketers to anticipate these issues and proceed with caution.
Some have proceeded with caution. Others have bound forward like an F5 tornado through a barn.
Do generative AI programs infringe on copyrights by making copies of existing content to train their LLMs or by creating outputs that closely resemble existing content? That’s the question we don’t have an answer to right now. But we can look at where the winds could take us.
Does the AI Training Process Infringe on the Copyrights of Other Works?
This is a question that, though it needs answering, may not affect a website owner but could certainly affect AI companies pioneering new technologies.
Every complex artificial intelligence model uses specific coding that directs it (the model) to learn. But how do they learn?
AI models like the LLM ChatGPT are revolutionary. It’s great for what it is, at least. ChatGPT is good because OpenAI has trained the LLM off of, well, everything. OpenAI has had many ChatGPT models over the years, but ChatGPT 3 is the model that enthralled the world. ChatGPT 4 was released soon after, and it now has access to the web, whereas the previous version’s knowledge base ended in 2021.
OpenAI has never denied using the works of others to train its LLM. They’ve explicitly said their model learns from many sources, including copyrighted content. OpenAI says it created copies of works it has access to in order to use them (the copies) to train their models.
Is the act of creating these copies to use an infringement of the copyright holders’ rights? The answer to that depends on who you ask.
AI companies argue that the process of training their models constitutes valid fair use and, thus, does not infringe on others’ copyrights. We’ve introduced the term fair use, which needs to be defined in the context of 17 U.S.C. § 107, which outlines four determining factors for determining fair use:
- The purpose and character of the content’s use, including whether the use is for commercial purposes or non-profit, educational purposes;
- The nature of the copyrighted material;
- The amount of substantiality of the used portion of the copyrighted material in relation to the work as a whole;
- The effect of the use of the copyrighted material on the potential market for or the value of the work.
Did you know that OpenAI is a non-profit organization? Well, kind of. They transitioned to a for-profit company but are still majority controlled by the larger non-profit. It’s complicated and probably meant to be in order to attempt to claim that they (OpenAI) aren’t using the information for commercial purposes and can claim the content they use through fair use.
But, it seems the direction the conversation is heading, at least politically and in the courts, can be seen in how the US Patent and Trademark Office describes the AI training process by saying, “will almost by definition involve the reproduction of entire works or substantial portions thereof.” The way government regulators are framing this conversation, we can see they are erring on the side of copyright holders.
Do AI Outputs Infringe on the Copyrights of Other Works?
When we examine the fourth point in determining fair use mentioned above, we see where specific issues for website content come in. The major concern, and one we should all be aware of if we operate a business website, is that AI models allow for the production of content that’s similar to other works and competes with them.
There have indeed been multiple lawsuits filed by well-known individuals in the entertainment industry against AI companies and entities. These lawsuits dispute any claims of fair use by the AI companies, arguing that the products of these models can undermine the market (and value) of the original works.
In September 2023, a district court ruled that a jury trial would be necessary to determine whether an AI company copying case summaries from Westlaw constitutes fair use. Westlaw is a legal research platform, so this case will directly affect a company in our particular realm. The court already conceded that the AI company’s use of the content was “undoubtedly commercial.” The jury would be needed, however, to handle four factors:
- Resolve factual disputes about whether the use was transformative (as opposed to commercial);
- Determine to what extent the nature of Westlaw’s work favored fair use;
- Determine whether the AI company copied more content than they needed from Westlaw to train their models;
- And determine whether the AI program could constitute a market substitute for the plaintiff.
The output of AI that resembles existing works could constitute an infringement of copyright. If we look at case law, a copyright owner could make a case that AI outputs infringe on their copyrights if the AI model (1) had access to their content and (2) created “substantially similar” outputs.
Showing element one here won’t be the issue as these cases go through the court system. These companies have been fairly open about how they’ve trained their models. It’s element two, showing that the outputs are “substantially similar,” that presents the biggest legal hurdle.
Defining “substantially similar” is tough, and the definition varies across US court systems. In general, we can see the courts have described defining “substantially similar” by examining the overall concept and feel of the piece or the overall look and feel. Additionally, the courts have examined whether or not an ordinary person would “fail to differentiate between the two works” (a comparison between the original and the AI-generated output, which is trained on the original).
Other cases have examined both the “qualitative and quantitative significance” of the copied content compared to the content as a whole. It’s likely that the courts will have to make comparisons like this in court so that a judge and/or jury can make a determination.
Two types of AI outputs raise concern, with the first concerning AI programs creating works involving fictional characters. Imagine Luke Skywalker showing up in a new book about Marco Polo’s adventures. AI could certainly do this, but it would be easier to see this as a copyright infringement.
The second area of concern focuses on prompts requesting that the AI output mimic the style of another author or artist. For example, you can attempt to have ChatGPT craft a criminal defense practice area page in the voice and style of Stephen King. While this would certainly make for an entertaining read, publishing it could constitute infringement, but that is admittedly a gray area right now.
This is a New Era of Digital and Copyright Law
AI companies are preemptively blaming their models’ users for any potential copyright infringement that occurs as a result of their given outputs. As the Copyright Office weighs new regulations for generative AI, they recently published a request for public comments on the new potential regulations (a standard procedure for all regulatory bodies weighing changes to existing regulations).
The public comments and replies thus far give us an understanding of how the AI companies are going to battle this in court. Notably and predictably, Microsoft, OpenAI, and Google all have something to say about this issue.
Microsoft (again, OpenAI’s largest backer with a 49% stake in the company) says that “users must take responsibility for using the tools responsibly and as designed.” The company says that AI developers have taken steps to mitigate the risk of AI tool misuse and copyright infringement.
Google quickly lays the blame on users of the technology by reiterating that generative AI can replicate content from its training but saying that this occurs through “prompt engineering.” Google’s public comments go on to say that the user who produces infringing output should be the party held responsible, not the company behind the technology.
OpenAI flatly says that infringement related to outputs from the technology “starts with the user.” They say that there would be no possible infringement on copyrights if not for the user inputs (nevermind the fact that OpenAI has copied nearly every piece of information, much of it copyright protected, to train its programs).
The Lawyers Tackling Complex AI Litigation
The beauty and beast of the dawning of the age of AI are the legal nuances that have yet to be fleshed out. As we have already reviewed, the concept of authorship and copyright issues are becoming increasingly scrutinized. However, there is also the element of how AI is trained to “learn” how to better answer questions or provide more specific– not necessarily accurate– information.
In an elementary understanding, AI learns by gathering data, also known as other people’s work, into an algorithm. The machine learning system can distribute the information descriptively, predictively, or prescriptively. But who’s work is being used, where is the work going, and how is it being referenced?
We all remember the days of citing sources. Well, AI is far from doing that. Instead, these models are benefiting and learning from people who have worked their entire lives to create beautiful, funny, and charismatic (all the adjectives) words and images that are now being filtered through an algorithm without any mention of where this information is coming from.
From authors to comedians to legal institutes, creators are filing suits against various AI programs on the legal theory that their work is being used to train AI without proper recognition or compensation. AI is exploiting the work of artists and scholars. Lawyers like Matthew Butterick and Joseph Saveri are standing up against prominent AI companies like Open AI, META, and ChatGPT. They are shaping the future world of AI by standing up for the humans who are providing the work in order for AI to learn and adapt.
Like anything in the legal world, all this will take time. When filing a lawsuit, you must present a legal complaint that outlines in short, plain language the facts, their application to the elements of each offense, and relief. Currently, AI programs are being sued for negligence, copyright infringement, unlawful competition, unjust enrichment, and DMCA violations. In fact, a new lawsuit was just filed by Julian Sancton, “and thousands of other writers did not consent nor were compensated for the use of their intellectual property in the training of the AI.”
As recognized in Harvard law journal, even though the lawsuits are compiling, there is no easy clear end in sight. The courts are going to have a hard time distinguishing between AI-generated material, authorship, what is considered public sourcing, and copyright infringement, among other legal theories like templates versus AI-generated outlines.
Eventually, once these lawsuits get past state courts and appeal processes, some cases could land at the feet of the Supreme Court as a writ of certiorari. If (when) that happens, the Supreme Court can deny the petition, and whatever opinion is delivered by the highest state court will prevail.
Or, the Supreme Court will become the final arbiter.
How Will the Government Regulate AI
The idea of the US government, specifically our elected officials, crafting regulations for artificial intelligence is almost laughable. This new technology is far more advanced than “internet things” that came before it, and we’re still living with Internet laws from the 1990s. Alas, the efforts are being made.
Time Magazine has shown that Congressional inquiries into AI are still in their infancy as legislators proceed with caution. Senate Majority Leader Chuck Schumer held a closed-door meeting with the nation’s main tech leaders in September. The goal, according to legislators, is to pass bipartisan AI legislation sometime in 2024, but with the technology evolving so rapidly, this seems like a tall order.
However, Congressional urgency on this subject is clear. Senate Intelligence Committee Chairman Mark Warner, D-VA., told reporters after the closed-door meeting that, “We don’t want to do what we did with social media, which is let techies figure it out, and we’ll fix it later.
A side note, and one that should be of concern, is that the average of Representatives is currently 58.4 years, and the average age of a Senator is 64.3 years. These are folks who, as smart as they may be, aren’t likely fresh on the ‘newfangled’ tech.
Hopping over to the executive branch, the Biden administration has crafted a Blueprint for an AI Bill of Rights. The White House sees the potential challenges posed to democracy with the expansion of unregulated AI into our lives and admits the outcomes could be harmful, though not inevitable.
The Blueprint highlights the need to continue technological progress but in a way that protects civil rights and democratic values. Amongst the prongs of protection outlined in the piece is a stress on data privacy, which ties directly into our copyright discussion. Specifically, the Blueprint says that “[D]esigners, developers, and deployers of automated systems should seek your permission and respect your decisions regarding collection, use, access, transfer, and deletion of your data in appropriate ways and to the greatest extent possible…”
The Blueprint goes on to call for enhanced protections and restrictions for data and inferences drawn from sensitive domains, saying this information should only be used for necessary functions (though “necessary functions” isn’t defined, giving leeway to the tech companies to craft the definition).
The AI lobby in the Capitol is already strong, and it’s growing. Every major tech company in the US has an interest in the outcome of any regulations, legislation, executive order, and judicial decisions related to artificial intelligence.
OpenAI says it “urges the Copyright Office to proceed cautiously in calling for new legislative solutions that might prove in hindsight to be premature or misguided as the technology rapidly evolves.”
These companies and their mega lobbying power will certainly influence the outcomes of any governmental regulations or new AI legislation. This is where we take a “wait and see” approach and hope for humanity to win out.
What the American Bar Association Says
As we wrap this up, we want to examine what the American Bar Association says about AI as this moves closer into our territory as legal content providers. The latest update we posted was in May 2023. The ABA provided a three-pronged approach to AI resolution to ensure accuracy, transparency, and accountability. In August 2023, the ABA created a task force to study the impact of AI in the legal profession.
The task force’s objectives are to explore:
- Risks of AI (bias, cybersecurity, privacy, and uses of AI such as spreading disinformation and undermining intellectual property protections) and how to mitigate them
- Emerging issues related to generative AI tech
- Using AI to increase access to justice
- AI governance (including law and regulations, industry standards, and best practices)
- AI in legal education
The ABA is responsible for maintaining this code of ethics for lawyers. As a profession (Victoria writing here), we uphold an oath to protect the law. But we also swear to tell on each other. There is no overseer of how lawyers conduct their work. Instead, we rely on the ABA to set the tone as well as our state bar code of ethics.
With the introduction of this task force, we may see an increase in lawyers submitting complaints to the state bar if they have actual knowledge that an attorney is abusing AI to conduct their work. The ABA task force reinforces the importance of using AI responsibly, if at all.
Be Smart About Using AI Content on Your Law Firm Website
The high-stakes conversations surrounding AI and copyright infringement are playing out amongst giants. They are companies, regulators, elected officials, and academics debating the next steps for this revolutionary technology.
We’re just on the sidelines, trying to stay out of the lines of fire. We don’t yet know what AI regulation will look like. Nobody does. What we can almost certainly guarantee is that AI laws will remain in flux and likely be a few years behind wherever the technology is, especially with the rapid advancement of the tech we’re seeing right now.
As you make the decision about how to craft a content strategy for your law firm, construction company, or any other industry, we urge you to proceed with caution. A signature on a piece of new legislation, a change at the regulatory level, or a judicial decision could change everything.
Regardless of where the new era takes us, you shouldn’t take shortcuts that sacrifice the quality of your content, and you certainly shouldn’t make any moves that violate the copyrights of others. Create your content plan wisely with a team that genuinely cares about writing quality and integrity.
Written by Allen Watson and Victoria Lozano, Esq. – Founders of Blue Seven Content