Who Owns AI Generated Content? Understanding AI Data Ownership in Generative Platforms

Artificial intelligence image graphic generate software

Many businesses are uploading internal content into AI tools without knowing what rights they’re giving away. Ownership boundaries are unclear when it comes to the use of these tools in the workplace.

This post examines who really owns your data once it enters a generative AI platform, and what your team should consider before hitting “submit.” Let’s take a look at the challenges that need to be addressed.

The Rise of AI Workflows and the New IP Risk

Generative AI tools like Gemini, ChatGPT and NotebookLM are becoming deeply embedded in day-to-day business workflows. Employees are using these platforms to speed up their work, simplify complex tasks and uncover insights buried in large sets of information.

The advantages are clear, but these evolving workflows also create new intellectual property and data privacy concerns, particularly when sensitive documents are uploaded without a full understanding of how the platform manages the material and without staying organized and knowing what documents you uploaded where. These new concerns make it essential for organizations to establish clear approaches and policies around when and how employees can use these tools.

Three Common Use Cases Driving AI Adoption

AI-driven platforms are changing the way teams create, share and organize information. The productivity gains are significant, but it is just as important to understand what is being uploaded and why. Here are some of the most common ways employees are using generative AI tools:

Summarizing Long or Complex Documents: Employees regularly upload research reports, technical specifications and policy manuals to generate quick summaries or extract key takeaways.
Drafting and Refining Written Communications: By feeding in older communications or existing templates and tweaking them, users can generate professional updates more efficiently than traditional rewriting.
Knowledge Management and Research Support: Several AI tools are designed to analyze and cross-reference uploaded knowledge bases instead of searching externally. Teams can upload collections of strategy documents, case studies, internal memos and project files to ask targeted questions and quickly surface relevant information. With tools like NotebookLM, teams can more confidently isolate one set of documents from another, in separate “notebooks” for different projects or customers.

But the downside is … uploading documents to AI platforms can expose sensitive information without safeguards or clear policies. Inputs have the potential to be stored, analyzed or used to improve AI models and have the potential to create a break in expected confidentiality.

"We’re seeing teams rely on AI to handle tasks that once consumed hours. That speed can give organizations a competitive edge as long as they remain mindful of what’s being shared." - Tara Swaminatha, Founder and Principal Attorney, ZeroDay Law

How Teams May Unknowingly Expose Sensitive Data

Generative AI’s versatility means that risks are not confined to a single department. Different teams often handle confidential materials without fully considering how those inputs are processed:

Marketing teams may upload existing slide decks and launch strategies to draft press releases, exposing unannounced product features or pricing models.
Legal teams frequently paste clauses from past agreements into AI tools when drafting new contracts, inadvertently combining one client’s confidential negotiation terms with another’s without the employee noticing.
HR teams often use AI to create updated performance review templates by referencing prior evaluations, which may include private employee data and internal metrics.
Research and development teams may rely on AI to summarize lab reports and technical findings, risking disclosure of proprietary research or patent-relevant data or code.

While these scenarios improve efficiency, they also highlight a growing tension between productivity and protection. Without careful review of platform terms and internal data governance policies, organizations may be exposing valuable intellectual property without even realizing or knowing the breadth and depth of exposure.

Input vs. Output: What the Difference Means Legally

Understanding the distinction between inputs and outputs in generative AI platforms is key, as each is handled differently under most license agreements and terms of use:

Inputs include emails, strategy documents, source code, policy manuals, other proprietary materials and, of course, prompts uploaded to the platform. Some AI vendors still reserve the right to store, analyze or even use inputs to improve their models unless enterprise controls or opt-out settings are enabled, which sometimes prevents users from configuring certain data to be ephemeral.
Outputs are the drafts, summaries, analyses or recommendations generated by the AI tool. Many platforms grant users broad rights to reuse outputs, including for commercial purposes, but they can also include disclaimers.

Because most agreements protect the provider, companies often bear the risk if AI-generated content infringes copyrights or creates compliance issues. Reviewing terms carefully is essential before uploading sensitive data or relying on outputs.

Learn more about how legal organizations can safeguard their AI efforts in our blog post, Legal Implications of AI Technologies: 6 Tips to Minimize Risk.

Interpreting Platform License Agreements: Key Ownership and Usage Clauses

Review license terms carefully before uploading documents, as platforms vary in how they handle data ownership and usage. But before reviewing license terms, make sure you know what license applies to the version of an AI tool your employees are using. Free and personal versions tend to grant the provider broad rights in users’ data. Some platform licenses take a middle-of-the-road approach and reserve the right to store and analyze your inputs to “improve their services,” which could expose confidential documents or proprietary data. Choosing the right license option (e.g., business) makes a difference. Reviewing license terms closely is essential, as policies on data ownership and usage vary widely and can affect how your information is handled.

Opt-Out vs. Opt-In Policies

The way a platform handles “opt-out” versus “opt-in” data policies can have a significant effect on your level of risk. With opt-out settings, uploaded information may be used to train models by default unless you actively change the permissions. Opt-in tools, on the other hand, require you to give explicit approval before your content is used beyond your current session. Again, check the license.

Who Owns the Output: You or the AI Vendor?

Ownership of AI-generated content is not always straightforward. Some platforms grant users broad rights to reuse outputs, while others include vague language that limits exclusivity or allows the same response to be provided to multiple users. Companies should review terms carefully and ask two key questions:

Do you have full rights to reuse the content?
Can the vendor resell or redistribute outputs?

These answers should be provided before AI-generated work is integrated into the workplace.

Real-World Legal Opinions and Precedents

Recent court cases and ongoing litigation are shedding light on the evolving legal landscape around AI, content ownership and copyright infringement. These rulings offer early guidance on how courts may treat both uploaded inputs and AI-generated outputs in enterprise contexts.

Cases on Copyrighted Materials Used to Train AI Models

Thomson Reuters vs. ROSS Intelligence (D. Del 2025): Fair use defense denied. Bottom line? Don’t steal copyrighted works and use them to train your own AI model to compete with the copyright owner.

This case was the first federal court opinion on fair use of copyrighted works to train AI models. Westlaw sued Ross, a competitor, for using Westlaw content—purchased from an intermediary—to train its non-generative AI model. Ross asserted a fair use affirmative defense for its use of Westlaw Keynotes. The judge didn’t buy the argument. The judge had reached the opposite conclusion in 2023, but this time the judge “slogged through 2,830 headnotes,” finding Ross infringed copyright on 2,243. The decision reinforced that Westlaw Keynotes are protectable and repurposing Keynotes for AI training constituted copyright infringement.
Bartz v. Anthropic (N.D. Cal 2025): Fair use defense upheld for using purchased works to train AI models. Bottom line? Only if you didn’t come by content honestly, don’t use it to train an AI model. Don’t download works from sites with pirated content.

A recent ruling agreed with Anthropic that its use of copyrighted works to train its Claude model was fair use where Anthropic had purchased copies of the works. Anthropic “tore off the bindings, scanned every page, and stored them in digitized, searchable files … to amass a central library of ‘all the books in the world’ to retain ‘forever.’” On the other hand, Anthropic’s use of the other millions of pirated works it downloaded for free was not.

Case on whether AI-generated content is copyrightable

Thaler vs. Perlmutter (N.D. Cal. March 2025): Only works authored by humans can be copyrighted. The court reaffirmed that purely AI-generated content cannot receive copyright protection under U.S. law, because such work lacks human authorship. The ruling underscores that businesses should not assume ownership of AI outputs without substantial human creative input. Bottom line? Your AI-written novel is not going to make you millions.

"These early rulings highlight that companies using copyrighted works to train AI models is not different from using copyrighted works for any other purpose protected by copyright law. And AI-written content cannot be copyrighted. Until more nuanced precedents emerge, the safest approach is to treat inputs and outputs as potential areas of risk." - Tara Swaminatha, Founder and Principal Attorney, ZeroDay Law

These cases collectively underscore growing judicial scrutiny of AI training practices and output ownership. Courts may further delineate fair use boundaries for internal vs. external content.

How to Protect Proprietary Content Before You Upload

Generative AI tools can improve efficiency, but safeguarding sensitive data you want to use in the tools takes careful planning. Companies need to establish clear rules before employees upload internal documents or use AI-generated content.

A checklist can guide reviews of AI Tool License Agreements and Privacy Policies. It should include:

Whether inputs are stored, analyzed or used to train models.
How outputs are (or aren’t) licensed and whether your rights to reuse are exclusive.
Verification of platform opt-out and opt-in settings to ensure they align with your risk tolerance.
A review of platform data retention policies, including whether data is partially retained, anonymized or transformed into derivative works. Understand exactly what “anonymization” means before agreeing to it and ensure any derivative works clauses are clearly defined.

Suggested best practices for using enterprise versions of AI tools include:

Prioritization of enterprise-grade tools with more favorable license terms & customizable data protections.
Use of managed accounts rather than personal logins to enforce centralized settings.
Disabled data sharing or training by default where possible.
Written agreements with vendors that clarify data handling.
Ensure thorough employee training on use policies and terms.
Implement consistent data classification and labeling protocols to distinguish between public, internal, confidential and regulated data to support AI access control and DLP strategy.

Why Internal Policy Matters as Much as External Contracts

Even with strong vendor protections, companies need internal controls. Employees should know what types of documents can be uploaded, which tools are approved and how to handle proprietary data. Without consistent guidance, sensitive materials can be unintentionally exposed.

A company-wide AI-usage policy can set expectations and guidelines that everyone can understand. Clear policies and vetted license agreements work best together. Regular monitoring and periodic reviews of AI usage can further ensure that policies remain effective as technologies and business needs evolve. By setting expectations up front and enforcing safeguards, companies can capture the benefits of generative AI without compromising control over sensitive content.

AI platforms offer tremendous efficiency, but businesses need to be clear-eyed about data ownership. Before you upload, review the platform’s terms, consult legal and create guardrails that protect your proprietary work.

Looking to develop an internal AI usage policy or evaluate license agreements in light of compliance obligations?

Contact ZeroDay Law for a consultation to help protect your IP and safeguard your organization.

You might also be interested in their related articles: