Explore the intersection of artificial intelligence and law. This hub offers insights into regulations, confidentiality, and copyright in the context of AI.

Regulation of AI

The evolving legal framework for AI

AI & Confidentiality

Data security and privilege considerations

AI & Copyright

Explore intellectual property implications

AI Model Provider Confidentiality

Is your data safe with AI providers?

Imagine you're a criminal defense lawyer. A client confesses they killed someone, you need to write an urgent brief, but you don't have time. You input the facts—including the confession—into ChatGPT, and explain what you need, and it delivers a reasonably good first draft that you can refine. Great, right?

Hopefully, by now, we're not that naive.

Confidentiality relating to generative artificial intelligence (GenAI) is a widespread concern among industries and legal professionals. Sending confidential information to GenAI services like OpenAI's ChatGPT, Anthropic's Claude, or Google's Gemini raises issues about leaking data, waiving privileges, or exposing confidential information. To our knowledge, no court has addressed whether sending information to GenAI providers waives a privilege or destroys confidentiality protection.

Understanding GenAI Provider Policies

The likely answer, unsurprisingly, is "it depends." Mostly, it depends on the GenAI provider's terms and security, and the user's settings in their GenAI accounts. If the provider uses your inputs to train future AI models, it seriously risks confidentiality leaks and privilege waivers. This must be avoided!

If the GenAI provider stores your inputs--but expressly says it won't use them to train AI--there is a reasonable argument against waiver. Providers of other cloud services like email, chat, and file storage often have similar terms, and using sufficiently-secure cloud providers generally does not waive privilege. See, e.g., Harleysville Ins. Co. v. Holding Funeral Home, Inc., No. 1:15CV00057, 2017 WL 4368617 (W.D. Va. Oct. 2, 2017).

Making Informed Choices

For many organizations, using "zero-retention" AI services will be the correct answer. While you still need to trust the provider—and data breaches can happen even with highly secure companies—established providers like Microsoft Azure and Amazon Bedrock offer enterprise-grade security comparable to their other cloud services.

The information herein is based on publicly available documentation from providers. Grounds LLP has no affiliation with these providers other than being a user. For the most current and accurate information, please refer to the providers' official documentation and legal agreements.

We looked through a whole lot of provider terms to compile this resource, and providers are frequently updating terms. Mistakes are possible! Let us know if anything needs updating, or if we should add another provider.

Provider Comparison

Provider	Trains on Data	Data Storage	Indemnity
OpenAI	Varies Consumer tier trains by default (can opt out), business offerings do not train by default	Storage for abuse monitoring and service improvement, with zero-retention option for enterprise (Data Usage FAQ →)	Available for API and Enterprise users with conditions, for claims that Customer's use or distribution of Output infringes a third party's intellectual property right (Business Terms →)
Azure OpenAI	No Does not train on customer data	30 days for abuse monitoring, enterprise can opt out (Data Privacy Documentation →)	Indemnity for allegedly infringing outputs, provided customer implements certain mitigations (Copyright Commitment →)
Anthropic	No By default, does not train on inputs or outputs. Exceptions for trust and safety reviews and explicit feedback.	Data stored for up to 30 days for safety monitoring, can be deleted upon request (Consumer Terms of Service →)	Indemnity for claims that outputs infringe third-party intellectual property rights, subject to certain conditions (Commercial Terms of Service →)
Amazon Bedrock	No Your content is not used to improve the base models and is not shared with any model providers	Data is encrypted in transit and at rest, with optional customer key encryption (AWS Bedrock FAQs →)	Uncapped IP indemnity for copyright claims arising from generative output of Amazon Bedrock services (AWS Service Terms →)
Perplexity	Varies Consumer tier may store data for AI improvements, Enterprise tier does not train on customer data	Input and responses stored for 7 days by default for consumer tier, configurable retention for enterprise (Enterprise Terms of Service →)	Enterprise customers receive indemnification protection against third-party claims related to use of services (Enterprise Terms of Service →)
Google Gemini	Varies Google uses data submitted through its unpaid tiers to improve its services, but does not use user input to train models for paid tiers	Data retained for up to 3 years by default, with options to limit retention to 3 or 36 months (Gemini API Terms →)	Google offers indemnification for both training data and generated output (Protecting customers with generative AI indemnification →)
Grok	Varies Free tier data may be used for training, with opt-out option; paid services do not train by default	X may use, store, and distribute input and output 'to maintain and provide the Service,' but '[e]xcept for anonymized and aggregated statistics, [Grok] will not use your Input or Output to develop or improve the Service' (Privacy Policy →)	Indemnification available for enterprise customers against third-party claims related to use of services (Enterprise Terms of Service →)

OpenAI

Trains on Data Varies

Consumer tier trains by default (can opt out), business offerings do not train by default

Data Storage

Storage for abuse monitoring and service improvement, with zero-retention option for enterprise (Data Usage FAQ →)

Indemnity

Available for API and Enterprise users with conditions, for claims that Customer's use or distribution of Output infringes a third party's intellectual property right (Business Terms →)

Azure OpenAI

Trains on Data No

Does not train on customer data

Data Storage

30 days for abuse monitoring, enterprise can opt out (Data Privacy Documentation →)

Indemnity

Indemnity for allegedly infringing outputs, provided customer implements certain mitigations (Copyright Commitment →)

Anthropic

Trains on Data No

By default, does not train on inputs or outputs. Exceptions for trust and safety reviews and explicit feedback.

Data Storage

Data stored for up to 30 days for safety monitoring, can be deleted upon request (Consumer Terms of Service →)

Indemnity

Indemnity for claims that outputs infringe third-party intellectual property rights, subject to certain conditions (Commercial Terms of Service →)

Amazon Bedrock

Trains on Data No

Your content is not used to improve the base models and is not shared with any model providers

Data Storage

Data is encrypted in transit and at rest, with optional customer key encryption (AWS Bedrock FAQs →)

Indemnity

Uncapped IP indemnity for copyright claims arising from generative output of Amazon Bedrock services (AWS Service Terms →)

Perplexity

Trains on Data Varies

Consumer tier may store data for AI improvements, Enterprise tier does not train on customer data

Data Storage

Input and responses stored for 7 days by default for consumer tier, configurable retention for enterprise (Enterprise Terms of Service →)

Indemnity

Enterprise customers receive indemnification protection against third-party claims related to use of services (Enterprise Terms of Service →)

Google Gemini

Trains on Data Varies

Google uses data submitted through its unpaid tiers to improve its services, but does not use user input to train models for paid tiers

Data Storage

Data retained for up to 3 years by default, with options to limit retention to 3 or 36 months (Gemini API Terms →)

Indemnity

Google offers indemnification for both training data and generated output (Protecting customers with generative AI indemnification →)

Grok

Trains on Data Varies

Free tier data may be used for training, with opt-out option; paid services do not train by default

Data Storage

X may use, store, and distribute input and output 'to maintain and provide the Service,' but '[e]xcept for anonymized and aggregated statistics, [Grok] will not use your Input or Output to develop or improve the Service' (Privacy Policy →)

Indemnity

Indemnification available for enterprise customers against third-party claims related to use of services (Enterprise Terms of Service →)

OpenAI

Azure OpenAI

Data Privacy Documentation →Copyright Commitment →FAQ →

Anthropic

Consumer Terms →Commercial Terms →Privacy Policy →Usage Policy →

Amazon Bedrock

AWS Bedrock FAQs →Service Terms →Security Documentation →

Perplexity

Google Gemini

Google Generative AI Indemnification →Google API Terms of Service - Submission →Gemini API Terms →Google Terms of Service →Google Data Processing Addendum →

Grok

This information is based on publicly available documentation and terms. Always refer to the official provider documentation and legal agreements for the most current information.

Regulation of AI

AI & Confidentiality

AI & Copyright

AI Model Provider Confidentiality

Understanding GenAI Provider Policies

Making Informed Choices

Provider Comparison

OpenAI

Azure OpenAI

Anthropic

Amazon Bedrock

Perplexity

Google Gemini

Grok

Contact us