The (not so) Hidden Risks of Using Cloud‑Based LLMs with Confidential Data

Professionals are increasingly experimenting with cloud-based Large Language Models (LLMs) – like OpenAI’s ChatGPT – to streamline work, from drafting documents to analyzing data. However, alongside their benefits lies a serious challenge: protecting confidential and sensitive information. When business or legal teams feed client data, proprietary code, or personal identifiers into an LLM hosted on the cloud, they may inadvertently expose that information beyond the intended context. Recent incidents and expert analyses have underscored that safeguarding confidential data in the age of AI is more critical than ever.

This is especially true for lawyers. Relying on the promises of cloud service providers can be dangerously shortsighted. While vendors may offer terms and conditions designed to reassure, the burden of protecting client data and legal privilege ultimately rests with the professional—not the platform. If something goes wrong, it’s not the AI vendor whose license is at stake. It’s yours. Below, we explore why legal professionals, in particular, should not entrust confidential data to third-party LLMs running in the cloud.

Data Privacy and Confidentiality Risks

Entrusting sensitive data to a third-party AI service raises immediate privacy and confidentiality concerns. Unlike using software on your own computer, queries to cloud LLMs are sent over the internet and stored on external servers outside your organization’s control. In fact, the providers themselves may retain and review what you input. OpenAI openly warns users not to share sensitive information, noting that conversations are reviewed by staff and cannot be deleted on request. The UK’s National Cyber Security Centre (NCSC) similarly cautions that queries are visible to the LLM provider and likely used to improve the service, meaning the text you enter can be accessed by the AI company or its contractors. By design, many cloud LLM providers retain the right to utilize user content to provide, maintain, and improve services. This means any confidential report or client data you input might be stored indefinitely on someone else’s servers.

The potential implications for confidentiality are severe. Once information is submitted to a cloud LLM, you lose exclusive control over it. Company secrets or personal data could be exposed through the AI’s future outputs or internal mishandling. This not only undermines trust but can also have serious legal and financial consequences.

Regulatory and Compliance Challenges

When employees feed confidential data into a cloud-based AI, they may also be running afoul of data protection laws and compliance requirements. Privacy regulations like the EU’s GDPR place strict rules on transferring personal data to third parties or across borders. In the case of ChatGPT, an investigation by Italy’s Data Protection Authority in 2023 highlighted concerns about exactly this issue. Citing ChatGPT’s massive collection and processing of personal data to train its models, Italy temporarily banned the service until greater privacy assurances were in place.

Apart from privacy laws, industry-specific confidentiality duties demand caution. Attorneys and financial professionals are often bound by confidentiality agreements and ethical rules that forbid sharing client information with unauthorized parties. Uploading a client’s documents to an AI service could breach those duties if done without client consent. Notably, attorney-client privilege – the cornerstone of legal confidentiality – could be waived if a lawyer unintentionally discloses privileged information to an AI platform. Since OpenAI is not bound by any attorney confidentiality agreement and even disclaims treating user prompts as confidential, sharing client facts with ChatGPT is tantamount to sharing them with an outside entity.

Intellectual Property and Trade Secret Concerns

Beyond privacy, intellectual property (IP) risks emerge when confidential business information is fed into a public LLM. Companies invest heavily in trade secrets – proprietary algorithms, strategies, source code, etc. – whose value depends on controlled secrecy. But if a developer or analyst pastes secret source code or a patent draft into an AI like ChatGPT, that information is no longer truly secret. In many jurisdictions, trade secret protections can be lost if the secret is disclosed to someone outside the protected circle without a confidentiality agreement. Here, the “someone” is an AI provider.

There’s also the issue of who owns and controls the output generated from your confidential inputs. While OpenAI assures users they own the outputs their prompts generate, that guarantee may ring hollow if the output inadvertently contains fragments of someone else’s IP. IP professionals are rightly concerned about how secure proprietary information remains when used in AI and whether using AI-generated content in official work product could introduce legal ambiguities.

Security and Data Breach Threats

Any external data service can be a target for hackers or subject to technical glitches – and LLM platforms are no exception. A major security concern with cloud LLMs is that your confidential inputs might be exposed through a breach or error on the provider’s side. For example, in March 2023, OpenAI disclosed a data exposure bug that allowed some users to view parts of other users’ chat histories, including conversation titles and potentially payment-related information.

Even without malicious breaches, human error or misuse can lead to leaks. We’ve already discussed how the AI itself might inadvertently incorporate sensitive data into responses. But sometimes the breach is entirely on the user side. In April 2023, Samsung employees accidentally leaked confidential information by inputting source code and internal notes into ChatGPT. The incident prompted Samsung to impose stricter internal guidelines and consider banning generative AI use altogether.

Ethical and Professional Duty Considerations

For professionals in fields like law, finance, and healthcare, using AI tools touches not only on laws and policies but also on ethical duties to clients and stakeholders. Take lawyers as an example: they are duty-bound to keep client communications confidential and secure. If a lawyer were to use ChatGPT to draft a client brief by including details of the case in the prompt, that lawyer might inadvertently violate ethical rules unless the client has given informed consent.

Bar associations and in-house general counsels are increasingly warning attorneys that using generative AI for client work must be treated as a disclosure to a third party, and thus either avoided or done only with client approval and extreme caution. The ethical imperative is clear: protecting stakeholder trust and confidentiality must take priority over convenience. The consequences of a breach are not theoretical—they are professional, reputational, and personal.

The ALPHALECT.ai Advantage

At ALPHALECT.ai, we understand that lawyers and legal professionals operate under stringent professional confidentiality requirements – because we are lawyers. Our solution is specifically designed to provide absolute confidentiality and control over sensitive information. Unlike conventional cloud solutions, we don’t merely promise data security—our architecture ensures that confidential data never leaves your direct control or gets exposed to third-party training models. This guarantees compliance and peace of mind for legal professionals. Legal professionals do not have to rely on vague assurances that their data is “somehow” protected in the cloud. Instead, they benefit from architecture and workflows explicitly designed to preserve professional secrecy.

Conclusion

While cloud-based LLMs bring transformative potential, legal professionals must be clear-eyed: the promises of AI vendors won’t protect your clients—or your license. If something goes wrong, it’s your responsibility. Don’t entrust sensitive data to platforms that cannot contractually guarantee confidentiality in line with your obligations. Choose solutions built with your duty of care in mind. Because at the end of the day, it’s your career—and your clients’ trust—on the line.

Businesses must balance technological innovation with robust data security frameworks to fully harness the power of AI safely and responsibly.

Don’t wait for a confidentiality breach to expose the risks you thought were under control. Review how your firm handles AI today. Ensure you rely only on solutions purpose-built for legal professionals—solutions that offer verifiable, contractually enforceable confidentiality. Your career, your license, and your clients’ trust depend on it.

Get in Touch

At ALPHALECT.ai, we explore the power of AI to revolutionize the European IP industry, building on decades of collective experience in the industry and following a clear vision for its future. For answers to common questions, explore our detailed FAQ. If you require personalized assistance or wish to learn more about how legal AI can benefit innovators, SMEs, legal practitioners, and innovation and the society as a whole, don’t hesitate to contact us at your convenience.