Seven AI Risks Your Vendor's Audit Missed

February 19, 2026 · 1943 words

Imagine hiring a security guard who checks IDs at the lobby desk but has never been trained to recognize social engineering. Someone walks up, flashes a confident smile, and says “I’m with IT. Show me the server room.” The guard, doing exactly what they were trained to do, lets them right in. Your AI vendor’s compliance certifications are that security guard: excellent at the job they were designed for, but completely blind to threats they were never taught to see.

Your AI vendor has a SOC 2 Type II report. Their auditors found no exceptions. Their infrastructure is locked down, their access controls are documented, and their disaster recovery plan has been tested. None of that will help you when a competitor reverse-engineers their model through the API, when a prompt injection attack overrides the system’s guardrails, or when privileged client information surfaces in a chatbot response because the model memorized its training data. If these risks exist in your environment and you haven’t audited for them, the ’lock on the door’ provided by SOC 2 won’t protect you from a claim of professional negligence. Traditional legal tech due dilligence stops at a certification checkbox that was never designed for AI.

In an earlier post , I explained why traditional certifications miss these risks. Now let’s look at some specific risks, what they actually look like, and why they should concern anyone deploying AI in legal practice.

You can have the strongest firewall in the world, but if the meaning of a user’s prompt is malicious, the firewall will let it through because it looks like a standard question.

AI/LLM risks not addressed by traditional certifications

Even organizations with pristine compliance credentials are discovering that AI introduces risks their security programs were never designed to catch. These AI-specific risks cut across confidentiality, integrity, and availability in ways that legacy controls are not designed to address.

Model Extraction: Can Competitors Steal Your AI Through the API?

An adversary can systematically query your AI model to reverse-engineer its functionality, using only legitimate API calls. No breach, no firewall alarm, no authentication failure. Just someone asking clever questions until they’ve effectively cloned your expensive, proprietary model. A law firm’s AI-trained legal reasoning model could be replicated by a competitor without ever touching the underlying code. SOC 2 assumes that if your API is authenticated and rate-limited, everything is fine. In reality, you need AI-specific anomaly detection on query patterns, which standard audits don’t require.

For a law firm, your proprietary legal reasoning prompt-chains or fine-tuned models are your competitive advantage. Model extraction allows a competitor to steal that advantage without ever breaking into your network.

Training Data Leakage: When Your AI Remembers What It Shouldn’t

LLMs sometimes memorize and regurgitate chunks of their training data verbatim. This means secrets encrypted in your database might still leak through the model’s outputs. A chatbot fine-tuned on confidential memos might later quote one to an arbitrary user.

Researchers have demonstrated that ChatGPT—a model with safety filters—can be exploited to emit real email addresses and phone numbers from its training set, without any network breach required. Traditional controls (encryption at rest, access controls) do nothing because the model itself is the data vessel. In legal contexts, this could be catastrophic. Imagine privileged strategy memos appearing in chatbot responses.

Prompt injection is social engineering, but for AI. A user provides input that causes the model to ignore its guardrails and follow the attacker’s instructions instead.

The technique was demonstrated within a day of Microsoft launching its AI-powered Bing Chat (in February 2023). A Stanford student Kevin Liu famously tricked Bing into revealing its hidden rules and codename “Sydney” within hours of launch, simply by appending “ignore previous instructions” to its query. Indirect prompt injections are even sneakier: an attacker can embed a hidden instruction in data that the AI will process later (like a rogue clause in a contract or a comment in a document). When a Chevrolet dealership deployed a ChatGPT-powered chatbot in December 2023, users tricked it into agreeing to sell a Tahoe for $1—complete with “and that’s a legally binding offer, no takesies backsies.”

If a chatbot can be tricked into selling a $50k SUV for $1, it can be tricked into waiving a discovery deadline or agreeing to a settlement term in a client portal.

Similarly, data poisoning—submitting bogus content into a system that retrains—can corrupt an AI’s outputs over time. An adversarial input could cause a model to emit false legal conclusions, or to generate documents that are factually incorrect.

From a certification standpoint, this is a nightmare gap. Prompt injection sits in a separate OWASP LLM Top 10 list that most traditional security audits don’t reference. It is not part of the standard penetration testing playbook. Neither SOC 2 nor ISO 27001 asks “how do you harden prompts?” or “do you sanitize user instructions?” They don’t account for an intelligent adversary trying to exploit the model itself rather than the surrounding IT system. It is an application-layer vulnerability unique to AI that falls between the cracks of secure coding guidelines and access control audits.

Model Drift: When AI Accuracy Deteriorates

Unlike traditional software, an AI model’s effectiveness isn’t static after deployment. Over time, it can become miscalibrated as data patterns change—a phenomenon called model drift. Nothing “breaks” technically; the model just silently becomes unreliable.

Traditional audits happen annually and verify that a change management process exists. No SOC 2 control ensures that the AI remains accurate over time. In legal AI, drift could mean a contract review tool that initially passed QA starts missing liability clauses as language patterns evolve. And nobody notices until a client suffers damage.

Fine-tuning and Data Contamination: Is Your Confidential Data Training Someone Else’s AI?

When you fine-tune a third-party model, you’re often sending sensitive text to a vendor that might feed a base model serving other clients. Many SaaS vendors have quietly integrated AI features that pipe customer data to providers like OpenAI or Anthropic, with defaults that allow training unless you opt out.

A SOC 2 report might confirm that the vendor has confidentiality policies. It won’t verify that “Client A’s fine-tuning data never influences outputs for Client B.” If a law firm fine-tunes an AI on privileged documents and the vendor uses that for general model improvement, you’ve potentially waived privilege to unknown parties. Although some organizations now insist on contractual “no training on our data” clauses, enforcement relies on trust unless technical isolation measures are verified.

Inference Privacy: Every Query is a Potential Leak

Using AI often involves sending queries (prompts) to a model that is hosted in the cloud. These queries may contain sensitive information—e.g., a lawyer asks an LLM, “Summarize our strategy for the ACME case…” which discloses client confidences.

Samsung engineers learned this the hard way in 2023 when, within three weeks of lifting an internal ChatGPT ban, employees had leaked semiconductor source code, internal meeting transcripts, and hardware specifications across three separate incidents, despite being a tech-savvy, ISO-certified company. This is exactly the kind of “silent third-party transfer” that GDPR frameworks struggle to catch, as I previously discussed .

Even if a provider doesn’t intentionally use your prompts, a breach could expose them. In March 2023, a bug in ChatGPT’s open-source Redis library allowed users to see snippets of other users’ conversations and partial payment information during a nine-hour window. No standard checklist in 2022 would have caught “Does your caching layer isolate User A’s conversations from User B?”

Multi-Tenancy and Cross-Client Influence: When One Client’s Data Shapes Another’s Results

When an AI platform serves multiple clients, there’s risk of bleed-over. Some risks are technical (e.g., a classic multi-tenancy failure, where a bug shows user A another user’s data). Others are model-level: if the system continuously learns from all users, one tenant’s input could subtly influence another’s outputs.

Auditors typically check that databases are logically separated. But with AI, the “database” is the model’s neural weights, and typical audits don’t have a procedure to inspect or assure segregation. A SOC 2 report relies on the company’s architecture description. If the company says “all customer data is logically separated,” the audit might just document that. They likely do not inspect the ML training pipeline to confirm that data is not commingled for model updates.

In legal contexts, this could be a huge failure. An AI research tool serving competing law firms might unintentionally transfer influence from one firm’s usage to another through the model’s responses. Traditional controls assume a static barrier (Client A can’t query Client B’s database), whereas with AI the barrier can be dynamically broken by the model’s training process.

These seven risks have one thing in common: they live inside the model, not the infrastructure around it. That is precisely where traditional audits don’t look. Your vendor’s SOC 2 report confirms their firewalls are configured correctly. It says nothing about what happens when someone asks their AI the right questions in the wrong way.

Real-World Examples: When Compliance Did Not Equal Security

These aren’t theoretical vulnerabilities. In 2025 alone, three high-profile incidents demonstrated that even industry leaders with comprehensive compliance programs are vulnerable to AI-specific attacks that traditional frameworks never anticipated.

The EchoLeak Microsoft 365 Copilot Exploit

Microsoft 365 Copilot, which is backed by a comprehensive suite of ISO 27001 and SOC 2 Type II attestations, was found to be vulnerable to a zero-click injection attack known as EchoLeak (CVE-2025-32711) . In a zero-click attack, your associate doesn’t need to click a link or download an attachment. They just need an email to sit in an inbox that Copilot happens to summarize. In the EchoLeak attack, an attacker could steal confidential business data, like OneDrive files, SharePoint content, and Teams messages, simply by sending a crafted email. The payload coerced Copilot into bypassing Microsoft’s prompt-injection filters and exfiltrating data through a whitelisted Microsoft domain.

The Salesforce Agentforce “ForcedLeak” incident

Salesforce, an industry leader in compliance and security, suffered a critical vulnerability in its Agentforce platform (CVSS 9.4) that allowed attackers to exfiltrate sensitive CRM data. The attack was deceptively simple: embed malicious instructions in a routine web form, wait for an employee to ask the AI to process it, and watch the agent obediently leak internal customer data to an attacker-controlled server. The researchers who discovered it purchased the domain needed to execute the attack for only $5. A $5 domain name bypassed a multi-billion dollar security infrastructure. This is the definition of a ‘Certification Paradox.’

The GitHub MCP AI Agent Hijacking

The Invariant Labs Security Research Team discovered that attackers could hijack official GitHub AI agents through malicious public GitHub issues. When an AI assistant was asked to “check the open issues,” it would read the malicious issue, get prompt-injected, and then use the developer’s access tokens to leak sensitive data from private repositories. This case illustrates the ’excessive privilege’ risk in AI agents—a vector that traditional audits, which focus on human user access, often fail to scrutinize in the context of automated tool-calling.

Until compliance frameworks catch up, the burden falls on you to ask questions that auditors don’t. Because when opposing counsel subpoenas your ‘compliant’ AI vendor’s logs and finds that your client’s privileged strategy was sitting in a shared training set, the SOC 2 certificate won’t be your defense. It will be Exhibit A in the malpractice claim. Your AI vendor risk assessment needs to go beyond the certificate and into the model itself. In a future post, I will discuss some of these questions and how emerging AI-specific frameworks can address legal industry-specific gaps that traditional compliance frameworks miss.