Your LLM Stack Is Leaking More Than You Think

Table of Contents

Why this matters this week

If you’re running LLMs against production data and you don’t have clear answers to:

What data is sent to which model, stored where, and for how long?
Who can reconstruct prompts, responses, and embeddings?
How you’d prove to an auditor that you’re doing what your policy claims?

…then you don’t have AI governance; you have vibes.

This week matters for three concrete reasons:

Vendors are quietly changing defaults
Major LLM providers have updated retention and training policies multiple times in the last few months. The press releases highlight “enterprise grade” and “no training,” but:
- Some APIs still log prompts/responses for abuse detection.
- Some have per-feature retention quirks (e.g., batch APIs vs streaming vs fine-tuning).
- Some “no training” guarantees do not cover third-party integrations.
SOC2 / ISO auditors are now asking “show me,” not “tell me”
Early audits waved through AI usage under generic “change management” and “access control” controls. That honeymoon is ending:
- You’ll be asked for data flow diagrams that include AI services.
- You’ll need evidence of prompt logging controls, retention policies, and DPIA/PIA (privacy impact assessment) for AI features.
- “We trust $VENDOR” will not satisfy “how do you prevent accidental PHI/PII leakage?”
Policy lag is starting to hurt velocity
Engineering teams are building faster than legal/security can reason about:
- Who owns AI usage policy?
- What “sensitive data” means in an embedding world.
- How to approve a new model in under 30 days.

If you don’t put a light governance layer in place now, you’ll end up with a hard freeze later when your first real incident or audit hits.

What’s actually changed (not the press release)

Three non-marketing shifts:

Model risk is now heterogeneous, not generic
A year ago, “using LLMs” was one risk bucket. Now:
- Different model families (closed API, self-hosted open weights, frontier vs small task-specific) have different:
  - Data retention behaviors
  - Re-identification risk
  - Prompt injection / jailbreak profiles
- Governance needs to distinguish:
  - Inference-only, stateless models
  - Fine-tuned or RAG systems with your own data
  - Agentic / tool-using systems that can perform actions
Regulators are converging on “same rules, new surface area”
There is no unified “AI law” that magically replaces existing frameworks:
- GDPR/CCPA expectations (data minimization, purpose limitation, right to delete) now apply to:
  - Prompt logs
  - Embedding stores
  - Fine-tuned model artifacts
- SOC2 / ISO 27001 are being interpreted to cover:
  - Third-party AI vendors as sub-processors
  - “Shadow AI” as shadow IT
    The bar isn’t new; it’s just being extended to LLM stacks, and your current controls probably don’t map cleanly.
Policy-as-code for AI is graduating from slideware to necessity
Basic runtime enforcement is becoming table stakes:
- Blocking PII from leaving the boundary (DLP/regex/classifiers in front of model APIs)
- Routing sensitive workloads to specific models/environments
- Enforcing retention and redaction on logs and traces

The change: auditors and incident responders now expect machine-enforced policies, not just “we tell devs not to send SSNs to ChatGPT.”

How it works (simple mental model)

Think about AI governance along three orthogonal axes:

Data lifecycle
For each AI-related artifact:
- Prompts
- Responses
- Embeddings
- Vector DB contents
- Fine-tuned model weights
  Ask four questions:
- Origin: Where did this data come from? (system of record, user input, third-party)
- Sensitivity: What classification? (public, internal, confidential, regulated)
- Boundary: Does it ever leave your VPC/tenant? If yes, to whom (and under what DPA)?
- Timer: How long are you keeping it, and where is that enforced?
Model risk profile
Treat each model like a third-party microservice with a risk vector:
- Deployment: SaaS API vs private cloud vs on-prem
- Training behavior: May it train on your data? For how long is it cached?
- Capabilities: Can it write code, call tools, generate SQL? That’s execution risk.
- Explainability/auditability: Can you reproduce a decision? Snapshot a version?
Represent this as a model registry with metadata, not a spreadsheet in someone’s drive.
Policy-as-code control points
You need 3–4 enforcement layers:
- At ingestion (before the model):
  - PII detection & redaction
  - Allowed-input rules (no secrets, API keys, tokens)
- At routing:
  - “If data sensitivity = X, only allow models in environment Y”
  - “If tenant = EU, only route to EU-hosted models”
- At output:
  - Safety filters (toxicity, hallucination-risk heuristics)
  - Data classification of outputs if they’re stored or re-used
- At persistence:
  - Log retention and redaction
  - Embedding retention policies
  - Fine-tuned model versioning and rollbacks

If you can’t point to code or configuration that implements policy at these points, you have a paper policy, not governance.

Where teams get burned (failure modes + anti-patterns)

1. “Prompt logs are just dev tools”

Pattern:
– Teams ship prompt observability for debugging and product analytics.
– Logs are stored in a general-purpose logging system with 90–365 day retention.
– Prompts routinely contain:
– Customer emails
– Internal URLs
– Database keys and IDs

Failure:
– Security runs a periodic log review and discovers regulated data in unapproved systems.
– Now you have:
– A potential data breach scope explosion.
– A remediation nightmare (how do you “delete” scattered prompts?).

Anti-patterns:
– Treating prompts as non-production data.
– Not applying the same retention/classification rules as application logs.

2. “We scrub PII, so we’re compliant”

Pattern:
– A DLP-like layer removes names, emails, SSNs from prompts.
– The team declares the system “de-identified” and reuses embeddings/responses widely.

Failure:
– Quasi-identifiers (role, department, timestamps, unique behaviors) allow re-identification.
– For some regulators, any data that could be re-attributed counts as personal data.

Anti-patterns:
– Conflating PII removal with anonymous data.
– Ignoring linkage attacks when the same user interacts often.

3. Shadow fine-tuning

Pattern:
– A small team exports historical tickets/chat logs.
– They fine-tune an open-weight model on internal infra “for better support answers.”
– There’s no DPIA, no DPA update, no data minimization.

Failure:
– Fine-tuned weights now embed customer issues and sometimes verbatim text.
– There is no documented way to:
– Remove a specific customer’s data from the model.
– Prove to an auditor that a deletion request was honored.

Anti-patterns:
– Treating fine-tuned models like stateless APIs.
– No registry or ownership for custom models.

4. “Legal said no AI” / “Security will figure it out later”

Pattern:
– Central policy is a blanket prohibition.
– Teams quietly use unmanaged tools and APIs to ship features anyway.
– Central functions are blind until an incident.

Failure:
– You end up with maximum risk:
– No central controls.
– No visibility.
– Harder cleanup after the fact.

Anti-patterns:
– Policy that doesn’t acknowledge productivity pressure.
– Governance as a gate, not as paved roads.

Practical playbook (what to do in the next 7 days)

This assumes you already have at least one LLM use case in production or close to it.

Day 1–2: Inventory and classify

Create an AI asset inventory
Minimal columns (put this in a repo, not a slide deck):
- Use case name
- Owner (team + person)
- Models used (name, provider, version)
- Data sources touched (with sensitivity: public/internal/confidential/regulated)
- Where prompts, responses, and embeddings are stored
- Retention settings
Tag sensitivity levels
Use a simple, ruthless scheme:
- S0: No customer or employee data
- S1: Internal but non-sensitive (docs, marketing copy)
- S2: Customer data, low risk (no special category)
- S3: Regulated / special category (health, financial, minors, etc.)
Map each use case + datastore to S0–S3.

Day 3–4: Establish policy-as-code guardrails

Front-door LLM client / gateway
Introduce a single internal library or service that all LLM calls must go through (even if it’s thin at first). It should:
- Attach model metadata (provider, region, retention guarantees).
- Route based on sensitivity:
  if sensitivity >= S2 then allowed_models = {list}
- Optionally log metadata only (who/when/which model) separately from content.
Implement minimal content controls
- For S2/S3 workloads:
  - Add PII detection (regex + ML-based if you have it) at ingestion.
  - Block obviously dangerous categories (API keys, tokens, secrets).
- At output, add basic filters:
  - Block responses containing secrets from known patterns.
  - Flag high-risk outputs for human review in early stages.

These can be simple: a service wrapper and a few middleware steps. You can harden over time.

Day 5: Fix logging and retention

Align logs with your existing policies
- For each LLM-related log store:
  - Set explicit retention (e.g., 30 or 90 days, not “forever”).
  - Restrict access (prod-only, least privilege).
- For sensitive workloads:
  - Either:
    - Don’t log full prompts/responses by default, or
    - Redact before logging.
**Decide on embedding

Your LLM Stack Is Leaking More Than You Think

Why this matters this week

What’s actually changed (not the press release)

How it works (simple mental model)