How to Redact PII from OpenAI API Calls

April 3, 2026 · 5 min read

Every time your application calls the OpenAI API with user data, that data lands in OpenAI's servers, logs, and potentially training pipelines. Names, email addresses, phone numbers, Social Security numbers, credit card numbers — all of it.

With the EU AI Act enforcement starting August 2026 and existing regulations like GDPR, CCPA, and HIPAA, this is becoming a legal liability, not just a best practice.

This guide shows three approaches to redacting PII from OpenAI API calls, from manual to fully automated.

The Problem

Here's a typical customer support summarization call:

response = openai.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{
        "role": "user",
        "content": f"Summarize this ticket: {ticket_text}"
    }]
)

If ticket_text contains "Hi, my name is Sarah Johnson, email sarah.j@acme.com, SSN 078-05-1120", all of that goes to OpenAI. Even if OpenAI doesn't train on API data, the data still traverses their infrastructure, sits in logs, and is subject to their data processing terms.

Approach 1: Manual Regex (Fragile)

import re

def strip_pii(text):
    text = re.sub(r'\b[\w.-]+@[\w.-]+\.\w+\b', '[EMAIL]', text)
    text = re.sub(r'\b\d{3}-\d{2}-\d{4}\b', '[SSN]', text)
    text = re.sub(r'\b\d{3}[-.]?\d{3}[-.]?\d{4}\b', '[PHONE]', text)
    return text

Problems:

Approach 2: Microsoft Presidio (Better, But Assembly Required)

from presidio_analyzer import AnalyzerEngine
from presidio_anonymizer import AnonymizerEngine

analyzer = AnalyzerEngine()
anonymizer = AnonymizerEngine()

results = analyzer.analyze(text=ticket_text, language="en")
anonymized = anonymizer.anonymize(text=ticket_text, analyzer_results=results)

# Now call OpenAI with anonymized.text
# But you still need to: build the proxy, handle streaming,
# map placeholders back, handle tool_calls, prevent evasion...

Problems:

Approach 3: Veil (One Line Change)

Veil is a drop-in proxy that handles all of this automatically. Change your base URL and you're done:

// Before
client = OpenAI(
    api_key=os.environ["OPENAI_API_KEY"],
)

// After — PII never reaches OpenAI
client = OpenAI(
    api_key=os.environ["OPENAI_API_KEY"],
    base_url="https://veil-api.com/v1",
    default_headers={
        "Authorization": f"Bearer {os.environ['VEIL_API_KEY']}",
        "x-upstream-key": os.environ["OPENAI_API_KEY"],
    }
)

What happens behind the scenes:

  1. Your app sends the request to Veil
  2. Veil detects and replaces 79+ types of PII with cryptographic tokens
  3. The sanitized request goes to OpenAI
  4. OpenAI's response comes back with tokens
  5. Veil restores the original values
  6. Your app gets a clean response with real names, emails, etc.

What OpenAI sees: <<VEIL_PERSON_a8f2c3d1e4f5>> instead of "Sarah Johnson".

What your app gets back: "Sarah Johnson" — fully restored.

What Veil Catches That Regex Doesn't

CategoryExamples
Personal infoNames, emails, phones, SSNs, addresses across 18 countries
FinancialCredit cards, IBANs, bank account numbers, routing numbers
Government IDsPassports, driver's licenses, national IDs (US, UK, DE, IT, IN, KR, etc.)
SecretsAWS keys, GitHub tokens, Stripe keys, GCP keys, JWTs, private keys
CryptoEthereum, Bitcoin, Litecoin, Monero wallets
Evasion attemptsZero-width chars, Cyrillic homoglyphs, accent-based evasion

Works with Any Provider

Veil isn't just for OpenAI. Set the x-upstream-provider header to route through Anthropic, Together, Groq, Mistral, DeepSeek, Fireworks, Perplexity, or any of 41 supported providers. Same code, same PII protection.

Compliance Coverage

Try Veil Free

100 requests/month on the free tier. No credit card required. Change one URL and your LLM calls are compliant.

Get API Key    View on GitHub