Skip to content

Local anonymization

Detect then substitute, locally

Anonymization happens in two steps, entirely on your server, before anything goes to a language model:

  1. Detection. First, deterministic rules (an IBAN, an amount, an email address follow formats recognizable by regular expression). Then, on what remains ambiguous, a named-entity recognition model that runs offline (no outgoing network packet to detect).
  2. Substitution. Each detected value is replaced with a typed and consistent token: [PERSON_1], [SALARY_1], [IBAN_1]. The same real name always receives the same token within a document - this lets the model keep the relational structure, which keeps the response useful.

A rendering example: before / after

Original text (stays on your server):

Note from Jean Dupont (payroll manager): transfer the 4,500 EUR bonus
to IBAN FR76 3000 6000 0112 3456 7890 189, contact [email protected].

Text sent to the model (anonymized locally):

Note from [PERSON_1] (payroll manager): transfer the [AMOUNT_1] bonus
to IBAN [IBAN_1], contact [EMAIL_1].

The tokens replace the sensitive values; the non-sensitive context (“payroll manager”, “transfer the bonus”) stays in clear. The table mapping each token to its real value never leaves your server:

TokenTypeReal value (local, never sent)
[PERSON_1]PersonJean Dupont
[AMOUNT_1]Amount4,500 EUR
[IBAN_1]IBANFR76 3000 6000 0112 3456 7890 189
[EMAIL_1]Email[email protected]

When the model’s response comes back, these tokens are replaced with the real values on your premises, before being displayed. The AI provider never saw a single real identifier.

Why typed tokens, not [XXX]

Masking with an opaque marker ([XXX]) destroys meaning. A typed and consistent token ([PERSON_1]) lets the model understand “who is talking to whom” without knowing the real identity. This is what makes anonymization useful and not just protective.

What the model is still able to do (and what it cannot)

The model stays good on tokensThe model cannot / should not
File summarizationNumeric computation on masked values
Drafting a responseVerifying a real IBAN
ClassificationReasoning over external knowledge tied to the real identity
Understanding relationshipsDeduplicating on the real name

Sensitive numeric computation (raising a salary by 5%, summing values) is not done on tokens: it is done in local code on the real value, which gives an exact and deterministic result.

The pitfall of over-masking

Masking too much is a real cost: an unreadable response, or a burden of human re-reading. Our doctrine assumes an asymmetry: a false positive (masking for nothing) costs little; a false negative (letting a leak through) costs dearly. To keep usefulness while masking: consistent and typed tokens, deterministic rules before model-based detection, and the non-sensitive context left in clear.

Honesty: on very dense free text, over-masking degrades fluency. This is a cost to measure, not zero.

Going further