Connect your AI

One thing changes: the API address

Your applications already call a cloud AI (Anthropic, OpenAI, Mistral…). Each message can carry a name, an amount, a contract. CLEVYA sits in front of those calls, on your server: you point your applications at the CLEVYA component instead of the provider’s API, and you keep your key and your provider. Your request formats do not change.


Today	Your applications call the provider’s API directly. The real data leaves in clear text, and nobody knows exactly which.
With CLEVYA	Your applications call the CLEVYA component, which runs on your premises. It alone talks to the provider, with your key. The table that maps a token to its real value never leaves your server.
For your developers	One environment variable: the API address. No rewrite, no migration.

The flow, in one diagram

What a call becomes when CLEVYA sits in front of it (firewall mode). Everything inside the “on your premises” frame never leaves your server; only tokens go out to the cloud.

  ┌──────────────────── On your premises (your server) ───────────────────────┐
  │                                                                           │
  │   Your request  ──►  Local anonymization  ──►  (tokens)  ───────────────┐ │
  │   (real data)        names/amounts/IBANs                                │ │
  │                      replaced with tokens                              │ │
  │                                                                        ▼ │
  │   Response  ◄──  Local detokenization  ◄──  (tokens)  ◄───────────  Cloud LLM
  │   (readable)     real values restored                               (Anthropic,
  │                  on your premises                                    OpenAI, ...)
  └───────────────────────────────────────────────────────────────────────────┘

  The token → real-value table stays inside the frame. The cloud only ever sees tokens.

In audit mode, the component does not anonymize: it observes and records what would leave, without blocking anything. In fully local mode, the “Cloud LLM” is replaced by a model running on your premises (Ollama) - nothing leaves, not even anonymized.

Two modes, one component

The same component runs on your premises. You start in observation, then turn on protection when you are ready.

Audit mode

CLEVYA observes without blocking anything. For each call, it records what left: how many identifiers were detected, what would have been masked, what was kept - without ever storing the real value. You get a compliance report to present to your data protection officer (DPO). This is risk made tangible, without breaking anything.

Firewall mode

CLEVYA replaces each sensitive identifier locally with a consistent token before sending: a name becomes [PERSON_1], an IBAN becomes [IBAN_1], the same everywhere. The model reasons over the tokens, its answer comes back, the real values are restored on your premises. The AI provider never sees a single real identifier.

See Local anonymization for the detail of detection and substitution.

The rule we never cross

The component runs on your premises, never on a CLEVYA server. If your traffic went through our infrastructure, your data would transit through us - exactly the third party this tool exists to avoid. The mapping between a token and its real value stays on your server, period.

You scale to your need

The component is lightweight: the model is the cloud you already use. You only move up a tier when you need to.

Tier	What runs on your premises	For whom
Audit	The lightweight component, observing. A PC or a small server is enough.	”I want to see what leaks first.”
Firewall	The component + local anonymization. Still lightweight.	”Anonymize before sending to the cloud.”
Fully local	The component + a model running on your premises (Ollama). A real server required.	”Nothing leaves, not even anonymized.”

The AI tokens you pay directly to your provider, under your key - see BYOAK. The CLEVYA subscription covers the component, its updates and support.

What is honest to say

We oversell nothing. Here is the exact boundary of this building block.

Proven in the lab. The full cycle was measured against a provider’s real API (Anthropic): a real message anonymized locally, tokens sent, response reconstructed on our side, compliance report with a real request id.
Detection recall is not 100 %. On a corpus of synthetic multi-domain documents written for the test, detection masks around 94.2 % of sensitive data; on an independent public benchmark this rate is around 75 %. The rest is caught by human review and a per-client blocklist. This is not zero risk.
Code cannot be masked. Anonymization covers typed identifiers (names, amounts, IBANs), not source code: tokenizing code makes it unusable by the model. For sensitive code, the only real protection is fully local mode.
The web leaves. If an agent fetches a web page, that request goes out to the Internet. CLEVYA controls what goes to the model, not a network call from a third-party tool you would authorize.
Two residual limits. Re-identification by combining details left in clear text, and sensitive data that is untyped and unknown (an in-house project name, a clause). The answer to both: fully local mode, where nothing leaves.

Going further

Data sovereignty - the principle and what really leaves.
Local anonymization - how detection works.
Egress journal - the verifiable trace of what left.
BYOAK - your key, your provider, your bill.