README.md
LM Studio plugin that redacts personally identifiable information (PII) from text.
The model passes a block of text and (optionally) a list of personal names it has spotted; the plugin handles the format-bound stuff itself (cards, NIR, IBAN, phone, email — all checksum-validated, not just regex). Each detected item is replaced with a stable typed pseudonym so coreference is preserved within the document.
anonymize_text(text, names?, custom_terms?, include_mapping?) — returns the redacted text plus a mapping from pseudonym to original value.| Type | Detection | Pseudonym |
|---|---|---|
| Card | 13–19 digits, Luhn-validated |
[CB_N] |
| NIR | French numéro de sécu, mod-97 checksum | [NIR_N] |
| IBAN | 2 letters + 2 digits + body, mod-97 | [IBAN_N] |
| Phone | French formats: 0X…, +33 X…, 0033 X… | [TEL_N] |
| RFC-ish | [EMAIL_N] |
Format-only detection without a checksum would produce too many false positives, so each detector validates before redacting.
names — personal names you've identified by reading the text. The tool does no NER. Pass full names like "Jean Dupont", not just "Jean", to avoid false matches against common words.custom_terms — anything else you want gone: company names, addresses, project codenames, etc.There's also a per-chat config field Always-redact terms (string array): things that should be redacted in every call regardless of model input — e.g. your own name, your home address, an employer name. Useful as a safety net.
Same value → same pseudonym, throughout one call:
(The model is expected to pass names: ["Jean", "Jean Dupont"] — but [NOM_1] is reused because the value matched in the text is the same once dedup runs.)
The response includes a mapping so you can build a "key" file alongside the redacted version if needed:
Set include_mapping: false to omit it (e.g. if you're going to forward the redacted text somewhere and don't want the mapping in your context).
The intended workflow:
read_file({path: "contract.md"}) — get the original.anonymize_text({text, names: [...]}) — model identifies names, plugin redacts.write_file({path: "contract.redacted.md", content: anonymized}) — save the clean version.| Field | Type | Default | Notes |
|---|---|---|---|
| Always-redact terms | string array | [] | Strings always replaced with [CUSTOM_N]. Matched literally. |
One-click (macOS, Windows): Run in LM Studio
Linux (AppImage):
After install: in any chat, click the tools button and enable anonymize.
MIT
"Jean a appelé Jean Dupont au 06 12 34 56 78. Sa CB est 4111 1111 1111 1111."
↓
"[NOM_1] a appelé [NOM_1] au [TEL_1]. Sa CB est [CB_1]."
{
"anonymized": "...",
"counts": { "NOM": 1, "TEL": 1, "CB": 1 },
"mapping": {
"[NOM_1]": "Jean Dupont",
"[TEL_1]": "06 12 34 56 78",
"[CB_1]": "4111 1111 1111 1111"
}
}
lms clone zexigh/anonymize
cd anonymize
lms dev -i