zexigh/anonymize • LM Studio Hub

anonymize

LM Studio plugin that redacts personally identifiable information (PII) from text.

The model passes a block of text and (optionally) a list of personal names it has spotted; the plugin handles the format-bound stuff itself (cards, NIR, IBAN, phone, email — all checksum-validated, not just regex). Each detected item is replaced with a stable typed pseudonym so coreference is preserved within the document.

Tools

anonymize_text(text, names?, custom_terms?, include_mapping?) — returns the redacted text plus a mapping from pseudonym to original value.

What's detected automatically

Type	Detection	Pseudonym
Card	13–19 digits, Luhn-validated

Field	Type	Default	Notes
Always-redact terms	string array	`[]`	Strings always replaced with `[CUSTOM_N]`. Matched literally.
Detect international phone numbers	boolean	`false`	When on, also redact `+CC…` international numbers (any country, not just France). Default is off — French-only detection is more precise.

Type	Precision	Recall	F1	Notes
EMAIL	99.75 %	99.23 %	99.49 %	Solid across realistic prose.
TEL (FR-only, default)	70.36 %	9.50 %	16.73 %	Most "TELEPHONENUM" gold values in the dataset are not French-formatted (`+56…`, `010…`). The default config catches French phones only.
TEL (international flag on)	93.23 %	50.77 %	65.74 %	With `Detect international phone numbers` enabled, an `+CC…` regex runs alongside the FR detector. Higher recall, with precision boosted because the broader match also satisfies stricter boundaries.
CB	12.69 %	7.33 %	9.29 %	Restricted to BIN prefixes `[3-6]` (Visa/MC/Amex/Diners/Discover/JCB/UnionPay) and Luhn-validated. The dataset generates random 16-digit strings that are mostly not Luhn-valid; we reject them. Residual false positives are identifiers that happen to pass both gates by chance — irreducible without contextual signals.
NIR	58.33 %	0.48 %	0.95 %	Same story as CB: synthetic NIRs in the dataset do not satisfy the mod-97 checksum, so we reject them. Real NIRs (with valid keys, Corsica included) are caught — see the unit tests for fixtures derived from official validators.

anonymize

anonymize

anonymize

Tools

What's detected automatically

What you (the model) provide

Stable pseudonyms

Composing with the filesystem plugin

Configuration

Quality

Install

For contributors

License