OpenAI Privacy Filter
- OpenAI released Privacy Filter, an open-source, on-device model that detects and removes personal data before text is sent. - The tool uses a mixture-of-experts architecture and is published under an Apache 2.0 licence for enterprise use. - By shifting sanitization left, the filter forces engineering teams to embed privacy controls in workflows rather than as downstream checks (venturebeat.com).
OpenAI on April 22 released Privacy Filter, a small open-weight model that strips personal data from text before that text leaves a device. (openai.com) The model is built to detect personally identifiable information, or PII, in unstructured text such as logs, documents, and chat transcripts, and OpenAI said it can run locally in a single pass. OpenAI also published the code and weights for developers to run, inspect, and fine-tune in their own environments. (openai.com) On GitHub, OpenAI said the model has 1.5 billion total parameters with 50 million active parameters, a 128,000-token context window, and an Apache 2.0 license for commercial use. The repository says it can run in a web browser or on a laptop and exposes controls that let teams trade off precision and recall. (github.com) Privacy filtering is the step where software finds names, phone numbers, addresses, account numbers, and other identifying details and masks them before storage or analysis. OpenAI said older tools often rely on fixed patterns like email formats, while Privacy Filter uses language context to decide when a detail refers to a private person and when similar text should stay visible. (openai.com) That distinction matters in enterprise systems that send support tickets, internal notes, code comments, and compliance records into search indexes or language models. OpenAI said developers can plug the filter into training, indexing, logging, and review pipelines instead of trying to catch sensitive data after it has already moved downstream. (openai.com) The model card describes Privacy Filter as a bidirectional token classifier with span decoding, which means it labels each word in a sequence and then groups those labels into redaction spans. OpenAI evaluated it on PII-Masking-300k and on credential detection in codebases, and said it also ran stress tests for multilingual, adversarial, and reasoning-heavy cases. (cdn.openai.com) OpenAI said it already uses a fine-tuned version of the model in its own privacy-preserving workflows. The company’s enterprise privacy page says business customer data in ChatGPT Business, ChatGPT Enterprise, and the application programming interface is not used for training by default. (openai.com 1) (openai.com 2) The release also extends OpenAI’s recent push into open-weight tools. In 2025, the company published gpt-oss models and later gpt-oss-safeguard under the same Apache 2.0 license, framing them as models developers can run on infrastructure they control. (openai.com 1) (openai.com 2) OpenAI’s pitch is straightforward: remove sensitive details before the text is sent anywhere else. For companies building with large language models, that moves privacy controls to the first step instead of the cleanup step. (openai.com)