OpenAI Privacy Filter

- OpenAI released Privacy Filter, an open-source on-device model that sanitises personal information in enterprise datasets. - The model uses a Mixture-of-Experts architecture and is available under an Apache 2.0 licence for edge redaction. - On-device redaction is being presented as a way to minimise sensitive data before it reaches central services. (venturebeat.com)

OpenAI has released Privacy Filter, a small open-weight model that strips personal data from text on a user’s own device before that text is sent anywhere else. (openai.com) The company published the model on April 22, 2026, and made the code and weights available under the Apache 2.0 license on GitHub and Hugging Face. OpenAI said the model is built for personally identifiable information detection and redaction in enterprise text workflows. (openai.com, github.com, huggingface.co) Privacy Filter is a token-classification model, which means it reads text and labels each word or fragment instead of generating a reply. OpenAI said it can detect spans such as names, bank account numbers, and other sensitive identifiers in unstructured documents. (openai.com, news.bloomberglaw.com) The model is designed to run locally in a browser or on a laptop, so raw text can be cleaned before it reaches a cloud service. OpenAI said that setup lowers exposure risk for companies that handle customer records, support logs, or internal documents with sensitive fields. (openai.com, openai.com) OpenAI’s technical pitch is that the model is sparse, a “Mixture of Experts” design that keeps only a small slice of its parameters active for each task. The repository says Privacy Filter has 1.5 billion total parameters but uses about 50 million at inference time, which is how OpenAI says it stays light enough for edge deployment. (github.com, github.com) The repository also lists a 128,000-token context window, which lets the model process long files without chopping them into smaller pieces first. OpenAI said users can tune operating points to trade off between catching more sensitive text and reducing false alarms. (github.com, openai.com) In its model card, OpenAI described Privacy Filter as a high-throughput system for context-aware detection of personal information in text. Third-party coverage citing OpenAI’s benchmark results said the model scored 96% F1 on PII-Masking-300k, with 94.04% precision and 98.04% recall. (cdn.openai.com, helpnetsecurity.com) The release lands as OpenAI has been pushing a broader open-model strategy alongside its closed commercial systems. OpenAI’s open-models page says its recent open-weight releases are meant for teams that want to run models on infrastructure they control, including on-premises and private-cloud setups. (openai.com, help.openai.com) That makes Privacy Filter less a chatbot feature than a plumbing tool for companies trying to keep private data out of downstream systems. The closer redaction happens to the source text, the less unfiltered material has to move across networks or sit on third-party servers. (openai.com, venturebeat.com))

OpenAI Privacy Filter

Get your own daily briefing