Hugging Face hosted fake OpenAI repo
- HiddenLayer said on May 7 it found a Hugging Face repo, Open-OSS/privacy-filter, impersonating OpenAI’s real Privacy Filter project and delivering infostealer malware. - The fake repo copied OpenAI’s model card, hit Hugging Face’s trending list, and showed more than 200,000 downloads before removal by Hugging Face. - This matters because AI repos now act like software supply chains — model cards, demo scripts, and loaders can all become attack paths.
AI model hubs are starting to look a lot like app stores and code registries — and that means they inherit the same supply-chain problems. The news here is simple but ugly. A repository on Hugging Face pretended to be OpenAI’s real “Privacy Filter” project, climbed the platform’s trending list, and used a booby-trapped script to drop credential-stealing malware on Windows machines. HiddenLayer disclosed it on May 7, and Hugging Face removed the repo after it had already shown more than 200,000 downloads. ### What was the fake repo pretending to be? OpenAI really does publish a Hugging Face model called `openai/privacy-filter`. It is a token-classification model for detecting and masking personally identifiable information, and OpenAI also hosts a demo Space for it on Hugging Face. The malicious repo used the name `Open-OSS/privacy-filter`, copied the legitimate model card almost word for word, and leaned on that similarity to look trustworthy at a glance. (hiddenlayer.com) ### Where did the malware actually sit? Not in some exotic model weight trick. That is the important part. HiddenLayer said the lure was a normal-looking repository README plus instructions telling users to run `start.bat` on Windows or `python loader.py` directly. The `loader.py` file mixed in decoy AI-looking code, then reached out to a remote JSON endpoint, pulled a command, and executed a PowerShell chain that led to an infostealer payload. (huggingface.co) ### Why is that different from the usual “bad model” story? Because this was closer to a classic developer compromise than a poisoned checkpoint. A lot of discussion around Hugging Face risk focuses on unsafe serialization formats like Pickle, where loading a model can execute code. Hugging Face has docs for both malware scanning and Pickle scanning, and researchers have already shown malicious model files on the platform before. But this case used the repo itself — the docs, the script, the social proof — as the delivery vehicle. (hiddenlayer.com) Basically, the attack surface was the whole project page, not just the model artifact. ### Why did so many people plausibly trust it? Because it looked like the kind of thing developers download every day. The fake page copied a real OpenAI release, pointed at the real OpenAI PDF, and apparently made it onto Hugging Face’s top trending list. Once something looks official and starts trending, people stop reading closely. They treat “clone repo and run script” as setup friction, not as a security boundary. (huggingface.co) ### What does the malware go after? HiddenLayer’s guidance was blunt — if a Windows user ran the repo files, treat the host as fully compromised and reimage it. The payload was built to harvest browser-stored passwords, session cookies, OAuth tokens, SSH keys, FTP credentials, Discord tokens, and crypto-wallet data. That matters because stolen session cookies can let attackers bypass MFA without ever needing the password itself. (hiddenlayer.com) ### Why should healthcare vendors care? Because this is exactly how “harmless experimentation” leaks into production. A team grabs a model repo to test de-identification, redaction, summarization, or note cleanup. Then someone runs the helper scripts on a workstation that also touches clinical systems, cloud consoles, or shared browsers. In healthcare, that can turn one bad download into a credential theft problem with PHI exposure wrapped around it. That last step is an inference, but it follows directly from what the malware was built to steal and from how these repos are commonly used. (hiddenlayer.com) ### So what changes now? The practical lesson is boring and important. Treat model repos like untrusted software packages. Ban direct execution of helper scripts from public repos. Separate research sandboxes from anything that can reach PHI, production browsers, or cloud admin sessions. Prefer safer model formats where possible, but don’t confuse safer weights with a safe repository. This incident showed the real problem — trust got attached to a familiar brand and a plausible README, and that was enough. (hiddenlayer.com) ### Bottom line? Open model hubs are now part of the software supply chain. If your controls still assume the risk is only in the model file, you are defending the wrong layer. (hiddenlayer.com)