Massive Chinese data trove leaked
A cybercrime forum listing reportedly contains more than 8–9TB of Chinese databases — described as a 50 billion‑record mega‑collection spanning e‑commerce, police and citizen data (x.com). The post frames the leak as a consolidated archive of multiple datasets, though forum‑level claims remain to be independently verified (x.com).
A cybercrime-forum post is advertising what it says is a giant archive of Chinese databases, but the seller’s headline numbers have not been independently verified. (darkwebinformer.com) (tornews.com) The listing described in recent reporting claims roughly 8 to 9 terabytes of data and tens of billions of records drawn from multiple sources, not one newly discovered breach. The public evidence so far comes from screenshots and forum-level claims circulated by Dark Web Informer and repeated by TorNews. (darkwebinformer.com) (tornews.com) That distinction matters because cybercrime forums often bundle old leaks, partial dumps, and repackaged databases into one “mega-collection” for resale. A record count can also overstate impact, since one person can appear many times across phone, address, account, and transaction tables. (justice.gov) (spycloud.com) China has already seen several huge data exposures built from aggregated personal information. Cybernews reported on February 3, 2026 that researchers found an unsecured Elasticsearch database with 8.73 billion Chinese records across 163 indices, including national identity numbers, addresses, emails, and passwords. (cybernews.com) SpyCloud said on April 9, 2026 that it obtained a copy of that January dataset and parsed 6.38 billion unique records from the original 8.7 billion, including 2.5 billion records with national identity numbers and 433 million with passwords. SpyCloud said the mix of formats and sources pointed to a long-running aggregation effort rather than a single incident. (spycloud.com) Another benchmark came in 2022, when a seller offered about 23 terabytes of data tied to the Shanghai police database and claimed records on roughly 1 billion people. Dark Reading reported the exposed information included names, addresses, phone numbers, national identity numbers, and criminal-record data. (darkreading.com) The mechanics are simple: Elasticsearch is a search database built to sort and retrieve huge volumes of information quickly, and an exposed server can act like a filing cabinet left unlocked on the internet. Cybernews said the 2026 Chinese dataset sat open for more than three weeks before it was closed. (cybernews.com) The immediate risk is not only identity theft. SpyCloud said large Chinese personal-data collections can support phishing, account takeovers, fraud, and investigations into Chinese-speaking threat actors, because phone numbers, names, identity numbers, and passwords can be cross-matched at scale. (spycloud.com) The market for that kind of material is well established. The United States Department of Justice said on March 4, 2026 that the seized LeakBase forum had more than 142,000 members and an archive of hacked databases, credentials, payment-card data, and other personal information for sale. (justice.gov) What is new in this case is the sales pitch, not yet the proof: a seller is presenting a China-focused archive as a single massive trove. Until researchers obtain samples, test for duplicates, and map the source datasets, the safest description is a large alleged compilation being marketed in a mature underground trade. (tornews.com) (spycloud.com)