Half‑million health records exposed
What happened
- Health records for 500,000 UK Biobank participants were posted for sale on a Chinese Alibaba site after a contract breach. - The data were described as 'de-identified' and included volunteers' medical information used for ageing and disease research. - Security experts warn de-identification isn't foolproof and called for stronger governance and vendor oversight across research datasets. (bloomberg.com)
Why it matters
Medical data tied to 500,000 UK Biobank volunteers was advertised for sale on Alibaba after a contract breach by researchers with access to the dataset. (ukbiobank.ac.uk) UK technology minister Ian Murray told Parliament on April 23 that the government learned of the listings on Monday and is investigating how the files appeared on Alibaba’s Chinese e-commerce platform. UK Biobank said the data had been provided to researchers at three academic institutions. (politico.eu) UK Biobank said the listings were removed before any purchases were made, with help from the UK government, Chinese authorities and Alibaba. The charity said the institutions and individuals involved had their access suspended. (ukbiobank.ac.uk) UK Biobank is one of Britain’s biggest medical research resources: it recruited 500,000 people ages 40 to 69 between 2006 and 2010 and links their samples, scans and health records for studies on diseases including cancer, dementia and Parkinson’s. Researchers have been using its de-identified data since 2012. (news.sky.com) (ukbiobank.ac.uk) “De-identified” means names, addresses, full dates of birth and National Health Service numbers are removed before data is shared. Murray said the exported files still included details such as gender, age, month and year of birth, socioeconomic status, lifestyle habits and measures from biological samples. (news.sky.com) That matters because removing obvious identifiers does not make a dataset impossible to reverse-engineer. Murray told lawmakers he could not guarantee “100%” that nobody could be identified from the material, though he said doing so would require advanced methods. (news.sky.com) The breach lands a month after a Guardian investigation found UK Biobank files had been exposed online dozens of times, often when researchers accidentally uploaded data to GitHub with their code. The report said UK Biobank issued 80 legal notices to GitHub between July and December 2025 to get data removed. (htworld.co.uk) UK Biobank says it had already moved researchers onto a restricted cloud platform in the UK, but this incident showed files could still be taken out. It has now temporarily suspended access to that platform, added strict limits on export size and said exported files will be monitored daily for suspicious behavior. (ukbiobank.ac.uk) The charity has referred itself to the Information Commissioner’s Office, and Murray said the government takes the case “extremely seriously.” The immediate question is no longer whether the listings were real, but whether research datasets built for public health can stay secure once outside the original system. (politico.eu)
Key numbers
- Health records for 500,000 UK Biobank participants were posted for sale on a Chinese Alibaba site after a contract breach.
- (bloomberg.com) Medical data tied to 500,000 UK Biobank volunteers was advertised for sale on Alibaba after a contract breach by researchers with access to the dataset.
- (ukbiobank.ac.uk) UK technology minister Ian Murray told Parliament on April 23 that the government learned of the listings on Monday and is investigating how the files appeared on Alibaba’s Chinese e-commerce platform.
- Researchers have been using its de-identified data since 2012.
What happens next
- Murray told lawmakers he could not guarantee “100%” that nobody could be identified from the material, though he said doing so would require advanced methods.
- (htworld.co.uk) UK Biobank says it had already moved researchers onto a restricted cloud platform in the UK, but this incident showed files could still be taken out.
- It has now temporarily suspended access to that platform, added strict limits on export size and said exported files will be monitored daily for suspicious behavior.
Quick answers
What happened in Half‑million health records exposed?
Health records for 500,000 UK Biobank participants were posted for sale on a Chinese Alibaba site after a contract breach. The data were described as 'de-identified' and included volunteers' medical information used for ageing and disease research. Security experts warn de-identification isn't foolproof and called for stronger governance and vendor oversight across research datasets. (bloomberg.com)
Why does Half‑million health records exposed matter?
Medical data tied to 500,000 UK Biobank volunteers was advertised for sale on Alibaba after a contract breach by researchers with access to the dataset. (ukbiobank.ac.uk) UK technology minister Ian Murray told Parliament on April 23 that the government learned of the listings on Monday and is investigating how the files appeared on Alibaba’s Chinese e-commerce platform. UK Biobank said the data had been provided to researchers at three academic institutions. (politico.eu) UK Biobank said the listings were removed before any purchases were made, with help from the UK government, Chinese authorities and Alibaba. The charity said the institutions and individuals involved had their access suspended. (ukbiobank.ac.uk) UK Biobank is one of Britain’s biggest medical research resources: it recruited 500,000 people ages 40 to 69 between 2006 and 2010 and links their samples, scans and health records for studies on diseases including cancer, dementia and Parkinson’s. Researchers have been using its de-identified data since 2012. (news.sky.com) (ukbiobank.ac.uk) “De-identified” means names, addresses, full dates of birth and National Health Service numbers are removed before data is shared. Murray said the exported files still included details such as gender, age, month and year of birth, socioeconomic status, lifestyle habits and measures from biological samples. (news.sky.com) That matters because removing obvious identifiers does not make a dataset impossible to reverse-engineer. Murray told lawmakers he could not guarantee “100%” that nobody could be identified from the material, though he said doing so would require advanced methods. (news.sky.com) The breach lands a month after a Guardian investigation found UK Biobank files had been exposed online dozens of times, often when researchers accidentally uploaded data to GitHub with their code. The report said UK Biobank issued 80 legal notices to GitHub between July and December 2025 to get data removed. (htworld.co.uk) UK Biobank says it had already moved researchers onto a restricted cloud platform in the UK, but this incident showed files could still be taken out. It has now temporarily suspended access to that platform, added strict limits on export size and said exported files will be monitored daily for suspicious behavior. (ukbiobank.ac.uk) The charity has referred itself to the Information Commissioner’s Office, and Murray said the government takes the case “extremely seriously.” The immediate question is no longer whether the listings were real, but whether research datasets built for public health can stay secure once outside the original system. (politico.eu)