A study into 141 million files found in data breaches has shown that 82% of incidents have included HR data.
Lab 1, a data intelligence platform, has published what it claims to be the biggest ever content-level analysis of information, using AI agents to scrape breached datasets and analyse every file exposed.
These included unstructured files, such as PDFs, emails and spreadsheets, which it said are typically overlooked in data breach analyses. These file formats are commonly used by HR practitioners and may hold sensitive information that can not only be leveraged for sophisticated cyberattacks but also introduce the risk of identity fraud for employees.
Analysing 141 million files leaked in the public domain from nearly 1,300 ransomware and data breach incidents, Lab 1’s Anatomy of a breach report found that HR data appeared in 82% of breaches.
Data protection
M&S pauses hiring as it deals with cyber attack
Remote working may have triggered jump in employee data breaches
These files typically contain personally identifiable information, featuring names, addresses, national IDs, and health-related records. It found that US social security numbers were present in 51% of all incidents analysed, posing a risk for identity theft and data protection violations.
The study also found that 58% of incidents included recruitment data such as names, addresses, and contact details of candidates from CVs and covering letters.
Robin Brattel, co-founder and CEO at Lab 1, said: “Rather than focus on mega data dumps of structured and primarily credential-based information, we’ve focused on the huge risks associated with unstructured files that often hold high-value information, such as cryptographic keys, customer account data, or sensitive commercial contracts.
“With cybercriminals now behaving like data scientists to unearth these valuable insights to fuel cyberattacks and fraud, unstructured data cannot be ignored. We’ve refined a scientific approach to analysing unstructured breach contents and today share our findings, which underline the need to move towards a content-aware approach to breach analysis.
“Ultimately, organisations must understand what information has been leaked, how it can be used, and who might be affected. And faster than it can be used against them.”
Lab 1 said that breaches rich in HR content and correspondence are particularly suited for “AI-enabled weaponisation”.
Eighty-six per cent of data breaches included leaked emails, it said. When cross-referenced with internal HR files, this information supports “hyper-targeted phishing and social engineering campaigns” and could be used to generate “synthetic identities”, which combine real and fabricated information to create new personas to open bank accounts, apply for credit cards or take out loans.
Sign up to our weekly round-up of HR news and guidance
Receive the Personnel Today Direct e-newsletter every Wednesday