By Quentin Fottrell, MarketWatch
MarketWatch photo illustration/iStockphoto
Aggregated or “anonymized” data may not be enough to protect your identity.
That’s the conclusion from a study published Tuesday in the peer-reviewed, open-access scientific journal Nature Communications. Data collected by technology companies is typically “anonymized,” so it can be used by marketers and advertisers — “99.98%” of people were correctly identified in anonymized datasets using 15 characteristics, including age, gender and marital status.
‘While rich medical, behavioral, and socio-demographic data are key to modern data-driven research, their collection and use raise legitimate privacy concerns.’
“While rich medical, behavioral, and socio-demographic data are key to modern data-driven research, their collection and use raise legitimate privacy concerns,” the study concluded . The researchers said these “anonymized datasets” are unlikely to satisfy the modern standards for anonymization set for the European Union’s General Data Protection Regulation (GDPR).
The GDPR is a European Union privacy law that went into effect in May. The rules require all organizations — from local governments to Silicon Valley tech companies including Google /zigman2/quotes/205453964/composite GOOG -2.18% /zigman2/quotes/202490156/composite GOOGL -2.21% , Twitter /zigman2/quotes/203180645/composite TWTR -1.90% and Facebook /zigman2/quotes/205064656/composite FB -2.05% — to take special precautions to protect the personal data and privacy of EU citizens.
Consumer advocates are increasingly concerned about how large datasets could be used by hackers in light of several recent data scandals. In 2017, Equifax /zigman2/quotes/208789454/composite EFX -1.18% revealed that hackers accessed the personal information of up to 147 million people; this week, the credit bureau announced a settlement, setting aside $700 million to provide cash payments for people who were affected.
Last year, Facebook announced that U.K.-based Cambridge Analytica improperly accessed 87 million Facebook users’ data. Facebook chief executive Mark Zuckerger testified before Congress and vowed to do more to fix the problem and help make sure that nothing like that happens again. Cambridge Analytica closed down in the wake of the scandal. Facebook was fined $5 billion by the Federal Trade Commission for the scandal.
WhatsApp, the messaging and audio app owned by Facebook, announced last May that hackers were able to install spyware on Android smartphones and Apple /zigman2/quotes/202934861/composite AAPL -2.26% iPhones. “The attack has all the hallmarks of a private company reportedly that works with governments to deliver spyware that takes over the functions of mobile phone operating systems,” it said.
More than 57 million customers of Uber /zigman2/quotes/211348248/composite UBER -0.49% had their data exposed by a massive hack in October 2016. Uber fired its chief security officer Joe Sullivan and one of his deputies for concealing the hack, which included the email addresses of 50 million Uber riders around the world. The revelation was made a year after the attack. It also affected seven million drivers.
Consumer advocates are increasingly concerned about how large datasets could be used by hackers in light of several recent data scandals.
Luc Rocher, a research fellow at Université Catholique de Louvain or UCLovain in Brussels and co-author of the latest study, wrote, “While there might be a lot of people who are in their thirties, male, and living in New York City, far fewer of them were also born on Jan. 5, are driving a red sports car, and live with two kids (both girls) and one dog.”
The researchers developed a tool that first asks you to put in the first part of their post code (U.K.) or ZIP Code (U.S.), gender, and date of birth, before giving them a probability that their profile could be re-identified in any “anonymized dataset.” It then asks your marital status, number of vehicles, house ownership status, and employment status.
In May 2019, the New York Times exposed President Trump’s tax returns from 1985 to 1994. The newspaper received information contained in Trump’s tax returns. Although it didn’t have the actual tax returns, it matched results with Internal Revenue Service data on the country’s top earners — a publicly available database that had identifying details removed.
“We’re often assured that anonymization will keep our personal information safe. Our paper shows that de-identification is nowhere near enough to protect the privacy of people’s data,” Julien Hendrickx, a professor of mathematical engineering at UCLovain and co-author of the latest study, and Rocher, wrote. They said standards for such data should be more robust.
Security experts generally recommend never re-using security passwords and say people should use two-factor authentication on their phones, which requires a user to put a code sent to a phone or email into an app or website in order to log in from a new device or to change a password. However, such security precautions would not help people protect against a data breach.