Bulletin
Investor Alert

July 25, 2019, 2:04 p.m. EDT

A worrying theory after Equifax and Facebook settlements — aggregated data is NOT enough to protect your privacy

A new study says it’s possible to ‘reverse engineer’ anonymous data to identify individuals

new
Watchlist Relevance
LEARN MORE

Want to see how this story relates to your watchlist?

Just add items to create a watchlist now:

  • X
    Alphabet Inc. Cl C (GOOG)
  • X
    Alphabet Inc. Cl A (GOOGL)
  • X
    Twitter Inc. (TWTR)

or Cancel Already have a watchlist? Log In

By Quentin Fottrell, MarketWatch


MarketWatch photo illustration/iStockphoto

Aggregated or “anonymized” data may not be enough to protect your identity.

That’s the conclusion from a study published Tuesday in the peer-reviewed, open-access scientific journal Nature Communications. Data collected by technology companies is typically “anonymized,” so it can be used by marketers and advertisers — “99.98%” of people were correctly identified in anonymized datasets using 15 characteristics, including age, gender and marital status.

‘While rich medical, behavioral, and socio-demographic data are key to modern data-driven research, their collection and use raise legitimate privacy concerns.’

“While rich medical, behavioral, and socio-demographic data are key to modern data-driven research, their collection and use raise legitimate privacy concerns,” the study concluded . The researchers said these “anonymized datasets” are unlikely to satisfy the modern standards for anonymization set for the European Union’s General Data Protection Regulation (GDPR).

The GDPR is a European Union privacy law that went into effect in May. The rules require all organizations — from local governments to Silicon Valley tech companies including Google /zigman2/quotes/205453964/composite GOOG -2.18% /zigman2/quotes/202490156/composite GOOGL -2.21%  , Twitter /zigman2/quotes/203180645/composite TWTR -1.90%  and Facebook /zigman2/quotes/205064656/composite FB -2.05%  — to take special precautions to protect the personal data and privacy of EU citizens.

Consumer advocates are increasingly concerned about how large datasets could be used by hackers in light of several recent data scandals. In 2017, Equifax /zigman2/quotes/208789454/composite EFX -1.18%  revealed that hackers accessed the personal information of up to 147 million people; this week, the credit bureau announced a settlement, setting aside $700 million to provide cash payments for people who were affected.

Last year, Facebook announced that U.K.-based Cambridge Analytica improperly accessed 87 million Facebook users’ data. Facebook chief executive Mark Zuckerger testified before Congress and vowed to do more to fix the problem and help make sure that nothing like that happens again. Cambridge Analytica closed down in the wake of the scandal. Facebook was fined $5 billion by the Federal Trade Commission for the scandal.

WhatsApp, the messaging and audio app owned by Facebook, announced last May that hackers were able to install spyware on Android smartphones and Apple /zigman2/quotes/202934861/composite AAPL -2.26%  iPhones. “The attack has all the hallmarks of a private company reportedly that works with governments to deliver spyware that takes over the functions of mobile phone operating systems,” it said.

Don’t miss: Facebook’s cryptocurrency Libra could be a ‘red line’ some people won’t cross

More than 57 million customers of Uber /zigman2/quotes/211348248/composite UBER -0.49%  had their data exposed by a massive hack in October 2016. Uber fired its chief security officer Joe Sullivan and one of his deputies for concealing the hack, which included the email addresses of 50 million Uber riders around the world. The revelation was made a year after the attack. It also affected seven million drivers.

Consumer advocates are increasingly concerned about how large datasets could be used by hackers in light of several recent data scandals.

Luc Rocher, a research fellow at Université Catholique de Louvain or UCLovain in Brussels and co-author of the latest study, wrote, “While there might be a lot of people who are in their thirties, male, and living in New York City, far fewer of them were also born on Jan. 5, are driving a red sports car, and live with two kids (both girls) and one dog.”

The researchers developed a tool that first asks you to put in the first part of their post code (U.K.) or ZIP Code (U.S.), gender, and date of birth, before giving them a probability that their profile could be re-identified in any “anonymized dataset.” It then asks your marital status, number of vehicles, house ownership status, and employment status.

In May 2019, the New York Times exposed President Trump’s tax returns from 1985 to 1994. The newspaper received information contained in Trump’s tax returns. Although it didn’t have the actual tax returns, it matched results with Internal Revenue Service data on the country’s top earners — a publicly available database that had identifying details removed.

“We’re often assured that anonymization will keep our personal information safe. Our paper shows that de-identification is nowhere near enough to protect the privacy of people’s data,” Julien Hendrickx, a professor of mathematical engineering at UCLovain and co-author of the latest study, and Rocher, wrote. They said standards for such data should be more robust.

Security experts generally recommend never re-using security passwords and say people should use two-factor authentication on their phones, which requires a user to put a code sent to a phone or email into an app or website in order to log in from a new device or to change a password. However, such security precautions would not help people protect against a data breach.

/zigman2/quotes/205453964/composite
US : U.S.: Nasdaq
$ 1,485.11
-33.04 -2.18%
Volume: 1.73M
Feb. 21, 2020 4:00p
P/E Ratio
30.20
Dividend Yield
N/A
Market Cap
$1048.69 billion
Rev. per Employee
$1.39M
loading...
/zigman2/quotes/202490156/composite
US : U.S.: Nasdaq
$ 1,483.46
-33.53 -2.21%
Volume: 2.13M
Feb. 21, 2020 4:00p
P/E Ratio
30.16
Dividend Yield
N/A
Market Cap
$1042.65 billion
Rev. per Employee
$1.39M
loading...
/zigman2/quotes/203180645/composite
US : U.S.: NYSE
$ 38.31
-0.74 -1.90%
Volume: 15.33M
Feb. 21, 2020 6:30p
P/E Ratio
20.51
Dividend Yield
N/A
Market Cap
$29.74 billion
Rev. per Employee
$776,112
loading...
/zigman2/quotes/205064656/composite
US : U.S.: Nasdaq
$ 210.18
-4.40 -2.05%
Volume: 14.10M
Feb. 21, 2020 4:00p
P/E Ratio
32.66
Dividend Yield
N/A
Market Cap
$611.71 billion
Rev. per Employee
$1.57M
loading...
/zigman2/quotes/208789454/composite
US : U.S.: NYSE
$ 159.53
-1.90 -1.18%
Volume: 786,579
Feb. 21, 2020 6:30p
P/E Ratio
N/A
Dividend Yield
0.98%
Market Cap
$19.34 billion
Rev. per Employee
$313,037
loading...
/zigman2/quotes/202934861/composite
US : U.S.: Nasdaq
$ 313.05
-7.25 -2.26%
Volume: 32.43M
Feb. 21, 2020 4:00p
P/E Ratio
24.72
Dividend Yield
0.98%
Market Cap
$1369.74 billion
Rev. per Employee
$1.98M
loading...
/zigman2/quotes/211348248/composite
US : U.S.: NYSE
$ 40.72
-0.20 -0.49%
Volume: 20.60M
Feb. 21, 2020 6:30p
P/E Ratio
N/A
Dividend Yield
N/A
Market Cap
$70.25 billion
Rev. per Employee
N/A
loading...

Quentin Fottrell is MarketWatch's personal-finance editor and The Moneyist columnist for MarketWatch. You can follow him on Twitter @quantanamo.

This Story has 0 Comments
Be the first to comment
More News In
Personal Finance

Story Conversation

Commenting FAQs »

Rates »

Partner Center

Link to MarketWatch's Slice.