Facebook gathers data for data mining operations used by data brokers

When it comes to mass spying, the best game in town is not CIA or any of the alphabet soup agencies. Private companies and data brokers have been doing data collection on a massive scale, and given their advanced statistical methods, this information can say a lot about a person. In fact, I’d say that what they have is better than what the alphabet soup has, and their data has a lot of implications.


This started when companies like Amazon realized that they can make a profit every step of the way: sell items to customers, sell customers’ data to data brokers. Data brokering has since become much bigger, and so the data collection methods have also become much more extensive. There are many ways to gather mass data, and these are just the ones I can think of off the top of my head: first, many websites straight up sell their data to brokers. This includes many online vendors, all kinds of popular sites (not all of them, but some of them), adult entertainment sites, you name it. If those sites do not sell data, dishonest brokers can and do embed tracking ads on sites that accept them, revealing a user’s entire browsing history. Then of course there are companies like Google, that sell user search histories.

Then there are companies like Facebook (and, to a lesser extent, Whatsapp), which sells all of its data on all of its users. Obviously, this data includes everything users put up on Facebook. The data from Facebook helps put a name on an IP address or a browser fingerprint. In addition to that, the Facebook mobile app (which usually comes preinstalled on smart phones) also collects your entire phone’s contents, including phone numbers and emails. This means that even if a person does not have Facebook, if his friends or associates do have it, Facebook can still collect data on that person.

Then there are mobile apps. I’m not sure if Facebook does all of this (it may or may not), but I know that there are many, many free apps that do. They collect all the data that can be collected from a smart phone. Names, addresses, emails, geolocation data, GSM location data, camera pictures, accelerometer data, whatever, you name it.
All of this data (and this is just what I can name off the top of my head) can be put together. Data will also be crosschecked: if they know any emails, user names or passwords, those can be used to find more data. For example, if you use the same password in several places, the system can link those accounts to you even if you use a different user name and email there (even if they don’t have your password, I believe they can compare password hashes). If that account yields any additional information, the they will obviously crosscheck that too. Facebook accounts and work emails can be used to put a name to all the accounts, IP addresses, MAC addresses and browser fingerprints. System will also check for phone numbers, which are used in many places, ranging from two-step verification to Facebook to emails to various other media accounts which need it.

Analyzing all of this, this data can tell a huge deal about a person. Everything from age, gender, race, income, family members, friends, associates, address, political leanings, personality type, mood, any possible health issues, any possible mental health issues (from depression to alcoholism, for example), sexual orientation, adult entertainment preferences, waking hours, work hours, who that person has met, you name it, there is more stuff than I can think of off the top of my head. Some companies have so accurate data models that they’ve actually been collecting lists of rape victims (I have no idea what use that serves). Big data has also been so accurate in predicting pregnancies, that the model has in some cases predicted that a woman is pregnant before she herself knew it.

Where this gets really interesting is what kind of things can be done with all this data. The obvious thing is marketing. A data broker called Axciom boasts about gathering data on 3000 different vulnerabilities that can be exploited to sell people stuff. Also, while mobile games can seem like a joke, some companies have put in a truly impressive amount of research as to how to make mobile games addictive. It sounds stupid, but they exploit all kinds of psychological mechanisms to make those games as addictive as they can (but that’s another topic).

Recently, there was a leak. Nobody knows for sure what company it came from (Axciom being a prime suspect) and the leak is also impossible to verify. But the data sounds plausible. So apparently, data brokers have been using extremely intrusive methods of data gathering, such as tapping into a smartphone’s microphone. The leak said that this anonymous company has a voice recognition program that can identify 500 000 different non-speech sounds, and what they say about a person, and add that to their profile (sounds could be anything from a barking dog to crying children etc.). Also, according to them, they can identify what car you’re driving by the engine sounds. Also, while the leak didn’t say this, I imagine they’d run speech recognition of speech too. In addition to that, smartphones can be used to scan for other smartphones in the vicinity. Phones can also do wifi mapping, charting your house layout.

The leak can be found here: https://imgur.com/a/rhFuj

Best part about all of this? It’s all 100 % legal. If you’ve ever signed an EULA, then you’ve probably agreed to all of the above. And if any of this sounds fantastic, feel free to use Google for details. Combine words like “data broker” “big data” with whatever particular topic you’re interested in. It was only a couple of years ago that people thought that government mass surveillance is pure tinfoil hat conspiracy lunacy. And lo and behold, all if it turned out to be true.

Obviously, this data collection has many issues, the biggest being privacy. But beyond that, there are leaks. The most famous leak being the Equifax leak, that leaked all the financial information of what, half of all Americans? These companies have also been known to sell this data. Some data broker didn’t do proper vetting, and they sold financial information to scammers and internet criminals.
I suspect that this data also has a lot of use in the field of politics. Presumably, intelligence agencies (and not just US intelligence agencies) have already taken an interest in all of this. Imagine, for example, buying all of that data and seeing what you could find on US politicians. Do you think that none of them will have any compromising stuff that could be uncovered this way? And even if they don’t, do you think that their loved ones are all squeaky clean? I imagine that intelligence services are already building compromats on important figures.

Furthermore, although I haven’t heard of this happening yet, I imagine that such data could be used to influence people. Imagine someone putting the kind of research into influencing a person’s political positions as they do in mobile games. I imagine they could make this kind of propaganda extremely effective and subtle (then again I am not an expert on this). For example, if you’d want to convert someone towards small government libertarianism, every now and then, you could introduce some subtle piece of news about some government fuck-up into their news-feed. Then you could monitor if it results in any kind of change in their views (in reality I imagine that this would be used for far more nefarious things though).

Furthermore, although it seems unlikely, imagine all the things that a totalitarian government could do with all of this mass data. You’d have a system that tells you everything you need to know. And given the pervasiveness of technology and social media, I suppose that if someone isn’t using them, then that alone would be enough to raise suspicions. The system gathers data on political leanings, so dissidents would be very easy to find. And addition to that, the system also tells you everything about their family and friends, so that you could crack down on networks. Or if that totalitarian government is targeting certain ethnicities, I imagine that those, too, could be found in this way, along with all of their family and friends. I know it seems unlikely, but the possibility is there.
So yeah, have fun with that thought. Nothing you’ve ever done on an electronic screen is private anymore. For many people, that includes huge portions of their lives, from phone to social media. And unless you’re an advanced user, there is nothing you can do about it.

