In Brief
There's a growing movement in computer science to use social media data in novel ways, including predicting a user's mental or even physical health. While this can certainly be beneficial, it also has its drawbacks—most notably its invasion of a user's privacy, and the pigeonholing of an individual based what might be considered pretty dubious data.

New research reveals how social network data can be used to predict users’ mental and physical health, adding to a growing number of researchers using social media to make startlingly accurate predictions from the most basic information. As many users reconsider what they choose to share online, the findings cast doubt on conventional ideas of “safe” surfing.

Researchers from Cambridge and Stanford universities have created a computer program which can use Facebook “likes” to predict personality traits like openness, conscientiousness, and neuroticism. With a given number of “likes,” the program can predict personality traits more accurately than friends (70 “likes”), family (150 “likes”), and even spouses (300 “likes”). What’s more, the researchers found that the computer’s judgments had “higher external validity when predicting life outcomes such as substance use, political attitudes, and physical health.” The team is optimistic that this growing field could one day help people to make better decisions, perhaps acting as an electronic agony aunt: one who knows you better than your own mother.

A separate team at the University of Pennsylvania have been conducting their own research using social network data. The group, known as the World Well-Being Project (WWP), found that the language used in tweets can provide an incredibly accurate indication of heart disease mortality rates in a given region. The researchers explained that stress leads to an increased risk of coronary heart disease, and that this same stress is expressed in users’ tweets, through “language patterns reflecting negative social relationships, disengagement, and negative emotions — especially anger.”

Perhaps most interestingly, the team found that “the people tweeting angry words and topics are in general not the ones dying of heart disease. But that means if many of your neighbors are angry, you are more likely to die of heart disease.” As the leading cause of death world-wide, the team hopes the technique can be used to identify high-risk areas, and test the effectiveness of community public-health interventions. “Twitter seems to capture a lot of the same information that you get from health and demographic indicators,” says WWP’s Gregory Park, “but it also adds something extra. So predictions from Twitter can actually be more accurate than using a set of traditional variables.”

Both groups recognized that their finding could pose challenges when it comes to user privacy. “Knowledge of people’s personalities can…be used to manipulate and influence them,” cautioned the Facebook research team. “People might distrust or reject digital technologies after realizing that their government, internet provider, web browser, online social network, or search engine can infer their personal characteristics. We hope that…policy-makers will tackle those challenges by supporting privacy-protecting laws and technologies, and giving the users full control over their digital footprints,” they said.

Similar research caught the attention of the Electronic Frontier Foundation (EFF), a well-known digital-privacy advocacy group, back in March 2013. The study, conducted at the University of Cambridge, found that using Facebook “likes,” researchers could predict a staggering breadth of personal information, including “sexual orientation, ethnic origin, political views, religion, personality, intelligence, satisfaction with life, substance use, whether an individual’s parents stayed together until the individual was 21 years old,” as well as “basic demographic attributes such as age, gender, relationship status, and size and density of the friendship network.”

The team, whose analysis focused on the potential marketing and privacy implications of their research, explained that, although “people may choose not to reveal certain pieces of information about their lives, such as their sexual orientation or age…this information might be predicted in a statistical sense from other aspects of their lives that they do reveal.”

Adi of the Electronic Frontier Foundation (EFF). Credit: Ad Kamdar/Flickr

Adi Kamdar, an EFF activist specializing in issues of consumer privacy, alerted users to this new threat to their digital privacy in an article entitled “You Won’t Like What Your Facebook “Likes” Reveal.”

I decided to approach Adi Kamdar and find out what advice he had for Futurism readers.

Rowan Green: “Do you think this kind of personal data extraction is likely to be misused, now or in the future?”

Adi Kamdar: “While there exist some safeguards in US law around the misuse of data for certain purposes (credit, insurance, employment), there has also been an attempt to grab more private social network information for these uses. When it comes to public information, consumer protection arguments often run against free speech arguments — the data is publicly accessible, therefore companies can do with it what they want.”

RG: “So what can our readers do to protect themselves from this risk?”

AK “Social network users can be careful about what they put online and whether they want to tie their name to it. The beauty of anonymity and pseudonymity is that it affords you at least some level of protection. But when it comes to networks like Facebook, where you are all but forced to use your real name, you should be careful about divulging personal information. You should also understand that seemingly innocuous actions — liking certain pages, for example — creates a fairly unique, detailed picture about yourself. ”

Adi, and co-author Dave Maass, gave this advice to readers in their article on the subject:

We suggest you practice good Facebook hygiene and go through your “Likes” right now to make sure you still actually like those things. After all, if liking “Harley Davidson” implies a low level of intelligence, you may want to keep your love of motorcycles hush-hush. Conversely, if you’re smart (or want to come across as smart), you might just go out and like “Curly Fries” or “Morgan Freeman’s Voice.”

Despite the challenges this new research poses for privacy and data protection advocates, it also offers undoubted benefits to groups, like the World Well-Being Project, who want to see the data put to good use. “Ultimately, we hope that our insights and analyses will help individuals, organizations, and governments choose actions and policies that are not just in the best economic interest of the people or companies, but which truly improve their well-being,” their website proclaims.

Fans of Isaac Asimov’s Foundation series may see parallels between the findings and Asimov’s “Psychohistory,” a fictional model which proposes that “the laws of statistics as applied to large groups of people could predict the general flow of future events” (Psychohistory, Wikipedia).

Do you think we could ever use our digital history to predict the future? Are you concerned about who could be using your social network data, or what they’re doing with it?