Subscribe to DSC Newsletter

Project Gaydar: Identifying Gay Students Using Data Mining Techniques Applied To Facebook Profiles

By Carolyn Y. Johnson
Globe Staff / September 20, 2009

At MIT, an experiment identifies which students are gay, raising new questions about online privacy
By Carolyn Y. Johnson, Globe Staff / September 20, 2009

It started as a simple term project for an MIT class on ethics and law on the electronic frontier.

Two students partnered up to take on the latest Internet fad: the online social networks that were exploding into the mainstream. With people signing up in droves to reconnect with classmates and old crushes from high school, and even becoming online “friends” with their family members, the two wondered what the online masses were unknowingly telling the world about themselves. The pair weren’t interested in the embarrassing photos or overripe profiles that attract so much consternation from parents and potential employers. Instead, they wondered whether the basic currency of interactions on a social network - the simple act of “friending” someone online - might reveal something a person might rather keep hidden.

Using data from the social network Facebook, they made a striking discovery: just by looking at a person’s online friends, they could predict whether the person was gay. They did this with a software program that looked at the gender and sexuality of a person’s friends and, using statistical analysis, made a prediction. The two students had no way of checking all of their predictions, but based on their own knowledge outside the Facebook world, their computer program appeared quite accurate for men, they said. People may be effectively “outing” themselves just by the virtual company they keep.

“When they first did it, it was absolutely striking - we said, ‘Oh my God - you can actually put some computation behind that,’ ” said Hal Abelson, a computer science professor at MIT who co-taught the course. “That pulls the rug out from a whole policy and technology perspective that the point is to give you control over your information - because you don’t have control over your information.”

The work has not been published in a scientific journal, but it provides a provocative warning note about privacy. Discussions of privacy often focus on how to best keep things secret, whether it is making sure online financial transactions are secure from intruders, or telling people to think twice before opening their lives too widely on blogs or online profiles. But this work shows that people may reveal information about themselves in another way, and without knowing they are making it public. Who we are can be revealed by, and even defined by, who our friends are: if all your friends are over 45, you’re probably not a teenager; if they all belong to a particular religion, it’s a decent bet that you do, too. The ability to connect with other people who have something in common is part of the power of social networks, but also a possible pitfall. If our friends reveal who we are, that challenges a conception of privacy built on the notion that there are things we tell, and things we don’t.

Full article at: Click here

Views: 666


You need to be a member of AnalyticBridge to add comments!

Join AnalyticBridge

Comment by Jozo Kovac on February 23, 2010 at 2:06am
gravy - thumbs up! :)
the best predictor : person 'in realationship/engaged/married' with xxx where xxx.gender = person.gender
this can be also used as control group, you can start testing your hypotheses and evaluate results.
you can predict who is straight, married, divorced too ...
if privacy is important for you, simply dont post your data to web!
Comment by Tejamoy Ghosh on February 23, 2010 at 1:27am
I agree with Ralph on " predict whether the person was gay ..., but having no way of checking all of their predictions is pretty off the wall ". Having said that, if the results are verifiable with data/statistically, unlike the current scenario, I find the idea to be interesteing - if we look at it on a wider perspective of predicting behavior, preferences etc. using one's online public profile (and definitely not intruding into one's sexual orientation/preference). This can then be extended into targeted marketing too.
Comment by Gravy Jones on October 5, 2009 at 9:16am
My wife doesn't need statistical analysis to tell a gay man from a straight man. She can look at a Facebook profile and peg them. Honestly, this type of work is interesting, funny, and incendiary. I would hope that the authors only use it as a distraction to learn something new and then focus the "fruits" of their labor on another, more interesting study....
Comment by arup guha on October 5, 2009 at 12:19am
very interesting. wonder just by looking at ones network how much can be known about one.
Comment by Ralph Winters on October 4, 2009 at 8:07pm
I don't find this article interesting at all, and I find it totally offensive. This is what the students at MIT are coming up with? I doubt that the results of this article would ever be even close to being published in a scientific journal. Making a "striking discovery" that they could predict whether the person was gay ..., but having no way of checking all of their predictions is pretty off the wall to begin with, and let's not encourage this kind of "research".

-Ralph Winters
Comment by Arijit on October 4, 2009 at 12:56am
Hey Vincent,

This article is really interesting. Thanks for sharing it!

On Data Science Central

© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service