Quantcast

Jump to content


Photo

"Researchers" released data on 70k OKCupid users without permission


  • Please log in to reply
25 replies to this topic

#1 Emily

Emily
  • Wonder Woman


  • 6508 posts


Users Awards

Posted 14 May 2016 - 11:19 AM

I was really avoiding making this topic because I don't even know where to start but here goes. As someone who has done research in the past and is now doing research as a master's student, this is an ethical nightmare. At the same time, I wanted to grab the popcorn and watch it unfold. Honestly, this excited me way more than it should have. This is my version of juicy gossip these days and that is sad. 

 

Without further ado...

 

I'll start off with a little intro from this blog post (which is actually a pretty good read so I would check it out if you want to read up more on this)

 

There is an excellent Tim Minchin song called If You Open Your Mind Too Much, Your Brain Will Fall Out. I'm sad to report that the same is also true of your data and your science.
 
At this point in the story I'd like to introduce you to Emil Kirkegaard, a self-described "polymath" at the University of Aarhus who has neatly managed to tie every single way to be irresponsible and unethical in academic publishing into a single research project. This is going to be a bit long, so here's a TL;DR: linguistics grad student with no identifiable background in sociology or social computing doxes 70,000 people so he can switch from publishing pseudoscientific racism to publishing pseudoscientific homophobia in the vanity journal that he runs.
 
Yeah, it's just as bad as it sounds.

 

So here's the facts:

- Danish "researchers" (if you can call them that) published a dataset containing information on 70,000 OKCupid users. 

- Information included things such as usernames, age, gender, sexual orientation, the kind of relationship the person is looking for and basically any question they may have answered on the website. 

 

What is the problem with this, you ask?

Well... they ignored any fucking ethical guideline put into place by:

    a) Not getting permission from OKCupid to take this information, at the very least.

    b) Not getting informed consent from the people whose information they were taking.

    c) Not anonymising any of this information so that the people included in it could not be identified. 

 

The blogger I mentioned above summarised this quite nicely: 

Having now spent some time exploring the data, and reading both public statements on the work and the associated paper: this is without a doubt one of the most grossly unprofessional, unethical and reprehensible data releases I have ever seen.
 
There are two reasons for that. The first is very simple; Kirkegaard never asked anyone. He didn't ask OKCupid, he didn't ask the users covered by the dataset - he simply said 'this is public so people should expect it's going to be released'.
 
This is bunkum. A fundamental underpinning of ethical and principled research - which is not just an ideal but a requirement in many nations and in many fields - is informed consent. The people you are studying or using as a source should know that you are doing so and why you are doing so.

 

The blogger then goes on to tear Kirkegaard a new one, which is quite entertaining. Really, worth the read. 

 

Can this get worse?

Yes. Why? Because Kirkegaard literally doesn't give a shit. 

Spoiler

 

But as of now, he's removed the data so at least there's that (even if it has already been downloaded hundreds if not thousands of times)

 

What's worse is that the paper itself is an actual joke but whatever. Juicy research gossip that I am enjoying way too much. 

 

Read up more here:

Motherboard article 

Vox article



#2 Mishelle

Mishelle
  • Bitch Of The Boards

  • 2245 posts


Users Awards

Posted 14 May 2016 - 11:24 AM

If they didn't get OkCupid's permission and they're publishing people's information that seems like something that could definitely lead to legal action. I don't even understand the point of them doing this? If the data was obtained and published unethically who in their right minds would even want to use it?



#3 Emily

Emily
  • Wonder Woman


  • 6508 posts


Users Awards

Posted 14 May 2016 - 11:28 AM

If they didn't get OkCupid's permission and they're publishing people's information that seems like something that could definitely lead to legal action. I don't even understand the point of them doing this? If the data was obtained and published unethically who in their right minds would even want to use it?

 

I don't think any rational researcher would ever use this information. Unfortunately, he's an absolute idiot and was doing this research on his own time, outside of university, for an open access journal that he is the editor of. 

 

He can kiss his future in research goodbye.



#4 Drakonid

Drakonid
  • 804 posts


Users Awards

Posted 14 May 2016 - 11:29 AM

How did he get all the information?
I know it's probably somewhere in those articles, but I don't know how to read.

#5 Bee

Bee
  • 1169 posts


Users Awards

Posted 14 May 2016 - 11:30 AM

That's an egregious breach of research etiquette. What on earth was he thinking?



#6 Emily

Emily
  • Wonder Woman


  • 6508 posts


Users Awards

Posted 14 May 2016 - 11:37 AM

How did he get all the information?
I know it's probably somewhere in those articles, but I don't know how to read.

 

He used a scraper. If I read correctly, it was just a bot posing as a real person that took all of the information from the matches it was getting. 


Ah, I just checked and they took the paper down for legal reasons (was up yesterday still) which means he's most likely going to get sued.  This is from the blog now that I can't actually go look at the paper anymore

 

His first research question was: what if gay men are basically just women? We have data on gender and we have data on sexuality; let's see if LGB people provide different answers from straight people for their given gender! Let's see if they're basically the opposite gender!

 
You'll be amazed to know he didn't end up incorporating this into the paper, presumably because the answer was "of course not, you bigot". But he did find time to evaluate whether religious people are just plain ol' idiots - the methodology for which is checking the user's response to various questions akin to IQ test entries. You know, the racist classist sexist thing.
 
As an aside, this kind of creepy superpositional homophobia is actually an improvement on much of the work I've found from Kirkegaard while digging into this, which is not superpositional at all: previous credits include such pseudoscience as arguing that letting low-IQ immigrants in will damage Danish society, and they should be subjected to brain implants and genetic engineering for entry, and (I wish this was a joke) checking whether immigrants commit more crime if they have tiny penises
 
So... At least you can't really take him seriously in the first place. 


#7 Junjie

Junjie
  • Hi there!

  • 2267 posts


Users Awards

Posted 14 May 2016 - 11:41 AM

This is really incredibly dumb, especially when it's not even anonymised. He's just opening himself up to like, not just an OKCupid suit but also a collective suit from the users and perhaps any rights or interested group or something isn't he? :/
And all for something which is useless now, and might have been useless even if it was ethically usable...

#8 Frizzle

Frizzle
  • M'lord

  • 16889 posts


Users Awards

Posted 14 May 2016 - 02:46 PM

I very doubt the US legal system has much power in Denmark.



#9 Emily

Emily
  • Wonder Woman


  • 6508 posts


Users Awards

Posted 14 May 2016 - 02:48 PM

I very doubt the US legal system has much power in Denmark.


I'm sure most countries have ethical procedures in place for research like this :p

#10 Romy

Romy
  • Neocodex Elite Four Member


  • 4876 posts


Users Awards

Posted 14 May 2016 - 02:51 PM

Isn't this information publicly available? 

 

He utilized a bot in order to scrape public information from OKCupid. Despite not providing a measure of anonymity to the users who's information he used, I can't blame him for using readily available information.

 

That is unless I'm missing something? I'm not 100% acquainted with research methods or OKCupid.

 

Enlighten me @Emily?



#11 Coops

Coops
  • 🌧️🌩️🌧️


  • 4009 posts


Users Awards

Posted 14 May 2016 - 02:58 PM

Isn't this information publicly available? 

 

He utilized a bot in order to scrape public information from OKCupid. Despite not providing a measure of anonymity to the users who's information he used, I can't blame him for using readily available information.

 

That is unless I'm missing something? I'm not 100% acquainted with research methods or OKCupid.

 

Enlighten me @Emily?

Any research must always involve informed consent even if you don't think your research will effect participants in any manner. Collecting data and then using it to make any sort of statement is unethical without consent, regardless of whether or not the information is public. That doesn't even bring up the fact that he doxed 70k people with varying sexualities, sexual proclivities, etc -- all of which I am sure you can imagine can result in being fired from your job, stalked, harassed, etc. 



#12 Frizzle

Frizzle
  • M'lord

  • 16889 posts


Users Awards

Posted 14 May 2016 - 02:59 PM

I'm sure most countries have ethical procedures in place for research like this :p


True, he's essientaly black listed himself but I'm sure he'll have no problems making a career out of this.

#13 Emily

Emily
  • Wonder Woman


  • 6508 posts


Users Awards

Posted 14 May 2016 - 03:00 PM

Isn't this information publicly available? 
 
He utilized a bot in order to scrape public information from OKCupid. Despite not providing a measure of anonymity to the users who's information he used, I can't blame him for using readily available information.
 
That is unless I'm missing something? I'm not 100% acquainted with research methods or OKCupid.
 
Enlighten me @Emily?


You have to follow certain ethical procedures when doing research and he didn't do that. I've been drinking so I can get into it more tomorrow.

#14 Frizzle

Frizzle
  • M'lord

  • 16889 posts


Users Awards

Posted 14 May 2016 - 03:02 PM

Isn't this information publicly available?

He utilized a bot in order to scrape public information from OKCupid. Despite not providing a measure of anonymity to the users who's information he used, I can't blame him for using readily available information.

That is unless I'm missing something? I'm not 100% acquainted with research methods or OKCupid.

Enlighten me @Emily?


If you were British I bet I could find your name, date of birth, financial details, house address, members of your family, car registration details, workplace employment and employer contact details whilst all being entirely legal.

Under UK law (sorry don't know much about US or Danish law), merely looking at this information isn't illegal. But the collection of this data, the storage and releasement of this information is illegal.

Plus there's the whole ethics/morality thing as well.

#15 Romy

Romy
  • Neocodex Elite Four Member


  • 4876 posts


Users Awards

Posted 14 May 2016 - 03:09 PM

If you were British I bet I could find your name, date of birth, financial details, house address, members of your family, car registration details, workplace employment and employer contact details whilst all being entirely legal.

Under UK law (sorry don't know much about US or Danish law), merely looking at this information isn't illegal. But the collection of this data, the storage and releasement of this information is illegal.

Plus there's the whole ethics/morality thing as well.

I think this gets to the heart of what my concerns are. 

 

 

Any research must always involve informed consent even if you don't think your research will effect participants in any manner. Collecting data and then using it to make any sort of statement is unethical without consent, regardless of whether or not the information is public. That doesn't even bring up the fact that he doxed 70k people with varying sexualities, sexual proclivities, etc -- all of which I am sure you can imagine can result in being fired from your job, stalked, harassed, etc. 

Ehh....I feel like asking someone for "informed consent" when using something they posted on the dating equivalent of facebook is ridiculous. 
Always assume anything you put on the internet is the same as saying it in a crowded street corner.



#16 Coops

Coops
  • 🌧️🌩️🌧️


  • 4009 posts


Users Awards

Posted 14 May 2016 - 03:15 PM

I think this gets to the heart of what my concerns are. 

 

 

Ehh....I feel like asking someone for "informed consent" when using something they posted on the dating equivalent of facebook is ridiculous. 
Always assume anything you put on the internet is the same as saying it in a crowded street corner.

Yeah, I realize that you risk stuff when you put anything online. But that doesn't excuse the ethical duties of a researcher. As a researcher, regardless of field, you don't get to just go "oh well, they were too dumb to protect their information, too bad!" and then proceed to utilize the information for shoddy research, release the data and 'research' (which apparently promotes racist/homophobic bullshit and is probably not reliable/valid), and claim it's kosher. I don't know if it's legal in the US though. But it's a pretty shitty thing to do. 



#17 Emily

Emily
  • Wonder Woman


  • 6508 posts


Users Awards

Posted 14 May 2016 - 03:45 PM

I think this gets to the heart of what my concerns are. 
 
 

Ehh....I feel like asking someone for "informed consent" when using something they posted on the dating equivalent of facebook is ridiculous. 
Always assume anything you put on the internet is the same as saying it in a crowded street corner.


Yes, but as a researcher, if you're using human subjects, you have to do that. I'm using things from Facebook for my research and I HAVE to get informed consent. It doesn't matter if it's out there already. It's the procedure.

#18 WarezHaxor

WarezHaxor
  • 668 posts


Users Awards

Posted 14 May 2016 - 04:46 PM

It all really boils down to what Danish law on journalism and data privacy states unfortunately. Morally and ethically, this is horribly wrong, but if Denmark protects it's journalists at all costs or has nothing on the books about publicizing that information, there's nothing that can be done. Furthermore, I'm uncertain on Denmark's extradition laws, but if it's anything like Russia's then as long as the data wasn't taken from their own citizens they may not even care.

#19 Junjie

Junjie
  • Hi there!

  • 2267 posts


Users Awards

Posted 14 May 2016 - 05:17 PM

I think, regardless of the law, researchers can't use that. It's already been said of course, but universities, journals, research institutes etc don't just have procedures for these, but also like, ethics boards which can investigate and censure people for using information unethically obtained, for example, or other systems of accountability. And this (roughly) seems to be true around the world for professional non-shady organisations and researchers regardless of the different laws in different places.

And in any case, research can be pretty collaborative from what I know, so even if you don't get punitive measures handed to you, if you get a strong and deserved rep for unethical conduct people probably can't and won't research with you anyways. And that has consequences too even if you're like, I will just run my own "journal". I guess he can continue to do that forever but he won't have much headway beyond that if he keeps this shit up.

#20 Romy

Romy
  • Neocodex Elite Four Member


  • 4876 posts


Users Awards

Posted 14 May 2016 - 05:28 PM

Yeah, I realize that you risk stuff when you put anything online. But that doesn't excuse the ethical duties of a researcher. As a researcher, regardless of field, you don't get to just go "oh well, they were too dumb to protect their information, too bad!" and then proceed to utilize the information for shoddy research, release the data and 'research' (which apparently promotes racist/homophobic bullshit and is probably not reliable/valid), and claim it's kosher. I don't know if it's legal in the US though. But it's a pretty shitty thing to do. 

 

Yes, but as a researcher, if you're using human subjects, you have to do that. I'm using things from Facebook for my research and I HAVE to get informed consent. It doesn't matter if it's out there already. It's the procedure.

 

Fair enough. Researchers should be held to a higher standard as their body of work is more heavily scrutinized.



#21 WarezHaxor

WarezHaxor
  • 668 posts


Users Awards

Posted 14 May 2016 - 05:32 PM

Oh I totally agree that in every sense it's wrong and immoral. But from the sounds of this guy, we could equate him to the national enquirer and the like with some of his ideas, so maybe he doesn't particularly care if what he's doing is wrong because he knows he's always going to be that crazy guy who researches his wierd theories all alone.

My point with the law is if that's the kind of person he is, even if he found no one willing to do research like this without regard to the ramifications, as long as he knows he's not going to be put in jail and is fine being an isolated quack who thinks penis size directly correlates to criminal activity, what other journalists and people think won't bother him much. There's probably a handful of people who think like him...could be the next manson family...

#22 Coops

Coops
  • 🌧️🌩️🌧️


  • 4009 posts


Users Awards

Posted 14 May 2016 - 05:36 PM

Fair enough. Researchers should be held to a higher standard as their body of work is more heavily scrutinized.

I think technology is opening up this realm of uncertainty for research, in the sense that now there is significantly more access to specific information, and there hasn't been enough time to really understand the ethical implications of something like this. I mean, hypothetically, imagine if say this guy outed gay people in a very anti-gay country like Russia? So, it's this weird grey area of well how much is it the responsibility of a person to protect their information versus the responsibility of the researcher. 

 

It's a fair question to ask why this is bad, or why we should be decrying this action.



#23 Romy

Romy
  • Neocodex Elite Four Member


  • 4876 posts


Users Awards

Posted 14 May 2016 - 05:49 PM

 I mean, hypothetically, imagine if say this guy outed gay people in a very anti-gay country like Russia? 

I actually hadn't even considered this. Fuck.



#24 Coops

Coops
  • 🌧️🌩️🌧️


  • 4009 posts


Users Awards

Posted 14 May 2016 - 06:17 PM

I actually hadn't even considered this. Fuck.

Yeah. Doxing is a seriously shitty thing to do, imo. Even places like the US aren't great about separating online/private life from work.

 

Anecdotal but a girl at my work got reprimanded formally for posting on FB about how she wasn't allowed to use the bathroom (childcare, where 2 certified providers had to be in the room at a time, so you only get to pee on breaks or if there is someone available to relieve you) at work. She didn't name names, she didn't post anything derogatory, just like "my work is terrible about letting me use the restroom in a timely manner". And of course, there are tons of incidents of teachers being reprimanded and suspended for shit on their online social profiles, which is irrelevant to their ability to teach.

 

Also digging the new avatar :p



#25 Emily

Emily
  • Wonder Woman


  • 6508 posts


Users Awards

Posted 14 May 2016 - 07:48 PM

What's sad is that he could be in a lot less trouble if he had at the very least anonymised the data so that users were not easily identifiable. Unfortunately, I don't think he really cared. The right to privacy doesn't just disappear because people are on certain social media networks - especially when it comes to research. 




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users