What traits enable crowd-workers to voluntarily disclose their identity?

Here at the World Well-Being Project, we have done many crowdsourcing experiments over the past years. Often times, we are interested not only in what the workers annotate, but also in these workers themselves. For example, in a short paper [1] we have shown that females are better and more confident than males in guessing gender from tweets, especially when it comes to guessing females.

Similarly, many surveys are done over a non-random selection of participants. This is for example a problem in exit-polls, where a non-random population agrees to share their voting preference, leading to the need for pollsters to perform corrections a posteriori. Moreover, in online studies where users are anonymous, not all will agree to disclose self-identifying or personal information. Using data from our studies on the Amazon Mechanical Turk crowdsourcing platform, we aimed to uncover which users are more likely to voluntarily disclose their identity.

We asked workers on our tasks to fill in a standard demographic questionnaire and a few psychological questionnaires before they were taken to their task. They were explicitly advised that they will receive a bonus after completing these questionnaires. At the end of the demographic questionnaire, we asked the workers to voluntarily give their Twitter-handle to help with our study. A percentage of 31.2% of users (685 out of 2131) filled in the Twitter handle field, with 91.5% of the these (a total of 627 users; analysis varies insignificantly between the two outcomes) being valid (profile existed and was public).

We were interested in the answer to the question:

Which users choose to voluntarily enter their Twitter handle (if they have one)?

In the chart below we show Pearson correlations between sharing a valid and public handle and various demographic and psychological traits (results are stable even if controlling for gender and age).

Pearson correlations between sharing a valid and public handle and various demographic and psychological traits.
Pearson correlations between sharing a valid and public handle and various demographic and psychological traits.

Note that this outcome necessarily contains two parts: First that users have a Twitter account in the first place, and second that they are willing to share it to help us.

Essentially, two traits are identified to be related to our outcome (p<0.01) of voluntarily agreeing to share self-identifying information:

Politically liberal

It is well documented in previous literature on the demographics of the MTurk platform that the crowd is leaning quite heavily towards liberal compared to the US population [2], [3], [4]. Our results show that, in addition, liberals are also more likely to share this self-identifying information. This may simply be because liberals are more likely to have a Twitter account in the first place. However, liberals may also be more likely to share their handle because they feel more comfortable and at-home with social media and online communication in general.

Lower perspective taking

Perspective-taking assesses the extent to which individuals spontaneously (try to) adopt others’ points of view. It is measured through 7 question part of the larger Interpersonal Reactivity Index. This was designed to assess empathy, which was defined as “the reactions of one individual to the observed experiences of another” [5]. Sample items for Perspective Taking are:

  • I try to look at everybody’s side of a disagreement before I make a decision.
  • If I’m sure I’m right about something, I don’t waste much time listening to other people’s arguments. (reverse scored)

It is quite surprising that the lower you score on this trait, the more likely you are to disclose information about yourself. Strong perspective-takers may be less likely to have Twitter, since they may prefer less brief and limited forms of communication. Also, they may be cagey about spreading their personal communication out of an awareness that written words can powerfully imply their feelings and other private mental states. Intriguingly, higher perspective taking is significantly correlated to political leaning towards liberals (r = 0.065, p < .01), which was associated with disclosing the Twitter handle.

To end, even if our study is limited to a single behavior, it is possible that these patterns will carry over to other cases of voluntarily sharing self-identifying or personal information.

This blogpost was written with the help of Jordan Carpenter.


  1. Flekova, Lucie and Preotiuc-Pietro, Daniel and Carpenter, Jordan and Giorgi, Salvatore and Ungar, Lyle – Analyzing Crowdsourced Assessment of User Traits through Twitter Posts, HCOMP 2015, Work-in-Progress track.

  2. Ipeirotis, Panos – Demographics of Mechanical Turk: Now Live.

  3. Huff, Connor and Tingley, Dustin (2015). Who are these people? Evaluating the demographic characteristics and political preferences of MTurk survey respondents, Research & Politics, 2 (3).

  4. Levay, Kevin E. and Freese, Jeremy and Druckman, James N. (2016). The Demographic and Political Composition of Mechanical Turk Samples, SAGE Open, 6(1).

  5. Davis, M. H. (1980). A Multidimensional Approach to Individual Differences in Empathy. JSAS Catalog of Selected Documents in Psychology, 10 (85).

Share this on ...Share on Facebook0Tweet about this on Twitter0Share on Google+0Print this pageEmail this to someone

About Daniel Preotiuc

Daniel is a Postdoctoral researcher at the University of Pennsylvania. His research is situated at the intersection of Natural Language Processing, Machine Learning and Social Science. His current interests include spatial and temporal learning models for text, user attribute prediction from text and Gaussian Processes, using large user-generated data coming from Social Media. Prior to joining UPenn, Daniel completed his PhD in Natural Language Processing and Machine Learning at the University of Sheffield, UK and was a researcher on the Trendminer EU FP7 project.

Leave a Reply

Your email address will not be published. Required fields are marked *