Here at the World Well-Being Project, we have done many crowdsourcing experiments over the past years. Often times, we are interested not only in what the workers annotate, but also in these workers themselves. For example, in a short paper  we have shown that females are better and more confident than males in guessing gender from tweets, especially when it comes to guessing females.
Similarly, many surveys are done over a non-random selection of participants. This is for example a problem in exit-polls, where a non-random population agrees to share their voting preference, leading to the need for pollsters to perform corrections a posteriori. Moreover, in online studies where users are anonymous, not all will agree to disclose self-identifying or personal information. Using data from our studies on the Amazon Mechanical Turk crowdsourcing platform, we aimed to uncover which users are more likely to voluntarily disclose their identity.
Continue reading What traits enable crowd-workers to voluntarily disclose their identity?
Owning a pet is very popular in the U.S. The most recent survey from the American Pet Products Association estimated that 65% of American households (79.9 million) include at least one pet. The most popular household pets are, unsurprisingly, dogs (44% of households) and cats (34.9% of households) .
Psychologists have long been fascinated to uncover whether individual differences drive pet ownership and preference. Most have focused on comparing the so called cat and dog people. Perhaps driven by the natures of their respective favorite pet, cat people are stereotyped as quiet, sensitive, and unorthodox while dog people are thought of as gregarious and energetic. One of the most comprehensive studies to-date  analysed 4,565 participants who took the Big Five personality Inventory and self-identified as dog people, cat people, both or neither. They found that dog people are higher in extraversion, agreeableness and conscientiousness and lower in neuroticism and openness, even when controlling for gender differences. Contrary to these findings, some failed to uncover differences between the two types  or suggested the labels do little more than offer a different way of saying masculine and feminine .
To shed new light on this debate, we analysed two different online behaviors using big data from social media:
- Mentioning animal names in Facebook posts;
- Using a profile picture featuring cat or a dog on Twitter.
Continue reading Personality Profiles of ‘Cat’ and ‘Dog People’ in Social Media
Automatically highlighting the central words of the candidates’ debate rhetoric.
As the primary election season continues to other important contests in the U.S., we continue our data-driven analysis of the presidential candidates. Previously, we have looked at the most distinctive words used by voters of each candidate and at the distinctive words of each candidate in debate speeches. This time, we were interested in what are the core concepts in each candidate’s rhetoric. To uncover this, we used a different algorithm that highlights the most ‘central’ words and phrases of each candidate those which appear over and over to bridge the distinct themes of each candidate (see Technical Section for more details).
Continue reading Do the presidential candidates have a plan or highlight problems?
The 2016 election has been a strange and surprising one, and the rise of two highly publicized candidates symbolize this unexpectedness: Donald Trump and Bernie Sanders. Few experts would have predicted that a reality TV star and an avowed democratic socialist would inspire strong bases of support in a U.S. presidential election.
The two candidates are not mirror-images in most ways. Donald Trump is currently leading in national polls over his Republican rivals, while Sanders is trailing Clinton. Sanders has been a senator for a decade, while Trump has never held any political office. Trump’s fame and extreme personality make it practically unfair to compare him to any other human being. But both candidates share a narrative of serving as a potential spoiler to their respective party establishments, and pundits are frequently nonplussed by the success of each.
Continue reading Insights to the 2016 Election
Moral Foundation Theory
Although almost everyone agrees that some things are morally good and some things are morally bad, the specific form of these beliefs can differ throughout the population. What is egregious to one person: harming marginalized communities, banning sugary soft drinks, refusing to go to church, etc.; can be considered completely trivial or even be endorsed by someone else.
The Moral Foundations Theory [1,2,3] was developed to model and explain these differences. Under this theory, there are a finite number of basic, moral values that people can intuitively support, but not necessarily to the same extent across the population. The five moral foundations are:
The valuation of compassion, kindness, and warmth, and the derogation of malice, deliberate injury, and inflicting suffering.
The endorsement of equality, reciprocity, egalitarianism, and universal rights.
Valuing patriotism and special treatment for ones own ingroup.
The valuation of extant or traditional hierarchies or social structures and leadership roles.
Disapproval of dirtiness, unholiness, and impurity.
Under this theory, a person who strongly endorses the value of ‘Care/Harm’ will be appalled at an action that causes suffering, while someone who endorses ‘Authority’ will support an action that supports the social hierarchy. These responses would be immediate, emotional, and intuitive.
Continue reading Moral Foundations in Partisan News Sources
One of the most hyped applications of big data analysis to social media is sentiment analysis (a.k.a. opinion mining). Sentiment analysis is the area of Natural Language Processing that aims to identify and extract subjective information from text. This generally includes identifying if a piece of text is subjective or objective, what sentiment (a.k.a. valence) it expresses (positive or negative), what emotion it conveys and towards which entity or aspect of the text. Companies and marketers are mostly interested in automatically inferring public opinion about products, movies or actions.
Opposite to mining these attitudes towards other objects, people also express their own emotions online. We decided to analyze this less popular facet: learning about the emotions of people posting subjective messages. In this post I’ll present variations in sentiment and intensity of Facebook posts and how these vary with the attributes of the people that post them. I will investigate a number of user traits such as gender, age and personality.
Continue reading Sentiment, intensity and user attributes
We study social media with the assumption that people reveal “who they are” when they post to Facebook, Twitter or Instagram: that men write more like men than women do, that extraverts look extraverted, depressed people depressed, and happy people happy. But do people present their “true selves” on Facebook and Twitter?
Of course not. Twelve year olds pretend to be thirteen — otherwise they are kicked off. And people don’t always share their embarrassing medical conditions or their illicit drug use–although they do share both surprisingly often.
Of course people want to look good. It is claimed that on dating sites like OKCupid, people on average inflate their height by two inches and their income by 20%. And, of course, they pick attractive — and sometime out of date — photos of themselves. Narcissists, on average post more photos of themselves to Facebook and edit them more often than the rest of us.
In fact, it’s not clear if people ever present their true selves–or have true selves to present. You don’t need to be a psychologist to know that simple questions like: “Are you a racist?” “Do you think I’m fat?” or even “How old are you?” do not always elicit honest answers. Sociologist Irving Goffman in his book The presentation of self in everyday life famously observed that we are always acting: when a waiter comes out of the door from the kitchen into the dining room, he puts on a persona for the diners, but when he goes back into the kitchen he doesn’t become ‘his true self’ — he just shows a different persona for the kitchen staff. We all behave differently with our friends than with our colleagues or parents. We’re always presenting ourselves on some stage.
Continue reading Presentation of Self in Social Media
Many Americans find astrology quite convincing. In fact, approximately 25% of Americans believe in Astrology, 55% of 18 to 24 year olds think astrology is at least “sort of scientific”, and a Huffington Post article on the Zodiac signs of world leaders, just released, has already accrued thousands of Facebook likes. My colleague Daniel Preotiuc-Pietro recently examined stereotypical words that accompany these beliefs by analyzing the content of tweets containing astrological sign hashtags. For example, my star sign, #leo, was most distinguished by words like “loyal”, “dynamic”, “stubborn”, “generous”, and “affectionate”. However, do leos actually differ in this way? Do people differ by their star sign at all?
I started investigating this question back in 2012 while working on what eventually became our first PLOS ONE paper: Personality, Gender, and Age in the Language of Social Media. My collaborators and I were thinking of applications for differential language analysis (DLA), our method which finds language features (e.g. words or phrases) that distinguish psychological and other human attributes. The suggestion was made that DLA could be used for psychological construct validation (i.e. does the language emerging from DLA fit the theory of the construct? How many language features, total, emerge as significantly correlated? For example, do leos use words indicating they are more stubborn or generous? How many words correlate with being a leo?).
Astrological (Zodiac) signs are a great way to prototype DLA for construct validation. Such signs seem to correspond with enduring traits that distinguish people. To believers, such signs are akin to a non-evidence-based Big 5 Personality Model, the most widely used model in Psychology. The Universal Psychic Guild represents this position:
The signs of the Zodiac can give us great insights into our day to day living as well as the many talents and special qualities we posses.
Daniel showed that differences clearly exist in descriptions of star signs. Here, we investigate whether differences clearly exist between people according to their star signs.
Continue reading Differential Language Analysis for Construct Validation: Do People Differ by Astrological Sign?
In experiments on word usage in Twitter, I’ve constantly noticed some very coherent groups of hashtags and words: those belonging to astrology. Apparently, many users post horoscope information, statements or comments and tag them using the name of the zodiac sign. So, I wondered (since I pretty much tried ignored astrology all my life) what are the most particular traits that people use to describe each sign.
#Taurus is extremely kind and sweet..until you betray them; then death is better.
To uncover this, I planned to use a combination of Twitter data and one of my favourite statistical measures – Pointwise Mutual Information (PMI) [1,2].
Continue reading Zodiac sign stereotypes in Twitter
What do our Facebook posts really say about us? Some dismiss them as just noise, but several research teams are seriously considering social media as a source of psychological data. A common goal of this work is to discover faster or cheaper ways to measure important but elusive variables, like personality, health, and happiness. At the World Well-Being Project, we focus on turning the language from social media into useful new measures.
For example, in a study published last year in PLoS ONE, we searched for traces of age, gender, and personality in a massive amount of social media language: 20 million status updates from 75,000 Facebook users. We found that users’ personality traits could be accurately predicted using only the words in their Facebook status updates. This is consistent with several recent studies [1-6] that suggest that statistical algorithms are surprisingly good at profiling our personalities, especially when they are fed psychologically-rich information like the structure of our Facebook social network or our Facebook likes.
Does this mean that algorithms will replace personality questionnaires?
Continue reading Assessing the assessment: Measuring personality with Facebook status messages