Tag Archives: NLP

Personality Profiles of ‘Cat’ and ‘Dog People’ in Social Media

Owning a pet is very popular in the U.S. The most recent survey from the American Pet Products Association estimated that 65% of American households (79.9 million) include at least one pet. The most popular household pets are, unsurprisingly, dogs (44% of households) and cats (34.9% of households) [1].

Psychologists have long been fascinated to uncover whether individual differences drive pet ownership and preference. Most have focused on comparing the so called cat and dog people. Perhaps driven by the natures of their respective favorite pet, cat people are stereotyped as quiet, sensitive, and unorthodox while dog people are thought of as gregarious and energetic. One of the most comprehensive studies to-date [5] analysed 4,565 participants who took the Big Five personality Inventory and self-identified as dog people, cat people, both or neither. They found that dog people are higher in extraversion, agreeableness and conscientiousness and lower in neuroticism and openness, even when controlling for gender differences. Contrary to these findings, some failed to uncover differences between the two types [9] or suggested the labels do little more than offer a different way of saying masculine and feminine [11].

To shed new light on this debate, we analysed two different online behaviors using big data from social media:

  1. Mentioning animal names in Facebook posts;
  2. Using a profile picture featuring cat or a dog on Twitter.

Continue reading Personality Profiles of ‘Cat’ and ‘Dog People’ in Social Media

Do the presidential candidates have a plan or highlight problems?

Automatically highlighting the central words of the candidates’ debate rhetoric.

As the primary election season continues to other important contests in the U.S., we continue our data-driven analysis of the presidential candidates. Previously, we have looked at the most distinctive words used by voters of each candidate and at the distinctive words of each candidate in debate speeches. This time, we were interested in what are the core concepts in each candidate’s rhetoric. To uncover this, we used a different algorithm that highlights the most ‘central’ words and phrases of each candidate those which appear over and over to bridge the distinct themes of each candidate (see Technical Section for more details).

Continue reading Do the presidential candidates have a plan or highlight problems?

Moral Foundations in Partisan News Sources

Moral Foundation Theory

Although almost everyone agrees that some things are morally good and some things are morally bad, the specific form of these beliefs can differ throughout the population. What is egregious to one person: harming marginalized communities, banning sugary soft drinks, refusing to go to church, etc.; can be considered completely trivial or even be endorsed by someone else.

The Moral Foundations Theory [1,2,3] was developed to model and explain these differences. Under this theory, there are a finite number of basic, moral values that people can intuitively support, but not necessarily to the same extent across the population. The five moral foundations are:

  1. Care/Harm:
    The valuation of compassion, kindness, and warmth, and the derogation of malice, deliberate injury, and inflicting suffering.
  2. Fairness/Cheating:
    The endorsement of equality, reciprocity, egalitarianism, and universal rights.
  3. Ingroup loyalty/Betrayal:
    Valuing patriotism and special treatment for ones own ingroup.
  4. Authority/Subversion:
    The valuation of extant or traditional hierarchies or social structures and leadership roles.
  5. Purity/Degradation:
    Disapproval of dirtiness, unholiness, and impurity.

Under this theory, a person who strongly endorses the value of ‘Care/Harm’ will be appalled at an action that causes suffering, while someone who endorses ‘Authority’ will support an action that supports the social hierarchy. These responses would be immediate, emotional, and intuitive.

Continue reading Moral Foundations in Partisan News Sources

Sentiment, intensity and user attributes

One of the most hyped applications of big data analysis to social media is sentiment analysis (a.k.a. opinion mining). Sentiment analysis is the area of Natural Language Processing that aims to identify and extract subjective information from text. This generally includes identifying if a piece of text is subjective or objective, what sentiment (a.k.a. valence) it expresses (positive or negative), what emotion it conveys and towards which entity or aspect of the text. Companies and marketers are mostly interested in automatically inferring public opinion about products, movies or actions.

Opposite to mining these attitudes towards other objects, people also express their own emotions online. We decided to analyze this less popular facet: learning about the emotions of people posting subjective messages. In this post I’ll present variations in sentiment and intensity of Facebook posts and how these vary with the attributes of the people that post them. I will investigate a number of user traits such as gender, age and personality.

Continue reading Sentiment, intensity and user attributes

Zodiac sign stereotypes in Twitter

In experiments on word usage in Twitter, I’ve constantly noticed some very coherent groups of hashtags and words: those belonging to astrology. Apparently, many users post horoscope information, statements or comments and tag them using the name of the zodiac sign. So, I wondered (since I pretty much tried ignored astrology all my life) what are the most particular traits that people use to describe each sign.

#Taurus is extremely kind and sweet..until you betray them; then death is better.

To uncover this, I planned to use a combination of Twitter data and one of my favourite statistical measures – Pointwise Mutual Information (PMI) [1,2].

Continue reading Zodiac sign stereotypes in Twitter

Assessing the assessment: Measuring personality with Facebook status messages

What do our Facebook posts really say about us? Some dismiss them as just noise, but several research teams are seriously considering social media as a source of psychological data. A common goal of this work is to discover faster or cheaper ways to measure important but elusive variables, like personality, health, and happiness. At the World Well-Being Project, we focus on turning the language from social media into useful new measures.

For example, in a study published last year in PLoS ONE, we searched for traces of age, gender, and personality in a massive amount of social media language: 20 million status updates from 75,000 Facebook users. We found that users’ personality traits could be accurately predicted using only the words in their Facebook status updates. This is consistent with several recent studies [1-6] that suggest that statistical algorithms are surprisingly good at profiling our personalities, especially when they are fed psychologically-rich information like the structure of our Facebook social network or our Facebook likes.

Does this mean that algorithms will replace personality questionnaires?

Continue reading Assessing the assessment: Measuring personality with Facebook status messages