Social media information can predict a wide range of personality traits and attributes


Principle Investigator Haruno Masahiko and Dr. Mori Kazuma at the Center for Information and Neural Networks (CiNet), the National Institute of Information and Communications Technology, report the use of machine learning to analyze behavior on Twitter and predict a wide range of personality traits and attributes such as intelligence and extraversion. Specifically, the study uses component-wise gradient boosting to demonstrate that network features, such as the number of Tweets and the number of likes, and word usage on Twitter are predictive of social (e.g., extraversion) and mental health (e.g., anxiety) personalities, respectively. This approach may provide a new way for mental health diagnostics and personalized nudges.

The new study was published in Journal of Personality online on Thursday, August 20, 2020.

Social media services (SNS) have quickly become universal tools for communication. Previous research has shown that information about Facebook and Twitter use can reveal basic and course personality traits based on the Big 5. However, which types of SNS information can be used to pinpoint specific personality traits and attributes are unknown. There is growing interest about what personality traits and attributes can be predicted by analyzing SNS information and how accurately that information reflects the user.

The study by Dr. Mori and Principle Investigator Haruno discovered that a wide range of personality traits and attributes can be predicted by analyzing four different types of users’ behaviors on Twitter (i.e., network features, time, word statistics, and word usage).

A statistical analysis found significant correlations between measured personality and attribute scores and predicted ones, with correlation coefficients around 0.25. This value is not sufficient for determining an individual’s personality traits precisely, but with a large enough population sample, this technology can provide informative results.

The study collected social media information from 239 participants (156 men, 83 women; average age 22.4 years old) who also took personality tests that measured 24 personality traits and attributes (52 subscales). Of the 52 subscales, the Twitter information could be reliably used to predict 23 of them. Figure 2A showcases a positive correlation (correlation coefficient = 0.44) between the measured and predicted Big 5 extraversion scores based on a 10-fold cross-validation procedure done 10 times (Bonferroni corrected p value of 0.05/52).

The analysis revealed that several social personality traits such as extraversion, empathy and autism could be predicted from network features (Figure 2B). Other personality traits such as socioeconomic status, smoking/drinking, and even depression or schizophrenia were predictable from the language usage features (Figure 2C and D). Prediction from time was more difficult to correlate with measured personalities, but did show a significant correlation with intelligence and social value orientation.

We are expanding the analysis to thousands of subjects. The method described in this study could be used to for mental health diagnostics and personalized nudges to act on people’s behaviors. It will also give insight on the neural mechanisms underlying individual differences in personality traits.