I've been studying Trump's tweets for over two years, and I've read every tweet ever posted by @realDonaldTrump. There are nearly 50,000.
But this post is not about Trump's bizarre Twitter activity. Just as interesting as Trump's tweets are the millions of users who tweet replies to @realDonaldTrump. How many of those tweets are #MAGA trolls, and how many are "triggered libs?"
To begin, I randomly sampled 1,000 replies to @realDonaldTrump, from 11/26/19 to 5/8/20, and set on an arduous journey to hand-label them “0” for anti-Trump and “1” for pro-Trump. In the end, exactly 700 (70%) were anti-Trump and 300 (30%) were pro-Trump. These numbers weren't particularly surprising, as previous reports have indicated that Twitter users skew liberal.
And just like that, I had a workable training set. In R (Isaac, computer science? Who would've thought?), I trained a regression model to automatically classify a response tweet, based on the tweet’s words and associations between words, as 0 or 1. On a test set of 200 tweets (where I compared the model's predictions to a hand-labeled key), the model is ~91% accurate. Not bad!
*Okay, the initial model was only around 75% accurate -- barely more accurate than a "null" model (equivalent to always guessing "anti-Trump," which is true 70% of the time). After a lot of time fiddling with the model's regularization parameters, up-sampling, under-sampling, and training word "embeddings" (matrices of associations between words) from a larger sample of millions of response tweets...I arrived at something workable!
Here are some examples of tweets automatically classified by the model:
Text of response tweet Probability_ProTrump Pro_Trump?
-------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------
Here’s an example of one tweet that is incorrectly classified:
-------------------------------------------------------------------------------------------------------------------------------
(It should be labeled 1, as it’s pro-Trump, but as prob(proTrump) < 0.5, it’s classified as anti-Trump.)
I was pretty happy with having built a classification model, but of course any model is only so interesting as its applications. Here are some general findings I found by applying the model to a sample of 10,000 replies:
Most common words among pro-Trump replies:
They hate sleepy joe "biden," address Trump as "sir," thank "god" for Trump, and "love" the "usa." Who's surprised?
Most common words among anti-Trump replies:
They side with "@cnn" and "cuomo" over @realDonaldTrump, complain about Trump's handling of the "virus," call trump a "liar," and cite comparisons between Trump and "obama."
That's all for this post. While the model and top words alone don't reveal any particularly novel information, there's much more to come. I need to lie down and rest my head after reading well over 1,000 angry political tweets -- more than a few of which introduced me to conspiracy theories I hadn't heard of prior. If anyone has a lot of time on their hands, go digging through #QAnon tweets to remind yourself that, no matter how much quarantine has plagued your mind, there are people far crazier than you out there.
Isaac
Incredible work. I truly enjoy reading about our glorious leader. All hail the best president in the history of the world and may he rule over us forever!! #keepamericagreat
ReplyDeleteAmazing work!
ReplyDelete