A geographical, statistical, and orthographic study of all the fucks on twitter.
wait, why?
-
there are a lot of
fucks
on twitter
-
in fact,
2.3%
of all tweets on twitter contain the word
fuck
-
yes, really
-
...
-
a study of twitter is a study of its users
-
coincidentally, our head of state uses twitter
"Every time I speak of the haters and losers I do so with great love and affection. They cannot help the fact that they were born
fucked
up!" — Donald J. Trump (@realDonaldTrump)
September 29, 2014
Methodology
-
Open up the twitter
FIREHOSE
(60000 tweets/hr)
-
Set a US bounding box and require location
-
Collect data for two weeks ~ 10^7 tweets
-
Filter to remove matches in mentions (@'s) and links
https://github.com/thoppe/twitterf_cks
Geographical analysis, city level
Geographical analysis, state level
Least fucks given
State, total, # of fucks per 1000 tweets ===================================================== MT 9976 10.4 AR 36957 11.2 DC 94142 11.7 NE 42636 13.6 MO 108180 13.7
Most fucks given
State, total, # of fucks per 1000 tweets ===================================================== ND 7699 23.4 LA 216023 23.4 AZ 173604 24.8 NV 127481 25.9 CA 1377434 26.7 WY 5357 27.6
Orthography
Examine variations in the
spelling
, ex. fuuuck, fvck, f$$$uck, ...
Fucking
regex
(f+)([aoyvu%s]+)(c+k+)
%
"!@#$%^*+-"
(f+)([aoyvu%s]+)(c+k+)
%
"!@#$%^*+-"
Spelling Variants
word, count ================================== fuck 248091 f*ck 1031 <---------- self-censorship? fuckkkk 419 fuckkk 390 fvck 387 fuckk 383 fuckkkkk 337 fuuuuck 263 <---------- peak repeat? fack 242 fuuuuuck 227 fuuuck 206 fuckkkkkk 157 fuuuuuuck 130 fock 125 fvckk 79 fuuuuuuuck 72
Number of repeated vowels peaks at 4
Statistics
curse, fraction of curse words ================================== shit 0.333857 fuck 0.331616 bitch 0.125879 damn 0.088843 dick 0.027923 piss 0.026053 pussy 0.017165 crap 0.015854 asshole 0.011103 cock 0.010084 douche 0.002595 bastard 0.002477 slut 0.002449 fag 0.002342 darn 0.001761
Curse word colocation
Sentiment analysis
Used
VADER
(Valence Aware Dictionary and sEntiment Reasoner), a sentiment analysis tool tuned for social media. Cat examples:
-
I hate cats.. just evil little fuckers (-0.9137) -
I just want to go to fucking sleep these stupid ass cats are fighting right outside my window (-0.8542) -
I just got a cat fucking drunk and he's abusive (-0.7841) -
Let your cat be a fucking cat. (0.0) -
honestly scaring cats is fucking hilarious (0.4754) -
I FUCKING LOVE MY CATS SO MUCH LOOK AT THIS BEAUTIFUL GUY I SWEAR WHAT A SMART LOYAL LOVING ANIMAL GIFTED TO ME (0.9577)