As Twitter adoption grows, so too does Spam. It comes in many shapes and sizes and we’ve all experienced it on Twitter. Watching the stream at 140 Twitter Conference was an interesting case study. A tweet from Mashable entered the stream followed by a massive influx of retweets. It took over the #140tc hashtag. I was sitting next to Chris Pirillo at the time and he raised a great question: “I wonder how much of that is spam or bots?”
Let’s take a look at the data…
The velocity of tweets in the first hour was dramatic. It’s clear how this story was able to take over the #140tc stream.
Following/Followers Ratio. One way to identify “questionable” accounts is by a high friends to follower ratio. Looking at the people retweeting the Mashable post on #140tc, show 11% with a ratio of >1.5 and 4% over 3.
Blank Profiles. Users with little or no profile data is another potential sign of a spammer or bot. In this sample data, 12% had blank bio, and 14% had a blank location. While 9% had both fields blank.
So What?
Neither of these measure are a perfect way to identify Spam — both provide a directional analysis. The spammer and bot issues on Twitter suggest that look at who is responding can be important, particularly when content gets picked up by big publishers like Mashable.




