BlogIndiana 2012 Day One Twitter Data Analytics

#BIN2012 Day1 on Topsy

Today, I attended the first day of a two day blogging conference here in Indianapolis called BlogIndiana that brings together bloggers, writers, marketers, PR specialists, social media and SEO professionals. It reminded me of the ScienceOnline conference but smaller, focused primarily on the Midwest, and held in a traditional conference format (that is, not in unconference format).

I captured (almost) all the tweets from the conference today, so I thought I would dig through the data and provide an analytical overview of BlogIndiana 2012 #BIN2012 Day 1. If you’re attending the BlogIndiana conference and you like the analytics below, please let me know tomorrow! It would be great to meet you and learn more about what you do.

Update: I also analyzed twitter data from day two of the conference.

First, some statistics

The tweets were collected just after the first session keynote started — Thursday, August 9th, 9:19 am — into the late afternoon on the same day at 5:50 pm. Note that all the spam tweets have been filtered out. In total, there were 2,082 tweets consisting of 31,784 words from 342 people. The top tweeter was Jenn Lisak (Twitter: @jlisak) with 101 tweets! Here are the top 10 Twitter users and their number of tweets:

Rank Twitter user Number of tweets
1 @jlisak 101
2 @edeckers 59
3 @choosingchange 53
4 @kmullett 50
5 @robbyslaughter 50
6 @amyl_bishop 45
7 @lediamedia 44
8 @wjjessen 41
9 @awelfle 40
10 @stmkent 40

Most conference attendees using Twitter actually tweeted very little: 240 people tweeted just 1-4 times. Thirty-seven people tweeted 5-9 times and 42 people tweeted 10-19 times. Each of the other groups with 20 or more tweets consisted of less than 10 people.

#BIN2012 Day1 Tweets per Number of Users

Let’s focus on retweets. Retweets are a way of saying “yeah, I like that” or “I agree!” and for the purposes of this analysis come in two forms: they can be a simple rebroadcast (e.g. RT: “original tweet”), which we’ll call a RT without conversation, or a retweet with comments back to the original tweeter (e.g. Agreed! RT: “original tweet”), which we’ll call a RT with conversation. Almost one-third (31%) of the #BIN2012 tweets today were retweets. Of those, 24% (158 of 645) were a RT with conversation. The top retweeter was Brooke Randolph (Twitter: @choosingchange) with 28 retweets.

#BIN2012 Day1 Retweets


#BIN2012 Day 1 tag cloud of tweets

As I’ve written before, I’ve always had a thing for tag clouds. I find tag clouds a useful visual representation of metadata; if they’re designed right, tag clouds can also look really good.

From 31,784 words, 3,691 were unique. I calculated the frequency of all 3,691 words and then “cleaned” the data, removing all words less than 4 characters, numbers, and words that were either common words (such as “that”, “from”, “with” or “have”) or gibberish (consisting principally of url strings from shared links). I also flattened out the top eight terms — bin2012, @allisonlcarter, @edeckers, @robbyslaughter, @ryanbrock, about, social and people — since their frequency increased so much that they distorted the tag cloud.

I ran a quick analysis to see if the terms used on Day 1 were positive or negative. I used the Subjectivity Lexicon from the Departement of Computer Science at the University of Pittsburgh. The Lexicon contained 2,304 positive terms: 13.5% of which were included in at least one tweet today. The Lexicon also contained 4,152 negative terms (is it that much easier to be negative??): only 4.72% were included in at least one tweet today. Assuming that tweets were mutually exclusive with respect to being positive or negative, #BIN2012 tweets were 2.8 times more positive than negative.

The top 200 terms were then imported into Wordle and a weighted tag cloud was generated. Feel free to download any of the files below and reshare.

#BIN2012 tag cloud

Here are the top 10 terms (prior to flattening) from the tag cloud above:

Rank Term Frequency
1 bin2012 2096
2 @allisonlcarter 245
3 @edeckers 214
4 @robbyslaughter 204
5 @ryanbrock 152
6 about 133
7 social 119
8 people 118
9 think 98
10 great 96

All data and images are avaliable for download: low-resolution image, high-resolution image or raw data set of 200 terms with frequencies.

Walter Jessen is a digital strategist, writer, web developer and data scientist. You can typically find him behind the screen something with an internet connection.

  • Leah

    Thanks for sharing! I was surprised that a majority of conference attendees didn’t tweet frequently. Live tweeting isn’t for everyone, though :)