Analysis Of A Reality TV Show's Human Network

We live in an interconnected world. As humans, we connect with others all the time. We have strong networks, family, friends and that clique at work. We also have weak connections to others. You know, Tshepo from HR (you nod at him every day but don’t really know him), Mike the petrol attendant who always asks you for your thoughts on that upcoming Orlando Pirates football game.

These types of connections, physical connections, for a long part of our history as the human race, were the norm. A lot has been written about these connections and how we as social beings gain a lot from them.

Then we started to communicate across distances, increasing our reach. From cave paintings and smoke signals to writing and the telephone. This type of communication allowed us to keep up relationships and connections that would normally become weak or die altogether.

Then, in the 21st century, comes the acceleration of online social networks. Now, even those connections you wish would die (those mates from primary school, who you played with like once at the playground, are now always commenting on your Facebook statuses) can survive. There is another side to this that can be more insightful now. Social networks such as Twitter (by default open and don’t require reciprocal following, people you follow don’t necessarily have to follow you), allow spontaneous discussion and creation of temporary strong networks that center on some topic and die off once the topic's popularity plunges.

Understanding these type of networks may allow us to get a 21st century understanding about how we congregate online. Why not then look at how we congregate around shared public experiences. In this case, interactive reality TV has all the perfect ingredients to make for public discussion: some tension between cast members, public voting, and loyal fan-bases.

So step in the 2015 Big Brother Mzansi (#BBMzansi). BBMzansi was a more popular search trend than Mmusi Maimane (during his ascension to DA leadership)

BBMzansi Network

So what can we learn from the #BBMzansi network then?

Maybe we should first see what questions can be asked when we have data from people tweeting about #BBMzansi. To do this, I collected 559,781 tweets about #BBMzansi while the show was still on TV.

Keep in mind these half a million tweets, are just a sample that Twitter allows us to see, so there were way more than that. That means we have millions of 140 character content spontaneously created by Twitter Users.

What a time to be alive!

More numbers, the most mentioned account in the whole network is naturally the show's account @BBMzansi, but what does the top 5 look like?

Twitter User Mentions
@BBMzansi 112,062
@lazbeinfamous 14,839
@katlego_moloko 8,999
@kay_nkgudi 6,905

As we can see here, the user @lazbeinfamous commanded a lot of attention amongst other users.

We construct a directed network/graph by having users as nodes and a mention as a link/edge between those nodes. If we were to visualise a part of that network, let's say around @lazbeinfamous it would look something like this.

Part of the #BBMzansi Reciprocal network

The thickness in the links indicates how many times a user mentioned another user. So the thicker the link, the more messages shared between those users. We can create a network where a link between users is only created when users have mentioned each other. This type of network is a reciprocal network, no links exist if only one user has not mentioned the other. With this type of network, we can now calculate the number of subgroups within the large network. In graph theory, this is known as Connected Components, that is, how many groups exist wherein each group, every person follows and is followed by at least one other person in the same group. For this #BBMzansi reciprocal network, we find that the largest such group has 5283. The other groups and their sizes are shown in the next table.

Connected Component Size Occurances
2 users 169 groups
3 users 15 groups
4 users 7 groups
5 users 1 group
8 users 1 group
5,283 users 1 group


So what valuable insights can we get from such a network?

Simply if you were in advertising, you would want to know who commands a lot of attention in such an ad-hoc network. That is which set of people, who are not necessarily involved with the TV show become drivers of discussion.

In this network, it was clear that user @lazbeinfamous was such a person.

A big omission from this network analysis is that it is made up of only Twitter data. There were other interactions on Facebook that may have been more directed and also controlled by different users, especially fan pages and groups.

Twitter would be defined as an ad-hoc group where fans found each other by using different hashtags, e.g. #k2blue, #ntombace, #mbalsea etc.

Can we predict who would be voted out of the #BBMzansi each week?

I don't like the word predict, but we could see if we can find correlations between the popularity of teams on Twitter and the voting patterns of fans.

Coincidentally, I did track this during the show's run. Twitter popularity correlated very well with the chances of a team being voted out. So the less the popularity on Twitter, the less the chances of real votes from fans, increasing the chances of being voted out. An example below is the tracked popularity on the last week of the show. The voting patterns from the show actually tracked this graph. The winners of #BBMzansi were ntombace, second was k2Blue and third were mbalsea (17th was the finale).

BBMzansi Analytics

A topic I have not covered in this article is the concepts of cascades. This is the concept of how information spreads through the network. For example when an event happens, like a BBMzansi housemate does something "newsworthy", how does that information spread throughout the people who are talking about the show. How does this same information even reach people who were not part of that network.

By analysing such a cascade, we can also identify people who can increase the reach of the event. You can then target individuals in order to spread some information. For example, you could be an edutainment show, and are looking to get more people educated about reckless behaviour, so finding people who amplify cascades becomes a great goal. As it would have it, I have done something like this a few years ago for the show Intersexions.


A word cloud of words that were connected to the keywords BBMzansi and BB2TV.

Word Cloud

Analysis of 2014 South African Elections Twitter Interactions

Vukosi Marivate

About Vukosi Marivate

Dr. Vukosi Marivate is a Data Hoarder. Writes in his personal capacity on interesting things we can learn from data. Works for CSIR, ex-intern at Google Inc. PhD, CompSci, Rutgers University (USA)