Using D3 to Visualize Twitter Followers

Hi guys, so this time I’m gonna post a tutorial about creating a visualization for BPS Twitter followers using D3.  What is the benefit of creating this visualization? Well, to get a better understanding of our own network. This tutorial is actually an adaptation of the assignment that I got during my study at ODU. So, without further ado, let’s start diving in the code. But before that, make sure that you have Python installed on your machine. All the codes below are written in Python 3.

Now, you are all set. Ready? Go…

I. Crawling the Twitter Data

1. Install the Required Python Libraries

There are various Python libraries available for accessing Twitter API. In this case, I will use Tweepy. Type pip freeze to check the instalIed python libraries on your computer. If Tweepy is not there, installed it using the command pip install tweepy . Don’t forget to also install unicodecsv  library for storing the crawling result into csv file. Then, import all libraries needed.

2. Create the Twitter API Object.

It’s worth to mention that Twitter has 2 types of API:

  1. REST API  –> for reading and writing Twitter data
  2. streaming API –> to monitor tweets in real time.

To be able to utilize Tweepy, we need the Tweepy API object. Creating this object requires Twitter Apps’ key and token that can be obtained from here. After successfully login, create a new app and find the key and token on the tab Keys and Access Tokens and embed them on the code as can be seen below (replace the xxx with your own key and token). The main_user is the screen_name of the Twitter handle you want to explore. In this case, I use BPS Twitter handle: bps_statistics.

3. Get the Followers List

Now that we have the API object, we are free to use any of its methods or attributes. First, we create the user object, then get all of its followers and save it in a csv file.  Tweepy provides useful methods to do this, that are API.followers_ids() and API.get_user(id) . Information that can be obtained from API.get_user(id) can be seen in the figure below.

Our final visualization will be a force-directed graph with the follower’s avatar as the node’s label. Therefore, for every follower, get its id, twitter name, and avatar’s URL. Feel free to explore Tweepy documentation to get a feel for its methods and attributes.

Please be aware that Twitter has the rate limiting policy per user access token. It only allows 180 calls for every 15 minutes, hence the try-except in def get_user_by_id(id).

We are done with the part I.

II. Analyzing the Followers-Followings Relationships

1. Reading the Followers

This step begins with reading the followers.csv generated from the previous step. Each row in the csv file is read iteratively and considered as an array comprises 4 elements: id, name, number of followers, and avatar URL. Remember that we need nodes and links to build a force-directed graph. Nodes are the people in the circle, while links are the connections between 2 or more people. Thus, we’re gonna build 2 arrays – people and links – while reading the csv file. In this step, we will only generate the links between me and my followers.

See the figure below to get an illustration of how the links array and people object look like.

links Array
people array

2. Exploring the Friendship between Everyone in People Array

Next step is exploring the friendship between my followers to see if any of them follow each other. Tweepy has been really kind to provide us the API.show_friendship()  method. It returns the friendship  object that contains detailed information about the relationship between two Twitter users. See the figure below to get an illustration of friendship .

We’re gonna iterate through each possible pair of people and determine if there is a link between them. These new identified links will be added to the links array that we have created in the previous step. So we will do something like this:

3. Generate JSON Serializable Data for NetworkX Graphs.

Now that we already have the people and links, the next thing to do is to convert them into a format that is suitable for a force-directed graph similar to the example from https://d3js.org/. But don’t worry, Networkx library is here to help us deal with this matter. Install this library using command pip install networkx . According to the website, there are 3 types of JSON formats that we can generate using Networkx: node-link, tree, and adjacency. For our graph, we will generate the JSON data in a node-link format. This JSON has nodes and links as the keys, where the nodes and links are arrays of objects. In short, we take the links and people array from the previous step, build a graph from it, then convert the graph into a node-link JSON data.

On lines 11 – 15 in the code above, notice that while doing the conversion, we also add the twitter name and avatar URL as the values for the key nodes. It is done so that our JSON data will look like this:

node data

III. Draw the D3 Graph

Next step is to draw the twitter followers graph by using D3. I use the Force-Directed Graph example by Mike Bostock. But, I found out that the graph resulted is too crowded and not really eye-catching. So, I combined the graph that I made with the D3 Process Map example by James Nylen. These are the modification that I made from both examples:

  1. Use ‘twitter_my_connections.json’ as the input for the graph.

The graph above looks  ‘seeable’ and not really crowded. Well, it looks like I need more followers on Twitter :). So, please follow my Twitter account @erikaris.

Thank you very much for reading this and see you soon in the next post.

Erika (@erikaris).

 

References:

  1. https://stackoverflow.com/questions/3162909/method-to-save-networkx-graph-to-json-graph/8681020
  2. https://networkx.github.io/documentation/stable/reference/readwrite/json_graph.html
  3. https://stackoverflow.com/questions/31693150/tweepy-rate-limit-exceeded-when-retrieving-user-timeline

Leave a Reply

Your email address will not be published. Required fields are marked *

Show Buttons
Hide Buttons