Before I start I wanted to just take a moment to reflect on what we’ve achieved in this series of articles. We went from having nothing – no data and no code – to creating insightful analytics on a large network and creating advanced visualizations of how that network developed. In doing this we have learned so many useful skills. First, we learned how to create data from nothing by scraping the web. Then we learned how to transform that data into a usable shape by using iterative functional programming. Then we learned how to use the
igraph package in R to create, analyze and plot networks and to do community detection. Finally, here we will learn simple and advanced ways to visualize these networks and communities and how they change. This has been an amazing journey through the possibilities of data science techniques. I hope you agree!
Using R to create basic visualizations
In the last article we wrote code to take a network edgelist, divide it into distinct communities, work out various statistics like the degree centrality of each node , and output two visualizations – one in a spherical form (pretty, but not great at seeing communities), and one in force-directed form – much better for seeing communities.
Now that we have actually written this code, we can easily turn it into a function that receives any season from the Friends edgelist and output one or both the visualizations above for that season. Here is the function that I wrote to do this:
Now we have this function, we can write some simple code to go through the edgelist season by season and visualize the force-directed network for all seasons up to and including that season. This code will generate 10 images by running our function 10 times.
Finally, if we take those ten images and feed them into a graphics processor like
ffmpeg, we can generate a gif or video of the networks being cycled through from 1 to 10, showing how the network increases and develops over time. Alternatively you can load the images to an online processor like EZgif to create something like this:
Using D3 to generate advanced visualizations
igraph into a
json form. The
networkD3 package in R makes this transformation very easy. Similar to the above, we can write functions to translate our network edgelists into
json graph representations:
I downloaded jpeg images of all the main characters and also the main characters in each of their communities, as discovered in the previous article, and wrote functions using D3 to generate force directed network visualizations based on a JSON file input. These function looked specifically for main characters and if it found them, superimposed their image onto their node. I then used this to create two different interactive visualizations:
- A visualization which looped through each season and displayed the cumulative network. This was done by creating a counter from 1 to 10, refreshing that counter ever 4 seconds to wipe the previous visualization and bring in the next dataset, and attaching this entire process to a play and stop button to give the user control over the visualization. I also attached a handy scroller to allow the user to determine the number of scenes need to define a ‘connection’ between the characters. You can play with this visualization here or at the bottom of this page and the full code is in the repo here.
- For those interested in exploring each season as its own self-contained network, I created a separate JS script which linked the refresh of the network to a pull-down menu that the user could control. Each option pulls in only the data from that particular season. You can play with this visualization here or at the top of this page and the full code is in the same repo here.
So that just about wraps up this series on network analytics using Friends as an example. I hope you enjoy playing with these resources and feel free to post any interesting observations you have as comments.