Visualizing How Networks Change Over Time

Watching phenomena change over time is a big component of modern data science techniques and is the basis for time series methodologies. However, when it comes to networks, whether of people or something else, I don’t see a lot of work being done on understanding how they change over time. In this article – the last of my project based on the Friends TV series – I look at ways that you can create visualizations of changing networks using R (for more basic methods) and the Javascript D3 library (for more advanced methods).

Community Detection in R Using Communities of Friends Characters

In this article I will use the community detection capabilities in the igraph package in R to show how to detect communities in a network. By the end of the article we will able to see how the Louvain community detection algorithm breaks up the Friends characters into distinct communities (ignoring the obvious community of the six main characters), and if you are a fan of the show you can decide if this analysis makes sense to you.

Simple iterative programming and error handling in R

As you develop as a programmer, there are common situations you will find yourself in. One of those situations is where you need to run your code over a number of iterations of one or more loops, and where you know that your code may fail for at least one iteration. You don’t want your code to stop completely, but you do want to know that it failed and log where it happened. I am going to show a simple example of how to do this here.

Scraping Structured Data From Semi-Structured Documents

One of the most powerful capabilities that data science tools bring to the table is the capacity to deal with unstructured data and to turn it into something that can be structured and analyzed. Any data scientist worth their salt should be able to ‘scrape’ data from documents, whether from the web, locally or any other type of text-based asset.

What Does It Take To Be a Qualified Data Scientist?

Everyone is now calling themselves a Data Scientist. No matter what position I am hiring for, that term is on over 80% of the resumes I look at. It has actually made me start to ignore the term because it is not a differentiator of talent any more.

Clustering Time Series Data in R

Increasingly, there is a desire to cluster observations based on how they change over time. Do they increase, decrease, stay the same? Are they consistently high, consistently low, or do they go up and down? Are some more complex in their changes than others?