Scraping Structured Data From Semi-Structured Documents

One of the most powerful capabilities that data science tools bring to the table is the capacity to deal with unstructured data and to turn it into something that can be structured and analyzed. Any data scientist worth their salt should be able to ‘scrape’ data from documents, whether from the web, locally or any other type of text-based asset.

Clustering Time Series Data in R

Increasingly, there is a desire to cluster observations based on how they change over time. Do they increase, decrease, stay the same? Are they consistently high, consistently low, or do they go up and down? Are some more complex in their changes than others?

What Matters in Speed Dating?

I was interested in finding out what it was about someone during that short interaction that determined whether or not someone viewed them as a match. This is a great opportunity to practice simple logistic regression if you’ve never done it before.

Making Sense of the Game of Thrones Universe Using Community Detection Algorithms

Community detection can help identify a structure in a set of interactions, which has applications to organizational design, but can also be useful in other fields such as digital communications and crime investigation