drunksandlampposts has written 13 posts for Drunks&Lampposts

Why are pirates called pirates?

In homage to International Talk Like a Pirate Day… I recently stumbled across a series of blog posts from the folks at IDV that visualised the archive of recorded pirate attacks which has been collected by theĀ US National Geospatial-Intelligence Agency. It’s a dataset of 6000+ pirate attacks which have been recorded over the last 30 … Continue reading

R: Dealing with package updates

Here’s a very short post to highlight one of the “highlights” of my week that I thought was worth sharing with the wider community. One of the things I find great about R is the rapidly evolving ecosystem where new packages are being constantly created and others are being updated. Up until now, I’ve found … Continue reading

R: Creating a shortcut to run a gWidgets GUI

I’ve been playing around with using gWidgets on Windows over the last few weeks as a way of creating front ends for various functions and set of functions that I’ve created, so that non R users can have the benefit of these without having to write a single line of code. The likes of 4Dpiecharts … Continue reading

D3 – another acronym to learn

D3 (which stands for data driven documents ) has been getting a lot of traction over the last few months, with more and more interactive and animated visualisations using this JavaScript library. The author of the library, Mike Bostock, is very active in both developing the library and also in providing a constant stream of … Continue reading

Another visualisation of 118 Years of US Weather Data

I posted yesterday about weather data sourced from NOAA to look at how hot this March was compared to previous years and used a couple heat maps in R to look at how temperatures compared based on using the rank of each year for each state (so if, say this March in Florida was the … Continue reading

118 years of US State Weather Data

A recent post on the Junkcharts blog looked at US weather dataand the importance of explaining scales (which in this case went up to 118). Ultimately, it turns out that 118 is the rank of the data compared to the previous 117 years of data (in ascending order, so that 118 is the highest). At … Continue reading

EC2 Tutorials: rJava – annoying enough to have its own blog post

One of the most frustrating items that I’ve been trying to install on my EC2 instance is rJava. Its an R package that lots of other packages have as a dependency, including glmulti and MongoDB. I’ve spent a fair few hours trying to get this installed, constantly receiving the error message: configure: error: Java Development … Continue reading

EC2 Tutorial: NumPy and SciPy

Another quick note for getting set up on your EC2 instance. To install SciPy, you first need to install ATLAS and lapack. The following few lines of code run as root (sudo bash) should sort you out: yum -y install atlas-devel yum install lapack pip install scipy

EC2 Tutorials: Scheduling tasks on EC2 using Crontab

One of my main reasons for wanting an EC2 instance was to be able to automatically run scripts at certain times, normally to collect data and save it to a database. As my EC2 instance is always running, I can forget about it for a month and have a month’s worth of data ready and … Continue reading

Google Refine: One of The Best Tools You’ve Probably Never Heard About

Lots of data that’s available online tends to not be the cleanest thing in the world, particularly if you’ve had to scrape it in the first place. At the same time, lots of internal data sets can be just as messy, with columns having different names in what should be identical spreadsheet templates, or with … Continue reading

Blog Stats

  • 328,092 hits

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 526 other followers


Get every new post delivered to your Inbox.

Join 526 other followers