//
archives

simonraper

I'm a statistician at Mindshare UK. My interests are in computational statistics, machine learning, Bayesian modelling, data visualisation, market mix modelling and R.
simonraper has written 10 posts for Drunks&Lampposts

Expected switching for the Dirichlet distribution

A valuable tool in choice modelling is the Dirichlet-multinomial distribution. It’s a compound of the multinomial and Dirichlet distributions and it works like this: A choice between N options is modelled as a multinomial distribution with parameters θ1, θ2, θ3 … θN, where the thetas also represent the probabilities of each option being chosen. For … Continue reading »

Thorstein Veblen and Hard Coding

It is still quite common to hear the career progression of an analyst described as one upwards from the hard graft of coding and “getting your hands dirty” towards the enviable heights of people management and strategic thinking. Whenever I hear this it reminds me of the book Conspicuous Consumption by the American economist Thorstein … Continue reading »

Two Quick Recipes: Ubuntu and Hadoop

There are so many flavours of everything and things are changing so quickly that I find every task researched online ends up being a set of instructions stitched together from several blogs and forums. Here’s a couple of recent ones. Ubuntu on AWS (50 mins) Was going to buy a new laptop but it made … Continue reading »

Lazy D3 on some astronomical data

I can’t claim to be anything near an expert on D3 (a JavaScript library for data visualisation) but being both greedy and lazy I wondered if I could get some nice results with minimum effort. In any case the hardest thing about D3 for a novice to the world of web design seems to be … Continue reading »

Graphing the history of philosophy

This one came about because I was searching for a data set on horror films (don’t ask) and ended up with one describing the links between philosophers. To cut a long story very short I’ve extracted the information in the influenced by section for every philosopher on Wikipedia and used it to construct a network … Continue reading »

Visualising the Path of a Genetic Algorithm

We quite regularly use genetic algorithms to optimise over the ad-hoc functions we develop when trying to solve problems in applied mathematics. However it’s a bit disconcerting to have your algorithm roam through a high dimensional solution space while not being able to picture what it’s doing or how close one solution is to another. … Continue reading »

Non overlapping labels on a ggplot scatterplot

This is a very quick post just to share a quick tip on how to add non overlapping labels to a scatterplot in ggplot using a great package called directlabels. The trick is to make each point a single member group using an aesthetic like colour and then apply the direct.label function with the first.qp … Continue reading »

Marketing Mix Lab: Visualising The Correlation Matrix

Following on from the previous post here is an R function for visualising correlations between the explanatory variables in your data set. An interesting example is the North Carolina Crime data set that comes with the plm package. This has the following continuous variables: crmrte crimes committed per person prbarr probability of arrest prbarr probability … Continue reading »

Marketing Mix Lab: Multicollinearity and Ridge Regression

In marketing mix modelling you have to be very lucky not to run into problems with multicollinearity. It’s in the nature of marketing campaigns that everything tends to happen at once: the TV is supported by radio, both are timed to coincide with the relaunch of the website. One of the techniques that is often … Continue reading »

Marketing Mix Lab: Generating Artificial Sales Data

Our statistics lecturers would often end each session with a demonstration of the power of the statistical model under discussion. This would usually mean generating some artificial data and showing how good the tool was at recovering the parameters or correctly classifying the observations. It was highly artificial but had a very useful feature: you … Continue reading »

Follow

Get every new post delivered to your Inbox.

Join 392 other followers