I’ve been interested in setting up an Amazon Web Services EC2 instance for a while – essentially a remote desktop in the cloud, which can be handy when you want an always-on machine (say, to run scripts at particular times, or to have easy access to a particular machine setup).
Over the next few weeks, I’m publishing a series of posts that aim to give a decent introduction to getting started on Amazon Web Services. There are a lot of great tutorials out there already (which I’ll link to and won’t reinvent the wheel), but my aim here is to pull them all together and also share solutions to a few stumbling blocks that I’ve come across.
For a while now, Amazon have been offering a free Micro EC2 instance for the first year of being an Amazon customer. An instance is very similar to the computing bit of a PC, and in the case of a Micro Instance, this is like a fairly low powered machine – it isn’t designed for performance.
Having said that, now I’ve had my Instance running for over 500 hours, I’ve discovered I’m able to do a lot more with it than I thought I’d be able to. Firstly, by setting up R Studio Server on the instance, so I can write code in the cloud very easily. Secondly, I can connect to it to query databases and the like. Thirdly, I can take an image of my Instance and load that up on a much more powerful machine.
One thing to note is that the Free Tier Instance is based on Amazon Linux – not Windows – so there’s quite a steep learning curve if you haven’t operated in a Linux environment before (more on that later).
I’ve set my EC2 machine up to run a number of python scripts that go off to various websites and pull data from them into a set of MySQL and MongoDB databases. As the machine is always on, I don’t have to worry about someone turning off my machine or forgetting to run a script at a certain time.
One thing to note is that although Amazon offer the Free Tier for new AWS customers, there are lots of things that incur costs (for instance, choosing a Small Instance rather than a Micro Instance), so please keep an eye on the meter (which can be found in top left corner – Account Activity under My Account / Console).
There are many good resources on getting set up with Amazon Web Services. There are some slight differences depending on what OS you are using.
Location, Location, Location
One thing to note is that your EC2 instance can be hosted in a number of locations around the globe. The location that you choose shouldn’t make much difference in terms of performance, but you should bear in mind that any services which screen your IP address (like Betfair) may not work in certain location. It’s possible to move an instance between locations after you’ve set it up, but from my experience, it’s not the easiest thing to do.
In the next post, I’ll provide some links on how to get set up installing the likes of R and Python, along with a few introductory lines of Linux.
A few handy terms
Getting started with EC2 can be quite a steep learning curve, so as I go through this series, I’ll try and explain the new terms that you might come accr
Security Group: The security settings for the machine - predominantly what ports to open (to allow users to access the machine via, say the web or a database program).
Instance: (or EC2 to give it its alternative name). This is essentially the remote machine that you’re connecting to. These can vary in size from Micro (cheapest and smallest), through Small, to Large and more.
AMI: A special type of pre-configured operating system. E.g. it might be pre-configured with R, Python, MySQL and makes it much easier to get started with Amazon Web Services.
Keypair: I’ve come to think of this like a supercharged password. Essentially, it’s a file that contains a long string which is used in place of a password. Much more secure.
Elastic Block Storage: Storage which is attached to the EC2 machine (so it’s very similar to the hard drive in your PC)