Home > SaaS, Uncategorized > Amazon S3: Simple Storage Service

Amazon S3: Simple Storage Service

100014192753_v46777512_

Amazon’s
Simple Simple Storage Service provides endless storage for any
conceivable kind of file or data. It provides a highly scalable,
secure, distributed storage network, accessable from any internet
connection, and can be used for everything from backing up personal
data, to distributing multimedia content to millions of users.

Amazon offers a Service Level Agreement
which states that if their uptime dips below 99.9%, you can claim
service credits to offset the disruption. In practise though, you have
to provide Amazon with detailed logs to prove that you suffered
disruption, and that the disruption was caused by S3. The SLA details can be found here:

http://www.amazon.com/gp/browse.html?node=379654011

Overview

One of the great things about the S3 service is its utter
simplicity. S3 has just 2 types of construct, namely buckets and
objects. Your S3 account can contain any number of buckets, your
buckets can contain any number of objects and an object can hold data
and metadata. That’s it. There are no folder structures, no renaming
support, no in place editing, nothing. Almost any application would
need to impose a layer on top of S3, and therein lies its brilliance.

S3 allows you to set access controls on both buckets and objects,
governing who can do what. Each bucket and object is uniquely
addressable with its own URI, for example:

http://s3.amazonaws.com/mybucket/myobject

S3 also allows you to map on your own domain, so you could publish the same resource at, say;

http://www.agilemicroisv.com/myobject

S3 is hugely distributed. It is spread across multiple server in
multiple locations. Currently, you can house data in either the US or
Europe. This is a two edged sword. On the one hand, it provides huge
benefits in terms of reliability, scalability and redundancy, but on
the other hand, its distributed nature can cause the odd headache.

  • Objects are not files. S3 does not support any
    file system-like operations on your objects. You cannot rename, move or
    modify files in place. You must get a local copy, make any changes,
    then commit the new file back.
  • Propagation latency. Your files are distributed
    over multiple servers in multiple data centres. This can cause issues
    if you have users trying to access the same data at the same time from
    different locations. Let’s say I upload a file, and you try to grab it.
    The object might not appear to you straight away, which can cause all
    kinds of fun with missing and out of date objects. If concurrent access
    is important to your application, you will end up writing some sort of
    version control or intermediary layer.
  • S3 requests will fail. Occasionally. It’s not a
    bug, rather a deliberate consequence of the architecture. Any
    application will need to gracefully expect the occasional failure, and
    retry after a small pause.
  • S3 IP addresses will change. Occasionally. Again, because of the distributed architecture of the system, you shouldn’t employ any local DNS caching for more than a few minutes.

It’s a great system, but if you’re going to use it, you need to have
a good grasp of what it is, and perhaps more importantly, what it isn’t.

Pricing

Users of Amazon’s web services are billed on 3 fronts.

  • Storage. Storage space in S3 is charged at 15¢ per gigabyte per
    month for data stored in the US, and 18¢ per gigabyte per month for
    data stored in Europe.
  • Data transfer. Data sent to S3 costs 10¢ per
    gigabyte uploaded. Data retrieved from S3 is charged on a sliding
    scale, depending on how much data was downloaded during the month: 18¢
    per gigabyte for the first 10 terabytes downloaded, 16¢ per gigabyte
    for the next 40 terabytes (between 10 TB and 50 TB), and 13¢per
    gigabyte for any additional data (over 50 TB).
  • API requests.
    You are also charged based on the number of API request messages S3
    processes on your behalf. You must pay per-request fees for the
    requests performed by your own application, as well as requests made by
    others when they download data you have made available from your
    account.

Amazon provide a calculator to estimate your usage costs:

http://calculator.s3.amazonaws.com/calc5.html

I’m not sure why the US is cheaper than Europe. It might be infrastructure costs, it might be the worthless dollar.

 

Spreading the love a little, here are a few applications that have built on top of the S3 APIs:

  • Share/Bookmark
Categories: SaaS, Uncategorized Tags:
  1. April 9th, 2008 at 14:36 | #1

    Since i started to read your blog when you wrote about Zemanta, I might point out that Zemanta uses S3 too. And EC2.

    Quite good experiences with both, apart from some hiccups. And take a look at SimpleDB too.

    bye
    andraz

  2. April 9th, 2008 at 14:53 | #2

    @Andraz, awesome, I didn’t know that Zemanta backed on to the AWS.
    I’ve got a series of AWS post coming up over the next few days, I’ll
    get to SimpleDB :)

  3. April 9th, 2008 at 18:10 | #3

    yeah, check out http://aws.typepad.com/aws/2008/04/zemanta.html

    The interesting thing is that more and more startups use EC2 &
    S3, which means when Amazon goes down it takes large part of the
    internet with it. However benifits are quite good.

    The lack of automatic self-management software for EC2 however is
    the big obstacle. We had to wrote our own. Do you know any good
    (&cheap) solutions for that?

  4. April 10th, 2008 at 00:42 | #4

    >> I’m not sure why the US is cheaper than Europe. It might be infrastructure costs, it might be the worthless dollar.

    Their answer to this question is usually that “electricity in europe
    is more expensive”. Think what you want about that answer :)

  5. April 10th, 2008 at 03:01 | #5

    On the subject of online backup and storage …

    Online backup is becoming common these days. It is estimated that
    70-75% of all PC’s will be connected to online backup services with in
    the next decade.

    Thousands of online backup companies exist, from one guy operating in his apartment to fortune 500 companies.

    Choosing the best online backup company will be very confusing and
    difficult. One website I find very helpful in making a decision to pick
    an online backup company is:

    http://www.BackupReview.info

    Have a look here, too:
    http://www.backupreview.info/index.php?pid=read_article&article_id=9

    This site lists more than 400 online backup companies in its directory and ranks the top 25 on a monthly basis.

  6. April 10th, 2008 at 09:36 | #6

    @Andraz, stay tuned :)

    @Jure, is there anything in Europe that isn’t more expensive? :)

    @Jenny, thanks, great resource!

  1. No trackbacks yet.