Amazon SimpleDB
Amazon’s SimpleDB service
is a reliable storage service for small pieces of textual data. It
provides the opportunity for the storage, modification and retrieval of
datasets, but without the need for the maintaining of a more
traditional database server, which should be a major benefit for a
thinly spread micro ISV. It should be noted, however that the SimpleDB
service is just that; simple. It does not have anything like the bells
and whistles of a full RDBMS database.
Overview
Like the other Amazon Web Services, SimpleDB has a limited number of entities:
- Domains. A domain is what might be considered
the equivalent of a database instance. Just as you might create several
databases on a single RDBMS server, you can use domains to partition
logically distinct datasets. - Items. An item is a uniquely (within the domain)
named collection of attributes that represents a data object. You can
add, modify or delete an entire object in one go, or modify individual
attributes. - Attributes. An attribute is a uniquely named (within the item) category of information.
The main difference between SimpleDB and a full RDBMS is that information is stored in hierarchical trees, not in tables.
There are also no predefined table schemas, so any item can have a
different set of attributes to any other item in the domain. Whilst
this provides enormous flexibility, you will need to keep your wits
about you, as if you misspell an attribute name, you might find that
that particular nugget of information is lost forever.
There are no types in SimpleDB, other than the string. All information is stored as text.
This means that SimpleDB can only perform case sensitive string
comparisons. There are no integers, floating point numbers, dates
etc. Again, it’s simplicity and flexibility also means that the
SimpleDB system won’t even raise an eyebrow should you provide, say, a floating point number for an attribute that should contain dates. Maintaining the integrity of your data becomes very much your concern.
The SimpleDB query language is very limited. It is far simple than SQL, for example. Queries take the form of
['AttributeName' Operator 'Literal']
In addition to the Boolean operators NOT, AND and OR, you have the
equality operators, =, !=, <, <=, >, >= and starts-with. So
to find items with a membership expiry date before the end of the year,
we would run this query:
['MembershipEndDate' < '2007-07-01']
Note that this is just a string comparison, you must encode your data in such a way that makes for sensible comparisons.
Like with Amazon’s other distributed systems, applications using SimpleDB will need to take into account the propagation latency.
At any one time, one cannot be sure that the data in the domain are not
out of date. In reality, it takes only a few seconds for all of the
physical servers to achieve consistency, but any application that
requires constant integrity will need to employ some sort of caching
mechanism.
Attributes in SimpleDB are severely limited in size, specifically,
to 1024 bytes. This limitation is entirely deliberate. It is intended
that objects of any appreciable size will be stored in S3 rather than
SimpleDB. You should attempt to work with this rather than work around
it. Storage costs in SimpleDB are many times the storage costs of S3.
Pricing
Storage space in SimpleDB is expensive; $1.50 per gigabyte.
Your space usage is calculated from every item, attribute name and
attribute value in your system. In addition, you are also charged for
an additional 45bytes per item, attribute name and attribute value.
This is for the indexing that SimpleDB automatically performs.
Data received by SimpleDB is charged at $0.10 per gigabyte, and data
sent by SimpleDB is charged on a sliding scale from $0.13 to $0.18 per
gigabyte.
You are also charged $0.14 per hour of machine usage. The amount of
CPU usage depends on the volume of the data returned by the query, the
amount uploaded etc.
Do you use SimpleDB? Does it work for you?
