20100712: Upgrade to OSQA r516. Let me know if you experience any new issues. 20100708: Fixed the RSS feed.

2
1

I see a lot of buzz about Cassandra, but I don't know when it's appropriate. In what use cases would I prefer Cassandra to other data stores?

What are its advantages and disadvantages of which I should be aware?

asked Jul 07 '10 at 16:10

Richard%20Holmes's gravatar image

Richard Holmes
31234


For me, the main features which are most beneficial are:

  • High availability
  • Ability to handle high insert workloads
  • Straightforward (or at least easier) management compared to some other systems.

The main problems will be

  • Lack of support for features you may wish to use:
  • Secondary indexes;
  • range scans (unless you use OrderPreservingPartitioner);
  • Convenient handling of Blobs (it is possible to store binary data, but object sizes are limited by what can easily fit in ram)
  • Difficult programming interface (Thrift is a little complex to build, then only supports synchronous operation)

All of these systems behave differently and have rather subtle nuances which take time to understand; only proceed with development against something that your team are happy using. You will probably have to support it for some time, and migration will be very expensive.

answered Jul 12 '10 at 18:00

Mark%20R's gravatar image

Mark R
462

If it is an important decision, but you haven't made this decision before and aren't familiar with the different options, how do you decide?

(Jul 12 '10 at 18:10) Joseph Turian ♦♦

Spend some time researching the available options - get some machines - preferably reasonable spec hardware (i.e. not VMs) - and install the possible options on to them - write some simulators to load data in and perform queries - get a feel for how it "works" from a development and ops standpoint. Repeat until you think you have an option your team is happy to work with. You will need to maintain this for a long time - it could save you many man-years choosing the right option vs the wrong option. Spending a month or two upfront should give a decent payback.

(Jul 26 '10 at 07:48) Mark R

It might sound like a bit of a tautology, but you should use it when you need the specific features that it has and are not concerned about the features it lacks. First and foremost is well proven distribution/replication from small scale to very large, without having to change your operational model (e.g. by having to add sharding or replication yourself). It also has a richer data model than some of the alternatives within that space but a somewhat simpler conflict-resolution model, and the importance of having an active community around it should not be dismissed.

As for disadvantages, the most obvious one is learning curve. Cassandra is not transactional. It does not provide any atomicity for updates across rows, and even for updates within a row you have to be careful to avoid races. If you do need transactions and would end up implementing them in some form with Cassandra, you're probably Doing It Wrong. The same concern applies somewhat to secondary indices. To use Cassandra effectively you'll need to learn about columns and supercolumns, and about denormalizing your data - which is also going to increase the total amount of space you use compared to a traditional DB.

Basically, Cassandra solves a bunch of Very Hard distributed-system problems that very few other systems solve. If you have a large and growing data set, which fits Cassandra's data/consistency model better than (for example) Riak's or Voldemort's, then you should probably at least investigate how it solves those problems before you even consider solving them yourself.

answered Jul 08 '10 at 09:35

Obdurodon's gravatar image

Obdurodon
312

If you do need transactions and would end up implementing them in some form with Cassandra, you're probably Doing It Wrong. The same concern applies somewhat to secondary indices. To use Cassandra effectively you'll need to learn about columns and supercolumn true religion jeans

answered Mar 08 '11 at 02:40

maicalljason's gravatar image

maicalljason
1

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Tags:

×3
×1

Asked: Jul 07 '10 at 16:10

Seen: 2,002 times

Last updated: Mar 08 '11 at 02:40

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.