Cassandra
Neutral
- it inherits part of Google Big Table architecture (data file layout), and part of Amazon Dynamo distribution model;
Advantages
- architectural simplicity:
- all nodes are equal in responsibility;
- there is no single point of failure;
- aware of node organization into racks and data-centers;
- data consistency:
- data IO:
- network interface:
the Thrift interface is brittle and can easily bring a node down (ref);
- backups:
it is possible to backup and restore each node individually (ref);
- deployment:
- to deploy just copy the jars and config file and start it;
Disadvantages
- data distribution:
- by using the ring topology and the initial token, the data can easily get unbalanced when using the order preserving partitioner; (by default it uses the random partitioner which transforms it into a hash table);
- data consistency:
- conflicts are resolved based on the client provided timpestamp;
- data IO:
reads go to any node, which then it proxies it to replicas (ref);
one read is transformed by the proxy to multiple reads to multiple replicas (see the consistency level) (ref);
high read latency (ref);
- limited support for in memory caching (from the documentation is unclear how caching is done, but I suspect that it always reads from the file on the assumption that the operating system handles file system caching);
- schema:
- it is unclear if adding a new column family implies a cluster restart;
- Java inheritance:
it is quite easy to trigger an OutOfMemory condition and bring down a node (ref);