Cassandra Key Value Service Configuration

Enabling Cassandra for your Application

To enable your application to be backed by cassandra, you just need to add Cassandra KVS as a runtime dependency. In gradle this looks like:

runtime 'com.palantir.atlasdb:atlasdb-cassandra:<atlas version>'

e.g.

runtime 'com.palantir.atlasdb:atlasdb-cassandra:0.7.0'

Configuring a Running Application to Use Cassandra

A minimal AtlasDB configuration for running against cassandra will look like the below.

Importantly - your lock creator must be consistent across all nodes. If you do not provide a lock creator, it will default to the first host in the leaders list. If you do not specify a lock creator, the leaders block should be exactly the same across all nodes.

atlasdb:
  keyValueService:
    type: cassandra
    servers:
      - cassandra:9160
    poolSize: 20
    keyspace: yourapp
    credentials:
      username: cassandra
      password: cassandra
    sslConfiguration:
      trustStorePath: var/security/truststore.jks
    replicationFactor: 1
    mutationBatchCount: 10000
    mutationBatchSizeBytes: 10000000
    fetchBatchCount: 1000
    safetyDisabled: false
    autoRefreshNodes: false

  leader:
    # This should be at least half the number of nodes in your cluster
    quorumSize: 2
    learnerLogDir: var/data/paxosLogs
    acceptorLogDir: var/data/paxosLogs
    # This should be different for every node. If ssl is not enabled, then the host must be specified as http
    localServer: https://<yourhost>:3828
    # This should be the same for every node. If ssl is not enabled, then the host must be specified as http
    lockCreator: https://host1:3828
    # This should be the same for every node
    leaders:
      - https://host1:3828 # If ssl is not enabled, then the hosts must be specified as http
      - https://host2:3828
      - https://host3:3828

For more details on the leader block, see Leader Configuration.

Communicating Over SSL

Atlas currently supports two different ways of specifying SSL options in the Cassandra KVS configuration: The sslConfiguration block and the ssl property. Both means are supported but ssl is preferred and will always be respected in favor of sslConfiguration when both are specified.

The following table summarizes whether SSL is enabled:

  sslConfiguration not present sslConfiguration present
ssl not present no yes
ssl is true yes yes
ssl is false no no

sslConfiguration

This object is specified according to the palantir/http-remoting library. It directly specifies all aspects of the ssl configuration, instead of reading them from system properties. The only required property is trustStorePath, as seen in the example above. In order to configure 2-way SSL, you would also have to set the optional properties keyStorePath and keyStorePassword.

ssl

This property is a boolean value saying whether or not to use ssl. When true, it will use java system properties that are passed in as jvm arguments to determine how to set up the ssl connection. For example, you would use the jvm option -Djavax.net.ssl.trustStore=<path-to-truststore> to tell atlas where to find the truststore to use.

Column Paging for Sweep (experimental)

If timestampsGetterBatchSize is set, the maximum number of entries loaded into memory for any Cassandra node during a Sweep will be limited.

Currently Cassandra does not provide a way to fetch columns and timestamps without also temporarily loading values into memory. Therefore, running a sweep job on a Cassandra-backed KVS with rows that (1) contain large (>1MB) values, and (2) are frequently updated, may cause the Cassandra node to run out of memory.

In such cases, limiting the value of timestampsGetterBatchSize (which is infinite by default) could result in greater reliability. On the other hand, more aggressive paging could lead to slower sweep performance.