Cassandra API Recommendations for Java

Avatar

Theo

The Cassandra API lets you have the maintenance and elasticity benefits of a powerful cloud-native database platform like Azure Cosmos DB, while still being able to consume the programmability layer of one of the most popular and widely used open-source databases: Apache Cassandra!

You can communicate with Cassandra API through any CQL Binary v4 wire protocol compliant open-source Apache Cassandra client driver. While you should not need to make any substantial code changes to existing apps using Apache Cassandra, there are some approaches and settings that we recommend for Cassandra API in Cosmos DB that will improve the experience.

In this blog we are going to be focusing on Java, specifically version 4 of the Cassandra client driver for Java. We have created an extension that you can implement without any code changes (just update pom.xml and application.conf) for a better overall experience. You may also want to try out this code sample which walks through an implementation of this extension, specifically to illustrate the retry policy and load balancing policy. When you take a dependency on the extension, the following settings and policies are applied within it’s reference.conf file (you can also override these settings in application.conf). In this blog we’ll talk you through these settings, why they are recommended, and in what circumstances you may want to override them.

 

Authentication

PlainTextAuthProvider is used by default. This is because the Cosmos DB Cassandra API requires authentication, and uses plain text authentication.

    auth-provider {
      class = PlainTextAuthProvider
    }

 

Connection

Cosmos DB load-balances requests against a large number of backend nodes. The default settings in the extension for local and remote node sizes work well in development, test, and low-volume production or staging environments. However, you should increase these values (i.e. override them within an application.conf file) based on the Request Units (RUs) provisioned for your database. See table below for suggested settings depending on the amount of throughput provisioned:

Request Units (RUs) local.size remote.size
100,000 50-100 50-100
200,000+ 100 100
    connection {
      pool {
        local {
          size = 10
        }
        remote {
          size = 10
        }
      }
    }

 

Token map

Token maps are disabled because they are not relevant to routing when the v4 Java Driver is used to access a Cosmos DB Cassandra instance.

    metadata {
      token-map {
        enabled = false
      }
    }

 

Reconnection Policy

The Java v4 driver provides two implementations out-of-the-box: ExponentialReconnectionPolicy and ConstantReconnectionPolicy. ExponentialReconnectionPolicy is the default policy. However, we recommend ConstantReconnectionPolicy for Cassandra API, with a base-delay of 2 seconds:

    reconnection-policy {
      class = ConstantReconnectionPolicy
      base-delay = 2 second
    }

 

Retry Policy

The retry policy handles errors such as OverLoadedException (which may occur due to rate limiting), and uses an exponential growing back-off scheme for retries. The time between retries is increased by a growing back off time (default: 1000 ms) on each retry, unless maxRetryCount is -1, in which case it backs off with a fixed duration.

The default retry policy in the Java Driver does not handle this exception, which is why we have prepared this custom policy for Cassandra API. It is important to handle rate limiting in Azure Cosmos DB to prevent errors when provisioned throughput has been exhausted. The parameters for the retry policy are defined within reference.conf of the Cosmos DB extension for Cassandra API, see below:

    retry-policy {
      class = com.azure.cosmos.cassandra.CosmosRetryPolicy
      max-retries = 5              
      fixed-backoff-time = 5000    
      growing-backoff-time = 1000  
    }

 

SSL Connection

The DefaultSslEngineFactory is used by default. This is because Cosmos Cassandra API requires SSL:

    ssl-engine-factory {
      class = DefaultSslEngineFactory
    }

 

Load balancing policy

The default load balancing policy in the v4 driver inhibits application-level failover, and specifying a single local datacenter for the CqlSession object is required by the policy. The native Apache Cassandra database is a multi-master system by default, and does not provide an option for single-master with multi-region replication for reads only. The concept of application-level failover to another region for writes is essentially redundant in Apache Cassandra as all nodes are independent and there is no single point of failure, which explains the default policy.

However, Azure Cosmos DB provides the out-of-box ability to configure either single master, or multi-master regions for writes. One of the advantages of having a single master region for writes is the avoidance of cross-region conflict scenarios, and the option of maintaining strong consistency across multiple regions, while still maintaining a level of high availability. Thus, unlike native Apache Cassandra, Azure Cosmos DB allows you to navigate the trade-off between Recovery-point-objective (RPO) and Recovery-time-objective (RTO) at a platform level. You can read more about RPO and RTO as it applies to the Cosmos DB platform here.

The class for the Cosmos DB custom load balancing policy (CosmosLoadBalancingPolicy) is referenced in reference.conf of the extension for Cassandra API, but the values for global-endpoint, read-datacenter, and write-datacenter should be overriden in an application.conf within your code. When global-endpoint is specified, you may specify a read-datacenter, but must not specify a write-datacenter. Writes will go to the default write region when a global-endpoint is specified. When global-endpoint is not specified, you must specify values for read-datacenter and write-datacenter.

    load-balancing-policy {
      global-endpoint = ""
      read-datacenter = "Australia East"
      write-datacenter = "UK South"
    }

Preferred regions

The load balancing policy mentioned above also include a “preferred regions” feature. This allows you to configure deterministic failover to specified regions in a multi-region deployment, in case of regional outages. The policy uses the regions in the list you specify, in priority order as determined by preferred-regions, to perform operations. If preferred-regions is null or not present, the policy uses the region specified in read-datacenter for reads, and either write-datacenter or global-endpoint to determine the region for writes. If neither preferred-regions or read-datacenter are present, the write region is the preferred location for all operations. When multi-region writes are enabled on the Cosmos DB account (and multi-region-write is set to true), the priority order for writes will be exactly the same as for reads.

    load-balancing-policy {
      multi-region-writes = false
      global-endpoint = ""
      read-datacenter = "Australia East"
      write-datacenter = "UK South"
      preferred-regions = ["Australia East", "UK West"]
    }

Timeouts

A request timeout of 60 seconds provides a better out-of-box experience than the default value of 2 seconds. Adjust this value up or down based on workload and Cosmos Cassandra throughput provisioning. The more throughput you provision, the lower you might set this value.

    request {
      timeout = "60 seconds"
    }

Get started

Create a new account using the Azure Portal, ARM template or Azure CLI and connect to it using your favorite tools. Check out the code sample which walks through an implementation of the extension referred to above, specifically to illustrate the retry policy and load balancing policy. We’d love to hear your feedback as well at askcosmosdbcassandra@microsoft.com. Stay up-to-date on the latest Azure #CosmosDB news and features by following us on Twitter @AzureCosmosDB. We are really excited to see what you will build with Azure Cosmos DB!

 

0 comments

Leave a comment