Steering ZEE5’s Migration to ScyllaDB

Eliminating cloud vendor lockin and supporting rapid growth – with a 5X cost reduction

Kishore Krishnamurthy, CTO at ZEE5, recently shared his perspective on Zee’s massive, business-critical database migration as the closing keynote at ScyllaDB Summit. You can read highlights of his talk below.

All of the ScyllaDB Summit sessions, including Mr. Krishnamurthy’s session and the tech talk featuring Zee engineers, are available on demand.

Watch On Demand

About Zee

Zee is a 30-year-old publicly-listed Indian media and entertainment company. We have interests in broadcast, OTT, studio, and music businesses. ZEE5 is our premier OTT streaming service, available in over 190 countries. We have about 150M monthly active users. We are available on web Android, iOS, smart TVs, and various other television and OEM devices.

Business Pressures on Zee

The media industry around the world is under a lot of bottom-line pressure. The broadcast business is now moving to OTT. While a lot of this business is being effectively captured on the OTT side, the business models on OTT are not scaling up similarly to the broadcast business. In this interim phase, as these business models stabilize, etc., there is a lot of pressure on us to run things very cost-efficiently.

This problem is especially pronounced in India. Media consumption is on the rise in India. The biggest challenge we face is in terms of monetizability: our revenue is in rupees while our technology expenses are in dollars. For example, what we make from one subscription customer in one year is what Netflix or Disney would make in a month.

We are on the lookout for technology and vendor partners who have a strong India presence, who have a cost structure that aligns with the kind of scale we provide, and who are able to provide the cost efficiency we want to deliver.

Technical Pressures on the OTT Platform

A lot of the costs we had on the platform scale linearly with usage and user base. That’s particularly true with some aspects of the platform like our heartbeat API, which was the primary use case driving our consideration of ScyllaDB. The linear cost escalation limited us in terms of what frequency we could run these kinds of solutions like heartbeat. A lot of our other solutions – like our playback experience, our security, and our recommendation systems – leverage heartbeat in their core infrastructure. Given the cost limitation, we could never scale that up.

We also had challenges in terms of the distributed architecture of the solution we had. We were working towards a multi-tenant solution. We were exploring cloud-neutral solutions, etc.

What Was Driving the Database Migration

Sometime last year, we decided that we wanted to get rid of cloud vendor lockin. Every solution we were looking for had to be cloud-neutral, and the database choice also needed to deliver on that. We were also in the midst of a large merger. We wanted to make sure that our stack was multitenant and ready to onboard multiple OTT platforms. So these were reasons why we were eagerly looking for a solution that was both cloud-neutral and multitenant.

Top Factors in Selecting ScyllaDB

We wanted to move away from the master-slave architecture we had. We wanted our solution to be infinitely scalable. We wanted the solution to be multi-region-ready. One of the requirements from a compliance perspective was to be ready for any kind of regional disaster. When we came up with a solution for multi-region, the cost became significantly higher.

We wanted a high-availability, multi-region solution and ScyllaDB’s clustered architecture allowed us to do that, and to move away from cloud vendor lockin. ScyllaDB allowed us to cut dependencies on the cloud provider. Today, we can run a multi-cloud solution on top of ScyllaDB. We also wanted to make sure that the migration would be seamless. ScyllaDB’s clustered architecture across clouds helped us when we were doing our recent cloud migration It allowed us to make it very seamless.

From a support perspective, the ScyllaDB team was very responsive. They had local support in India, they had a local competency in terms of solution architects who held our hands along the way. So we were very confident we could deliver with their support.

We found that operationally, ScyllaDB was very efficient. We could significantly reduce the number of database nodes when we moved to ScyllaDB. That also meant that the costs came down. ScyllaDB also happened to be a drop-in replacement for the current incumbents like Cassandra and DynamoDB. All of this together made it an easy choice for us to select ScyllaDB over the other database choices we were looking at.

Migration Impact

The migration to ScyllaDB was seamless. I’m happy to say there was zero downtime. After the migration to ScyllaDB, we also did a very large-scale cloud migration. From what we heard from the cloud providers, nobody else in the world had attempted this kind of migration overnight. And our migration was extremely smooth. ScyllaDB was a significant part of all the components we migrated, and that second migration was very seamless as well.

After the migration, we moved about 525M users’ data, including their references, login details, session information, watch history, etc., to ScyllaDB. We have now hundreds of millions of heartbeats recorded on ScyllaDB. The overall data we store on ScyllaDB is in the tens of terabytes range at this point.

Our overall cost is a combination of the efficiency that ScyllaDB provides in terms of the reduction in the number of nodes we use, and the cost structure that ScyllaDB provides. Together, this has given us a 5x improvement in cost – that’s something our CFO is very happy with.

As I mentioned before, the support has been excellent. Through both the migrations – first the migration to ScyllaDB and subsequently the cloud migration – the ScyllaDB team was always available on demand during the peak periods. They were available on-prem to support us and hand-hold us through the whole thing. All in all, it’s a combination: the synergy that comes from the efficiency, the cost-effectiveness, and the scalability. The whole is more than the sum of the parts. That’s how I feel about our ScyllaDB migration.

Next Steps

ScyllaDB is clearly a favorite with the developers at Zee. I expect a lot more database workloads to move to ScyllaDB in the near future.

If you’re curious about the intricacies of using heartbeats to track video watch progress, then catch the talk by Srinivas and Jivesh, where they explained the phenomenally efficient system that we have built.

Watch the Zee Tech Talk

Inside a DynamoDB to ScyllaDB Migration

A detailed walk-through of an end-to-end DynamoDB to ScyllaDB migration

We previously discussed ScyllaDB Migrator’s ability to easily migrate data from DynamoDB to ScyllaDB – including capturing events. This ensures that your destination table is consistent and abstracts much of the complexity involved during a migration.

We’ve also covered when to move away from DynamoDB, exploring both technical and business reasons why organizations seek DynamoDB alternatives, and examined what a migration from DynamoDB looks like, walking through how a migration from DynamoDB looks like and how to accomplish it within the DynamoDB ecosystem.

Now, let’s switch to a more granular (and practical) level. Let’s walk through an end-to-end DynamoDB to ScyllaDB migration, building up on top of what we discussed in the two previous articles.

Before we begin, a quick heads up: The ScyllaDB Spark Migrator should still be your tool of choice for migrating from DynamoDB to ScyllaDB Alternator – ScyllaDB’s DynamoDB compatible API. But there are some scenarios where the Migrator isn’t an option:

  • If you’re migrating from DynamoDB to CQL – ScyllaDB’s Cassandra compatible API
  • If you don’t require a full-blown migration and simply want to stream a particular set of events to ScyllaDB
  • If bringing up a Spark cluster is an overkill at your current scale

You can follow along using the code in this GitHub repository.

End-to-End DynamoDB to ScyllaDB Migration

Let’s run through a migration exercise where:

  • Our source DynamoDB table contains Items with an unknown (but up to 100) number of Attributes
  • The application constantly ingests records to DynamoDB
  • We want to be sure both DynamoDB and ScyllaDB are in sync before fully switching

For simplicity, we will migrate to ScyllaDB Alternator, our DynamoDB compatible API. You can easily transform it to CQL as you go.

Since we don’t want to incur any impact to our live application, we’ll back-fill the historical data via an S3 data export. Finally, we’ll use AWS Lambda to capture and replay events to our destination ScyllaDB cluster.

Environment Prep – Source and Destination

We start by creating our source DynamoDB table and ingesting data to it. The create_source_and_ingest.py script will create a DynamoDB table called source, and ingest 50K records to it. By default, the table will be created in On-Demand mode, and records will be ingested using the BatchWriteItem call serially. We also do not check whether all Batch Items were successfully processed, but this is something you should watch out for in a real production application.

The output will look like this, and it should take a couple minutes to complete:

Next, spin up a ScyllaDB cluster. For demonstration purposes, let’s spin a ScyllaDB container inside an EC2 instance:

Beyond spinning up a ScyllaDB container, our Docker command-line does two things:
Exposes port 8080 to the host OS to receive external traffic, which will be required later on to bring historical data and consume events from AWS Lambda
Starts ScyllaDB Alternator – our DynamoDB compatible API, which we’ll be using for the rest of the migration.

Once your cluster is up and running, the next step is to create your destination table. Since we are using ScyllaDB Alternator, simply run the create_target.py script to create your destination table:

NOTE: Both source and destination tables share the same Key Attributes. In this guided migration, we won’t be performing any data transformations. If you plan to change your target schema, this would be the time for you to do it. 🙂

Back-fill Historical Data

For back-filling, let’s perform a S3 Data Export. First, enable DynamoDB’s point-in-time recovery:

Next, request a full table export. Before running the below command, ensure the destination S3 bucket exists:

Depending on the size of your actual table, the export may take enough time for you to step away to grab a coffee and some snacks. In our tests, this process took around 10-15 minutes to complete for our sample table.

To check the status of your full table export, replace your table ARN and execute:

Once the process completes, modify the s3Restore.py script accordingly (our modified version of the LoadS3toDynamoDB sample code) and execute it to load the historical data:

NOTE: To prevent mistakes and potential overwrites, we recommend you use a different Table name for your destination ScyllaDB cluster. In this example, we are migrating from a DynamoDB Table called source to a destination named dest.

Remember that AWS allows you to request incremental backups after a full export. If you feel like that process took longer than expected, simply repeat it in smaller steps with later incremental backups. This can be yet another strategy to overcome the DynamoDB Streams 24-hour retention limit.

Consuming DynamoDB Changes

With the historical restore completed, let’s get your ScyllaDB cluster in sync with DynamoDB. Up to this point, we haven’t necessarily made any changes to our source DynamoDB table, so our destination ScyllaDB cluster should already be in sync.

Create a Lambda function

Name your Lambda function and select Python 3.12 as its Runtime:

Expand the Advanced settings option, and select the Enable VPC option. This is required, given that our Lambda will directly write data to ScyllaDB Alternator, currently running in an EC2 instance. If you omit this option, your Lambda function may be unable to reach ScyllaDB.

Once that’s done, select the VPC attached to your EC2 instance, and ensure that your Security Group allows Inbound traffic to ScyllaDB. In the screenshot below, we are simply allowing inbound traffic to ScyllaDB Alternator’s port:

Finally, create the function.

NOTE: If you are moving away from the AWS ecosystem, be sure to attach your Lambda function to a VPC with external traffic. Beware of the latency across different regions or when traversing the Internet, as it can greatly delay your migration time. If you’re migrating to a different protocol (such as CQL), ensure that your Security Group allows routing traffic to ScyllaDB relevant ports.

Grant Permissions

A Lambda function needs permissions to be useful. We need to be able to consume events from DynamoDB Streams and load them to ScyllaDB.

Within your recently created Lambda function, go to Configuration > Permissions. From there, click the IAM role defined for your Lambda:

This will take you to the IAM role page of your Lambda. Click Add permissions > Attach policies:

Lastly, proceed with attaching the AWSLambdaInvocation-DynamoDB policy.

Adjust the Timeout

By default, a Lambda function runs for only about 3 seconds before AWS kills the process. Since we expect to process many events, it makes sense to increase the timeout to something more meaningful.

Go to Configuration > General Configuration, and Edit to adjust its settings:

Increase the timeout to a high enough value (we left it at 15 minutes) that allows you to process a series of events. Ensure that when you hit the timeout limit, DynamoDB Streams won’t consider the Stream to have failed processing, which effectively sends you into an infinite loop.

You may also adjust other settings, such as Memory, as relevant (we left it at 1Gi).

Deploy

After the configuration steps, it is time to finally deploy our logic! The dynamodb-copy folder contains everything needed to help you do that.

Start by editing the dynamodb-copy/lambda_function.py file and replace the alternator_endpoint value with the IP address and port relevant to your ScyllaDB deployment.

Lastly, run the deploy.sh script and specify the Lambda function to update:

NOTE: The Lambda function in question simply issues PutItem calls to ScyllaDB in a serial way, and does nothing else. For a realistic migration scenario, you probably want to handle DeleteItem and UpdateItem API calls, as well as other aspects such as TTL and error handling, depending on your use case.

Capture DynamoDB Changes

Remember that our application is continuously writing to DynamoDB, and our ultimate goal is to ensure that all records ultimately exist within ScyllaDB, without incurring any data loss.

At this step, we’ll simply enable DynamoDB Streams to Capture change events as they go. To accomplish that, simply turn on DynamoDB streams to capture Item level changes:

In View Type, specify that you want a New Image capture, and proceed with enabling the feature:

Create a Trigger

At this point, your Lambda is ready to start processing events from DynamoDB. Within the DynamoDB Exports and streams configuration, let’s create a Trigger to invoke our Lambda function every time an item gets changed:

Next, choose the previously created Lambda function, and adjust the Batch size as needed (we used 1000):

Once you create the trigger, data should start flowing from DynamoDB Streams to ScyllaDB Alternator!

Generate Events

To show the situation of an application frequently updating records, let’s simply re-execute the initial create_source_and_ingest.py program: It will insert another 50K records to DynamoDB, whose Attributes and values will be very different from the existing ones in ScyllaDB:

The program found out the source table already exists and has simply overwritten all its existing records. The new records were then captured by DynamoDB Streams, which should now trigger our previously created Lambda function, which will stream its records to ScyllaDB.

Comparing Results

It may take some minutes for your Lambda to catch up and ingest all events to ScyllaDB (did we say coffee?). Ultimately, both databases should get in sync after a few minutes.

Our last and final step is simply to compare both database records. Here, you can either compare everything, or just a few selected ones.

To assist you with that, here’s our final program: compare.py! Simply invoke it, and it will compare the first 10K records across both databases and report any mismatches it finds:

Congratulations! You moved away from DynamoDB! 🙂

Final Remarks

In this article, we explored one of the many ways to migrate a DynamoDB workload to ScyllaDB. Your mileage may vary, but the general migration flow should be ultimately similar to what we’ve covered here.

If you are interested in how organizations such as Digital Turbine or Zee migrated from DynamoDB to ScyllaDB, you may want to see their recent ScyllaDB Summit talks. Or perhaps you would like to learn more about different DynamoDB migration approaches? In that case, watch my talk in our NoSQL Data Migration Masterclass.

If you want to get your specific questions answered directly, talk to us!

Inside Natura &Co Global Commercial Platform with ScyllaDB

Filipe Lima and Fabricio Rucci share the central role that ScyllaDB plays in their Global Commercial Platform

Natura, a multi-brand global cosmetics group including Natura and Avon, spans across 70 countries and an ever-growing human network of over 7 million sales consultants. Ranked as one of the world’s strongest cosmetics brands, Natura’s operations require processing an intensive amount of consumer data to drive campaigns, run predictive analytics, and support its commercial operations.

In this interview, Filipe Lima (Architecture Manager) and Fabricio Pinho Rucci (Data and Solution Architect) share their insights on where ScyllaDB fits inside Natura’s platform, how the database powers its growing operations, and why they chose ScyllaDB to drive their innovation at scale.

“Natura and ScyllaDB have a longstanding partnership. We consider them as an integral part of our team”, said Filipe, alluding to their previous talk covering their migration from Cassandra to ScyllaDB back in 2018. Natura operations have dramatically scaled since then and ScyllaDB scaled alongside them, supporting their growth.

Here are some key moments from the interview…

Who is Natura?

Filipe: Wherever you go inside ANY Brazilian house, you will find Natura cosmetics there. This alone should give you an idea of the magnitude and reach we have. We are really proud to be present in every Brazilian house! To put that into perspective, we are currently the seventh largest country in population, with over 200 million people.

Natura &Co is today one of the largest cosmetics companies in the world. We are made up of two iconic beauty brands:

  • Avon, which should be well-known globally
  • Natura, which has a very strong presence in LATAM

Today Natura’s operations span over a hundred countries, where most of our IT infrastructure is entirely cloud native. Operating at such a large scale does come with its own set of challenges, given our multi-channel presence – such as e-commerce, retail and mainly direct sales, plus managing over seven million sales consultants and brand representatives.

Natura strongly believes in challenging the status quo in order to promote a real and positive social economic impact. Our main values are Cooperation, Co-creation, and Collaboration. Hence why we are Natura &Co.

Given the scale of our operations, it becomes evident that we have several hundreds of different applications and integrations to manage. That brings complexity, and challenges of running our 24/7 mission-critical operations.

What is the Global Commercial Platform? Why do you need it?

Filipe: Before we discuss the Global Commercial Platform (GCP), let me provide you with context on why we needed it.

We started this journey around 5 years ago, when we decided to create a single platform to manage all of our direct selling. At the time, we were facing scaling challenges. We lacked a centralized system interface for keeping and managing our business and data rules. And we relied on a loosely coupled infrastructure that had multiple points of failure. All of this could affect our sales process as we grew.

The main reason we decided to build our own platform, rather than purchase an existing one from within the market, is because Natura’s business model is very specific and unique. On top of it, given our large product portfolio, integrating and re-architecting all of our existing applications to work and complement a third-party solution could become a very time-consuming process.

At the time, we called that program LEGO, with 5 main components to manage our sales force, in addition to e-commerce that serves the end consumer.

The LEGO program is defined as five components, each one covering specific parts of the direct selling process – including structuring, channel control, performance and payment of the sales force.

Our five components are as follows:

  • People/Registry (GPP)
  • Relationships (GRP)
  • Order Capture (GSP)
  • Direct Sales Management (GCP)
  • Data Operations (GDP)

The platform responsible for generating data and sales indicators for the other platforms is GCP (Global Commercial Platform). GCP manages, integrates, and handles all rules related to Natura’s commercial models and their relationships, processes KPIs, and processes all intrinsic aspects related to direct selling, such as profits and commission bonuses.

Why and where does ScyllaDB fit?

Fabricio: We have been proud and happy users of ScyllaDB for many years now. Our journey with ScyllaDB started back in 2018. Back then, our old systems were very hard to scale, in a way that it got to a point where it became an impediment to managing our own operations and keeping up with our ongoing innovation.

In 2018 we started this journey of migrating from our previous solution to ScyllaDB. Past that, we shifted to AWS and since then we have been expanding the reach of our platform to other business areas. For example, last year we started using ScyllaDB CDC, and currently we are studying to implement multi-region deployments for some of our applications.


The main reason why we decided to shift to ScyllaDB was because of its impressive scaling power.

Our indicator processing requires real-time execution, with the lowest latency possible. We receive several events per second, and the inability to process them in a timely manner would result in a backlog of requests, ruining our users’ experience.

The fact that ScyllaDB scales linearly, both up and out, was also a key decision factor. We started small and later migrated more workloads to it gradually. Whenever we required more capacity, we simply added more nodes, in a planned and conscious way. “Bill shock” was never a problem for us with ScyllaDB.

Our applications are Global (hence the platform’s acronym), and currently span several countries. Therefore, we could no longer work with maintenance windows incurring downtime. We needed a solution that would be always on and process our workloads in an uninterruptible way.

ScyllaDB’s active-active architecture perfectly fits what we were looking for. We plan to cover the Northern Virginia and São Paulo regions on AWS in the near future with a multi-datacenter cluster, and so we can easily ensure strong consistency for our users thanks to ScyllaDB’s tunable consistency.

What else can you tell us about your KPIs and their numbers?

Filipe: One aspect to understand before we talk about the numbers ScyllaDB delivers to us is how our business model works.

In a nutshell, Natura is made by people, and for people. We have Beauty Consultants all around the world bringing our products to the consumer market. The reason why the Natura brand is so strong (especially within Brazil), is primarily because we have a culture of dealing with people before we make important decisions, such as buying a car or a house.

What typically happens is this: You have a friend, who is one of our Beauty Consultants. This friend of yours offers you her products. Since you trust your friend and you like the products, you eventually end up trying it out. In the end, you realize that you fell in love with it, and decide to always check in with your friend as time goes by. Ultimately, you also refer this friend of yours to other friends, as people ask which lotion or perfume you’re using, and that’s how it goes.

Now imagine that same situation I described on a much larger scale. Remember: We have over 7 million consultants in our network of people.

Therefore, we need to provide these consultants with incentives and campaigns for them to keep on doing the great job they are doing today. This involves, for example, checking whether they are active, if they recovered after a bad period, or whether they simply ceased engaging with us. If the consultant is a new recruit, it is important that we know this as well because every one of them is treated differently in a personalized way.

That way, by treating our consultants with respect and appreciation, we leverage our platform to help us and them make the best decisions.

Today ScyllaDB powers over 73K indicators, involving data of over 4 million consultants within 6 countries of Latin America. This includes over USD 120M just in orders and transactions. All of this is achieved on top of a ScyllaDB cluster delivering an average throughput of 120K operations per second, with single-digit millisecond latencies of 6 milliseconds for both reads and writes.

How complex is Natura’s Commercial Platform architecture today?

Fabricio: Very complex, as you can imagine for a business of that size!

GCP is primarily deployed within AWS (heh!). We have several input sources coming from our data producers. These involve Sales, our Commercial Structures, Consultants, Orders, Final Customers, etc.

Once these producers send us requests, their submissions enter our data pipelines. This information arrives in queues (MSK) and is consumed using Spark (EMR), some streaming and others batch, this data is transformed according to our business logic, which eventually reaches our database layer, which is where ScyllaDB is located.

We of course have other databases in our stack, but for APIs and applications requiring real-time performance and low latency, we end up choosing ScyllaDB as the main datastore.

For querying ScyllaDB we developed a centralized layer for our microservices using AWS Lambda and API Gateway. This layer consults ScyllaDB and then provides the requested information to all consumers that require it.

As for more details about our ScyllaDB deployment, we currently have 12 nodes running on top of AWS i3.4xlarge EC2 instances. Out of the 120K operations I previously mentioned, 35K are writes, with an average latency of 3 milliseconds. The rest are reads, with an average latency of 6 milliseconds.

Enhanced Cluster Scaling for Apache Cassandra®

The Instaclustr Managed Platform now supports the zero-downtime scaling-up of local storage (also known as instance store or ephemeral storage) nodes running Apache Cassandra on AWS, Azure and GCP!

Enhanced Vertical Scaling 

NetApp has released an extension to the Cluster Scaling feature on the Instaclustr Managed Platform, which now supports scaling up the local storage of the Cassandra nodes. This enhancement builds upon the existing network block storage scaling option, offering greater flexibility and control over cluster configurations.

Customers can now easily scale up their Cassandra clusters on demand to respond to growing data needs. This development not only provides unparalleled flexibility and performance, but also distinguishes Instaclustr from competitors who lack support for local storage nodes. 

Local Storage 

Local storage refers to the physical storage directly attached to the node or instance. Unlike network-backed storage (such as EBS), local storage eliminates the need for data to travel over the network, leading to lower latency, higher throughput, and improved performance in data-intensive applications.

Moreover, the cost of the local storage is included in the instance pricing which can lead to local storage, when used in conjunction with Reserved Instance and similar concepts, being the optimal cost infrastructure choice for many use cases. 

Whether you need to store large volumes of data or run complex computational tasks, the ability to scale up local storage nodes gives you the flexibility to manage node sizes based on your requirements. Scaling local storage nodes with minimal disruption is complex. Instaclustr leverages advanced internal tools for on-demand vertical scaling. Our updated replace tool with “copy data” mode streamlines the process without compromising data integrity or cluster health. 

Additionally, this enhancement gives our customers the capability to switch between local storage (ephemeral) and network-based storage (persistent) while scaling up their clusters. Customer storage needs vary over time, ranging from the high I/O performance of local storage to the cost-effectiveness and durability of network-based storage. Instaclustr provides our customers with a variety of storage options to scale up their workloads based on performance and cost requirements.

Not sure where to start with your cluster scaling needs? Read about the best way to add capacity to your cluster here. 

Self-Service 

With this enhancement to the cluster scaling feature, we are expanding Instaclustr’s self-service capabilities, empowering customers with greater control and flexibility over their infrastructure. Scaling up Cassandra clusters has become more intuitive and is just a few clicks away.

This move towards greater autonomy is supported by production SLAs, ensuring scaling operations are completed without data loss or downtime. Cassandra nodes can be scaled up using the cluster scaling functionality available through the Instaclustr console, API, or Terraform provider. Visit our documentation for guidance on seamlessly scaling your storage for the Cassandra cluster. 

While this enhancement allows the scaling up of local storage, downscaling operations are not yet supported via self-service. Should you need to scale down the storage capacity of your cluster, our dedicated Support team is ready to assist 

Scaling up improves performance and operational flexibility, but it will also result in increased costs, so it is important to consider the cost implications. You can review the pricing of the available nodes with the desired storage on the resize page of the console. Upon selection, a summary of current and new costs will be provided for easy comparison. 

Leverage the latest enhancements to our Cluster Scaling feature via the console, API or Terraform provider, and if you have any questions about this feature, please contact Instaclustr Support at any time. 

Unlock the flexibility and ease of scaling Cassandra clusters at the click of a button and sign in now! 

The post Enhanced Cluster Scaling for Apache Cassandra® appeared first on Instaclustr.

ScyllaDB Cloud Network Connectivity: Transit Gateway Connection

ScyllaDB Cloud is a managed NoSQL Database-as-a-Service based on the leading open-source database engine ScyllaDB. It is designed for extreme performance, low latency, and high availability. It is compatible with the Cassandra Query Language (CQL) and Amazon DynamoDB APIs, making it a possible replacement for many solution architects pursuing performance.

This article is about ScyllaDB Cloud network connectivity management. It will show how to use ScyllaDB Cloud to connect to the customer’s application using a transit gateway.

ScyllaDB Cloud Network Connectivity

In response to customer demand, ScyllaDB Cloud now offers two solutions for managing connectivity between ScyllaDB Cloud and customer application environments running in Amazon Web Services (AWS) or even in hybrid cloud setups. In addition to VPC peering, we now support Transit Gateway Connection.

Two Connectivity Options

VPC peering connection has been supported for years. It is an excellent solution for connectivity in non-complex network setups. However, managing many VPC peering connections can be cumbersome, error-prone, and challenging to audit.

This problem became more severe at scale, and another tool from the arsenal of AWS became necessary.

The Transit Gateway Connectivity feature was added to ScyllaDB Cloud in February 2024. It enables our customers to use Transit Gateways as part of their configuration. It adds another way to connect to ScyllaDB Clusters. The feature is available in clusters deployed in ScyllaDB Cloud as well as clusters deployed on the customer’s own AWS accounts, via the bring-your-own-account feature (BYOA).

Transit Gateway Connection allows connection using a centralized hub that simplifies the connectivity and routing. This helps organizations connect multiple VPCs, Load Balancers, and VPNs within a single, scalable gateway. It acts as a transit hub for routing traffic between various network endpoints, providing a unified solution for managing connectivity across complex or distributed environments.

The Transit Gateway simplifies the management of intricate networks created by introducing a more centralized hub-and-spokes model.

 

That simplifies configuring and monitoring the network connectivity, streamlines operations, and improves visibility.

Moreover, it acts as a central place to apply security policies and access controls – thereby strengthening network security and improving auditability.

Configure Transit Gateway Connection in ScyllaDB Cloud

Configuring the Transit Gateway in ScyllaDB Cloud is straightforward.

Prerequisites

  • ScyllaDB Cloud account. If you don’t have an account, you can use a free account. Get one here: ScyllaDB Cloud.
  • Basic understanding of AWS networking concepts (VPCs, CIDR subnets, Resource Shares, route tables).
  • AWS Account with sufficient permissions to create network configurations.
  • 30 minutes of your time.

Step 1: ScyllaDB Cloud Cluster

Login to ScyllaDB Cloud and create a New Cluster. Depending on your account, you can either use the Free trial or Dedicated VM option. For this, I am using a free trial; if you choose a Dedicated VM, your screen might be slightly different.

Please select AWS as your Cloud Provider and click Next.

As of the time of this blog, we support only AWS Transit Gateway connection. Support for Network Connectivity Center(NCC) or similar technologies for Google Cloud Platform(GCP) or other clouds will be added in time.

 

On the next screen, Cluster Properties under Network Type, you should select VPC Peering / Transit Gateway to enable network connectivity, and then you can confirm with the Create Cluster button below.

The ScyllaDB cluster will be created shortly. Once it’s created, select the cluster.

Step 2: Provision AWS Transit Gateway

The following steps for provisioning of the transit gateway have to be done in your AWS account. Alto transit gateway can connect to transit gateways in other regions (inter-region); it can only connect VPCs from the same region.

The transit gateway should be deployed in the same region as the ScyllaDB cluster. In this case, this is us-east-1.

If you are not familiar with creating a transit gateway, the following guide can help:
Configure AWS Transit Gateway (TGW) VPC Attachment Connection.

It is a good idea to set Auto accept shared attachments. It will make your TGW automatically accept attachment requests from other accounts, including ScyllaDB Cloud.

For the next step, we will need the Transit Gateway ID and RAM Resource Share APN.

Step 3: Configuring Transit Gateway Connection in ScyllaDB Cloud

The next step is to configure the network connection.

Select the Connections tab. You will be navigated to the Connections screen below.

Select Add Transit Gateway Connection and when prompted, provide a custom Name. This is how the connection will be visualized in ScyllaDB Cloud.

The Data Center should be the region where the TGW is located.

Get the AWS RAM ARN from your AWS Resource Share as configured in the AWS setup in the previous step and the Transit Gateway ID transit gateway from AWS Transit Gateways (12-digit ID).

Provide the chosen VPC CIDR to route the traffic and proceed with Add Transit Gateway.

The processing will take some time.

If you followed my advice above, the connection will be accepted automatically. Otherwise, you will have to go to your AWS account to accept the connection.

Once the connection is active, the attached networks from the customer AWS account and the ScyllaDB Cloud should see each other over the networks as defined.

You can go to the ScyllaDB Documentation if you need help verifying the connection.

Cost & Billing

Amazon will charge the network components to the AWS account associated with each respective network component.  This might be confusing at first. It means that bills will be sent to different accounts when connecting components from different accounts.

In our cost example below, the database clusters are in ScyllaDB Cloud, but the applications are in the customer AWS account. For illustration purposes, we have databases and applications in different regions, with multiple transit gateways and peering. It is a complicated setup designed to show the different categories of charges.

The expenses that occur using Transit Gateway Connection are listed and explained below.

AWS Transit Gateway attachment hourly charges

AWS charges for each network component attached to the transit gateway. The typical cost per hour is $0.05 per attachment (us-east-1). The price varies by region for ap-northeast-2 it is $0.07

Attachments are considered all VPCs from the diagram above (1,2,3,4,5,6,7,8) and both transit gateways, TGW1 and TGW2. Attachments are also considered other network components like VPN connections, VPC peering, gateways etc. which are excluded from this setup.

In our example, we have the following components:

Components connected to TGW1 (price for attachment $0.05 us-east-1):

VPC 1,2,3,4 will be billed to ScyllaDB Cloud

4 x $0.05 = $0.20 per hour

VPC 5,6,7 and TGW2 will be billed to Customer AWS Account

5 x $0.05 = $0.25 per hour

Components connected to TGW2 (price for attachment $0.07 ap-northeast-2)

VPC 8 and TGW1 will be billed to Customer AWS Account

2 x $0.07 = $0.14 per hour

Monthly the charges will be as follows:

 

Transit Gateway data processing charges

Transit Gateway Data Processing costs $0.02 per 1 GB in most regions.

Data processing is charged to the AWS account that owns the VPC. Assuming 2 TB of symmetrical traffic for simplicity.

VPC 1,2,3,4 will be billed to ScyllaDB Cloud 4 x 2048 GB x $0.02 = $163.84

VPC 5,6,7,8 will be billed to Customer AWS Account 4 x 2048 GB x $0.02 = $163.84

Monthly the charges will be as follows:

 

Transit Gateway data processing charge across peering attachments

These are the peering fees between two or more Transit Gateways. Only outbound traffic will be charged as standard traffic between regions.

Charges differ by region, but the typical cost is $0.02.

The monthly charges will be as follows:

 

All charges in the  ScyllaDB Cloud Column total of $324.44 will be included in ScyllaDB Cloud’s monthly billing report and passed back to the customer.

All charges in Customer Account will appear in the customer AWS bill.

Bring your own account (BYOA)

Customers can use ScyllaDB Cloud to deploy databases in their AWS account.

In this case, all charges will be applied to the customer’s account or accounts. Since there will be no charges in ScyllaDB Cloud, nothing will be passed with the monthly bill.

Combining Transit Gateways, VPC peering, VPN connections in the same network setup is possible. ScyllaDB Cloud supports this configuration to provide flexibility and cost optimization.

ScyllaDB Cloud Network Connectivity Overview

ScyllaDB Cloud offers two network connectivity features. Both VPC Peering Connection and Transit Gateway Connection enable seamless and efficient connectivity between customer applications and highly performant ScyllaDB database clusters.

We invite you to explore ScyllaDB Cloud and experiment with the network connections to match your unique application requirements and budget

Cadence® Performance Benchmarking Using Apache Cassandra® Paxos V2

Overview 

Originally developed and open sourced by Uber, Cadence® is a workflow engine that greatly simplifies microservices orchestration for executing complex business processes at scale. 

Instaclustr has offered Cadence on the Instaclustr Managed Platform for more than 12 months now. Having made open source contributions to Cadence to allow it to work with Apache Cassandra® 4, we were keen to test that performance with the new features of Cassandra 4.1 with Cadence. 

Paxos is the consensus protocol used in Cassandra to implement lightweight transactions (LWT) that can handle concurrent operations. However, the initial version of Paxos—Paxos v1—achieves linearizable consistency at the high cost of 4 round-trips for write operations. It could potentially affect Cadence performance considering Cadence’s reliance on LWT for executing multi-row single shard conditional writes to Cassandra according to this documentation. 

With the release of Cassandra 4.1, Paxos v2 was introduced, promising a 50% improvement in LWT performance, reduced latency, and a reduction in the number of roundtrips needed for consensus, as per the release notes. Consequently, we wanted to test the impact on Cadence’s performance with the introduction of Cassandra Paxos v2.  

This blog post focuses on benchmarking the performance impact of Cassandra Paxos v2 on Cadence, with a primary performance metric being the rate of successful workflow executions per second. 

Benchmarking Setup 

We used the cadence-bench tool to generate bench loads on the Cadence test clusters; to reduce variables in benchmarking, we only use the basic loads that don’t require the Advanced Visibility feature. 

Cadence bench tests require a Cadence server and bench workers. The following are their setups in this benchmarking: 

Cadence Server 

At Instaclustr, a managed Cadence cluster depends on a managed Cassandra cluster as the persistence layer. The test Cadence and Cassandra clusters were provisioned in their own VPCs and used VPC Peering for inter-cluster communication. For comparison, we provisioned 2 sets of Cadence and Cassandra clusters—Baseline set and Paxos v2 set.  

  • Baseline set: A Cadence cluster dependent on a Cassandra cluster with the default Paxos v1 
  • Paxos v2 set: A Cadence cluster dependent on a Cassandra cluster with Paxos upgraded to v2 

We provisioned the test clusters in the following configurations: 

Application  Version  Node Size  Number of Nodes 
Cadence  1.2.2  CAD-PRD-m5ad.xlarge-150 (4 CPU cores + 16 GiB RAM)  3 
Cassandra  4.1.3  CAS-PRD-r6g.xlarge-400 (4 CPU cores + 32 GiB RAM)  3 

Bench Workers 

We used AWS EC2 instances as the stressor boxes to run bench workers that generate bench loads on the Cadence clusters. The stressor boxes were provisioned in the VPC of the corresponding Cadence cluster to reduce network latency between the Cadence server and bench workers. On each stressor box, we ran 20 bench worker processes. 

 These are the configurations of the EC2 instances used in this benchmarking: 

EC2 Instance Size  Number of Instances 
c4.xlarge (4 CPU cores + 7.5 GiB RAM)  3 

Upgrade Paxos 

For the test Cassandra cluster in the Paxos v2 set, we upgraded the Paxos to v2 after the cluster hit running.  

According to the section Steps for upgrading Paxos in NEWS.txt, we set 2 configuration properties and ensured Paxos repairs were running regularly to upgrade Paxos to v2 on a Cassandra cluster. We also planned to change the consistency level used for LWT in Cadence to fully benefit from Paxos v2 improvement.  

We took the following actions to upgrade Paxos on the test Paxos v2 Cassandra cluster: 

Added these configuration properties to the Cassandra configuration file

cassandra.yaml.
Configuration Property  Value 
paxos_variant  v2 
paxos_state_purging  repaired 

For Instaclustr managed Cassandra clusters, we use an automated service on Cassandra nodes to run Cassandra repairs regularly. 

Cadence sets LOCAL_SERIAL as the consistency level for conditional writes (i.e., LWT) to Cassandra as specified here. Because it’s hard-coded, we could not change it to ANY or LOCAL_QUORUM as suggested in the Steps for upgrading Paxos.  

The Baseline Cassandra cluster was set to use the default Paxos v1 so no change to configurations were required. 

Bench Loads 

We used the following configurations for the basic bench loads to be generated on both the Baseline and Paxos v2 Cadence clusters: 

{   

  "useBasicVisibilityValidation": true,   

  "contextTimeoutInSeconds": 3,   

  "failureThreshold": 0.01,   

  "totalLaunchCount": 100000,   

  "routineCount": 15 or 20 or 25,    

  "waitTimeBufferInSeconds": 300,   

  "chainSequence": 12,   

  "concurrentCount": 1,    

  "payloadSizeBytes": 1024,   

  "executionStartToCloseTimeoutInSeconds": 300   

}

All the configuration properties except routineCount were kept constant. routineCount defines the number of in-parallel launch activities that start the stress workflows. Namely, it affects the rate of generating concurrent test workflows, and hence can be used to test Cadence’s capability to handle concurrent workflows.   

We ran 3 sets of bench loads with different routineCounts (i.e., 15, 20, and 25) in this benchmarking so we could observe the impacts of routineCount on Cadence performance.  

We used cron jobs on the stressor boxes to automatically trigger bench loads with different routineCount on the following schedules: 

  • Bench load 1 with routineCount=15: Runs at 0:00, 4:00, 8:00, 12:00, 16:00, 20:00 
  • Bench load 2 with routineCount=20: Runs at 1:15, 5:15, 9:15, 13:15, 17:15, 21:15 
  • Bench load 3 with routineCount=25: Runs at 2:30, 6:30, 10:30, 14:30, 18:30, 22:30 

Results 

In the beginning of the benchmarking, we observed the aging impact on Cadence performance. Specifically, the key metric—Average Workflow Success/Sec—gradually decreased for both Baseline and Paxos v2 Cadence clusters as more bench loads were run. 

Why did this occur? Most likely because Cassandra uses in-memory cache when there’s only a small amount of data stored in it, meaning read and write latency are lower when all data lives in memory. After more rounds of bench loads were executed, Cadence performance became more stable and consistent. 

Cadence Workflow Success/Sec  

Cadence Average Workflow Success/Sec is the key metric we use to measure Cadence performance.  

As demonstrated in the graphs below and contrary to what we expected, Baseline Cadence cluster achieved around 10% higher Average Workflow Success/Sec than Paxos v2 Cadence cluster in the sample of 10 rounds of bench loads. 

Baseline Cadence cluster successfully processed 36.2 workflow executions/sec on average: 

Paxos v2 Cadence cluster successfully processed 32.9 workflow executions/sec on average: 

Cadence History Latency 

Cadence history service is the internal service that communicates with Cassandra to read and write workflow data in the persistence store. Therefore, Average History Latency should in theory be reduced if Paxos v2 truly benefits Cadence performance.  

However, as shown below, Paxos v2 Cadence cluster did not consistently perform better than Baseline Cadence cluster on this metric: 

Baseline Cadence cluster 

Paxos v2 Cadence cluster 

Cassandra CAS Operations Latency 

Cadence only uses LWT for conditional writes (LWT is also called Compare and Set [CAS] operation in Cassandra). Measuring the latency of CAS operations reported by Cassandra clusters should reveal if Paxos v2 reduces latency of LWT for Cadence. 

As indicated in the graphs below and by the mean value of this metric, we did not observe consistently significant differences in Average CAS Write Latency between Baseline and Paxos v2 Cassandra clusters.  

The mean value of Average CAS Write Latency reported by Baseline Cassandra cluster is 105.007 milliseconds. 

The mean value of Average CAS Write Latency reported by Paxos v2 Cassandra cluster is 107.851 milliseconds. 

Cassandra Contended CAS Write Count 

One of the improvements introduced by Paxos v2 is reduced contention. Therefore, to ensure that Paxos v2 was truly running on Paxos v2 Cassandra cluster, we measured and compared the metric Contended CAS Write Count for Baseline and Paxos v2 Cassandra clusters. As clearly illustrated in the graphs below, Baseline Cassandra cluster experienced contended CAS writes while Paxos v2 Cassandra cluster did not, which means Paxos v2 was in effect on Paxos v2 Cassandra. 

Baseline Cassandra cluster 

Paxos v2 Cassandra cluster 

Conclusion 

During our benchmarking test, there was no noticeable improvement in the performance of Cadence attributable to Cassandra Paxos v2.   

This may be because we were unable to modify the consistency level in the Lightweight Transactions (LWT) as required to fully leverage Cassandra Paxos v2, or may simply be that other factors in a complex, distributed application like Cadence mask any improvement from Paxos v2.  

Future Work 

Although we did not see promising results in this benchmarking, future investigations could focus on assessing the latency of LWTs in Cassandra clusters with Paxos v2 both enabled and disabled and with a less complex client application. Such exploration would be valuable for directly evaluating the impact of Paxos v2 on Cassandra’s performance. 

The post Cadence® Performance Benchmarking Using Apache Cassandra® Paxos V2 appeared first on Instaclustr.

New Google Cloud Z3 Instances: Early Performance Benchmarks on ScyllaDB Show up to 24% Better Throughput

ScyllaDB, a high-performance NoSQL database with a close-to-the-metal architecture, had the privilege of examining Google Cloud’s Z3 GCE instances in an early preview. The Z3 machine series is the first of the Storage Optimized VM GCE offerings. It boasts a remarkable 36T of Local SSD. The Z3 series is powered by the 4th Gen Intel Xeon Scalable processor (alias Sapphire Rapids) and DDR5 memory as well as Google’s custom-built Infrastructure Processing Unit (IPU) that supports Hyperdisk. The Z3 amalgamates the most recent advancements in compute, networking, and storage technologies into a single platform, with a distinct emphasis on a new breed of high-density, high-performance local SSD.

The Z3 series is optimized for workloads that require low latency and high performance access to large data sets. Likewise, ScyllaDB is engineered to deliver predictable low latency, even with workloads exceeding 1M OPS per machine. Google, Intel, and ScyllaDB partnered to test ScyllaDB on the new instances because we were all curious to see how these innovations translated to performance gains with data-intensive use cases.

TL;DR When we tested ScyllaDB on the new Z3 instances, ScyllaDB exhibited a significant throughput improvement across workloads versus the previous generation of N2 instances. We observed a 23% increase in write throughput, 24% for mixed workloads, and 14% for reads per vCPU (z3-highmem-88 vs n2-highmem-96) and at a lower cost when compared to N2 with the additional fast disks of the same size. On these new instances, a cluster of just 3 ScyllaDB nodes can achieve around 2.2M OPS for writes and mixed workloads and around 3.6M OPS for reads.

Instance Selection: Google Cloud Z3 versus N2

Z3 instances come in 2 shapes: z3-highmem-88 and z3-highmem-176, each boasting 88 and 176 4th Gen Intel(R) Xeon(R) Scalable vCPUs respectively. Each vCPU is bolstered with 8GB memory, culminating in a staggering 1,408 GB for the larger variant.

We conducted a comparative analysis between the Z3 instance and the N2 memory-optimized instances. The N2 instances were our standard choice until now.

The N2 instances are available in a variety of sizes and are designed around two Intel CPU architectures: 2nd and 3rd Gen Intel(R) Xeon(R) Scalable. The 3rd Gen Intel(R) Xeon(R) Scalable architecture is the default choice for larger machines (with 96 vCPUs or more). The n2-highmem also incorporates 8GB per vCPU memory.

The N2 instance reaches its maximum size at 128 vCPUs. Thus, for an equitable comparison, we selected the n2-highmem-96, the closest N2 instance to the smaller Z3 instance, and equipped it with the maximum attachable 24 fast local NVMe disks.

ScyllaDB Benchmark Results: Z3 versus N2 Throughput

Setup and Configuration

Benchmarking such powerful machines requires considerable effort. To mimic user processes on this grand scale, we equipped 30 client instances, each with 30 processing units, to optimize outcomes. This necessitated the development of appropriate scripts for executing load and accumulating results. However, the scylla-cluster-tests testing framework facilitated this process, allowing us to execute all tests with remarkable efficiency.

We measured maximum throughput using the cassandra-stress benchmark tool. To make the workload more realistic, we tuned the row size to 1KB each and set the replication factor to 3. Also, to measure the performance impact of the new generation CPUs, we included workloads that read from cache – removing the influence of disk speed disparities across the different instance types. All results show client-side values, so we measured the complete round trip and confirmed ScyllaDB-side metrics values.

Results

Because of ScyllaDB’s shard-per-core architecture, it is more suitable to show results normalized by vCPU to provide a better sense of the new CPU platform’s capabilities. ScyllaDB exhibited a significant 23% increase in write throughput and a 24% increase in throughput for a mixed workload. Additionally, ScyllaDB achieved a 14% improvement in read throughput.

 

Workload 4th Gen Intel Xeon
[op/s per vCPU]
3rd Gen Intel Xeon
[op/s per vCPU]
diff
Write only 8.45K op/s 6.85K op/s +23%
Read Only (Entirely from Cache) 13.59K op/s 11.93K op/s +14%
Mixed (50/50 Read /Writes) 8.63K op/s 6.94K op/s +24%

The metrics showed a sustainable number of served requests:

Requests Served per shard (Z3)

Careful readers will notice the graph shows 15K OPS/shard, which is higher than the numbers in the table. This is because 8 vCPUs are reserved exclusively for work with network and disk IRQ; they are not serving requests as part of the ScyllaDB node.

Overall, a cluster of just 3 nodes can achieve around 2.2M OPS write and mixed workload and around 3.6M OPS read (all measured with QUORUM consistency level). Despite Z3 instances being 8 vCPUs smaller than the N2 ones, we achieved better performance in all tested workloads, which is an extraordinary accomplishment.

Workload z3-highmem-88 n2-highmem-96 diff
Write only 2.23M op/s 1.97M op/s +13%
Read Only (Entirely from Cache) 3.6M op/s 3.43M op/s +5%
Mixed (50/50 Read /Writes) 2.28M op/s 2.00M op/s +14%

And this is how the Z3 read workload looks in ScyllaDB Monitoring:

Closing Thoughts

The results of this benchmark highlight how Google Cloud’s new Intel 4th Gen Intel(R) Xeon(R) Scalable based Z3 platform family brings significant enhancements in terms of CPU, disk, memory, and network performance. 36TB local SSD capacity included makes it more cost-effective over N2 with the additional fast disks of the same size. For ScyllaDB users, this translates to substantial gains in throughput while reducing costs for a variety of workloads. We recommend using these instances for ScyllaDB users to further reduce their infrastructure footprint while improving performance.

Next steps:

Notices & Disclaimers
Performance varies by use, configuration and other factors. Learn more at intel.com/performanceindex. Your costs and results may vary. Intel technologies may require enabled hardware, software or service activation. © Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.

Preview Release of Apache Cassandra® version 5.0 on the Instaclustr Managed Platform

Apache Cassandra® version 5.0 Beta 1.0 is now available in public preview on the Instaclustr Managed Platform!  

This follows on the heels of the project release of Cassandra version 5.0 Beta 1.0. Instaclustr is proud to be the first managed service platform to release this version for deployment on all major cloud providers or on-premises.  

This release is designed to allow anyone to easily undertake any application-specific testing of Apache Cassandra 5.0 (Beta 1.0) in preparation for the forthcoming GA release of Apache Cassandra 5.0. 

Apache Cassandra 5.0 

The last major version of Apache Cassandra was released about three years ago, bringing numerous advantages with it. Cassandra 5.0 is another significant iteration with exciting features that will revolutionize the future of NoSQL databases.  

Cassandra 5.0 brings enhanced efficiency and scalability, performance, and memory optimizations to your applications. Additionally, it expands functionality support, accelerating your AI/ML journey and playing a pivotal role in the development of AI applications. 

Some of the key new features in Cassandra 5.0 include: 

  • Storage-Attached Indexes (SAI): A highly scalable, globally distributed index for Cassandra databases. With SAI, column-level indexes can be added leading to unparalleled I/O throughput for searches across different data types, including vectors. SAI also enables lightning-fast data retrieval through zero-copy streaming of indices, resulting in unprecedented efficiency. 
  • Vector Search: This is powerful technique for searching relevant content or discovering connections by comparing similarities in large document collections and is particularly useful for AI applications. It uses storage-attached indexing and dense indexing techniques to enhance data exploration and analysis. 
  • Unified Compaction Strategy: This unifies compaction approaches, including leveled, tiered, and time-windowed strategies. The strategy leads to a major reduction in SSTable sizes. Smaller SSTables mean better read and write performance, reduced storage requirements, and improved overall efficiency. 
  • Numerous stability and testing improvements: You can read all about these changes here. 

Important: Limitations of Cassandra 5.0 Beta 1 on the Instaclustr Managed Platform 

Customers can provision a cluster on the Instaclustr Platform with Cassandra version 5.0 for testing purposes. Note that while Cassandra version 5.0 Beta 1.0 is supported on our platform, there are some limitations in the functionality available, including:   

  • KeySpace/Table-level monitoring and metrics are not available when the vector data type is used. 
  • Trie-indexed SSTables and memtables are not yet supported since we do not yet support using the BTI format for SSTables and memtables. 

Apart from the application-specific limitations (read more about them here), the public preview release comes with the following conditions: 

  • It is not supported for production usage and is not covered by SLAs. The release should be used for testing purposes only. 
  • There is no support for add-ons such as Apache Lucene™, Continuous Backups and others. 
  • No PCI-compliant mode. 

What’s Next? 

Instaclustr will continue to conduct performance baselining and additional testing to offer industry-leading support and work on removing existing limitations in preparation for Cassandra 5.0’s GA release. 

How to Get Started 

Ready to try Cassandra 5.0 out yourself? Set up a non-production cluster using the Instaclustr console by following the easy instructions available here. Choose the Apache Cassandra 5.0 Preview version from the public preview category on the Cassandra setup page to begin your exploration. 

Don’t have an Instaclustr account yet? Sign up for a trial or reach out to our sales team to start exploring Cassandra 5.0. With over 300 million node-hours of management experience, Instaclustr offers unparalleled expertise. Visit our website to learn more about the Instaclustr Managed Platform for Apache Cassandra. 

If you have any issues or questions about provisioning your cluster, please contact Instaclustr Support at any time. 

The post Preview Release of Apache Cassandra® version 5.0 on the Instaclustr Managed Platform appeared first on Instaclustr.