ScyllaDB’s Safe Topology and Schema Changes on Raft

How ScyllaDB is using Raft for all topology and schema metadata – and the impacts on elasticity, operability, and performance ScyllaDB recently completed the transition to strong consistency for all cluster metadata. This transition involved moving schema and topology metadata to Raft, implementing a centralized topology coordinator for driving topology changes, and several other changes related to our commit logs, schema versioning, authentication, and other aspects of database internals. With all topology and schema metadata now under Raft, ScyllaDB officially supports safe, concurrent, and fast bootstrapping with versions 6.0 and higher. We can have dozens of nodes start concurrently. Rapidly assembling a fresh cluster, performing concurrent topology and schema changes, and quickly restarting a node with a different IP address or configuration – all of this is now possible. This article shares why and how we moved to a new algorithm providing centralized (yet fault-tolerant) topology change coordination for metadata, as well as its implications for elasticity, operability, and performance. A Quick Consistency Catchup Since ScyllaDB was born as a Cassandra-compatible database, we started as an eventually consistent system. That made perfect business sense for storing user data. In a large cluster, we want our writes to be available even if a link to the other data center is down.   [For more on the differences between eventually consistent and strongly consistent systems, see the blog ScyllaDB’s Path to Strong Consistency: A New Milestone.] But beyond storing user data, the database maintains additional information, called metadata, that describes: Topology (nodes, data distribution…) Schema (table format, column names, indexes…) There’s minimal business value in using the eventually consistent model for metadata. Metadata changes are infrequent, so we do not need to demand extreme availability or performance for them. Yet, we want to reliably change the metadata in an automatic manner to bring elasticity. That’s difficult to achieve with an eventually consistent model. Having metadata consistently replicated to every node in the cluster allows us to bridge the gap to elasticity, enabling us to fully automate node operations and cluster scaling. So, back in 2021, we embarked on a journey to bring in Raft: an algorithm and a library that we implemented to replicate any kind of information across multiple nodes. Since then, we’ve been rolling out the implementation incrementally. Our Move to Schema and Topology Changes on Raft In ScyllaDB 5.2, we put the schema into a Raft-replicated state. That involved replicating keyspace, table, and column information through Raft. Raft provides a replicated log across all the nodes in the cluster. Everything that’s updated through Raft first gets applied to that log, then gets applied to the nodes (in exactly the same order to all nodes). Now, in ScyllaDB 6.0, we greatly expanded the amount of information we store in this Raft-based replicated state machine. We also include new schema tables for authentication and service levels. And more interestingly, we moved topology over to Raft and implemented a new centralized topology coordinator that’s instrumental for our new tablets architecture (more on that at the end of this article). We also maintain backward compatibility tables so that old drivers and old clients can still get information about the cluster in the same way. Driving Topology Changes from a Centralized Topology Coordinator Let’s take a closer look at that centralized topology coordinator. Previously, the node joining the cluster would drive the topology change forward. If a node was being removed, another node would drive its removal. If something happened to the node driving these operations, the database operator had to intervene and restart the operation from scratch. Now, there’s a centralized process (which we call the topology change coordinator) that runs alongside the Raft cluster leader node and drives all topology changes. If the leader coordinator node is down, a new node is automatically elected a leader. Since the coordinator state is stored in the deterministic state machine (which is replicated across the entire cluster), the new coordinator can continue to drive the topology work from the state where the previous coordinator left off. No human intervention is required. Every topology operation registers itself in a work queue, and the coordinator works off that queue. Multiple operations can be queued at the same time, providing an illusion of concurrency while preserving operation safety. It’s possible to build a deterministic schedule, optimizing execution of multiple operations. For example, it lets us migrate multiple tablets at once, call cleanups for multiple nodetool operations, and so on. Since information about the cluster members is now propagated through Raft instead of Gossip, it’s quickly replicated to all nodes and is strongly consistent. A snapshot of this data is always available locally. That allows a starting node to quickly obtain the topology information without reaching out to the majority of the cluster. Practical Applications of this Design Next, let’s go over some practical applications of this design, beginning with the improvements in schema changes that we introduced in ScyllaDB 6.0. Dedicated Metadata Commit Log on shard 0 The ScyllaDB schema commit log, introduced in ScyllaDB 5.0 and now mandatory in ScyllaDB 6.0, is a dedicated write-ahead log for schema tables. With ScyllaDB 6.0, we started using the same log for schema and topology changes. That brings both linearizability and durability. This commit log runs on shard 0 and has different properties than the data commit log. It’s always durable, always synced immediately after write to disk. There’s no need to sync the system tables to disk when performing schema changes, which leads to faster schema changes. And this commit log has a different segment size, allowing larger chunks of data (e.g., very large table definitions) to fit into a single segment. This log is not impacted by the tuning you might do for the data commit log, such as max size on disk or flush settings. It also has its own priority, so that data writes don’t’ stall metadata changes, and there is no priority inversion. Linearizable schema version Another important update is the change to how we build schema versions. A schema version is a table identifier that we use internally in intra-cluster RPC to understand that every node has the same version of the metadata. Whenever a table definition changes, the identifier must be rebuilt. Before, with eventual consistency allowing concurrent schema modifications, we used to rehash all the system tables to create a new version on each schema change. Now, since schema changes are linearized, only one schema change occurs at a time – making a monotonic timestamp just as effective. It turns out that schema hash calculation is a major performance hog when creating, propagating, or applying schema changes. Moving away from this enables a nice speed boost. With this change, we were able to dramatically improve schema operation (e.g., create table, drop table) performance from one schema change per 10-20 seconds (in large clusters) to one schema change per second or less. We also removed the quadratic dependency of the cost of this algorithm on the size of the schema. It used to be that the more tables you had, the longer it took to add a new table. That’s no longer the case. We plan to continue improving schema change performance until we can achieve at least several changes per second and increase the practical ceiling for the number of tables a ScyllaDB installation can hold. Authentication and service levels on Raft We moved the internal tables for authentication and service levels to Raft as well. Now, they are globally replicated (i.e., present on every node in the cluster). This means users no longer need to adjust the replication factor for authentication after adding or removing nodes. Previously, authentication information was partitioned across the entire cluster. If a part of the cluster was down and the role definition was on one of the unavailable nodes, there was a risk that this role couldn’t connect to the cluster at all. This posed a serious denial of service problem. Now that we’re replicating this information to all nodes using Raft, there’s higher reliability since the data is present at all nodes. Additionally, there’s improved performance since the data is available locally (and also no denial of service risk for the same reason). For service levels, we moved from a polling model to a triggering model. Now, service level information is rebuilt automatically whenever it’s updated, and it’s also replicated onto every node via Raft. Additional Metadata Consistency in ScyllaDB 6.0 Now, let’s shift focus to other parts of the metadata that we converted to strong consistency in ScyllaDB 6.0. With all this metadata under Raft, ScyllaDB now officially supports safe, concurrent, and fast bootstrap. We can have dozens of nodes start concurrently. Feature Negotiation on Raft To give you an idea of some of the low-level challenges involved in moving to Raft, consider how we moved little-known ScyllaDB feature called “feature negotiation.” Essentially, this is a feature with details about other features. To ensure smooth upgrades, ScyllaDB runs a negotiation protocol between cluster nodes. A new functionality is only enabled when all of the nodes in the cluster can support it. But how does a cluster know that all of the nodes support the feature? Prior to Raft, this was accomplished with Gossip. The nodes were gossiping about their supported features, and eventually deciding that it was safe to enable them (after every node sees that every other node sees the feature). However, remember that our goal was to make ScyllaDB bootstraps safe, concurrent, and fast. We couldn’t afford to continue waiting for Gossip to learn that the features are supported by the cluster. We decided to propagate features through Raft. But we needed a way to quickly determine if the cluster supported the feature of feature propagation through Raft. It’s a classic “chicken or the egg” problem. The solution: in 6.0, when joining a node, we offload its feature verification to an existing cluster member. The joining node sends its supported feature set to the cluster member, which then verifies whether the node is compatible with the current cluster. Beyond the features that this node supports, this also includes such things as the snitch used and the cluster name. All that node information is then persisted in Raft. Then, the topology coordinator decides whether to accept the node or to reject it (because it doesn’t support some of the features). The most important thing to note here is that the enablement of any cluster features is now serialized with the addition of the nodes. There is no race. It’s impossible to concurrently add a feature and add a node that doesn’t support that feature. CDC Stream Details on Raft We also moved information about CDC stream generation to Raft. Moving this CDC metadata was required in order for us to stop relying on Gossip and sleeps during boot. We use this metadata to tell drivers that the current distribution of CDC has changed because the cluster topology changed – and it needs to be refreshed. Again, Gossip was previously used to safely propagate this metadata through the cluster, and the nodes had to wait for Gossip to settle. That’s no longer the case for CDC metadata. Moving this data over to Raft on group0, with its dedicated commit log, also improved data availability & durability. Additional Updates Moreover, we implemented a number of additional updates as part of this shift: Automated SSTable Cleanup : In ScyllaDB 6.0, we also automated the SSTable cleanup that needs to run between (some) topology changes to avoid data resurrection. Sometimes even a failed topology change may require cleanup. Previously, users had to remember to run this cleanup. Now, each node tracks its own cleanup status (whether the cleanup is needed or not) and performs the cleanup. The topology coordinator automatically coordinates the next topology change with the cluster cleanup status. Raft-Based UUID host identification: Internally, we switched most ScyllaDB subsystems to Raft-based UUID host identification. These are the same identifiers that Raft uses for cluster membership. Host id is now part of every node’s handshake, and this allows us to ensure that a node removed from the cluster cannot corrupt cluster data with its write RPCs. We also provide a safety net for the database operators: if they mistakenly try to remove a live node from the cluster, they get an error. Live nodes can be decommissioned, but not removed. Improved Manageability of the Raft Subsystem: We improved the manageability of the Raft subsystem in ScyllaDB 6.0 with the following: A new system table for Raft state inspection allows users to see the current Raft identifiers, the relation of the node to the Raft cluster, and so on. It’s useful for checking whether a node is in a good state – and troubleshooting if it is not. New Rest APIs allow users to manually trigger Raft internal operations. This is mainly useful for troubleshooting clusters. A new maintenance mode lets you start a node even if it’s completely isolated from the rest of the cluster. It also lets you manipulate the data on that local node (for example, to fix it or allow it to join a different cluster). Again, this is useful for troubleshooting. We plan to continue this work going forward. Raft is Enabled – and It Enables Extreme Elasticity with Our New Tablets Architecture To sum up, the state of the topology, such as tokens, was previously propagated through gossip and eventually consistent. Now, the state is propagated through Raft, replicated to all nodes and is strongly consistent. A snapshot of this data is always available locally, so that starting nodes can quickly obtain the topology information without reaching out to the leader of the Raft group. Even if nodes start concurrently, token metadata changes are now linearized. Also, the view of token metadata at each node is not dependent on the availability of the owner of the tokens. Node C is fully aware of node B tokens (even if node B is down) when it bootstraps. Raft is enabled by default, for both schema and topology, in ScyllaDB 6.0 and higher. And it now serves as the foundation for our tablets replication implementation: the tablet load balancer could not exist without it. Learn more about our tablets initiative overall, as well as its load balancer implementation, in the following blogs: Smooth Scaling: Why ScyllaDB Moved to “Tablets” Data Distribution How We Implemented ScyllaDB’s “Tablets” Data Distribution

How We Implemented ScyllaDB’s “Tablets” Data Distribution

How ScyllaDB implemented its new Raft-based tablets architecture, which enables teams to quickly scale out in response to traffic spikes ScyllaDB just announced the first GA release featuring our new tablets architecture. In case this is your first time hearing about it, tablets are the smallest replication unit in ScyllaDB. Unlike static and all-or-nothing topologies, tablets provides dynamic, parallel, and near-instant scaling operations. This approach allows for autonomous and flexible data balancing and ensures that data is sharded and replicated evenly across the cluster. That, in turn, optimizes performance, avoids imbalances, and increases efficiency. The first blog in this series covered why we shifted from tokens to tablets and outlined the goals we set for this project. In this blog, let’s take a deeper look at how we implemented tablets via: Indirection and abstraction Independent tablet units A Raft-based load balancer Tablet-aware drivers Indirection and Abstraction There’s a saying in computer science that every problem can be solved with a new layer of indirection. We tested that out here, with a so-called tablets table. 😉 We added a tablets table that stores all of the tablets metadata and serves as the single source of truth for the cluster. Each tablet has its own token-range-to-nodes-and-shards mapping, and those mappings can change independently of node addition and removal. That enables the shift from static to dynamic data distribution. Each node controls its own copy, but all of the copies are synchronized via Raft. The tablets table tracks the details as the topology evolves. It always knows the current topology state, including the number of tablets per table, the token boundaries for each tablet, and which nodes and shards have the replicas. It also dynamically tracks tablet transitions (e.g., is the tablet currently being migrated or rebuilt?) and what the topology will look like when the transition is complete Independent Tablet Units Tablets dynamically distribute each table based on its size on a subset of nodes and shards. This is a much different approach than having vNodes statically distribute all tables across all nodes and shards based only on the token ring. With vNodes, each shard has its own set of SSTables and memtables that contain the portion of the data that’s allocated to that node and shard. With this new approach, each tablet is isolated into its own mini memtable and its own mini SSTables. Each tablet runs the entire log-structured merge (LSM) tree independently of other tablets that run on this shard. The advantage of this approach is that everything (SSTable + memtable + LSM tree) can be migrated as a unit. We just flush the memtables and copy the SSTables before streaming (because it’s easier for us to stream SSTables and not memtables). This enables very fast and very efficient migration. Another benefit: users no longer need to worry about manual cleanup operations. With vNodes, it can take quite a while to complete a cleanup since it involves rewriting all of the data on the node. With tablets, we migrate it as a unit and we can just delete the unit when we finish streaming it. When a new node is added to the cluster, it doesn’t yet own any data. A new component that we call the load balancer (more on this below) notices an imbalance among nodes and automatically starts moving data from the existing nodes to the new node. This is all done in the background with no user intervention required. For decommissioning nodes, there’s a similar process, just in the other direction. The load balancer is given the goal of zero data on that decommissioned node, it shifts tablets to make that happen, then the node can be removed once all tablets are migrated. Each tablet holds approximately 5 GB of data. Different tables might have different tablet counts and involve a different number of nodes and shards. A large table will be divided into a large number of tablets that are spread across many nodes and many shards, while a smaller table will involve fewer tablets, nodes and shards. The ultimate goal is to spread tables across nodes and shards evenly, in a way that they can effectively tap the cluster’s combined CPU power and IO bandwidth. Load Balancer All the tablet transitioning is globally controlled by the load balancer. This includes moving data from node to node or across shards within a node, running rebuild and repair operations, etc. This means the human operator doesn’t have to perform those tasks. The load balancer moves tablets around with the goal of achieving the delicate balance between overloading the nodes and underutilizing them. We want to maximize the available throughput (saturate the CPU and network on each node). But at the same time, we need to avoid overloading the nodes so we can keep migrations as fast as possible. To do this, the load balancer runs a loop that collects statistics on tables and tablets. It looks at which nodes have too little free space and which nodes have too much free space and it works to balance free space. It also rebalances data when we want to decommission a node. The load balancer’s other core function is maintaining the serialization of transitions. That’s all managed via Raft, specifically the Raft Group 0 leader. For example, we rely on Raft to prevent migration during a tablet rebuild and prevent conflicting topology changes. If the human operator happens to increase the replication factor, we will rebuild tablets for the additional replicas – and we will not allow yet another RF change until the automatic rebuild of the new replicas completes. The load balancer is hosted on a single node, but not a designated node. If that node goes down for maintenance or crashes, the load balancer will just get restarted on another node. And since we have all the tablets metadata in the tablets tables, the new load balancer instance can just pick up wherever the last one left off. Tablet-aware drivers Finally, it’s important to note that our older drivers will work with tablets, but they will not work as well. We just released new tablet-aware drivers that will provide a nice performance and latency boost. We decided that the driver should not read from the tablets table because it could take too long to scan the table, plus that approach doesn’t work well with things like lambdas or cloud functions. Instead, the driver learns tablets information lazily. The driver starts without any knowledge of where tokens are located. It makes a request to a random node. If that’s the incorrect node, the node will see that the driver missed the correct host. When it returns the data, it will also add in extra routing information that indicates the correct location. Next time, the driver will know where that particular token lives, so it will send the request directly to the node that hosts the data. This avoids an extra hop. If the tablets get migrated later on, then the “lazy learning” process repeats. How Does this All Play Out? Let’s take a deeper look into monitoring metrics and even some mesmerizing tablet visualization to see how all the components come together to achieve the elasticity and speed goals laid out in the previous blog. Conclusion We have seen how tablets make ScyllaDB more elastic. With tablets, ScyllaDB scales out faster, scaling operations are independent, and the process requires less attention and care from the operator. We feel like we haven’t yet exhausted the potential of tablets. Future ScyllaDB versions will bring more innovation in this space.

Smooth Scaling: Why ScyllaDB Moved to “Tablets” Data Distribution

The rationale behind ScyllaDB’s new “tablets” replication architecture, which builds upon a multiyear project to implement and extend Raft  ScyllaDB 6.0 is the first release featuring ScyllaDB’s new tablet architecture. Tablets are designed to support flexible and dynamic data distribution across the cluster. Based on Raft, this new approach provides new levels of elasticity with near-instant bootstrap and the ability to add new nodes in parallel – even doubling an entire cluster at once. Since new nodes begin serving requests as soon as they join the cluster, users can spin up new nodes that start serving requests almost instantly. This means teams can quickly scale out in response to traffic spikes – satisfying latency SLAs without needing to overprovision “just in case.” This blog post shares why we decided to take on this massive project and the goals that we set for ourselves. Part 2 (publishing tomorrow) will focus on the technical requirements and implementation. Tablets Background First off, let’s be clear. ScyllaDB didn’t invent the idea of using tablets for data distribution. Previous tablets implementations can be seen across Google Bigtable, Google Spanner, and YugabyteDB. The 2006 Google Bigtable paper introduced the concept of splitting table rows into dynamic sections called tablets. Bigtable nodes don’t actually store data; they store pointers to tablets that are stored on Colossus (Google’s internal, highly durable file system). Data can be rebalanced across nodes by changing metadata vs actually copying data – so it’s much faster to dynamically redistribute data to balance the load. For the same reason, node failure has a minimal impact and node recovery is fast. Bigtable automatically splits busier or larger tablets in half and merges less-accessed/smaller tablets together – redistributing them between nodes as needed for load balancing and efficient resource utilization. The Bigtable tablets implementation uses Paxos. Tablets are also discussed in the 2012 Google Spanner paper (in section 2.1) and implemented in Spanner. Paxos is used here as well. Spanserver software stack – image from the Google Spanner paper Another implementation of tablets can be found in DocDB, which serves as YugabyteDB’s underlying document storage engine. Here, data is distributed by splitting the table rows and index entries into tablets according to the selected sharding method (range or hash) or auto-splitting. The Yugabyte implementation uses Raft. Each tablet has its own Raft group, with its own LSM-Tree datastore (including a memtable, in RAM, and many SSTable files on disk). YugabyteDB hash-based data partitioning – image from the YugabyteDB blog YugabyteDB range-based data partitioning – image from the YugabyteDB blog Why Did ScyllaDB Consider Tablets? Why did ScyllaDB consider a major move to tablets-based data distribution? Basically, several elements of our original design eventually became limiting as infrastructure and the shape of data evolved – and the sheer volume of data and size of deployments spiked. More specifically: Node storage: ScyllaDB streaming started off quite fast back in 2015, but storage volumes grew faster. The shapes of nodes changed: nodes got more storage per vCPU. For example, compare AWS i4i nodes to i3en ones, which have about twice the amount of storage per vCPU. As a result, each vCPU needs to stream more data. The immediate effect is that streaming takes longer. Schema shapes: The rate of streaming depends on the shape of the schema. If you have relatively large cells, then streaming is not all that CPU-intensive. However, if you have tiny cells (e.g., numerical data common with time-series data), then ScyllaDB will spend CPU time parsing and serializing – and then deserializing and writing – each cell. Eventually consistent leaderless architecture: ScyllaDB’s eventually consistent leaderless architecture, without the notion of a primary node, meant that the database operator (you) had to bear the burden of coordinating operations. Everything had to be serialized because the nodes couldn’t reliably communicate about what you were doing. That meant that you could not run bootstraps and decommissions in parallel. Static token-based distribution: Static data distribution is another design aspect that eventually became limiting. Once a node was added, it was assigned a range of tokens and those token assignments couldn’t change between the point when the node was added and when it was removed. As a result, data couldn’t be moved dynamically. This architecture – rooted in the Cassandra design – served us well for a while. However, the more we started working with larger deployments and workloads that required faster scaling, it became increasingly clear that it was time for something new. So we launched a multiyear project to implement tablets-based data distribution in ScyllaDB. Our Goals for the ScyllaDB Tablets ImplementationProject Our tablets project targeted several goals stemming from the above limitations: Fast bootstrap/decommission: The top project goal was to improve the bootstrap and decommission speed. Bootstrapping large nodes in a busy cluster could take many hours, sometimes a day in massive deployments. Bootstrapping is often done at critical times: when you’re running out of space or CPU capacity, or you’re trying to support an expected workload increase. Understandably, users in such situations want this bootstrapping to complete as fast as feasible. Incremental bootstrap: Previously, the bootstrapped node couldn’t start serving read requests until all of the data was streamed. That means you’re still starved for CPU and potentially IO until the end of that bootstrapping process. With incremental bootstrap, a node can start shouldering the load – little by little – as soon as it’s added to the cluster. That brings immediate relief. Parallel bootstrap: Previously, you could only add one node at a time. And given how long it took to add a node, increasing cluster size took hours, sometimes days in our larger deployments. With parallel bootstrap, you can add multiple modes in parallel if you urgently need fast relief. Decouple topology operations: Another goal was to decouple changes to the cluster. Before, we had to serialize every operation. A node failure while bootstrapping or decommissioning nodes would force you to restart everything from scratch. With topology operations decoupled, you can remove a dead node while bootstrapping two new nodes. You don’t have to schedule everything and have it all waiting on some potentially slow operation to complete. Improve support for many small tables: ScyllaDB was historically optimized for a small number of large tables. However, our users have also been using it for workloads with a large number of small tables – so we wanted to equalize the performance for all kinds of workloads. Tablets in Action To see how tablets achieves those goals, let’s look at the following scenario: Preload a three-node cluster with 650 GB per replica Run a moderate mixed read/write workload Bootstrap three nodes to add more storage and CPU Decommission three nodes We ran this with the Scylla-cluster-tests (open-source) test harness that we use for our weekly regression tests. With tablets, the new nodes start gradually relieving the workload as soon as they’re bootstrapped and existing nodes start shedding the load incrementally. This offers fast relief for performance issues. In the write scenario here, bootstrapping was roughly 4X faster. We’ve tested other scenarios where bootstrapping was up to 30X faster. Next Up: Implementation The follow-up blog looks at how we implemented tablets. Specifically: Indirection and abstraction Independent tablet units A Raft-based load balancer Tablet-aware drivers Finally, we wrap it up with a more extensive demo that shows the impact of tablets from the perspective of coordinator requests, CPU load, and disk bandwidth across operations.

ScyllaDB 6.0: with Tablets & Strongly-Consistent Topology Updates

The ScyllaDB team is pleased to announce ScyllaDB Open Source 6.0, a production-ready major release. ScyllaDB 6.0 introduces two major features which change the way ScyllaDB works: Tablets, a dynamic way to distribute data across nodes that significantly improves scalability Strongly consistent topology, Auth, and Service Level updates Note: Join ScyllaDB co-founder Dor Laor on June 27 to explore what learn what this architectural shift means for elasticity and operational simplicity. Join the livestream In addition, ScyllaDB 6.0 includes many other improvements in functionality, stability, UX and performance. Only the latest two minor releases of the ScyllaDB Open Source project are supported. With this release, only ScyllaDB Open Source 6.0 and 5.4 are supported. Users running earlier releases are encouraged to upgrade to one of these two releases. Related Links Get ScyllaDB Open Source 6.0 as binary packages (RPM/DEB), AWS AMI, GCP Image and Docker Image Upgrade from ScyllaDB 5.4 to ScyllaDB 6.0 Report an issue   New features Tablets In this release, ScyllaDB enabled Tablets, a new data distribution algorithm to replace the legacy vNodes approach inherited from Apache Cassandra. While the vNodes approach statically distributes all tables across all nodes and shards based on the token ring, the Tablets approach dynamically distributes each table to a subset of nodes and shards based on its size. In the future, distribution will use CPU, OPS, and other information to further optimize the distribution. Read Avi Kivity’s blog series on tablets In particular, Tablets provide the following: Faster scaling and topology changes. New nodes can start serving reads and writes as soon as the first Tablet is migrated. Together with Strongly Consistent Topology Updates (below), this also allows users to add multiple nodes simultaneously and scale, out or down, much faster Automatic support for mixed clusters with different core counts. Manual vNode updates are not required. More efficient operations on small tables, since such tables are placed on a small subset of nodes and shards. Read more about Tablets in the docs Using Tablets Tablets are enabled by default for new clusters. No action required. To disable Tablets for a Keyspace use     CREATE KEYSPACE ... WITH TABLETS = { 'enabled': false } For Tablets limitations in 6.0, see the discussion in the docs. Monitoring Tablets To Monitor Tablets in real time, upgrade ScyllaDB Monitoring Stack to release 4.7, and use the new dynamic Tablet panels, below. Driver Support The Following Drivers support Tablets: Java driver 4.x, from 4.18.0.2 (to be released soon) Java Driver 3.x, from 3.11.5.2 Python driver, from 3.26.6 Gocql driver, from 1.13.0 Rust Driver from 0.13.0 Legacy ScyllaDB and Apache Cassandra drivers will continue to work with ScyllaDB but will be less efficient when working with tablet-based Keyspaces. Strongly Consistent Topology Updates With Raft-managed topology enabled, all topology operations are internally sequenced consistently. A centralized coordination process ensures that topology metadata is synchronized across the nodes on each step of a topology change procedure. This makes topology updates fast and safe, as the cluster administrator can trigger many topology operations concurrently, and the coordination process will safely drive all of them to completion. For example, multiple nodes can be bootstrapped concurrently, which couldn’t be done with the previous gossip-based topology. Strongly Consistent Topology Updates is now the default for new clusters, and should be enabled after upgrade for existing clusters. In addition to Topology Updates, more Cluster metadata elements are now strongly consistent: Strongly Consistent Auth Updates. Role-Based Access Control (RBAC) commands like create role or grant permission are safe to run in parallel. As a result, there is no need to update system_auth RF or run repair when adding a DataCenter. Strongly Consistent Service Levels. Service Levels allow you to define attributes such as timeout per workload. Service levels are now strongly consistent Native Nodetool The nodetool utility provides simple command-line interface operations and attributes. ScyllaDB inherited the Java based nodetool from Apache Cassandra. In this release, the Java implementation was replaced with a backward-compatible native nodetool. The native nodetool works much faster. Unlike the Java version ,the native nodetool is part of the ScyllaDB repo, and allows easier and faster updates. With the Native Nodetool, the JMX server has become redundant and will no longer be part of the default ScyllaDB Installation or image, and can be installed separately Maintenance Mode and Socket Maintenance mode is a new mode in which the node does not communicate with clients or other nodes and only listens to the local maintenance socket and the REST API. It can be used to fix damaged nodes – for example, by using nodetool compact or nodetool scrub. In maintenance mode, ScyllaDB skips loading tablet metadata if it is corrupted to allow an administrator to fix it. The new Maintenance Socket provides a way to interact with ScyllaDB, only from within the node it runs on, while on Maintenance Mode Maintenance Socket docs. Improvements and Bug Fixes The latest release also features extensive improvements to: Bloom Filters Stability and performance Compaction Commitlog Cluster operations Materialized views Performance Edge cases Guardrails CQL Alternator (DynamoDB compatible API) REST API Tracing Monitoring Packaging and configuration For details, see the release notes on the ScyllaDB Community Forum See the detailed release notes

Book Excerpt: ScyllaDB, a Different Database

What’s so distinctive about ScyllaDB? Read what Bo Ingram (Staff Engineer at Discord) has to say – in this excerpt from the book “ScyllaDB in Action.” Editor’s note: We’re thrilled to share the following excerpt from Bo Ingram’s informative – and fun! – new book on ScyllaDB: ScyllaDB in Action. You might have already experienced Bo’s expertise and engaging communication style in his blog How Discord Stores Trillions of Messages or ScyllaDB Summit talks How Discord Migrated Trillions of Messages from Cassandra to ScyllaDB and  So You’ve Lost Quorum: Lessons From Accidental Downtime  If not, you should 😉 You can purchase the full 325-page book from Manning.com. You can also access a 122-page early-release digital copy for free, compliments of ScyllaDB. The book excerpt includes a discount code for 45% off the complete book. Get the 122-page PDF for free The following is an excerpt from Chapter 1; it’s reprinted here with permission of the publisher. *** ScyllaDB is a database — it says it in its name! Users give it data; the database gives it back when asked. This very basic and oversimplified interface isn’t too dissimilar from popular relational databases like PostgreSQL and MySQL. ScyllaDB, however, is not a relational database, eschewing joins and relational data modeling to provide a different set of benefits. To illustrate these, let’s take a look at a fictitious example. Hypothetical databases Let’s imagine you’ve just moved to a new town, and as you go to new restaurants, you want to remember what you ate so that you can order it or avoid it next time. You could write it down in a journal or save it in the notes app on your phone, but you hear about a new business model where people remember information you send them. Your friend Robert has just started a similar venture: Robert’s Rememberings. ROBERT’S REMEMBERINGS Robert’s business (figure 1.2) is straightforward: you can text Robert’s phone number, and he will remember whatever information you send him. He’ll also retrieve information for you, so you won’t need to remember everything you’ve eaten in your new town. That’s Robert’s job. Figure 1.2 Robert’s Rememberings has a seemingly simple plan. The plan works swimmingly at first, but issues begin to appear. Once, you text him, and he doesn’t respond. He apologizes later and says he had a doctor’s appointment. Not unreasonable, you want your friend to be healthy. Another time, you text him about a new meal, and it takes him several minutes to reply instead of his usual instant response. He says that business is booming, and he’s been inundated with requests — response time has suffered. He reassures you and says not to worry, he has a plan. Figure 1.3 Robert adds a friend to his system to solve problems, but it introduces complications. Robert has hired a friend to help him out. He sends you the new updated rules for his system. If you only want to ask a question, you can text his friend, Rosa. All updates are still sent to Robert; he will send everything you save to her, so she’ll have an up-to-date copy. At first, you slip up a few times and still ask Robert questions, but it seems to work well. No longer is Robert overwhelmed, and Rosa’s responses are prompt. One day, you realize that when you asked Rosa a question, she texted back an old review that you had previously overwritten. You message Robert about this discrepancy, worried that your review of the much-improved tacos at Main Street Tacos is lost forever. Robert tells you there was an issue within the system where Rosa hadn’t been receiving messages from Robert but was still able to get requests from customers. Your request hasn’t been lost, and they’re reconciling to get back in sync. You wanted to be able to answer one question: is the food here good or not? Now, you’re worrying about contacting multiple people depending on whether you’re reading a review or writing a review, whether data is in sync, and whether your friend’s system can scale to satisfy all of their users’ requests. What happens if Robert can’t handle people only saving their information? When you begin brainstorming intravenous energy drink solutions, you realize that it’s time to consider other options. ABC DATA: A DIFFERENT APPROACH Your research leads you to another business – ABC Data. They tell you that their system is a little different: they have three people – Alice, Bob, and Charlotte – and any of them can save information or answer questions. They communicate with each other to ensure each of them has the latest data, as shown in figure 1.4. You’re curious what happens if one of them is unavailable, and they say they provide a cool feature: because there are multiple of them, they coordinate within themselves to provide redundancy for your data and increased availability. If Charlotte is unavailable, Alice and Bob will receive the request and answer. If Charlotte returns later, Alice and Bob will get Charlotte back up to speed on the latest changes. Figure 1.4 ABC Data’s approach is designed to meet the scaling challenges that Robert encountered. This setup is impressive, but because each request can lead to additional requests, you’re worried this system might get overwhelmed even easier than Robert’s. This, they tell you, is the beauty of their system. They take the data set and create multiple copies of it. They then divide this redundant data amongst themselves. If they need to expand, they only need to add additional people, who take over some of the existing slices of data. When a hypothetical fourth person, Diego, joins, one customer’s data might be owned by Alice, Charlotte, and Diego, whereas Bob, Charlotte, and Diego might own other data. Because they allow you to choose how many people should respond internally for a successful request, ABC Data gives you control over availability and correctness. If you want to always have the most up-to-date data, you can require all three holders to respond. If you want to prioritize getting an answer, even if it isn’t the most recent one, you can require only one holder to respond. You can balance these properties by requiring two holders to respond — you can tolerate the loss of one, but you can ensure that a majority of them have seen the most up-to-date data, so you should get the most recent information. Figure 1.5 ABC Data’s approach gives us control over availability and correctness. You’ve learned about two imaginary databases here — one that seems straightforward but introduces complexity as requests grow, and another with a more complex implementation that attempts to handle the drawbacks of the first system. Before beginning to contemplate the awkwardness of telling a friend you’re leaving his business for a competitor, let’s snap back to reality and translate these hypothetical databases to the real world. Real-world databases Robert’s database is a metaphorical relational database, such as PostgreSQL or MySQL. They’re relatively straightforward to run, fit a multitude of use cases, and are quite performant, and their relational data model has been used in practice for more than 50 years. Very often, a relational database is a safe and strong option. Accordingly, developers tend to default toward these systems. But, as demonstrated, they also have their drawbacks. Availability is often all-or-nothing. Even if you run with a read replica, which in Robert’s database would be his friend, Rosa, you would potentially only be able to do reads if you had lost your primary instance. Scalability can also be tricky – a server has a maximum amount of compute resources and memory. Once you hit that, you’re out of room to grow. It is through these drawbacks that ScyllaDB differentiates itself. The ABC Data system is ScyllaDB. Like ABC Data, ScyllaDB is a distributed database that replicates data across its nodes to provide both scalability and fault tolerance. Scaling is straightforward – you add more nodes. This elasticity in node count extends to queries. ScyllaDB lets you decide how many replicas are required to respond for a successful query, giving your application room to handle the loss of a server. *** Want to read more from Bo? You can purchase the full 325-page book from Manning.com   Also, you can access a 122-page early-release digital copy for free, compliments of ScyllaDB.  Get the 122-page PDF for free

Focus on Creativity, Not Clusters: DataStax Mission Control in Action!

While many large enterprises have made use of managed databases, several still have significant workloads being served by self-managed solutions (both on-premises and in the cloud) like DataStax Enterprise (DSE) and Apache Cassandra®. Although a significant amount of those workloads will eventually...

Why Teams are Eliminating External Database Caches

Often-overlooked risks related to external caches — and how 3 teams are replacing their core database + external cache with a single solution (ScyllaDB) Teams often consider external caches when the existing database cannot meet the required service-level agreement (SLA). This is a clear performance-oriented decision. Putting an external cache in front of the database is commonly used to compensate for subpar latency stemming from various factors, such as inefficient database internals, driver usage, infrastructure choices, traffic spikes and so on. Caching might seem like a fast and easy solution because the deployment can be implemented without tremendous hassle and without incurring the significant cost of database scaling, database schema redesign or even a deeper technology transformation. However, external caches are not as simple as they are often made out to be. In fact, they can be one of the more problematic components of a distributed application architecture. In some cases, it’s a necessary evil, such as when you require frequent access to transformed data resulting from long and expensive computations, and you’ve tried all the other means of reducing latency. But in many cases, the performance boost just isn’t worth it. You solve one problem, but create others. Here are some often-overlooked risks related to external caches and ways three teams have achieved a performance boost plus cost savings by replacing their core database and external cache with a single solution. Spoiler: They adopted ScyllaDB, a high-performance database that achieves improved long-tail latencies by tapping a specialized internal cache. Why Not Cache At ScyllaDB, we’ve worked with countless teams struggling with the costs, hassles and limits of traditional attempts to improve database performance. Here are the top struggles we’ve seen teams experience with putting an external cache in front of their database. An External Cache Adds Latency A separate cache means another hop on the way. When a cache surrounds the database, the first access occurs at the cache layer. If the data isn’t in the cache, then the request is sent to the database. This adds latency to an already slow path of uncached data. One may claim that when the entire data set fits the cache, the additional latency doesn’t come into play. However, unless your data set is considerably small, storing it entirely in memory considerably magnifies costs and is thus prohibitively expensive for most organizations. An External Cache is an Additional Cost Caching means expensive DRAM, which translates to a higher cost per gigabyte than solid-state disks (see this P99 CONF talk by Grafana’s Danny Kopping for more details on that). Rather than provisioning an entirely separate infrastructure for caching, it is often best to use the existing database memory, and even increase it for internal caching. Modern database caches can be just as efficient as traditional in-memory caching solutions when sized correctly. When the working set size is too large to fit in memory, then databases often shine in optimizing I/O access to flash storage, making databases alone (no external cache) a preferred and cheaper option. External Caching Decreases Availability No cache’s high availability solution can match that of the database itself. Modern distributed databases have multiple replicas; they also are topology-aware and speed-aware and can sustain multiple failures without data loss. For example, a common replication pattern is three local replicas, which generally allows for reads to be balanced across such replicas to efficiently make use of your database’s internal caching mechanism. Consider a nine-node cluster with a replication factor of three: Essentially every node will hold roughly a third of your total data set size. As requests are balanced among different replicas, this grants you more room for caching your data, which could completely eliminate the need for an external cache. Conversely, if an external cache happens to invalidate entries right before a surge of cold requests, availability could be impeded for a while since the database won’t have that data in its internal cache (more on this below). Caches often lack high availability properties and can easily fail or invalidate records depending on their heuristics. Partial failures, which are more common, are even worse in terms of consistency. When the cache inevitably fails, the database will get hit by the unmitigated firehose of queries and likely wreck your SLAs. In addition, even if a cache itself has some high availability features, it can’t coordinate handling such failure with the persistent database it is in front of. The bottom line: Rely on the database, rather than making your latency SLAs dependent on a cache. Application Complexity — Your Application Needs to Handle More Cases External caches introduce application and operational complexity. Once you have an external cache, it is your responsibility to keep the cache up to date with the database. Irrespective of your caching strategy (such as write-through, caching aside, etc.), there will be edge cases where your cache can run out of sync from your database, and you must account for these during application development. Your client settings (such as failover, retry and timeout policies) need to match the properties of both the cache as well as your database to function when the cache is unavailable or goes cold. Usually such scenarios are hard to test and implement. External Caching Ruins the Database Caching Modern databases have embedded caches and complex policies to manage them. When you place a cache in front of the database, most read requests will reach only the external cache and the database won’t keep these objects in its memory. As a result, the database cache is rendered ineffective. When requests eventually reach the database, its cache will be cold and the responses will come primarily from the disk. As a result, the round-trip from the cache to the database and then back to the application is likely to add latency. External Caching Might Increase Security Risks An external cache adds a whole new attack surface to your infrastructure. Encryption, isolation and access control on data placed in the cache are likely to be different from the ones at the database layer itself. External Caching Ignores The Database Knowledge And Database Resources Databases are quite complex and built for specialized I/O workloads on the system. Many of the queries access the same data, and some amount of the working set size can be cached in memory to save disk accesses. A good database should have sophisticated logic to decide which objects, indexes and accesses it should cache. The database also should have eviction policies that determine when new data should replace existing (older) cached objects. An example is scan-resistant caching. When scanning a large data set, say a large range or a full-table scan, a lot of objects are read from the disk. The database can realize this is a scan (not a regular query) and choose to leave these objects outside its internal cache. However, an external cache (following a read-through strategy) would treat the result set just like any other and attempt to cache the results. The database automatically synchronizes the content of the cache with the disk according to the incoming request rate, and thus the user and the developer do not need to do anything to make sure that lookups to recently written data are performant and consistent. Therefore, if, for some reason, your database doesn’t respond fast enough, it means that: The cache is misconfigured It doesn’t have enough RAM for caching The working set size and request pattern don’t fit the cache The database cache implementation is poor A Better Option: Let the Database Handle It How can you meet your SLAs without the risks of external database caches? Many teams have found that by moving to a faster database such as ScyllaDB with a specialized internal cache, they’re able to meet their latency SLAs with less hassle and lower costs. Results vary based on workload characteristics and technical requirements, of course. But for an idea of what’s possible, consider what these teams were able to achieve. SecurityScorecard Achieves 90% Latency Reduction with $1 Million Annual Savings SecurityScorecard aims to make the world a safer place by transforming the way thousands of organizations understand, mitigate and communicate cybersecurity. Its rating platform is an objective, data-driven and quantifiable measure of an organization’s overall cybersecurity and cyber risk exposure. The team’s previous data architecture served them well for a while, but couldn’t keep up with their growth. Their platform API queried one of three data stores: Redis (for faster lookups of 12 million scorecards), Aurora (for storing 4 billion measurement stats across nodes), or a Presto cluster on Hadoop Distributed File System (for complex SQL queries on historical results). As data and requests grew, challenges emerged. Aurora and Presto latencies spiked under high throughput. The largest possible instance of Redis still wasn’t sufficient, and they didn’t want the complexity of working with a Redis Cluster. To reduce latencies at the new scale that their rapid business growth required, the team moved to ScyllaDB Cloud and developed a new scoring API that routed less latency-sensitive requests to Presto and S3 storage. Here’s a visualization of this – and considerably simpler – architecture: The move resulted in: 90% latency reduction for most service endpoints 80% fewer production incidents related to Presto/Aurora performance $1 million infrastructure cost savings per year 30% faster data pipeline processing Much better customer experience Read more about the SecurityScorecard use case IMVU Reins in Redis Costs at 100X Scale A popular social community, IMVU enables people all over the world to interact with each other using 3D avatars on their desktops, tablets and mobile devices. To meet growing requirements for scale, IMVU decided it needed a more performant solution than its previous database architecture of Memcached in front of MySQL and Redis. The team looked for something that would be easier to configure, easier to extend and, if successful, easier to scale. “Redis was fine for prototyping features, but once we actually rolled it out, the expenses started getting hard to justify,” said Ken Rudy, senior software engineer at IMVU. “ScyllaDB is optimized for keeping the data you need in memory and everything else in disk. ScyllaDB allowed us to maintain the same responsiveness for a scale a hundred times what Redis could handle.” Comcast Reduces Long Tail Latencies 95% with $2.5 million Annual Savings Comcast is a global media and technology company with three primary businesses: Comcast Cable, one of the United States’ largest video, high-speed internet and phone providers to residential customers; NBCUniversal and Sky. Comcast’s Xfinity service serves 15 million households with more than 2 billion API calls (reads/writes) and over 200 million new objects per day. Over seven years, the project expanded from supporting 30,000 devices to more than 31 million. Cassandra’s long tail latencies proved unacceptable at the company’s rapidly increasing scale. To mask Cassandra’s latency issues from users, the team placed 60 cache servers in front of their database. Keeping this cache layer consistent with the database was causing major admin headaches. Since the cache and related infrastructure had to be replicated across data centers, Comcast needed to keep caches warm. They implemented a cache warmer that examined write volumes, then replicated the data across data centers. After struggling with the overhead of this approach, Comcast soon moved to ScyllaDB. Designed to minimize latency spikes through its internal caching mechanism, ScyllaDB enabled Comcast to eliminate the external caching layer, providing a simple framework in which the data service connected directly to the data store. Comcast was able to replace 962 Cassandra nodes with just 78 nodes of ScyllaDB. They improved overall availability and performance while completely eliminating the 60 cache servers. The result: 95% lower P99, P999 and P9999 latencies with the ability to handle over twice the requests – at 60% of the operating costs. This ultimately saved them $2.5 million annually in infrastructure costs and staff overhead.   Closing Thoughts Although external caches are a great companion for reducing latencies (such as serving static content and personalization data not requiring any level of durability), they often introduce more problems than benefits when placed in front of a database. The top tradeoffs include elevated costs, increased application complexity, additional round trips to your database and an additional security surface area. By rethinking your existing caching strategy and switching to a modern database providing predictable low latencies at scale, teams can simplify their infrastructure and minimize costs. And at the same time, they can still meet their SLAs without the extra hassles and complexities introduced by external caches.

ScyllaDB as a DynamoDB Alternative: Frequently Asked Questions

A look at the top questions engineers are asking about moving from DynamoDB to ScyllaDB to reduce cost, avoid throttling, and avoid cloud vendor lockin A great thing about working closely with our community is that I get a chance to hear a lot about their needs and – most importantly – listen to and take in their feedback. Lately, we’ve seen a growing interest from organizations considering ScyllaDB as a means to replace their existing DynamoDB deployments and, as happens with any new tech stack, some frequently recurring questions. 🙂 ScyllaDB provides you with multiple ways to get started: you can choose from its CQL protocol or ScyllaDB Alternator. CQL refers to the Cassandra Query Language, a NoSQL interface that is intentionally similar to SQL. ScyllaDB Alternator is ScyllaDB’s DynamoDB-compatible API, aiming at full compatibility with the DynamoDB protocol. Its goal is to provide a seamless transition from AWS DynamoDB to a cloud-agnostic or on-premise infrastructure while delivering predictable performance at scale. But which protocol should you choose? What are the main differences between ScyllaDB and DynamoDB? And what does a typical migration path look like? Fear no more, young sea monster! We’ve got you covered. I personally want to answer some of these top questions right here, and right now. Why switch from DynamoDB to ScyllaDB? If you are here, chances are that you fall under at least one of the following categories: Costs are running out of control Latency and/or throughput are suboptimal You are currently locked-in and would like a bit more flexibility ScyllaDB delivers predictable low latency at scale with less infrastructure required. For DynamoDB specifically, we have an in-depth article covering which pain points we address. Is ScyllaDB Alternator a DynamoDB drop-in replacement? In the term’s strict sense, it is not: notable differences across both solutions exist. DynamoDB development is closed source and driven by AWS (which ScyllaDB is not affiliated with), which means that there’s a chance that some specific features launched in DynamoDB may take some time to land in ScyllaDB. A more accurate way to describe it is as an almost drop-in replacement. Whenever you migrate to a different database, some degree of changes will always be required to get started with the new solution. We try to keep the level of changes to a minimum to make the transition as seamless as possible. For example, Digital Turbine easily migrated from DynamoDB to ScyllaDB within just a single two-week sprint, the results showing significant performance improvements and cost savings. What are the main differences between ScyllaDB Alternator and AWS DynamoDB? Provisioning: In ScyllaDB you provision nodes, not tables. In other words, a single ScyllaDB deployment is able to host several tables and serve traffic for multiple workloads combined. Load Balancing: Application clients do not route traffic through a single endpoint as in AWS DynamoDB (dynamodb.<region_name>.amazonaws.com). Instead, clients may use one of our load balancing libraries, or implement a server-side load balancer. Limits: ScyllaDB does not impose a 400KB limit per item, nor any partition access limits. Metrics and Integration: Since ScyllaDB is not a “native AWS service,” it naturally does not integrate in the same way as other AWS services (such as CloudWatch and others) does with DynamoDB. For metrics specifically, ScyllaDB provides the ScyllaDB Monitoring Stack with specific dashboards for DynamoDB deployments. When should I use the DynamoDB API instead of CQL? Whenever you’re interested in moving away from DynamoDB (either to remain in AWS or to another cloud), and either: Have zero interest in refactoring your code to a new API, or Plan to get started or evaluate ScyllaDB prior to major code refactoring. For example, you would want to use the DynamoDB API in a situation where hundreds of independent Lambda services communicating with DynamoDB may require quite an effort to refactor. Or, when you rely on a connector that doesn’t provide compatibility with the CQL protocol. For all other cases, CQL is likely to be a better option. Check out our protocol comparison for more details. What is the level of effort required to migrate to ScyllaDB? Assuming that all features required by the application are supported by ScyllaDB (irrespective of which API you choose), the level of effort should be minimal. The process typically involves lifting your existing DynamoDB tables’ data and then replaying changes from DynamoDB Streams to ScyllaDB. Once that is complete, you update your application to connect to ScyllaDB. I once worked with an AdTech company choosing CQL as their protocol. This obviously required code refactoring to adhere to the new query language specification. On the other hand, a mobile platform company decided to go with ScyllaDB Alternator, eliminating the need for data transformations during the migration and application code changes. Is there a tool to migrate from DynamoDB to ScyllaDB Alternator? Yes. The ScyllaDB Migrator is a Spark-based tool available to perform end-to-end migrations. We also provide relevant material and hands-on assistance for migrating to ScyllaDB using alternative methods, as relevant to your use case. I currently rely on DynamoDB autoscaling; how does that translate to ScyllaDB? More often than not you shouldn’t need it. Autoscaling is not free (there’s idle infrastructure reserved for you), and it requires considerable “time to scale”, which may end up ruining end users’ experience. A small ScyllaDB cluster alone should be sufficient to deliver tens to hundreds of thousands of operations per second – and a moderately-sized one can easily achieve over a million operations per second. That being said, the best practice is to be provisioned for the peak. What about DynamoDB Accelerator (DAX)? ScyllaDB implements a row-based cache, which is just as fast as DAX. We follow a read-through caching strategy (unlike the DAX write-through strategy), resulting in less write amplification, simplified cache management, and lower latencies. In addition, ScyllaDB’s cache is bundled within the core database, not as a separate add-on like DynamoDB Accelerator. Which features are (not) available? The ScyllaDB Alternator Compatibility page contains a detailed breakdown of not yet implemented features. Keep in mind that some features might be just missing the DynamoDB API implementation. You might be able to achieve the same functionality in ScyllaDB in other ways. If any particular missing feature is critical for your ScyllaDB adoption, please let us know. How safe is this? Really? ScyllaDB Alternator has been production-ready since 2020, with leading organizations running it in production both on-premise as well as in ScyllaDB Cloud. Our DynamoDB compatible API is extensively tested and its code is open source. Next Steps If you’d like to learn more about how to succeed during your DynamoDB to ScyllaDB journey, I highly encourage you to check out our recent ScyllaDB Summit talk. For a detailed performance comparison across both solutions, check out our ScyllaDB Cloud vs DynamoDB benchmark. If you are still unsure whether a change makes sense, then you might want to read the top reasons why teams decide to abandon the DynamoDB ecosystem. If you’d like a high-level overview on how to move away from DynamoDB, refer to our DynamoDB: How to Move Out? article. When you are ready for your actual migration, then check out our in-depth walkthrough of an end-to-end DynamoDB to ScyllaDB migration. Chances are that I probably did not address some of your more specific questions (sorry about that!), in which case you can always book a 1:1 Technical Consultation with me so we can discuss your specific situation thoroughly. I’m looking forward to it!