ScyllaDB User Tech Talks: Dor Laor’s Takeaways

Somehow two months have passed since ScyllaDB Summit 24. Thinking back on the event, so much has changed since our inaugural ScyllaDB Summit, when ~100 people gathered in San Jose, CA in 2016. The ScyllaDB database, company, and community have all developed quite substantially. But one thing remains the same: hearing about what our users are achieving with ScyllaDB is still my favorite part of the conference. I want to share some of the highlights from these great tech talks by our users and encourage you to watch the ones that are most relevant to your situation. As you read/watch, you will notice that some of the same themes appear across multiple talks: Moving to ScyllaDB to simplify the tech stack and operations Growing demand for a database that’s not tied to a specific cloud vendor Users unwilling to compromise between low latency, high throughput, and cost-effectiveness ReversingLabs The Strategy Behind ReversingLabs’ Massive Key-Value Migration Martina Alilović Rojnić, Software Architect, ReversingLabs About: ReversingLabs offers a complete software supply chain security and malware analysis platform. ReversingLabs data is used by more than 65 of the world’s most advanced security vendors and their tens of thousands of security professionals. Takeaways: Martina is a fantastic storyteller, and she had a rather impressive story to tell. ReversingLabs collects a massive amount of file samples daily and analyzes them both statically and dynamically. They currently have around 20B samples in their data set, mostly malware. These samples generate over 300TB of metadata, which is then exposed to around 400 services and various feeds. Not surprisingly, the system architecture they designed when they were a scrappy startup well over a decade ago doesn’t suit their 2024 scale. They recognized that their initial Postgres-based architecture was not sustainable back in 2011, and they built their own simple key-value store that met their specific needs (being able to handle a ridiculous amount of data extremely efficiently). Ten years later, the company had grown significantly and they recognized they were reaching the limits of that custom key-value implementation. They selected ScyllaDB, as you can guess. But what’s most interesting is how they pulled off this massive migration flawlessly, and with zero downtime. As I said at the start, Martina is a great storyteller – so I strongly encourage you to learn about their strategy directly from her. Watch ReversingLabs Supercell Real-Time Persisted Events at Supercell Edvard Fagerholm, Senior Server Engineer, Supercell About: Supercell is the brand behind the popular global games Hay Day, Clash of Clans, Boom Beach, Clash Royale, and Brawl Stars. Supercell offers free-to-play games that yield profits through in-game microtransactions. Takeaways: Supercell recently started on an application that goes along with wildly popular video games which each have generated over $1B in revenue. Basically, they transformed an existing Supercell ID implementation to support an in-game social network that lets players communicate with each other and see what their friends are currently doing within the games. Every player update generates an event, which is immediately sent to an API endpoint, broadcasted out to the appropriate parties, and saved to ScyllaDB. In his tech talk, Edvard shared the story behind this social network’s implementation. I think we were all shocked to learn that the social network component of this system, used by over 200M players per month, is developed and run by a single backend engineer. That’s why the team wanted the implementation to be as efficient, flexible, and simple as possible, with a high level of abstraction. And to store the many events generated by all the players changing levels, updating avatars, etc., they needed a database that handled many small writes, easily supported a hierarchical data model, had low latency, and relieved that one backend engineer from having to deal with all the cluster management. We’re honored that they selected ScyllaDB Cloud for this task. See the video for examples of how they modeled data for ScyllaDB as well as a detailed look at how they built the broader system architecture that ScyllaDB plugs into. Watch Supercell   Tractian MongoDB vs ScyllaDB: Tractian’s Experience with Real-Time ML JP Voltani, Director of Engineering, Tractian About: Tractian is an industrial asset monitoring company that uses AI to predict mechanical failures. They use sensors, edge computing hardware, and AI models to monitor industrial machines and identify potential failures based on vibrations and frequency patterns. Takeaways: The team at Tractian does a first-class job across the board: database POCs, benchmarking, migrations, and even award-worthy video production. Kudos to them for amazing skill and initiative, all around! With over 6 languages and 8 databases in the young and growing organization’s tech stack, they are clearly committed to adopting whatever technology is the best match for each of their use cases. And that’s also why they moved away from MongoDB when their AI model workloads increased over 2X in a single year and latencies also spiked beyond the point where scaling and tuning could help. They evaluated Postgres and ScyllaDB, with extensive benchmarking across all three options. Ultimately, they moved to ScyllaDB Cloud because it met their needs for latency at scale. Interestingly, they saw the data modeling rework required for moving from MongoDB to ScyllaDB as an opportunity, not a burden. They gained 10x better throughput with 10x better latency, together. Watch the session to get all the details on their MongoDB to ScyllaDB migration blueprint Watch TRACTIAN   Discord So You’ve Lost Quorum: Lessons From Accidental Downtime Bo Ingram, Staff Software Engineer, Persistence Infrastructure, Discord About: Discord is “a voice, video, and text app that helps friends and communities come together to hang out and explore their interests — from artists and activists, to study groups, sneakerheads, plant parents, and more. With 150 million monthly users across 19 million active communities, called servers, Discord has grown to become one of the most popular communications services in the world.” Takeaways: Last year, Bo Ingram shared why and how Discord’s persistence team recently completed their most ambitious migration yet: moving their massive set of trillions of messages from Cassandra to ScyllaDB. This year, we were honored to have him back to share lessons from a very stressful Monday at Discord. Bo shared the inside perspective on an outage with one of their ScyllaDB clusters, showing how a stressed ScyllaDB cluster looks and behaves during an incident. If you watch this session, you will learn about how to diagnose issues in your clusters, see how external failure modes manifest in ScyllaDB, and how you can avoid making a fault too big to tolerate. Bonus: Bo just wrote a great book on using ScyllaDB! If you plan to run ScyllaDB at scale, make sure you read this book before going to production. Bo captured years of high scalability practices, wrapped in friendly and fun packaging. You can get early access to a preview (free) at https://lp.scylladb.com/scylladb-in-action-book-offer. That includes a 45% discount on the complete book, which is being released quite soon. Watch Discord   Zee CTO Insights: Steering a High-Stakes Database Migration Kishore Krishnamurthy, CTO, ZEE5 Tracking Millions of Heartbeats on Zee’s OTT Platform Srinivas Shanmugam, Principal Architect, ZEE5 Jivesh Threja, Senior Software Engineer, ZEE5 About: Zee is a 30-year-old publicly-listed Indian media and entertainment company. ZEE5 is their premier “over the top” (OTT) streaming service, available in over 190 countries with 150M monthly active users. Takeaways: It’s honestly hard to believe that we just started working with Zee last year. They fast became power users, especially after navigating two major migrations – from DynamoDB, RDS (PostgreSQL), Redis, and Solr to ScyllaDB as well as a parallel migration to Google Cloud. The fact that these migrations went so smoothly is a great testament to the leadership of CTO Kishore Krishnamurthy. His keynote provides rapid-fire insight into what was going on through his head before, during, and after those migration processes. Zee engineers also provided an insightful tech talk on Zee’s heartbeat API, which was the primary use case driving their consideration of ScyllaDB. Heartbeat lies at the core of Zee’s playback experience, security, and recommendation systems. They currently process a whopping 100B+ heartbeats per day! Scaling up their previous heartbeats architecture to suit their rapid growth was cost-prohibitive with their previous databases. Jivesh and Srinivas outlined the technical requirements for the replacement (cloud-neutrality, multi-tenant readiness, simplicity of onboarding new use cases, and high throughput and low latency at optimal costs) and how that led to ScyllaDB. Then, they explained how they achieved their goals through a new stream processing pipeline, new API layer, and data (re)modeling that ultimately added up to 5X cost savings (from 744K to 144K annually) and single-digit millisecond P99 read latency. It all wraps up with an important set of lessons learned that anyone using or considering ScyllaDB should take a look at. Watch Kishore Watch the Zee5 Tech Talk Digital Turbine (with SADA) Radically Outperforming DynamoDB @ Digital Turbine with SADA and Google Cloud Joseph Shorter, VP of Platform Architecture, Digital Turbine Miles Ward, CTO, SADA About: Digital Turbine is a mobile advertising, app delivery, and monetization ecosystem adopted by mobile carriers and OEMs. Their technology is deployed on ~100M mobile phones and drives 5B app installs. Takeaways: First off, I think Miles and Joe deserve their own talk show. These are two great technologists and really cool people; it’s fun to watch them having so much fun talking about serious business-critical technology challenges. After some fun intros, Miles and Joe walked through why DynamoDB was no longer meeting their needs. Basically, Digital Turbine’s write performance wasn’t where they hoped it would be and getting the performance they really wanted would have required even higher costs. On top of that, Digital Turbine had already decided to move to GCP. They approached SADA, looking for a way to get better performance and migrate without radically refactoring their platform. Joe and team were sold on the fact that ScyllaDB offered Alternator, a DynamoDB-compatible API that let them move with minimal code changes (less than one sprint). They were pleasantly surprised to find that it not only improved their performance, but also came at a much lower cost due to ScyllaDB’s inherent efficiency. While DynamoDB was throttling them at 1,400 OPS, ScyllaDB wasn’t even breaking a sweat. That’s great to hear! Watch Digital Turbine and SADA JioCinema Discover the Unseen: Tailored Recommendation of Unwatched Content Charan Kamal, Back End Developer, JioCinema Harshit Jain, Software Engineer, JioCinema About: JioCinema is an Indian over-the-top media streaming service owned by Viacom18, a joint venture of Reliance Industries and Paramount Global. They feature top Indian and international entertainment, including live sports (including IPL cricket and NBA basketball), new movies, live sports, HBO originals, and more. Takeaways: It was interesting to see how JioCinema tackled the challenge of ensuring that the “prime real estate” of recommended content lists don’t include content that the viewer has already watched. This is a common concern across media streamers – or at least should be – and I’m impressed by the innovative way that this team approached it. Specifically, they used Bloom filters. They explored different options, but discovered that the most common ones required significant tradeoffs. Building their own in-memory Bloom filters offered low latency, but implementing it at the scale required was going to require too much work given their core focus. Redis was ruled out due to the deployment complexity required to maintain high availability and because Redis’ cost structure (pay for every operation) was going to be prohibitive given their 10M daily users and big events with 20M concurrent users. ScyllaDB also worked well, and with a cost structure that worked better for this rapidly growing service. In the video, Charan shares some good technical nuggets about how they implemented the Bloom filter solution with ScyllaDB. He also covers other features they liked in ScyllaDB, such as multi-region, local quorum, TTL for auto expiration, and high cardinality of key partitions. Watch JioCinema Expedia Inside Expedia’s Migration to ScyllaDB for Change Data Capture Jean Carlo Rivera Ura, NoSQL Database Engineer III, Expedia Mani Rangu, Database Administrator III, Expedia About: Expedia is one of the world’s leading full-service online travel brands helping travelers easily plan and book their whole trip with a wide selection of vacation packages, flights, hotels, vacation rentals, rental cars, cruises, activities, attractions, and services. Takeaways: Expedia shared their first ScyllaDB use case back in 2021. That was for an application that aggregates data from multiple systems, like hotel location info, third-party data, etc. with the goal of single-digit millisecond P99 read response time. The new use case presented at this year’s ScyllaDB Summit was related to their identity cluster, which numerous Expedia applications rely on for user authentication. If it goes down, users can’t log in. They were previously running on Cassandra and using Cassandra’s CDC to capture changes and stream them to a Kafka cluster using Debezium Connector. They grew frustrated by the fact that writes to the CDC-enabled tables were rejected whenever the Debezium Connector stopped consuming events (which happened all too often for their liking). Upon learning that ScyllaDB offered a straightforward and uncomplicated CDC implementation, they planned their move. Watch their talk to hear their breakdown of the two migration options they considered (SSTableLoader and the Spark-based ScyllaDB Migrator), get a step-by-step look at how they migrated, and learn what they’ll do differently when they migrate additional use cases over from Cassandra to ScyllaDB. Watch JioCinema ShareChat Getting the Most Out of ScyllaDB Monitoring: ShareChat’s Tips Andrei Manakov, Staff Software Engineer, ShareChat About: ShareChat is the leading social media platform in India. Between the ShareChat app and Moj (short-form video), they serve a user base of 400M users – and rapidly growing! Takeaways: ShareChat is now starting their second year as a ScyllaDB user, which is pretty amazing given that their best practices are among the most sophisticated of all of our users. ShareChat has been a generous contributor to our community across ScyllaDB Summit as well as P99 CONF; I personally appreciate that and I know our community is always eager to learn what they’re sharing next. The latest addition to the ShareChat library of best practices is Andrei Manakov’s talk on how they’ve really customized monitoring to their specific needs and preferences for Moj, their TikTok-like short video app with 20M daily active users and 100M monthly active users. Here, ScyllaDB is used for the ML feature store that powers users’ feeds. Andrei walks us through how they’ve used the ScyllaDB Monitoring Stack to optimize driver usage, measure the impact of changing compaction strategies, monitor cluster capacity, and more. Watch ShareChat’s Tips Bonus: In case anyone wants background on why and how ShareChat moved to ScyllaDB, we’ve also added the great introductory talk by Geetish Nayak to the on-demand library. He covers the pressure driving their database migration, why and how they implemented ScyllaDB for a still-growing number of use cases, their migration strategy, and the best practices they developed for working with ScyllaDB. See the ShareChat Story Bonus Session for Live Attendees Only Speaking of bonuses – live attendees were also treated to an additional user tech talk by a major media streaming provider. We don’t have permission to share it following that live broadcast, but here are some high-level takeaways: The team previously used Cassandra for their “Continue Watching” use case, but went looking for alternatives due to GC spikes, complex CDC, and its inability to take advantage of the modern hardware they had invested in They considered DynamoDB because of its similar data structure, but based on their data size (>10TB) and throughput (170k+ WPS and 78k+ RPS), they felt DynamoDB was “like just burning money” They conducted a POC of ScyllaDB, led by the ScyllaDB team using 6x i4i.4xlarge nodes with RF 3, 3B records preloaded They were able to hit combined load with no errors, P99 read latency of 9ms, and P99 write latency < 1ms Watch All Sessions On Demand

Retaining Database Goodput Under Stress with Per-Partition Query Rate Limiting

How we implemented ScyllaDB’s per-partition rate limit feature and the impact on “goodput” Distributed database clusters operate best when the data is spread across a large number of small partitions, and reads/writes are spread uniformly across all shards and nodes. But misbehaving client applications (e.g., triggered by bugs, malicious actors, or unpredictable events sending something “viral”) could suddenly upset a previously healthy balance, causing one partition to receive a disproportionate number of requests. In turn, this usually overloads the owning shards, creating a scenario called a “hot partition” and causing the total cluster latency to worsen. To prevent this, ScyllaDB’s per-partition rate limit feature allows users to limit the rate of accepted requests on a per-partition basis. When a partition exceeds the configured limit of operations of a given type (reads/writes) per second, the cluster will then start responding with errors to some of the operations for that partition. That way, the rate of accepted requests is statistically kept at the configured limit. Since rejected operations use fewer resources, this alleviates the “hot partition” situation. This article details why and how we implemented this per-partition rate limiting and measures the impact on “goodput”: the amount of useful data being transferred between clients and servers. Background: The “Hot Partition” Problem To better understand the problem we were addressing, let’s first take a quick look at ScyllaDB’s architecture. ScyllaDB is a distributed database. A single cluster is composed of multiple nodes, and each node is responsible for storing and handling a subset of the total data set. Furthermore, each node is divided into shards. A single shard is basically a CPU core with a chunk of RAM assigned to it. Each of a node’s shards handles a subset of the data the node is responsible for. Client applications that use our drivers open a separate connection to each shard in the cluster. When the client application wants to read or write some data, the driver determines which shard owns this data and sends the request directly to that shard on its dedicated connection. This way, the CPU cores rarely have to communicate with each other, minimizing the cross-core synchronization cost. All of this enables ScyllaDB to scale vertically as well as horizontally. The most granular unit of partitioning is called a partition. A partition in a table is a set of all rows that share the same partition key. For each partition, a set of replica nodes is chosen. Each of the replicas is responsible for storing a copy of the partition. While read and write requests can be handled by non-replica nodes as well, the coordinator needs to contact the replicas in order to execute the operation. Replicas are equal in importance and not all have to be available in order to handle the operation (the number required depends on the consistency level that the user set). This data partitioning scheme works well when the data and load are evenly distributed across all partitions in the cluster. However, a problem can occur when one partition suddenly starts receiving a disproportionate amount of traffic compared to other partitions. Because ScyllaDB assigns only a part of its computing resources to each partition, the shards on replicas responsible for that partition can easily become overloaded. The architecture doesn’t really allow other shards or nodes to “come to the rescue,” therefore even a powerful cluster will struggle to serve requests to that partition. We call this situation a “hot partition” problem. Moreover, the negative effects aren’t restricted to a single partition. Each shard is responsible for handling many partitions. If one of those partitions becomes hot, then any partitions that share at least a single replica shard with that overloaded partition will be impacted as well. It’s the Schema’s Fault (Maybe?) Sometimes, hot partitions occur because the cluster’s schema isn’t a great fit for your data. You need to be aware of how your data is distributed and how it is accessed and used, then model it appropriately so that it fits ScyllaDB’s strengths and avoids its weaknesses. That responsibility lies with you as the system designer; ScyllaDB won’t automatically re-partition the data for you. It makes sense to optimize your data layout for the common case. However, the uncommon case often occurs because end-users don’t always behave as expected. It’s essential that unanticipated user behavior cannot bring the whole system to a grinding halt. What’s more, something inside your system itself could also stir up problems. No matter how well you test your code, bugs are an inevitable reality. Lurking bugs might cause erratic system behavior, resulting in highly unbalanced workloads. As a result, it’s important to think about overload protection at the per-component level as well as at the system level. What is “Goodput” and How Do You Maintain It? My colleague Piotr Sarna explained goodput as follows in the book Database Performance at Scale: “A healthy distributed database cluster is characterized by stable goodput, not throughput. Goodput is an interesting portmanteau of good + throughput, and it’s a measure of useful data being transferred between clients and servers over the network, as opposed to just any data. Goodput disregards errors and other churn-like redundant retries, and is used to judge how effective the communication actually is. This distinction is important. Imagine an extreme case of an overloaded node that keeps returning errors for each incoming request. Even though stable and sustainable throughput can be observed, this database brings no value to the end user.” With hot partitions, requests arrive faster than the replica shards can process them. Those requests form a queue that keeps growing. As the queue grows, requests need to wait longer and longer, ultimately reaching a point where most requests fail because they time out before processing even begins. The system has good throughput because it accepts a lot of work, but poor goodput because most of the work will ultimately be wasted. That problem can be mitigated by rejecting some of the requests when we have reason to believe that we won’t be able to process all of them. Requests should be rejected as early as possible and as cheaply as possible in order to leave the most computing resources for the remaining requests. Enter Per-Partition Rate Limiting In ScyllaDB 5.1 we implemented per-partition rate limiting which follows that idea. For a given table, users can specify a per-partition rate limit: one for reads, another for writes. If the cluster detects that the reads or writes for a given partition are starting to exceed that user-defined limit, the database will begin rejecting some of the requests in an attempt to keep the rate of unrejected requests below the limit. For example, the limit could be set as follows: ALTER TABLE ks.tbl WITH per_partition_rate_limit = { 'max_writes_per_second': 100, 'max_reads_per_second': 200 }; We recommend that users set the limit high enough to avoid reads/writes being rejected during normal usage, but also low enough so that ScyllaDB will reject most of the reads/writes to a partition when it becomes hot. A simple guideline is to estimate the maximum expected reads/writes per second and apply a limit that is an order of magnitude higher. Although this feature tries to accurately apply the defined limit, the actual rate of accepted requests may be higher due to the distributed nature of ScyllaDB. Keep in mind that this feature is not meant for enforcing strict limits (e.g. for API purposes). Rather, it was designed as an overload protection mechanism. The following sections will provide a bit more detail about why there’s sometimes a discrepancy. How Per-Partition Rate Limiting Works To determine how many requests to reject, ScyllaDB needs to estimate the request rate for each partition. Because partitions are replicated on multiple nodes and not all replicas are usually required to perform an operation, there is no obvious single place to track the rate. Instead, we perform measurements on each replica separately and use them to estimate the rate. Let’s zoom in on a single replica and see what is actually being measured. Each shard keeps a map of integer counters indexed by token, table, and operation type. When a partition is accessed, the counter relevant to the operation is increased by one. On the other hand, every second we cut every counter in half, rounding towards zero. Due to this, it is a simple mathematical exercise to show that if a partition has a steady rate of requests, then the counter value will eventually oscillate between X and 2*X, where X is the rate in terms of requests per second. We managed to implement the counter map as a hashmap that uses a fixed, statically allocated region of memory that is independent from the number of tables or the size of the data set. We employ several tricks to make it work without losing too much accuracy. You can review the implementation here. Case 1: When the coordinator is a replica Now, let’s see how these counters are used in the context of the whole operation. Let’s start with the case where the coordinator is also a replica. This is almost always the case when an application uses a properly configured shard-aware driver for ScyllaDB [learn more about shard-aware drivers]. Because the coordinator is a replica, it has direct access to one of the counters for the partition relevant to the operation. We let the coordinator make a decision based only on that counter. If it decides to accept the operation, it then increments its counter and tells other replicas to do the same. If it decides to reject it, then the counter is only incremented locally and replicas are not contacted at all. Although skipping the replicas causes some undercounting, the discrepancy isn’t significant in practice – and it saves processing time and network resources. Case 2: When the coordinator is not a replica When the coordinator is not a replica, it doesn’t have direct access to any of the counters. We cannot let the coordinator ask a replica for a counter because it would introduce additional latency – and that’s unacceptable for a mechanism that is supposed to be low cost. Instead, the coordinator proceeds with sending the request to replicas and lets them decide whether to accept or reject the operation. Unfortunately, we cannot guarantee that all replicas will make the same decision. If they don’t , it’s not the end of the world, but some work will go to waste. Here’s how we try to reduce the chance of that happening. First, cutting counters in half is scheduled to happen at the start of every second, according to the system clock. ScyllaDB depends on an accurate system clock, so cutting counters on each node should be synchronized relatively well. Assuming this and that nodes accept a similar, steady rate of replica work, the counter values should be close to each other most of the time. Second, the coordinator chooses a random number from range 0..1 and sends it along with a request. Each replica computes the probability of rejection (calculated based on the counter value) and rejects if the coordinator sends a number below it. Because counters are supposed to be close together, replicas will usually agree on the decision. Read vs Write Accuracy There is one important difference in measurement accuracy between reads and writes that is worth mentioning. When a write operation occurs, the coordinator asks all live replicas to perform the operation. The consistency level only affects the number of replicas the coordinator is supposed to wait for. All live replicas increment their counters for every write operation so our current calculations are not affected. However, in the case of reads, the consistency level also affects the number of replicas contacted. It can lead to the shard counters being incremented fewer times than in the case of writes. For example, with replication factor 3 and a quorum consistency level, only 2 out of 3 replicas will be contacted for each read. The cluster will think that the real read rate is proportionally smaller, which can lead to a higher rate of requests being accepted than the user limit allows. We found this acceptable for an overload prevention solution where it’s more important to prevent the cluster performance from collapsing rather than enforcing some strict quota. Results: Goodput Restored after Enabling Rate Limit Here is a benchmark that demonstrates the usefulness of per-partition rate limiting. We set up a cluster on 3 i3.4xlarge AWS nodes with ScyllaDB 5.1.0-rc1. We pre-populated it with a small data set that fit in memory. First, we ran a uniform read workload of 80k reads per second from a c5.4xlarge instance – this represents the first section on the following charts. (Note that the charts show per-shard measurements.) Next, we started another loader instance. Instead of reading uniformly, this loader performed heavy queries on a single partition only – 8 times as much concurrency and 10 times the data fetched for each read. As expected, three shards become overloaded. Because the benchmarking application uses fixed concurrency, it becomes bottlenecked on the overloaded shards and its read rate dropped from 80k to 26k requests per second. Finally, we applied a per-partition limit of 10 reads per second. Here the chart shows that the read rate recovered. Even though the shards that were previously overloaded now coordinate many more requests per second, nearly all of the problematic requests are rejected. This is much cheaper than trying to actually handle them, and it’s still within the capabilities of the shards. Summary The article explored ScyllaDB’s solution to the “hot partition” problem in distributed database clusters through per-partition rate limiting. This capability was designed to maintain stable performance under stress by rejecting excess requests to maintain stable goodput. The implementation works by estimating request rates for each partition and making decisions based on those estimates. And benchmarks confirmed how rate limiting restored goodput, even under stressful conditions, by rejecting problematic requests. You can read the detailed design notes at https://github.com/scylladb/scylladb/blob/master/docs/dev/per-partition-rate-limit.md

IoT Overdrive Part 2: Where Can We Improve?

In our first blog, “IoT Overdrive Part 1: Compute Cluster Running Apache Cassandra® and Apache Kafka®”, we explored the project’s inception and Version 1.0. Now, let’s dive into the evolution of the project via a walkthrough of Version 1.4 and our plans with Version 2.0.

Some of our initial challenges that need to be addressed include:

  • Insufficient hardware
  • Unreliable WiFi
  • Finding a way to power all the new and existing hardware without as many cables
  • Automating the setup of the compute nodes

Upgrade: Version 1.4

Version 1.4 served as a steppingstone to Version 2.0, codenamed Ericht, which needed to be portable for shows and talks. Scaling up and out was a big necessity for this version, to make it operable at demonstrations in a timely manner.

Why the jump to 1.4? We made a few changes in between that were relatively minor, and so we rolled v1.1–1.3 into v1.4.

Let’s walk through the major changes:

Hardware

To resolve the insufficient hardware issue, we enhanced the hardware, implementing both vertical and horizontal scaling, and introduced a new method to power the Pi computers. This includes lowering the baseline model for a node at a 4GB quad-core computer (for vertical scale) and increasing the number of nodes for horizontal scaling.

These nodes are comparable to smaller AWS instances, and the cluster could be compared to a free cluster on the Instaclustr managed service.

Orange Pi 5

We upgraded the 4 Orange Pi worker nodes to Orange Pi 5s with 8gb RAM and added 256GB M.2 storage drives to each Orange Pi for storing database and Kafka data.

Raspberry Pi Workers

The 4 Raspberry Pi workers were upgraded to 8GB RAM Raspberry Pi 4bs, and the 4GB models were reassigned as Application Servers. We added 256GB USB drives to the Raspberry Pis for storing database and Kafka data.

After all that, the cluster plan looked like this:

Source: Kassian Wren

We needed a way to power all of this; luckily there’s a way to kill two birds with one stone here.

Power Over Ethernet (PoE) Switch

To eliminate power cable clutter and switch to ethernet connections, I used a Netgear 16-port PoE switch. Each Pi has its own 5V/3A adapter, which splits out into power USB-C and ethernet connectors.

This setup allowed for a single power plug to be used for the entire cluster instead of one for each Pi, greatly reducing the need for power strips and slimming down the cluster considerably.

The entire cluster plus the switch consume about 20W of power, which is not small but not unreasonable to power with a large solar panel.

Source: Kassian Wren

Software

I made quite a few tweaks on the software side in v1.4. However, the major software enhancement was the integration of Ansible.

Ansible automated the setup process for both Raspberry and Orange Pi computers, from performing apt updates to installing Docker and starting the swarm.

Making it Mobile

We made the enclosure more transport-friendly with configurations for both personal travel and secure shipping. An enclosure was designed from acrylic plates with screw holes to match the new heat sink/fan cases for the Pi computers.

Using a laser cutter/engraver, we cut the plates out of acrylic and used standoff screws for motherboards to stand the racks two high; although it can be configured to be four-high, it gets tall and ungainly.

The next challenge was coming up with a way to ship it for events.

There are two configurations: one where the cluster travels with me, and the other where the cluster is shipped to my destination.

For shipping, I use one large pelican case (hard-sided suitcase) with the pluck foam layers laid out to cradle the enclosures, with the switch in the layer above.

The “travel with me” configuration is still a work in progress; fitting everything into a case small enough to travel with is interesting! I use a carry-on pelican case and squish the cluster in. You can see the yellow case in this photo:

Source: Kassian Wren

More Lessons Learned

While Version 1.4 was much better, it did bring to light the need for more services and capabilities, and showed how my implementation of the cluster was pushing the limits of Docker.

Now that we’ve covered 1.4 and all of the work we’ve done so far, it’s time to look to the future with version 2.0, codenamed “Ericht.”

Next: Version 2.0 (Ericht)

With 2.0, I needed to build out a mobile, easy-to-move version of the cluster.

Version 2.0 focuses on creating a mobile, easily manageable version of the cluster, with enhanced automation for container deployment and management. It also implements a service to monitor the health of the cluster and pass and store the data using Kafka and Cassandra.

The problems we’re solving with 2.0 are:

  • Container management overhaul
  • Create mobility for the cluster
  • Give the cluster something to do and show
  • Monitor the health and status of the cluster

Hardware for 2.0

A significant enhancement in this new version is the integration of INA226 Voltage/Current sensors into each Pi. This addition is key in providing advanced power monitoring, and detailed insights into each unit’s energy consumption.

Health Monitoring

I will be implementing an advanced health monitoring system that tracks factors such as CPU and RAM usage, Docker status, and more. These will be passed into the Cassandra database using Kafka consumers and producers.

Software Updates

Significant software updates are planned on the software side of things. As the Docker swarm requires a lot of manual maintenance, I wanted a better way to orchestrate the containers. One way we thought of was Kubernetes.

Kubernetes (K8s)

We’re transitioning to utilizing with Docker, an open source container orchestration system, to manage our containers. Though Kubernetes is often used for more transient containers than Kafka and Cassandra, we have operators available such as Strimzi for Kafka and K8ssandra for Cassandra to help make this integration more feasible and effective.

Moving Ahead

As we continue to showcase and demo this cluster more and more often, (we were at Current 2023 and Community Over Code NA!), we are learning a lot. This project’s potential for broader technological applications is becoming increasingly evident.

As I move forward, I’ll continue to document my progress and share my findings here on the blog. Our next post will go into getting Apache Cassandra and Apache Kafka running on the cluster in Docker Swarm.

The post IoT Overdrive Part 2: Where Can We Improve? appeared first on Instaclustr.

Gemini Code Assist + Astra DB: Build Generative AI Apps Faster

Today at Google Cloud Next, DataStax is showcasing its integration with Gemini Code Assist to allow for automatic generation of Apache Cassandra compliant CQL. This integration, a collaboration with Google Cloud, will enable developers to build applications that use Astra DB faster. Developers...

Unlearning Old Habits: From Postgres to NoSQL

Where an RDBMS-pro’s intuition led him astray – and what he learned when our database performance expert helped him get up and running with ScyllaDB NoSQL Recently, I was asked by the ScyllaDB team if I could document some of my learnings moving from relational to NoSQL databases. Of course, there are many (bad) habits to break along the way, and instead of just documenting these, I ended up planning a 3 part livestream series with my good friend Felipe from ScyllaDB. ScyllaDB is a NoSQL database that is designed from the ground up with performance in mind. And by that, I mean optimizing for speed and throughput. I won’t go into the architecture details here, there’s a lot of material that you might find interesting if you want to read more in the resources section of this site. Scaling your database is a nice problem to have, it’s a good indicator that your business is doing well. Many customers of ScyllaDB have been on that exact journey – some compelling event that perhaps has driven them to consider moving from one database or another, all with the goal of driving down latency and increasing throughput. I’ve been through this journey myself, many times in the past, and I thought it would be great if I could re-learn, or perhaps just ditch some bad habits, by documenting the trials and tribulations of moving from a relational database to ScyllaDB. Even better if someone else benefits from these mistakes! Fictional App, Real Scaling Challenges So we start with a fictional application I have built. It’s the Rust-Eze Racing application which you can find on my personal GitHub here: https://github.com/timkoopmans/rust-eze-racing If you have kids, or are just a fan of the Cars movies, you’re no doubt familiar with Rust-Eze, which is one of the Piston Cup teams. — and also, a sponsor of the famous Lightning McQueen… Anyway, I digress. The purpose of this app is built around telemetry, a key factor in modern motor racing. It allows race engineers to interpret data that is captured from car systems so that they can tune for optimum performance. I created a simple Rust application that could simulate these metrics being written to the database, while at the same time having queries that read different metrics in real time. For the proof of concept, I used Postgres for data storage and was able to get decent read and write throughput from it on my development machine. Since we’re still in the world of make believe and cars that can talk, I want you to imagine that my PoC was massively successful, and I have been hired to write the whole back end system for the Piston Cup. At this point in the scenario, with overwhelming demands from the business, I start to get nervous about my choice of database: Will it scale? What happens when I have millions or perhaps billions of rows? What happens when I add many more columns and rows with high cardinality to the database? How can I achieve the highest write throughput while maintaining predictable low latency? All the usual questions a real full-stack developer might start to consider in a production scenario… This leads us to the mistakes I made and the journey I went through, spiking ScyllaDB as my new database of choice. Getting my development environment up and running, and connecting to the database for the first time In my first 1:1 livestreamed session with Felipe, I got my development environment up and running, using docker-compose to set up my first ScyllaDB node. I was a bit over-ambitious and set up a 3-node cluster, since a lot of the training material references this as a minimum requirement for production. Felipe suggested it’s not really required for development and that one node is perfectly fine. There are some good lessons in there, though… I got a bit confused with port mapping and the way that works in general. In Docker, you can map a published port on the host to a port in the container – so I naturally thought: Let’s map each of the container’s 9042 ports to a unique number on the host to facilitate client communication. I did something like this: ports: - "9041:9042" And I changed the port number for 3 nodes such that 9041 was node 1, 9042 was node 2 and 9043 was node 3. Then in my Rust code, I did something like this: SessionBuilder::new() .known_nodes(vec!["0.0.0.0:9041", "0.0.0.0:9042", "0.0.0.0:9043"]) .build() .await I did this thinking that the client would then know how to reach each of the nodes. As it turns out, that’s not quite true, depending on what system you’re working on. On my Linux machine, there didn’t seem to be any problem with this, but on macOS there are problems. Docker runs differently on macOS than Linux – Docker uses the Linux kernel, so these routes would always work, but macOS doesn’t have a Linux Kernel, so Docker has to run in a Linux virtual machine. When the client app connects to a node in ScyllaDB, part of the discovery logic in the driver is to ask for other nodes it can communicate with (read more on shard aware drivers here). Since ScyllaDB will advertise other nodes on 9042 ,they simply won’t be reachable. So on Linux, it looks like it established TCP comms with three nodes: ❯ netstat -an | grep 9042 | grep 172.19 tcp 0 0 172.19.0.1:52272 172.19.0.2:9042 ESTABLISHED tcp 0 0 172.19.0.1:36170 172.19.0.3:9042 ESTABLISHED tcp 0 0 172.19.0.1:40718 172.19.0.4:9042 ESTABLISHED But on macOS it looks a little different, with only one of the nodes with established TCP comms and the others stuck in SYN_SENT. The short of it is, you don’t really need to do this in development if you’re just using a single node! I was able to simplify my docker-compose file and avoid this problem. In reality, production nodes would most likely be on separate hosts/pods, so no need to map ports anyway. The nice thing about running with one node instead of three is that you’ll also avoid this type of problem, depending on which platform you’re using: std::runtime_error The most common cause is not enough request capacity in /proc/sys/fs/aio-max-nr I was able to circumvent this issue by using this flag: --reactor-backend=epoll This option switches Seastar threads to use epoll for event polling, as opposed to the default linux-aio implementation. This may be necessary for development workstations (in particular Mac OS deployments) where increasing the value for fs.aio-max-nr on the host system may not turn out to be so easy. Note that linux-aio (the default) is still the recommended option for production deployments. Another mistake I made was using this argument: --overprovisioned As it turns out, this argument needs a value, e.g. 1 to set this flag. However, it’s not necessary since this argument and the following are already set when you’re using ScyllaDB in docker: --developer-mode The refactored code looks something like this: scylladb1: image: scylladb/scylla container_name: scylladb1 expose: - "19042" - "7000" ports: - "9042:9042" restart: always command: - --smp 1 - --reactor-backend=epoll The only additional argument worth using in development is the  --smp command line option to restrict ScyllaDB to a specific number of CPUs. You can read more about that argument and other recommendations when using Docker to run ScyllaDB here. As I write all this out, I think to myself: This seems like pretty basic advice. But you will see from our conversation that these learnings are pretty typical for a developer new to ScyllaDB. So hopefully, you can take something away from this and avoid making the same mistakes as me. Watch the First Livestream Session On Demand Next Up: NoSQL Data Modeling In the next session with Felipe, we’ll dive deeper into my first queries using ScyllaDB and get the app up and running for the PoC. Join us as we walk through it on  April 18 in Developer Data Modeling Mistakes: From Postgres to NoSQL.

Benchmarking MongoDB vs ScyllaDB: IoT Sensor Workload Deep Dive

benchANT’s comparison of ScyllaDB vs MongoDB in terms of throughput, latency, scalability, and cost for an IoT sensor workload BenchANT recently benchmarked the performance and scalability of the market-leading general-purpose NoSQL database MongoDB and its performance-oriented challenger ScyllaDB. You can read a summary of the results in the blog Benchmarking MongoDB vs ScyllaDB: Performance, Scalability & Cost, see the key takeaways for various workloads in this technical summary,  and access all results (including the raw data) from the benchANT site. This blog offers a deep dive into the tests performed for the IoT sensor workload. The IoT sensor workload is based on the YCSB and its default data model, but with an operation distribution of 90% insert operations and 10% read operations that simulate a real-world IoT application. The workload is executed with the latest request distribution patterns. This workload is executed against the small database scaling size with a data set of 250GB and against the medium scaling size with a data set of 500GB. Before we get into the benchmark details, here is a summary of key insights for this workload. ScyllaDB outperforms MongoDB with higher throughput and lower latency results for the sensor workload except for the read latency in the small scaling size ScyllaDB provides constantly higher throughput that increases with growing data sizes up to 19 times ScyllaDB provides lower (down to 20 times) update latency results compared to MongoDB MongoDB provides lower read latency for the small scaling size, but ScyllaDB provides lower read latencies for the medium scaling size Throughput Results for MongoDB vs ScyllaDB The throughput results for the sensor workload show that the small ScyllaDB cluster is able to serve 60 kOps/s with a cluster utilization of ~89% while the small MongoDB cluster serves only 8 kOps/s under a comparable cluster utilization of 85-90%. For the medium cluster sizes, ScyllaDB achieves an average throughput of 236 kOps/s with ~88% cluster utilization and MongoDB 21 kOps/s with a cluster utilization of 75%-85%. Scalability Results for MongoDB vs ScyllaDB Analogous to the previous workloads, the throughput results allow us to compare the theoretical scale up factor for throughput with the actually achieved scalability. For ScyllaDB the maximal theoretical throughput scaling factor is 400% when scaling from small to medium. For MongoDB, the theoretical maximal throughput scaling factor is 600% when scaling size from small to medium. The ScyllaDB scalability results show that ScyllaDB is able to nearly achieve linear scalability by achieving a throughput scalability of 393% of the theoretically possible 400%. The scalability results for MongoDB show that it achieves a throughput scalability factor of 262% out of the theoretically possible 600%.   Throughput per Cost Ratio In order to compare the costs/month in relation to the provided throughput, we take the MongoDB Atlas throughput/$ as baseline (i.e. 100%) and compare it with the provided ScyllaDB Cloud throughput/$. The results show that ScyllaDB provides 6 times more operations/$ compared to MongoDB Atlas for the small scaling size and 11 times more operations/$ for the medium scaling size. Similar to the caching workload, MongoDB is able to scale the throughput with growing instance/cluster sizes, but the preserved operations/$ are decreasing.   Latency Results for MongoDB vs ScyllaDB The P99 latency results for the sensor workload show that ScyllaDB and MongoDB provide constantly low P99 read latencies for the small and medium scaling size. MongoDB provides the lowest read latency for the small scaling size, while ScyllaDB provides the lowest read latency for the medium scaling size. For the insert latencies, the results show a similar trend as for the previous workloads. ScyllaDB provides stable and low insert latencies, while MongoDB experiences up to 21 times higher update latencies. Technical Nugget – Performance Impact of the Data Model The default YCSB data model is composed of a primary key and a data item with 10 fields of strings that results in documents with 10 attributes for MongoDB and a table with 10 columns for ScyllaDB. We analyze how performance changes if a pure key-value data model is applied for both databases: a table with only one column for ScyllaDB and a document with only one field for MongoDB keeping the same record size of 1 KB. Compared to the data model impact for the social workload, the throughput improvements for the sensor workload are clearly lower. ScyllaDB improves the throughput by 8% while for MongoDB there is no throughput improvement. In general, this indicates that using a pure k-v improves the performance of read-heavy workloads rather than write-heavy workloads. Continue Comparing ScyllaDB vs MongoDB Here are some additional resources for learning about the differences between MongoDB and ScyllaDB: Benchmarking  MongoDB vs ScyllaDB: Results from benchANT’s complete benchmarking study that comprises 133 performance and scalability measurements that compare MongoDB against ScyllaDB. Benchmarking MongoDB vs ScyllaDB: Caching Workload Deep Dive: benchANT’s comparison of  ScyllaDB vs MongoDB in terms of throughput, latency, scalability, and cost for a caching workload A Technical Comparison of MongoDB vs ScyllaDB: benchANT’s technical analysis of how MongoDB and ScyllaDB compare with respect to their features, architectures, performance, and scalability. ScyllaDB’s MongoDB vs ScyllaDB page: Features perspectives from users – like Discord – who have moved from MongoDB to ScyllaDB.

Steering ZEE5’s Migration to ScyllaDB

Eliminating cloud vendor lockin and supporting rapid growth – with a 5X cost reduction Kishore Krishnamurthy, CTO at ZEE5, recently shared his perspective on Zee’s massive, business-critical database migration as the closing keynote at ScyllaDB Summit. You can read highlights of his talk below. All of the ScyllaDB Summit sessions, including Mr. Krishnamurthy’s session and the tech talk featuring Zee engineers, are available on demand. Watch On Demand About Zee Zee is a 30-year-old publicly-listed Indian media and entertainment company. We have interests in broadcast, OTT, studio, and music businesses. ZEE5 is our premier OTT streaming service, available in over 190 countries. We have about 150M monthly active users. We are available on web Android, iOS, smart TVs, and various other television and OEM devices. Business Pressures on Zee The media industry around the world is under a lot of bottom-line pressure. The broadcast business is now moving to OTT. While a lot of this business is being effectively captured on the OTT side, the business models on OTT are not scaling up similarly to the broadcast business. In this interim phase, as these business models stabilize, etc., there is a lot of pressure on us to run things very cost-efficiently. This problem is especially pronounced in India. Media consumption is on the rise in India. The biggest challenge we face is in terms of monetizability: our revenue is in rupees while our technology expenses are in dollars. For example, what we make from one subscription customer in one year is what Netflix or Disney would make in a month. We are on the lookout for technology and vendor partners who have a strong India presence, who have a cost structure that aligns with the kind of scale we provide, and who are able to provide the cost efficiency we want to deliver. Technical Pressures on the OTT Platform A lot of the costs we had on the platform scale linearly with usage and user base. That’s particularly true with some aspects of the platform like our heartbeat API, which was the primary use case driving our consideration of ScyllaDB. The linear cost escalation limited us in terms of what frequency we could run these kinds of solutions like heartbeat. A lot of our other solutions – like our playback experience, our security, and our recommendation systems – leverage heartbeat in their core infrastructure. Given the cost limitation, we could never scale that up. We also had challenges in terms of the distributed architecture of the solution we had. We were working towards a multi-tenant solution. We were exploring cloud-neutral solutions, etc. What Was Driving the Database Migration Sometime last year, we decided that we wanted to get rid of cloud vendor lockin. Every solution we were looking for had to be cloud-neutral, and the database choice also needed to deliver on that. We were also in the midst of a large merger. We wanted to make sure that our stack was multitenant and ready to onboard multiple OTT platforms. So these were reasons why we were eagerly looking for a solution that was both cloud-neutral and multitenant. Top Factors in Selecting ScyllaDB We wanted to move away from the master-slave architecture we had. We wanted our solution to be infinitely scalable. We wanted the solution to be multi-region-ready. One of the requirements from a compliance perspective was to be ready for any kind of regional disaster. When we came up with a solution for multi-region, the cost became significantly higher. We wanted a high-availability, multi-region solution and ScyllaDB’s clustered architecture allowed us to do that, and to move away from cloud vendor lockin. ScyllaDB allowed us to cut dependencies on the cloud provider. Today, we can run a multi-cloud solution on top of ScyllaDB. We also wanted to make sure that the migration would be seamless. ScyllaDB’s clustered architecture across clouds helped us when we were doing our recent cloud migration It allowed us to make it very seamless. From a support perspective, the ScyllaDB team was very responsive. They had local support in India, they had a local competency in terms of solution architects who held our hands along the way. So we were very confident we could deliver with their support. We found that operationally, ScyllaDB was very efficient. We could significantly reduce the number of database nodes when we moved to ScyllaDB. That also meant that the costs came down. ScyllaDB also happened to be a drop-in replacement for the current incumbents like Cassandra and DynamoDB. All of this together made it an easy choice for us to select ScyllaDB over the other database choices we were looking at. Migration Impact The migration to ScyllaDB was seamless. I’m happy to say there was zero downtime. After the migration to ScyllaDB, we also did a very large-scale cloud migration. From what we heard from the cloud providers, nobody else in the world had attempted this kind of migration overnight. And our migration was extremely smooth. ScyllaDB was a significant part of all the components we migrated, and that second migration was very seamless as well. After the migration, we moved about 525M users’ data, including their references, login details, session information, watch history, etc., to ScyllaDB. We have now hundreds of millions of heartbeats recorded on ScyllaDB. The overall data we store on ScyllaDB is in the tens of terabytes range at this point. Our overall cost is a combination of the efficiency that ScyllaDB provides in terms of the reduction in the number of nodes we use, and the cost structure that ScyllaDB provides. Together, this has given us a 5x improvement in cost – that’s something our CFO is very happy with. As I mentioned before, the support has been excellent. Through both the migrations – first the migration to ScyllaDB and subsequently the cloud migration – the ScyllaDB team was always available on demand during the peak periods. They were available on-prem to support us and hand-hold us through the whole thing. All in all, it’s a combination: the synergy that comes from the efficiency, the cost-effectiveness, and the scalability. The whole is more than the sum of the parts. That’s how I feel about our ScyllaDB migration. Next Steps ScyllaDB is clearly a favorite with the developers at Zee. I expect a lot more database workloads to move to ScyllaDB in the near future. If you’re curious about the intricacies of using heartbeats to track video watch progress, then catch the talk by Srinivas and Jivesh, where they explained the phenomenally efficient system that we have built. Watch the Zee Tech Talk

Inside a DynamoDB to ScyllaDB Migration

A detailed walk-through of an end-to-end DynamoDB to ScyllaDB migration We previously discussed ScyllaDB Migrator’s ability to easily migrate data from DynamoDB to ScyllaDB – including capturing events. This ensures that your destination table is consistent and abstracts much of the complexity involved during a migration. We’ve also covered when to move away from DynamoDB, exploring both technical and business reasons why organizations seek DynamoDB alternatives, and examined what a migration from DynamoDB looks like, walking through how a migration from DynamoDB looks like and how to accomplish it within the DynamoDB ecosystem. Now, let’s switch to a more granular (and practical) level. Let’s walk through an end-to-end DynamoDB to ScyllaDB migration, building up on top of what we discussed in the two previous articles. Before we begin, a quick heads up: The ScyllaDB Spark Migrator should still be your tool of choice for migrating from DynamoDB to ScyllaDB Alternator – ScyllaDB’s DynamoDB compatible API. But there are some scenarios where the Migrator isn’t an option: If you’re migrating from DynamoDB to CQL – ScyllaDB’s Cassandra compatible API If you don’t require a full-blown migration and simply want to stream a particular set of events to ScyllaDB If bringing up a Spark cluster is an overkill at your current scale You can follow along using the code in this GitHub repository. End-to-End DynamoDB to ScyllaDB Migration Let’s run through a migration exercise where: Our source DynamoDB table contains Items with an unknown (but up to 100) number of Attributes The application constantly ingests records to DynamoDB We want to be sure both DynamoDB and ScyllaDB are in sync before fully switching For simplicity, we will migrate to ScyllaDB Alternator, our DynamoDB compatible API. You can easily transform it to CQL as you go. Since we don’t want to incur any impact to our live application, we’ll back-fill the historical data via an S3 data export. Finally, we’ll use AWS Lambda to capture and replay events to our destination ScyllaDB cluster. Environment Prep – Source and Destination We start by creating our source DynamoDB table and ingesting data to it. The create_source_and_ingest.py script will create a DynamoDB table called source, and ingest 50K records to it. By default, the table will be created in On-Demand mode, and records will be ingested using the BatchWriteItem call serially. We also do not check whether all Batch Items were successfully processed, but this is something you should watch out for in a real production application. The output will look like this, and it should take a couple minutes to complete: Next, spin up a ScyllaDB cluster. For demonstration purposes, let’s spin a ScyllaDB container inside an EC2 instance: Beyond spinning up a ScyllaDB container, our Docker command-line does two things: Exposes port 8080 to the host OS to receive external traffic, which will be required later on to bring historical data and consume events from AWS Lambda Starts ScyllaDB Alternator – our DynamoDB compatible API, which we’ll be using for the rest of the migration. Once your cluster is up and running, the next step is to create your destination table. Since we are using ScyllaDB Alternator, simply run the create_target.py script to create your destination table: NOTE: Both source and destination tables share the same Key Attributes. In this guided migration, we won’t be performing any data transformations. If you plan to change your target schema, this would be the time for you to do it. 🙂 Back-fill Historical Data For back-filling, let’s perform a S3 Data Export. First, enable DynamoDB’s point-in-time recovery: Next, request a full table export. Before running the below command, ensure the destination S3 bucket exists: Depending on the size of your actual table, the export may take enough time for you to step away to grab a coffee and some snacks. In our tests, this process took around 10-15 minutes to complete for our sample table. To check the status of your full table export, replace your table ARN and execute: Once the process completes, modify the s3Restore.py script accordingly (our modified version of the LoadS3toDynamoDB sample code) and execute it to load the historical data: NOTE: To prevent mistakes and potential overwrites, we recommend you use a different Table name for your destination ScyllaDB cluster. In this example, we are migrating from a DynamoDB Table called source to a destination named dest. Remember that AWS allows you to request incremental backups after a full export. If you feel like that process took longer than expected, simply repeat it in smaller steps with later incremental backups. This can be yet another strategy to overcome the DynamoDB Streams 24-hour retention limit. Consuming DynamoDB Changes With the historical restore completed, let’s get your ScyllaDB cluster in sync with DynamoDB. Up to this point, we haven’t necessarily made any changes to our source DynamoDB table, so our destination ScyllaDB cluster should already be in sync. Create a Lambda function Name your Lambda function and select Python 3.12 as its Runtime: Expand the Advanced settings option, and select the Enable VPC option. This is required, given that our Lambda will directly write data to ScyllaDB Alternator, currently running in an EC2 instance. If you omit this option, your Lambda function may be unable to reach ScyllaDB. Once that’s done, select the VPC attached to your EC2 instance, and ensure that your Security Group allows Inbound traffic to ScyllaDB. In the screenshot below, we are simply allowing inbound traffic to ScyllaDB Alternator’s port: Finally, create the function. NOTE: If you are moving away from the AWS ecosystem, be sure to attach your Lambda function to a VPC with external traffic. Beware of the latency across different regions or when traversing the Internet, as it can greatly delay your migration time. If you’re migrating to a different protocol (such as CQL), ensure that your Security Group allows routing traffic to ScyllaDB relevant ports. Grant Permissions A Lambda function needs permissions to be useful. We need to be able to consume events from DynamoDB Streams and load them to ScyllaDB. Within your recently created Lambda function, go to Configuration > Permissions. From there, click the IAM role defined for your Lambda: This will take you to the IAM role page of your Lambda. Click Add permissions > Attach policies: Lastly, proceed with attaching the AWSLambdaInvocation-DynamoDB policy. Adjust the Timeout By default, a Lambda function runs for only about 3 seconds before AWS kills the process. Since we expect to process many events, it makes sense to increase the timeout to something more meaningful. Go to Configuration > General Configuration, and Edit to adjust its settings: Increase the timeout to a high enough value (we left it at 15 minutes) that allows you to process a series of events. Ensure that when you hit the timeout limit, DynamoDB Streams won’t consider the Stream to have failed processing, which effectively sends you into an infinite loop. You may also adjust other settings, such as Memory, as relevant (we left it at 1Gi). Deploy After the configuration steps, it is time to finally deploy our logic! The dynamodb-copy folder contains everything needed to help you do that. Start by editing the dynamodb-copy/lambda_function.py file and replace the alternator_endpoint value with the IP address and port relevant to your ScyllaDB deployment. Lastly, run the deploy.sh script and specify the Lambda function to update: NOTE: The Lambda function in question simply issues PutItem calls to ScyllaDB in a serial way, and does nothing else. For a realistic migration scenario, you probably want to handle DeleteItem and UpdateItem API calls, as well as other aspects such as TTL and error handling, depending on your use case. Capture DynamoDB Changes Remember that our application is continuously writing to DynamoDB, and our ultimate goal is to ensure that all records ultimately exist within ScyllaDB, without incurring any data loss. At this step, we’ll simply enable DynamoDB Streams to Capture change events as they go. To accomplish that, simply turn on DynamoDB streams to capture Item level changes: In View Type, specify that you want a New Image capture, and proceed with enabling the feature: Create a Trigger At this point, your Lambda is ready to start processing events from DynamoDB. Within the DynamoDB Exports and streams configuration, let’s create a Trigger to invoke our Lambda function every time an item gets changed: Next, choose the previously created Lambda function, and adjust the Batch size as needed (we used 1000): Once you create the trigger, data should start flowing from DynamoDB Streams to ScyllaDB Alternator! Generate Events To show the situation of an application frequently updating records, let’s simply re-execute the initial create_source_and_ingest.py program: It will insert another 50K records to DynamoDB, whose Attributes and values will be very different from the existing ones in ScyllaDB: The program found out the source table already exists and has simply overwritten all its existing records. The new records were then captured by DynamoDB Streams, which should now trigger our previously created Lambda function, which will stream its records to ScyllaDB. Comparing Results It may take some minutes for your Lambda to catch up and ingest all events to ScyllaDB (did we say coffee?). Ultimately, both databases should get in sync after a few minutes. Our last and final step is simply to compare both database records. Here, you can either compare everything, or just a few selected ones. To assist you with that, here’s our final program: compare.py! Simply invoke it, and it will compare the first 10K records across both databases and report any mismatches it finds: Congratulations! You moved away from DynamoDB! 🙂 Final Remarks In this article, we explored one of the many ways to migrate a DynamoDB workload to ScyllaDB. Your mileage may vary, but the general migration flow should be ultimately similar to what we’ve covered here. If you are interested in how organizations such as Digital Turbine or Zee migrated from DynamoDB to ScyllaDB, you may want to see their recent ScyllaDB Summit talks. Or perhaps you would like to learn more about different DynamoDB migration approaches? In that case, watch my talk in our NoSQL Data Migration Masterclass. If you want to get your specific questions answered directly, talk to us!