How Cassandra Streaming, Performance, Node Density, and Cost are All related
This is the first post of several I have planned on optimizing Apache Cassandra for maximum cost efficiency. I’ve spent over a decade working with Cassandra and have spent tens of thousands of hours data modeling, fixing issues, writing tools for it, and analyzing it’s performance. I’ve always been fascinated by database performance tuning, even before Cassandra.
A decade ago I filed one of my first issues with the project, where I laid out my target goal of 20TB of data per node. This wasn’t possible for most workloads at the time, but I’ve kept this target in my sights.
Why TRACTIAN Migrated from MongoDB to ScyllaDB for Real-Time ML
TRACTIAN’s ML model workloads increased over 2X in a year. Here’s why they changed databases and their lessons learned What happens when you hit a database scaling wall? Since TRACTIAN, an AI-driven industrial monitoring company, is all about preventing problems, they didn’t want to wait and see. After the company’s ML workloads doubled in a year, their industrial IoT platform was experiencing unsolvable performance degradation. With more rapid growth on the horizon, their engineering leaders decided to rethink their distributed data system before they hit MongoDB’s breaking point. JP Voltani, TRACTIAN’s Director of Engineering, recently shared the team’s experiences at ScyllaDB Summit. If we gave out Academy Awards for production, this one would have been the clear winner (all credit to the TRACTIAN team). So, be sure to watch this quick look at some impressive scaling work. Enjoy engineering case studies like this? Choose your own adventure through 60+ tech talks at Monster Scale Summit (free + virtual). You can learn from experts like Martin Kleppmann, Kelsey Hightower and Gwen Shapira, plus engineers from Discord, Disney+, Slack, Atlassian, Uber, Canva, Medium, Cloudflare, and more. Get a free conference pass Key Takeaways A few key takeaways: TRACTIAN was reaching a critical inflection point when their sensor network grew more than 2x in a single year. MongoDB struggled, even after the team’s valiant optimization and scaling attempts. The constant stream of time-series sensor data (vibration, temperature, energy consumption) caused performance degradation that could compromise their latency targets. The team wanted a database architecture specifically designed for high-throughput, time-partitioned data workloads, which led them to ScyllaDB. They benchmarked ScyllaDB vs Cassandra, Postgres, and MongoDB. The results showed a 10x performance improvement with ScyllaDB, and they appreciated its operational simplicity compared to Cassandra. The TRACTIAN team moved their most performance-critical workloads to ScyllaDB while maintaining MongoDB for other use cases, exemplifying their “right tool for the job” philosophy. They experienced a 10x improvement in throughput and latency with ScyllaDB. TRACTIAN applied a four-phase migration process (dual writes → historical backfill → read switching → final validation). This phased approach maintained 99.95% availability while transitioning critical industrial IoT data pipelines. The team mapped their IoT workload to ScyllaDB by partitioning data by sensor ID and clustering by timestamp. This data modeling change improved query performance for time-window searches and eliminated the hotspot issues that had plagued their MongoDB implementation. Here’s a lightly edited transcript… Intro Hello, everyone. My name is JP, and I’m the Director of Engineering at TRACTIAN. Today, I’m going to talk about our experience with real time machine learning using ScyllaDB. I will start talking about what TRACTIAN is and what we do, what our infrastructure looks like, why we migrated away from MongoDB for some workloads, our ScyllaDB migration process, and what is next for us. At TRACTIAN, we build solutions for industrial maintenance. We want to empower the maintenance teams around the globe with the best in class hybrid and AI assisted software. We have three products: The Smart Trac is a vibration and temperature sensor that is able to detect more than 70 types of failures in rotating machines. The TracOS is a system with everything needed to manage the operations of maintenance teams on the plant floor, enabling mobile and offline operations. The Energy Trac is a sensor that is able to monitor energy consumption, efficiency and electrical quality. Together, these products form a very concise solution that works seamlessly with one another – bringing a very Apple-like experience to industrial maintenance. We have already raised over $100M through VC funding, establishing a global footprint with customers across the Americas. We have three different headquarters: one in Brazil, one in Mexico, one in the USA. We have employees worldwide. The TRACTIAN Tech Stack Let’s talk about our tech stack. We have a very straightforward approach to adopting new technologies: If it helps solve a real problem, we embrace it. For this reason, our tech stack is very modern and extensive. We use more than 80 databases and 6 different languages for our services. That allows us to leverage the strengths of each technology. We have a microservices architecture with more than 30 services, ranging from APIs, consumers, producers and batch processes. They all handle more than 1500 events per second from different sources. And they do so with an average latency lower than 200 milliseconds and with 99.95% availability. Here’s what our infrastructure looked like before ScyllaDB. The sensor sent data to our APIs, and the APIs put the sensor data into Kafka topics. We had different services that would consume these topics to process the data– saving into MongoDB, into different collections. After that, we sent triggers to the AI pipeline to process the data. We start with a binary blob from the sensor and the processing services expand the data to different tables. Some use it for client visualizations, others as vectors for AI (training and inference). Why They Evolved As the company grew, the number of samples arriving to the system also grew. We saw the workloads increase over 2x in a single year, and the database needed to deal with that increase. Unfortunately, even after upscale operations and optimizations, that was not the case with MongoDB. Performance degradation made us look for alternative solutions for our warehouse and AI workloads. Why ScyllaDB Why ScyllaDB? At the time, we already tested Cassandra. The results were promising, but some database operations, like upscaling, had some aspects that were not attractive to us. MongoDB was not handling the IoT workload very well, and we wanted something that was easier to scale. ScyllaDB showed itself to be a light at the end of the tunnel. We were searching for something really specific, and luckily ScyllaDB had a data model that fit our problem very well. Also, ScyllaDB’s database operations were way better than Cassandra’s. This is just one example of how ScyllaDB’s data model works in favor of our workloads. In this case, we have some binary data that we want to start partitioning by sensor ID and ordering by the timestamp. ScyllaDB will make this query for a specific ID in a time window very fast. We had a plan on our hands. First, we created a new DSL. What would the tables on ScyllaDB look like? How would MongoDB data map to the new tables? After that, we did a bunch of theoretical benchmarks, which is basically testing with synthetic data. This is an easy and fast way to validate an idea. Then we did it all over again, but with real data. Sometimes synthetic tests fail to map some nuances of real data and miss things like partitions and hot spots. Other times, they fail to create a good mapping, and this only becomes visible when you test with real data. So, it’s important to not skip this step. Next, we went into the weeds and refactored all the existing application code to use the new database. It’s important to have very, very clear success criteria. What are you trying to achieve with this migration? We had a very clear number of devices in mind that the new infrastructure should be able to handle. The test results came in favor of ScyllaDB. In some workloads, we saw an increase of 10x in throughput and latency. Migration Strategy Next, let’s talk about the migration game plan. We did everything live and without downtime. Initially, all the data was being written to MongoDB. After that, we started to write to both databases. This was the first checkpoint of the migration. At this step, we checked to see if both databases agreed if the data was correct and if the initial performance test agreed with the benchmark ones. After that, we started our migration script that would backfill ScyllaDB with the historical data from MongoDB and check that no data was missing. Then, we switched the reads to occur on ScyllaDB, while continuing to write on MongoDB as a backup if any problems occurred. This is how we did our online no downtime migration. The results speak for themselves. Results We have a great write read latency after migration and ScyllaDB has scaled very well with our increasing workload. Our infrastructure now has ScyllaDB as one of its backbones, and we still use MongoDB for other types of workloads – and also a bunch of other databases for other challenges. Read more about TRACTIAN’s comparison of ScyllaDB vs MongoDB and PostgreSQL in ScyllaDB vs MongoDB vs PostgreSQLIBM acquires DataStax: What that means for customers–and why Instaclustr is a smart alternative
IBM’s recent acquisition of DataStax has certainly made waves in the tech industry. With IBM’s expanding influence in data solutions and DataStax’s reputation for advancing Apache Cassandra® technology, this acquisition could signal a shift in the database management landscape.
For businesses currently using DataStax, this news might have sparked questions about what the future holds. How does this acquisition impact your systems, your data, and, most importantly, your goals?
While the acquisition proposes prospects in integrating IBM’s cloud capabilities with high-performance NoSQL solutions, there’s uncertainty too. Transition periods for acquisitions often involve changes in product development priorities, pricing structures, and support strategies.
However, one thing is certain: customers want reliable, scalable, and transparent solutions. If you’re re-evaluating your options amid these changes, here’s why NetApp Instaclustr offers an excellent path forward.
Decoding the IBM-DataStax link-up
DataStax is a provider of enterprise solutions for Apache Cassandra, a powerful NoSQL database trusted for its ability to handle massive amounts of distributed data. IBM’s acquisition reflects its growing commitment to strengthening data management and expanding its footprint in the open source ecosystem.
While the acquisition promises an infusion of IBM’s resources and reach, IBM’s strategy often leans into long-term integration into its own cloud services and platforms. This could potentially reshape DataStax’s roadmap to align with IBM’s broader cloud-first objectives. Customers who don’t rely solely on IBM’s ecosystem—or want flexibility in their database management—might feel caught in a transitional limbo.
This is where Instaclustr comes into the picture as a strong, reliable alternative solution.
Why consider Instaclustr?
Instaclustr is purpose-built to empower businesses with a robust, open source data stack. For businesses relying on Cassandra or DataStax, Instaclustr delivers an alternative that’s stable, high-performing, and highly transparent.
Here’s why Instaclustr could be your best option moving forward:
1. 100% open source commitment
We’re firm believers in the power of open source technology. We offer pure Apache Cassandra, keeping it true to its roots without the proprietary lock-ins or hidden limitations. Unlike proprietary solutions, a commitment to pure open source ensures flexibility, freedom, and no vendor lock-in. You maintain full ownership and control.
2. Platform agnostic
One of the things that sets our solution apart is our platform-agnostic approach. Whether you’re running your workloads on AWS, Google Cloud, Azure, or on-premises environments, we make it seamless for you to deploy, manage, and scale Cassandra. This differentiates us from vendors tied deeply to specific clouds—like IBM.
3. Transparent pricing
Worried about the potential for a pricing overhaul under IBM’s leadership of DataStax? At Instaclustr, we pride ourselves on simplicity and transparency. What you see is what you get—predictable costs without hidden fees or confusing licensing rules. Our customer-first approach ensures that you remain in control of your budget.
4. Expert support and services
With Instaclustr, you’re not just getting access to technology—you’re also gaining access to a team of Cassandra experts who breathe open source. We’ve been managing and optimizing Cassandra clusters across the globe for years, with a proven commitment to providing best-in-class support.
Whether it’s data migration, scaling real-world workloads, or troubleshooting, we have you covered every step of the way. And our reliable SLA-backed managed Cassandra services mean businesses can focus less on infrastructure stress and more on innovation.
5. Seamless migrations
Concerned about the transition process? If you’re currently on DataStax and contemplating a move, our solution provides tools, guidance, and hands-on support to make the migration process smooth and efficient. Our experience in executing seamless migrations ensures minimal disruption to your operations.
Customer-centric focus
At the heart of everything we do is a commitment to your success. We understand that your data management strategy is critical to achieving your business goals, and we work hard to provide adaptable solutions.
Instaclustr comes to the table with over 10 years of experience in managing open source technologies including Cassandra, Apache Kafka®, PostgreSQL®, OpenSearch®, Valkey,® ClickHouse® and more, backed by over 400 million node hours and 18+ petabytes of data under management. Our customers trust and rely on us to manage the data that drives their critical business applications.
With a focus on fostering an open source future, our solutions aren’t tied to any single cloud, ecosystem, or bit of red tape. Simply put: your open source success is our mission.
Final thoughts: Why Instaclustr is the smart choice for this moment
IBM’s acquisition of DataStax might open new doors—but close many others. While the collaboration between IBM and DataStax might appeal to some enterprises, it’s important to weigh alternative solutions that offer reliability, flexibility, and freedom.
With Instaclustr, you get a partner that’s been empowering businesses with open source technologies for years, providing the transparency, support, and performance you need to thrive.
Ready to explore a stable, long-term alternative to DataStax? Check out Instaclustr for Apache Cassandra.
Contact us and learn more about Instaclustr for Apache Cassandra or request a demo of the Instaclustr platform today!
The post IBM acquires DataStax: What that means for customers–and why Instaclustr is a smart alternative appeared first on Instaclustr.
Build an RPG Using the Bluesky Jetstream, ScyllaDB, and Rust
Learn how to build a Rust application that tracks Bluesky user experiences and events. Let’s build a high-performance, scalable, and reliable application that can: Fetch and process public events from the Bluesky platform. Track user events and experiences. Implement a leveling system with experience points (XP). Display user levels and progress based on XP via a REST API. 1. Background Bluesky, which uses a mix of SQLite and ScyllaDB to store data, has a really cool feature called Firehose. Firehose is an aggregated stream of all the public data updates in the network. You can understand it by accessing FireSky.tv, an app that implements this stream and serves it directly in the browser. Implementing it from scratch requires deep knowledge of the AT Protocol. But a Bluesky engineer built Jetstream: a Firehose aggregator. With Firehose, you can just listen on a websocket and get a JSON stream of selected events. Here’s a sample of an event payload from Jetstream: Just listening to one of these streams without any issues is amazing. And it turns out that you can even select which type of event you want to listen to, like: app.bsky.graph.follow; app.bsky.feed.post; app.bsky.feed.like; app.bsky.feed.repost; and many more! But how can we turn it into an application? Well, it depends on your needs. The data is there; just consume it and do your magic! In my case, I like to transform data into games. 2. Gamifying Jetstream I’m not a game developer, but games follow an Event-Driven Development approach, right? Every time that you earn some points in something, you level up or learn a new skill. But to earn experience points, users need to take actions. And that’s what you do inside a Social Network: actions! Imagine that every time you: Post: Just Text? Earns 50 experience Have Media? Earns 60 experience Have Media with Alt Text? Earns 70 Like: Earns 10 experience Repost: Just Text? Earns 50 experience Have Media? Earns 60 experience Have Media with Alt Text? Earn 70 experience! There are plenty of other abstractions that can be done, but that’s the idea. The experience will be calculated using arithmetic progression, and should follow this simple rule: With that, we can now talk about the technologies used in this project. 3. Meet the Stack Bluesky uses ScyllaDB to serve all the AppView layer thinking about high availability and throughput, so we’re going to do the same! Also, I’ve been using Rust extensively (and always learning more!), so I decided to implement this project with Rust. Here’s the tech stack in a nutshell: Language: Rust Database: ScyllaDB Packages: HTTP Server: actix-web ORM: charybdis Jetstream Client: jetstream-oxide Bluesky Client: atrium-api My goal is to build something that, besides creating cool charts on Grafana, can also display something via REST API. First, let’s explore our data modeling strategy. 4. What about the Data Modeling? Initially, the idea was to just store the events and test how stressed the app/database would become. But, at this point, we can go a little bit further. ScyllaDB follows a Query Driven Development approach because it’s a Wide-Column NoSQL Database. Let’s think about that. First, it’s an RPG focused on a timeline profile, so it will have heavy read operations on top of the “characters”: Since we only have one item in the WHERE CLAUSE, it means that our query is aKey Value
lookup. But wait…we also need to
store the current experience of this user. For that, I would
use the Counter type
to atomically store it using
key-value
pairs: It’s supposed to be simple, just like
this! But it also has to be fast enough to serve 1M requests/s with
ease. WARNING: Counter types can’t be clusterized
or used as partition keys. Also, if you use them in a table, all
fields besides the Partition Keys aggregates must be Counters! I
also want to track all possible events happening in a user’s
account and list them in our extension to show how that person can
be a better Bluesky user. So, the queries would be around users and
they must be clusterized in descending order: Alright, that should
be enough for an MVP. Now let’s model each part showing some Rust
and Charybdis ORM! 4.1 Modeling: Leveling State UDT Since we’re
using ScyllaDB, we can use UDTs (User Defined Types). Keeping track
of operations can be a pain. However, if you’re making this a
pattern across all tables, UDTs can be useful when you don’t want
to recreate the same fields every time. Now we can just use it
around the other tables, whether it’s related to events or
characters. 4.2 Modeling: Characters Table This will be the most
accessed table inside our project via REST API. And the modeling
(at this moment) is simple since we only want the user_handle and
the leveling state (udt). Check it out: With the UDT, we can serve
exactly the latest leveling state to build a UI later on. We can
also add new fields since none of them will be part of the
Partition Key. 4.3 Modeling: Characters Experience Table As
mentioned earlier, we should store the experience so that it won’t
become a race condition. Why? ScyllaDB is a highly available
database that can replicate your data across multiple nodes. To
avoid race conditions, we need to use the only Atomic Type
available: the Counter type. With that, we will ensure that every
write/read will be the latest there. Yes, it impacts performance.
However, Counters are planned and optimized for this type of
operation. The modeling would be: Now the last one, the events
table! 4.4 Modeling: Events Table and MV This is the most
“complicated” part, but it’s not that hard. As mentioned before,
there are plenty of events around ATProto Bluesky, and I want to
give all the possible events for each user. Displaying data in
descending order is a must. ScyllaDB can provide this functionality
if you include a Clustering Key in your table. Check it out: With
the CLUSTERING ORDER BY (event_at DESC)
I’m basically
telling it that every time I fetch a chunk of data from this table,
it ALWAYS will be the recent inserts. However, now we have a
problem. Imagine that we want to list all events from a specific
type. With this table, we’re not able to do that. Why? Because you
can only use as WHERE clause items that you add inside your
Partitions or Clustering Keys. However, we can get around this by
creating a Materialized View! Materialized Views are tables created
based on a parent table. Every time that this parent table receives
a write, your view will also receive it. You can then play
with the partition/clusterization. Check it out: Now, we have
different partitions for the same user, storing different types of
events that we’re able to query directly. With that, our data
modeling is finally DONE! Let’s jump into some business rules
implementation. 5. Hands-on: Application Flow With the basics taken
care of, let’s explain how everything works under the hood. 5.1
App: Jetstream Oxide At the Websocket layer, we’re using the
Jetstream Oxide package to receive all the events in an elegantly
structured way. The boilerplate can be like: For each type of
event, we’ll receive a specific amount of experience and a
different response in asynchronicity. With that, the goal was to
make an OCP integration where we only need to add new events when
possible: That takes us to the last step, which sets up the event
default behavior at the Trait. We have three types of event
actions: Create, Update, and Delete. The Handler will take care of
the whole Action/Communication with ScyllaDB through Charybdis ORM.
In this example, you can check how the CreateEventHandler works: We
can implement other types of events by only extending the trait to
the new Dynamic Struct, and it will be working fine. 5.2 App: Actix
Web For serving this data, there’s a simple implementation of an
endpoint using Actix. Since the long-term goal is to build a
browser extension, we need to serve an endpoint with the
character/user information: 6. Conclusion This exploration of
Bluesky Jetstream and its potential for gamification showcases the
power of leveraging cutting-edge technologies like ScyllaDB and
Rust to build scalable, high-performance applications. By focusing
on event-driven development, we successfully demonstrated how to
create an interactive system that transforms social media
activities into measurable, gamified metrics. You can check out
the project here. How JioCinema Uses ScyllaDB Bloom Filters for Personalization
Why they used ScyllaDB Bloom Filters instead of building their own using common solutions like Redis Bloom filters When you log in to your favorite streaming service, first impressions matter. The featured content should instantly lure you into binge-watching mode. If it’s full of shows and movies you’ve already seen, your brain quickly shifts into “Hmmm, is it time to cancel this service?” mode. At a technical level, ensuring fresh recommendations is something that every streaming platform faces. But the standard solutions weren’t a good fit for JioCinema, a prominent Indian streaming service known for its affordability and extensive content library – and currently experiencing explosive growth (e.g., with world-record-breaking 620M IPL viewers, peaking over 20M concurrent viewers). Instead of building their own Bloom filters or using common solutions like Redis Bloom filters, they took a different path: using ScyllaDB’s built-in Bloom filter support to check watch status in real time. JioCinema’s Charan Kamal (Back-End Developer) and Harshit Jain (Software Engineer) recently shared why they took this unconventional path, including the tradeoffs of the more obvious solutions and the logistics of implementing this with ScyllaDB. Watch their complete tech talk below, or read on to skim the highlights. Enjoy engineering case studies like this? Choose your own adventure through 60+ tech talks at Monster Scale Summit (free + virtual). You can learn from experts like Martin Kleppmann, Kelsey Hightower and Gwen Shapira, plus engineers from Discord, Disney+, Slack, Atlassian, Uber, Canva, Medium, Cloudflare, and more. Get a free conference pass The Challenge: “Watch Discounting” for Fresh Recommendations JioCinema is a leading “over the top” (OTT) streaming platform. The service features top Indian and international entertainment, including live sports (from IPL cricket, to LaLiga European football, to NBA basketball), new movies, HBO originals, and more. Their massive content library spans 10 Indian languages. The JioCinema app uses customized content trays like “Because You Watched” to keep users engaged and help them discover new content. For example, after a user completes “Game of Thrones,” the platform might commonly recommend “House of the Dragon” – but if the user already watched it, it will suggest something else. As Harshit Jain put it, “These personalized trays not only keep the customers engaged but also enhance content discoverability, fostering long-term engagement and reducing churn rates. However, personalization comes with its own challenges, particularly the issue of recommending content that the customers have already watched. To address this, we have implemented a solution and termed it ‘Watch Discounting.’” This service must cost-efficiently satisfy low-latency requirements at an impressive scale (e.g., 10M daily active users consuming hundreds of thousands of shows and films per day). Charan Kamal explained, “Keeping the sheer size of our customer base and catalog in mind, we had to use a data structure which was space-efficient as well as blazing fast. While we want to keep our recommendations fresh, we also want to avoid over-engineering and making the system overly complex. We could tolerate occasional false positives here. So this led us to Bloom filters – space-efficient probabilistic data structures designed for rapid membership lookup in a set.” The Problem with Custom and Redis Bloom Filters Okay, but which Bloom filters were the best fit here? The team first considered building a custom Bloom filter to store and serve content. Although this “fun exercise” would have provided complete control over the implementation, it presented significant scaling challenges. They didn’t trust that a simple map of Bloom filters would scale vertically to keep pace with JioCinema’s massive (and rapidly growing) user base. Horizontal scaling would have required either implementing sticky sessions (where specific pods would hold Bloom filters for particular users) or replicating data across every pod in the system. While this would have been an interesting engineering challenge, it just didn’t make sense from a business perspective. The next option they explored was Redis with Bloom filter capabilities. Their initial testing with an open-source Redis cluster revealed an interesting issue with Redis’ cluster topology. During high load, nodes would occasionally get hot and trigger failovers, promoting replicas to primary nodes. This created a cascade effect where entire nodes within the cluster became unusable while primary-replica promotions continued in an endless loop. Looking to avoid that risk, they explored Redis Enterprise. This approach showed significant performance improvements and indeed met their SLA requirements. However, there was a catch. JioCinema’s business requires millisecond-level latency across multiple regions. According to Charan, “Even with Redis Enterprise, we faced a choice between an active-active deployment to maintain low latency or compromising the customer experience in certain regions. The latter was unacceptable for our use case. Additionally, Redis Enterprise imposed substantial charges for each operation and replication, making it challenging to justify the return on investment of this feature for our business. This led us to explore ScyllaDB.” Implementing Watch Discounting with ScyllaDB Charan continued, “ScyllaDB not only supported Bloom filters out of the box, but we also had prior experience using it for different personalization use cases. Knowing its exceptional speed with the ability to replicate data into multiple regions and serve customers from locations close to their origin, ScyllaDB seemed like a comprehensive solution. This allowed us to concentrate on developing what mattered most for our customers and enabled us to go to market fast.” As the following diagram shows, the Watch Discounting feature was powered by two ingestion pipelines. Batch processes compute users’ watch history within a specified time window, determining if a piece of content meets the completion criteria to be considered “watched.” If so, the system updates the ScyllaDB table with a time-to-live (TTL) attribute, ensuring content only becomes rediscoverable after a specified amount of time. Short videos (e.g., 30-60 second videos that drive high engagement) require a different treatment. Here, content must be marked as “viewed” immediately, so real-time event streaming is used to update the watch discounting repository. Why ScyllaDB Charan concluded, “As mentioned earlier, adopting ScyllaDB enabled us to prioritize developing new functionality over creating data structures. This approach allowed us to keep our nodes small and maintain a clear separation of concerns between Bloom filters and filtering watched content. The unmatched performance of ScyllaDB became evident, especially when dealing with high cardinality of partitions and small data sizes—precisely the characteristics of our dataset. TTLs made cleanups easy and permitted the discovery of watched content after a specified period. Moreover, reading from LOCAL_QUORUM ensured that we could access data from the closest region to the user, resulting in high throughput and lower latencies. This strategic combination of features in ScyllaDB significantly contributed to the efficacy and effectiveness of our system.”Charybdis: Building High-Performance Distributed Rust Backends with ScyllaDB
Build a high-performance distributed Rust backend—without losing the expressiveness and ease of use of Ruby on Rails and SQL Editor’s note: This post was originally published on Goran’s blog. Ruby on Rails (RoR) is one of the most renowned web frameworks. When combined with SQL databases, RoR transforms into a powerhouse for developing back-end (or even full-stack) applications. It resolves numerous issues out of the box, sometimes without developers even realizing it. For example, with the right callbacks, complex business logic for a single API action is automatically wrapped within a transaction, ensuring ACID (Atomicity, Consistency, Isolation, Durability) compliance. This removes many potential concerns from the developer’s plate. Typically, developers only need to define a functional data model and adhere to the framework’s conventions — sounds easy, right? However, as with all good things, there are trade-offs. In this case, it’s performance. While the RoR and RDBMS combination is exceptional for many applications, it struggles to provide a suitable solution for large-scale systems. Additionally, using frameworks like RoR alongside standard relational databases introduces another pitfall: it becomes easy to develop poor data models. Why? Simply because SQL databases are highly flexible, allowing developers to make almost any data model work. We just utilize more indexing, joins, and preloading to avoid the dreaded N+1 query problem. We’ve all fallen into this trap at some point. What if we could build a high-performance, distributed, Rust-based backend while retaining some of the expressiveness and ease-of-use found in RoR and SQL? This is where ScyllaDB and Charybdis ORM come into play. Before diving into these technologies, it’s essential to understand the fundamental differences between traditional Relational Database Management Systems (RDBMS) and ScyllaDB NoSQL. LSM vs. B+ Tree ScyllaDB, like Cassandra, employs a Log-Structured Merge Tree (LSM) storage engine, which optimizes write operations by appending data to in-memory structures called memtables and periodically flushing them to disk as SSTables. This approach allows for high write throughput and efficient handling of large volumes of data. By using a partition key and a hash function, ScyllaDB can quickly locate the relevant SSTables and memtables, avoiding global index scans and focusing operations on specific data segments. However, while LSM trees excel at write-heavy workloads, they can introduce read amplification since data might be spread across multiple SSTables. To mitigate this, ScyllaDB/Cassandra uses Bloom filters and optimized indexing strategies. Read performance may occasionally be less predictable compared to B+ trees, especially for certain read patterns. Traditional SQL Databases: B+ Tree Indexing In contrast, traditional SQL databases like PostgreSQL and MySQL (InnoDB) use B+ Tree indexing, which provides O(log N) read operations by traversing the tree from root to leaf nodes to locate specific rows. This structure is highly effective for read-heavy applications and supports complex queries, including range scans and multi-table joins. While B+ trees offer excellent read performance, write operations are slower compared to LSM trees due to the need to maintain tree balance, which may involve node splits and more random I/O operations. Additionally, SQL databases benefit from sophisticated caching mechanisms that keep frequently accessed index pages in memory, further enhancing read efficiency. Horizontal Scalability ScyllaDB/Cassandra: Designed for Seamless Horizontal Scaling ScyllaDB/Cassandra are inherently built for horizontal scalability through their shared-nothing architecture. Each node operates independently, and data is automatically distributed across the cluster using consistent hashing. This design ensures that adding more nodes proportionally increases both storage capacity and compute resources, allowing the system to handle growing workloads efficiently. The automatic data distribution and replication mechanisms provide high availability and fault tolerance, ensuring that the system remains resilient even if individual nodes fail. Furthermore, ScyllaDB/Cassandra offer tunable consistency levels, allowing developers to balance between consistency and availability based on application requirements. This flexibility is particularly advantageous for distributed applications that need to maintain performance and reliability at scale. Traditional SQL Databases: Challenges with Horizontal Scaling Traditional SQL databases, on the other hand, were primarily designed for vertical scalability, relying on enhancing a single server’s resources to manage increased load. While replication (primary-replica or multi-primary) and sharding techniques enable horizontal scaling, these approaches often introduce significant operational complexity. Managing data distribution, ensuring consistency across replicas, and handling failovers require careful planning and additional tooling. Moreover, maintaining ACID properties across a distributed SQL setup can be resource-intensive, potentially limiting scalability compared to NoSQL solutions like ScyllaDB/Cassandra. Data Modeling To harness ScyllaDB’s full potential, there is one fundamental rule: data modeling should revolve around queries. This means designing your data structures based on how you plan to access and query them. At first glance, this might seem obvious, prompting the question: Aren’t we already doing this with traditional RDBMSs? Not entirely. The flexibility of SQL databases allows developers to make nearly any data model work by leveraging joins, indexes, and preloading techniques. This often masks underlying inefficiencies, making it easy to overlook suboptimal data designs. In contrast, ScyllaDB requires a more deliberate approach. You must carefully select partition and clustering keys to ensure that queries are scoped to single partitions and data is ordered optimally. This eliminates the need for extensive indexing and complex joins, allowing ScyllaDB’s Log-Structured Merge (LSM) engine to deliver high performance. While this approach demands more upfront effort, it leads to more efficient and scalable data models. To be fair, it also means that, as a rule, you usually have to provide more information to locate the desired data. Although this can initially appear challenging, the more you work with it, the more you naturally develop the intuition needed to create optimal models. Charybdis Now that we have grasped the fundamentals of data modeling in ScyllaDB, we can turn our attention to Charybdis. Charybdis is a Rust ORM built on top of the ScyllaDB Rust Driver, focusing on ease of use and performance. Out of the box, it generates nearly all available queries for a model and provides helpers for custom queries. It also supports automatic migrations, allowing you to run commands to migrate the database structure based on differences between model definitions in your code and the database. Additionally, Charybdis supports partial models, enabling developers to work seamlessly with subsets of model fields while implementing all traits and functionalities that are present in the main model. Sample User Model Note: We will always query a user byid
, so we simply added
id
as the partition key, leaving the clustering key
empty. Installing and Running Migrations First, install the
migration tool: cargo install charybdis-migrate
Within
your src/
directory, run the migration: migrate
--hosts <host> --keyspace <your_keyspace>
--drop-and-replace (optional)
This command will create the
users
table with fields defined in your model. Note
that for migrations to work, you need to use types or aliases
defined within charybdis::types::*
. Basic Queries for
the User Model Sample Models for a Reddit-Like Application In a
Reddit-like application, we have communities that have posts, and
posts have comments. Note that the following sample is available
within the
Charybdis examples repository. Community Model Post Model
Actix-Web Services for Posts Note: The insert_cb
method triggers the before_insert
callback within our
trait, assigning a new id
and created_at
to the post. Retrieving All Posts for a Community Updating a Post’s
Description To avoid potential inconsistency issues, such as
concurrent requests to update a post’s description and other
fields, we use the automatically generated
partial_<model>! macro. The partial_post!
is
automatically generated by the charybdis_model
macro.
The first argument is the new struct name of the partial model, and
the others are a subset of the main model fields that we want to
work with. In this case, UpdateDescriptionPost
behaves
just like the standard Post
model but operates on
a subset of model fields. For each partial model, we must provide
the complete primary key, and the main model must implement the
Default
trait. Additionally, all traits on the main
model that are defined below the charybdis_model
will
automatically be included for all partial models. Now we can have
an Actix service dedicated to updating a post’s description: Note:
To update a post, you must provide all components of the primary
key (community_id
, created_at, id
). Final
Notes A full working sample is available within the
Charybdis examples repository. Note: We defined our models
somewhat differently than in typical SQL scenarios by using three
columns to define the primary key. This is because, in designing
models, we also determine where and how data will be stored for
querying and data transformation. ACID Compliance Considerations
While ScyllaDB offers exceptional performance and seamless
horizontal scalability for many applications, it is not suitable
for scenarios where ACID (Atomicity, Consistency, Isolation,
Durability) properties are required. Sample Use Cases Requiring
ACID Integrity Bank Transactions: Ensuring that
fund transfers are processed atomically to prevent discrepancies
and maintain financial accuracy. Seat
Reservations: Guaranteeing that seat allocations in
airline bookings or event ticketing systems are handled without
double-booking. Inventory Management: Maintaining
accurate stock levels in e-commerce platforms to avoid overselling
items. For some critical applications, the lack of inherent ACID
guarantees in ScyllaDB means that developers must implement
additional safeguards to ensure data integrity. In cases where
absolute transactional reliability is non-negotiable, integrating
ScyllaDB with a traditional RDBMS that provides full ACID
compliance might be necessary. In upcoming articles, we will
explore how to handle additional scenarios and how to leverage
eventual consistency effectively for the majority of your web
application, as well as strategies for maintaining strong
consistency when required by your data models in ScyllaDB. Books by Monster SCALE Summit 25 Speakers: Distributed Data Systems & Beyond
Monster SCALE Summit speakers have amassed a rather impressive list of publications, including quite a few books. This blog highlights 10+ of them. If you’ve seen the Monster SCALE Summit agenda, you know that the stars have aligned nicely. In just two half days, from anywhere you like, you can learn from 60+ outstanding speakers – all exploring extreme scale engineering challenges from a variety of angles. Distributed databases, event streaming, AI/ML, Kubernetes, Rust…it’s all on the agenda. If you read the bios of our speakers, you’ll note that many have written books. This blog highlights eleven of those Monster SCALE Summit speakers’ books – plus two new books by past conference speakers. Once you register for the conference (it’s free + virtual), you’ll gain 30-day full access to the complete O’Reilly library (thanks to O’Reilly, a conference media sponsor). And Manning Publications is also a media sponsor. They are offering the Monster SCALE community a nice 50% discount on all Manning books . One more bonus: conference attendees who participate in the speaker chat will be eligible to win book bundles, courtesy of Manning. See the agenda and register – it’s free Designing Data-Intensive Applications, 2nd Edition By Martin Kleppmann and Chris Riccomini O’Reilly ETA: December 2025 Data is at the center of many challenges in system design today. Difficult issues such as scalability, consistency, reliability, efficiency, and maintainability need to be resolved. In addition, there’s an overwhelming variety of tools and analytical systems, including relational databases, NoSQL datastores, plus data warehouses and data lakes. What are the right choices for your application? How do you make sense of all these buzzwords? In this second edition, authors Martin Kleppmann and Chris Riccomini build on the foundation laid in the acclaimed first edition, integrating new technologies and emerging trends. You’ll be guided through the maze of decisions and trade-offs involved in building a modern data system, from choosing the right tools like Spark and Flink to understanding the intricacies of data laws like the GDPR. Peer under the hood of the systems you already use, and learn to use them more effectively Make informed decisions by identifying the strengths and weaknesses of different tools Navigate the trade-offs around consistency, scalability, fault tolerance, and complexity Understand the distributed systems research upon which modern databases are built Peek behind the scenes of major online services, and learn from their architectures Martin and Chris are presenting “Designing Data-Intensive Applications in 2025” Think Distributed Systems Dominik Tornow ETA: Fall 2025 Manning (use code SCALE2025 for 50% off) All modern software is distributed. Let’s say that again—all modern software is distributed. Whether you’re building mobile utilities, microservices, or massive cloud native enterprise applications, creating efficient distributed systems requires you to think differently about failure, performance, network services, resource usage, latency, and much more. This clearly-written book guides you into the mindset you’ll need to design, develop, and deploy scalable and reliable distributed systems. In Think Distributed Systems you’ll find a beautifully illustrated collection of mental models for: Correctness, scalability, and reliability Failure tolerance, detection, and mitigation Message processing Partitioning and replication Consensus Dominik is presenting “The Mechanics of Scale” Latency: Reduce Delay in Software Systems Pekka Enberg ETA: Summer 2025 Manning (use code SCALE2025 for 50% off) Slow responses can kill good software. Whether it’s recovering microseconds lost while routing messages on a server or speeding up page loads that keep users waiting, finding and fixing latency can be a frustrating part of your work as a developer. This one-of-a-kind book shows you how to spot, understand, and respond to latency wherever it appears in your applications and infrastructure. This book balances theory with practical implementations, turning academic research into useful techniques you can apply to your projects. In Latency you’ll learn: What latency is—and what it is not How to model and measure latency Organizing your application data for low latency Making your code run faster Hiding latency when you can’t reduce it Pekka presented “Patterns of Low Latency” at P99 CONF 2024. And his Turso co-founder Glauber Costa will be presenting “Who Needs One Database Anyway?” at Monster SCALE Summit Writing for Developers: Blogs That Get Read By Piotr Sarna and Cynthia Dunlop January 2025 Amazon | Manning (use code SCALE2025 for 50% off) This book is a practical guide to writing more compelling engineering blog posts. We discuss strategies for nailing all phases of the technical blogging process. And we have quite a bit of fun exploring the core blog post patterns that are most common across engineering blogs today, like “The Bug Hunt,” “How We Built It,” “Lessons Learned,” “We Rewrote It in X,” “Thoughts on Trends,” etc. Each “pattern” chapter includes an analysis of real-world examples as well as specific dos/don’ts for that particular pattern. There’s a section on moving from blogging into opportunities such as article writing, conference speaking, and book writing. Finally, we wrap with a critical (and often amusing) look at generative AI blogging uses and abuses. Oh…and there’s also a foreword by Bryan Cantrill and an afterword by Scott Hanselman! Readers will learn how to: Pinpoint topics that make intriguing posts Apply popular blog post design patterns Rapidly plan, draft, and optimize blog posts Make your content clearer and more convincing to technical readers Tap AI for revision while avoiding misuses and abuses Increase the impact of all your technical communications Piotr is presenting “A Dist Sys Programmer’s Journey Into AI” ScyllaDB in Action Bo Ingram October 2024 Amazon | Manning (use code SCALE2025 for 50% off) | ScyllaDB (free chapters) ScyllaDB in Action is your guide to everything you need to know about ScyllaDB, from your very first queries to running it in a production environment. It starts you with the basics of creating, reading, and deleting data and expands your knowledge from there. You’ll soon have mastered everything you need to build, maintain, and run an effective and efficient database. This book teaches you ScyllaDB the best way—through hands-on examples. Dive into the node-based architecture of ScyllaDB to understand how its distributed systems work, how you can troubleshoot problems, and how you can constantly improve performance.You’ll learn how to: • Read, write, and delete data in ScyllaDB • Design database schemas for ScyllaDB • Write performant queries against ScyllaDB • Connect and query a ScyllaDB cluster from an application • Configure, monitor, and operate ScyllaDB in production Bo’s colleagues Ethan Donowitz and Vicki Niu are both presenting at Monster SCALE Summit Data Virtualization in the Cloud Era Dr. Daniel Abadi and Andrew Mott July 2024 O’Reilly Data virtualization had been held back by complexity for decades until recent advances in cloud technology, data lakes, networking hardware, and machine learning transformed the dream into reality. It’s becoming increasingly practical to access data through an interface that hides low-level details about where it’s stored, how it’s organized, and which systems are needed to manipulate or process it. You can combine and query data from anywhere and leave the complex details behind. In this practical book, authors Dr. Daniel Abadi and Andrew Mott discuss in detail what data virtualization is and the trends in technology that are making data virtualization increasingly useful. With this book, data engineers, data architects, and data scientists will explore the architecture of modern data virtualization systems and learn how these systems differ from one another at technical and practical levels. By the end of the book, you’ll understand: The architecture of data virtualization systems Technical and practical ways that data virtualization systems differ from one another Where data virtualization fits into modern data mesh and data fabric paradigms Modern best practices and case study use cases Daniel is presenting “Two Leading Approaches to Data Virtualization: Which Scales Better?” Bonus: Read Daniel Abadi’s article on the PACELC theorem. Database Performance at Scale By Felipe Cardeneti Mendes, Piotr Sarna, Pavel Emelyanov, and Cynthia Dunlop October 2023 Amazon | ScyllaDB (free) Discover critical considerations and best practices for improving database performance based on what has worked, and failed, across thousands of teams and use cases in the field. This book provides practical guidance for understanding the database-related opportunities, trade-offs, and traps you might encounter while trying to optimize data-intensive applications for high throughput and low latency. Whether you’re building a new system from the ground up or trying to optimize an existing use case for increased demand, this book covers the essentials. The ultimate goal of the book is to help you discover new ways to optimize database performance for your team’s specific use cases, requirements, and expectations. Understand often overlooked factors that impact database performance at scale Recognize data-related performance and scalability challenges associated with your project Select a database architecture that’s suited to your workloads, use cases, and requirements Avoid common mistakes that could impede your long-term agility and growth Jumpstart teamwide adoption of best practices for optimizing database performance at scale Felipe is presenting “ScyllaDB is No Longer “Just a Faster Cassandra” Piotr is presenting “A Dist Sys Programmer’s Journey Into AI” Algorithms and Data Structures for Massive Datasets Dzejla Medjedovic, Emin Tahirovic, and Ines Dedovic May 2022 Amazon | Manning (use code SCALE2025 for 50% off) Algorithms and Data Structures for Massive Datasets reveals a toolbox of new methods that are perfect for handling modern big data applications. You’ll explore the novel data structures and algorithms that underpin Google, Facebook, and other enterprise applications that work with truly massive amounts of data. These effective techniques can be applied to any discipline, from finance to text analysis. Graphics, illustrations, and hands-on industry examples make complex ideas practical to implement in your projects—and there’s no mathematical proofs to puzzle over. Work through this one-of-a-kind guide, and you’ll find the sweet spot of saving space without sacrificing your data’s accuracy. Readers will learn: Probabilistic sketching data structures for practical problems Choosing the right database engine for your application Evaluating and designing efficient on-disk data structures and algorithms Understanding the algorithmic trade-offs involved in massive-scale systems Deriving basic statistics from streaming data Correctly sampling streaming data Computing percentiles with limited space resources Dzejla is presenting “Read- and Write-Optimization in Modern Database Infrastructures” Kafka: The Definitive Guide, 2nd Edition By Gwen Shapira, Todd Palino, Rajini Sivaram, Krit Petty November 2021 Amazon | O’Reilly Engineers from Confluent and LinkedIn responsible for developing Kafka explain how to deploy production Kafka clusters, write reliable event-driven microservices, and build scalable stream processing applications with this platform. Through detailed examples, you’ll learn Kafka’s design principles, reliability guarantees, key APIs, and architecture details, including the replication protocol, the controller, and the storage layer. You’ll learn: Best practices for deploying and configuring Kafka Kafka producers and consumers for writing and reading messages Patterns and use-case requirements to ensure reliable data delivery Best practices for building data pipelines and applications with Kafka How to perform monitoring, tuning, and maintenance tasks with Kafka in production The most critical metrics among Kafka’s operational measurements Kafka’s delivery capabilities for stream processing systems Gwen is presenting “The Nile Approach: Re-engineering Postgres for Millions of Tenants” The Missing README: A Guide for the New Software Engineer by Chris Riccomini and Dmitriy Ryaboy Amazon | O’Reilly August 2021 For new software engineers, knowing how to program is only half the battle. You’ll quickly find that many of the skills and processes key to your success are not taught in any school or bootcamp. The Missing README fills in that gap—a distillation of workplace lessons, best practices, and engineering fundamentals that the authors have taught rookie developers at top companies for more than a decade. Early chapters explain what to expect when you begin your career at a company. The book’s middle section expands your technical education, teaching you how to work with existing codebases, address and prevent technical debt, write production-grade software, manage dependencies, test effectively, do code reviews, safely deploy software, design evolvable architectures, and handle incidents when you’re on-call. Additional chapters cover planning and interpersonal skills such as Agile planning, working effectively with your manager, and growing to senior levels and beyond. You’ll learn: How to use the legacy code change algorithm, and leave code cleaner than you found it How to write operable code with logging, metrics, configuration, and defensive programming How to write deterministic tests, submit code reviews, and give feedback on other people’s code The technical design process, including experiments, problem definition, documentation, and collaboration What to do when you are on-call, and how to navigate production incidents Architectural techniques that make code change easier Agile development practices like sprint planning, stand-ups, and retrospectives Chris and Martin Kleppmann are presenting “Designing Data-Intensive Applications in 2025” The DynamoDB Book By Alex Debrie April 2020 Amazon | Direct DynamoDB is a highly available, infinitely scalable NoSQL database offering from AWS. But modeling with a NoSQL database like DynamoDB is different than modeling with a relational database. You need to intentionally design for your access patterns rather than creating a normalized model that allows for flexible querying later. The DynamoDB Book is the authoritative resource in the space, and it’s the recommended resource within Amazon for learning DynamoDB. Rick Houlihan, the former head of the NoSQL Blackbelt team at AWS, said The DynamoDB Book is “definitely a must read if you want to understand how to correctly model data for NoSQL apps.” The DynamoDB takes a comprehensive approach to teaching DynamoDB, including: Discussion of key concepts, underlying infrastructure components, and API design; Explanations of core strategies for data modeling, including one-to-many and many-to-many relationships, filtering, sorting, aggregations, and more; 5 full walkthrough examples featuring complex data models and a large number of access patterns. Alex is presenting “DynamoDB Cost Optimization Considerations and Strategies” RESTful Java Patterns and Best Practices: Learn Best Practices to Efficiently Build Scalable, Reliable, and Maintainable High Performance Restful Services By Bhakti Mehta Amazon September, 2014 This book provides an overview of the REST architectural style and then dives deep into best practices and commonly used patterns for building RESTful services that are lightweight, scalable, reliable, and highly available. It’s designed to help application developers get familiar with REST. The book explores the details, best practices, and commonly used REST patterns as well as gives insights on how Facebook, Twitter, PayPal, GitHub, Stripe, and other companies are implementing solutions with RESTful services.Tracking Millions of Heartbeats on ZEE’s Streaming Platform
How strategic database migration + data (re)modeling improved latencies and cut database costs 5X ZEE is India’s largest media and entertainment business, covering broadcast TV, films, streaming media, and music. ZEE5 is their premier OTT streaming service, available in over 190 countries with ~150M monthly active users. And every user’s playback experience, security, and recommendations rely upon a “heartbeat API” that processes a whopping 100B+ heartbeats per day. The engineers behind the system knew that continued business growth would stress their infrastructure (as well as the people reviewing the database bills). So, the team decided to rethink the system before it inflicted any heart attacks. TL;DR, they designed a system that’s loved internally and by users. And Jivesh Threja (Tech Lead) and Srinivas Shanmugam (Principal Architect) joined us on Valentine’s Day last year to share their experiences. They outlined the technical requirements for the replacement (cloud neutrality, multi-tenant readiness, simplicity of onboarding new use cases, and high throughput and low latency at optimal costs) and how that led to ScyllaDB. Then, they explained how they achieved their goals through a new stream processing pipeline, new API layer, and data (re)modeling. The initial results of their optimization: 5X cost savings (from $744K to $144K annually) and single-digit millisecond P99 read latency. Wrapping up, they shared lessons learned that could benefit anyone considering or using ScyllaDB. Here are some highlights from that talk… What’s a Heartbeat? “Heartbeat” refers to a request that’s fired at regular intervals during video playback on the ZEE5 OTT platform. These simple requests track what users are watching and how far they’ve progressed in each video. They’re essential for ZEE5’s “continue watching” functionality, which lets users pause content on one device then resume it on any device. They’re also instrumental for calculating key metrics, like concurrent viewership for a big event or the top shows this week. Why Change? ZEE5’s original heartbeat system was a web of different databases, each handling a specific part of the streaming experience. Although it was technically functional, this approach was expensive and locked them into a specific vendor ecosystem. The team recognized an opportunity to streamline their infrastructure– and they went for it. They wanted a system that wasn’t locked into any particular cloud provider, would cost less to operate, and could handle their massive scale with consistently fast performance – specifically, single-digit millisecond responses. Plus, they wanted the flexibility to add new features easily and the ability to offer their system to other streaming platforms. As Srinivas put it: “It needed to be multi-tenant ready so it could be reused for any OTT provider. And it needed to be easily extensible to new use cases without major architectural changes.” System Architecture, Before and After Here’s a look at their original system architecture with multiple databases: DynamoDB to store the basic heartbeat data Amazon RDS to store next and previous episode information Apache Solr to store persistent metadata One Redis instance to cache metadata Another Redis instance to store viewership details Click for a detailed view The ZEE5 team considered four main database options for this project: Redis, Cassandra, Apache Ignite, and ScyllaDB. After evaluation and benchmarking, they chose ScyllaDB. Some of the reasons Srinivas cited for this decision: “We don’t need an extra cache layer on top of the persistent database. ScyllaDB manages both the cache layer and the persistent database within the same infrastructure, ensuring low latency across regions, replication, and multi-cloud readiness. It works with any cloud vendor, including Azure, AWS, and GCP, and now offers managed support with a turnaround time of less than one hour.” The new architecture simplifies and flattens the previous system architecture structure. Click for a detailed view Now, all heartbeat events are pushed into their heartbeat topic, processed through stream processing, and ingested into ScyllaDB Cloud using ScyllaDB connectors. Whenever content is published, it’s ingested into their metadata topic and then inserted into ScyllaDB Cloud via metadata connectors. Srinivas concludes: “With this new architecture, we successfully migrated workloads from DynamoDB, RDS, Redis, and Solr to ScyllaDB. This has resulted in a 5x cost reduction, bringing our monthly expenses down from $62,000 to around $12,000.” Deeper into the Design Next Jivesh shared more about their low-level design… Real-time stream processing pipeline In the real-time stream processing pipeline, heartbeats are sent to ScyllaDB at regular intervals. The heartbeat interval is set to 60 seconds, meaning that every frontend client sends a heartbeat every 60 seconds while a user is watching a video. These heartbeats pass through the playback stream processing system, business logic consumers transform that data into the required format – then the processed data is stored in ScyllaDB. Scalable API layer The first component in the scalable API layer is the heartbeat service, which is responsible for handling large volumes of data ingestion. Topics process the data, then it passes through a connector service and is stored in ScyllaDB. Another notable API layer service is the Concurrent Viewership Count service. This service uses ScyllaDB to retrieve concurrent viewership data – either per user or per asset (e.g., per ID). For example, if a movie is released, this service can tell how many users are watching the movie at any given moment. Metadata management use case One of the first major challenges ZEE5 faced was managing metadata for their massive OTT platform. Initially, they relied on a combination of three different databases – Solr, Redis, and Postgres – to handle their extensive metadata needs. Looking to optimize and simplify, they redesigned their data model to work with ScyllaDB instead – using ID as the partition key, along with materialized views. Here’s a look at their metadata model:create keyspace.meta_data ( id text,
title text, show_id text, …, …, PRIMARY KEY((id),show_id) ) with
compaction = {‘class’: ‘LeveledCompactionStrategy’ };
In
this model, the ID serves as the partition key. Since this table
experiences relatively few writes (a write occurs only when a new
asset is released) but significantly more reads, they used Leveled
Compaction Strategy to optimize performance. And, according to
Jivesh, “Choosing the right partition and clustering keys helped us
get a single-digit millisecond latency.” Viewership count use case
Viewership Count is another use case that they moved to ScyllaDB.
Viewership count can be tracked per user or per asset ID. ZEE5
decided to design a table where the user ID served as the partition
key and the asset ID as the sort key – allowing viewership data to
be efficiently queried. They set ScyllaDB’s TTL to match the
60-second heartbeat interval, ensuring that data automatically
expires after the designated time. Additionally, they used
ScyllaDB’s Time-Window Compaction Strategy to efficiently manage
data in memory, clearing expired records based on the configured
TTL. Jivesh explained, “This table is continuously updated with
heartbeats from every front end and every user. As heartbeats
arrive, viewership counts are tracked in real time and
automatically cleared when the TTL expires. That lets us
efficiently retrieve live viewership data using ScyllaDB.” Here’s
their viewership count data model: CREATE TABLE
keyspace.USER_SESSION_STREAM ( USER_ID text, DEVICE_ID text,
ASSET_ID text, TITLE text, …, PRIMARY KEY((USER_ID), ASSET_ID) )
WITH default_time_to_live = 60 and compaction = { 'class' :
'TimeWindowCompactionStrategy' };
ScyllaDB Results and
Lessons Learned The following load test report shows a throughput
of 41.7K requests per second. This benchmark was conducted during
the database selection process to evaluate performance under high
load. Jivesh remarked, “Even with such a high throughput, we could
achieve a microsecond write latency and average microsecond read
latency. This really gave us a clear view of what ScyllaDB could do
– and that helped us decide.” He then continued to share some facts
that shed light on the scale of ZEE5’s ScyllaDB deployment: “We
have around 9TB on ScyllaDB. Even with such a large volume of data,
it’s able to handle latencies within microseconds and a
single-digit millisecond, which is quite tremendous. We have a
daily peak concurrent viewership count of 1 million. Every second,
we are writing so much data into ScyllaDB and getting so much data
out of it We process more than 100 billion heartbeats in a day.
That’s quite huge.” The talk wrapped with the following lessons
learned: Data modeling is the single most critical factor in
achieving single-digit millisecond latencies. Choose the right
quorum setting and compaction strategy. For example, does a
heartbeat need to be written to every node before it can be read,
or is a local quorum sufficient? Selecting the right quorum ensures
the best balance between latency and SLA requirements. Choose
Partition and Clustering Keys wisely – it’s not easy to modify them
later. Use Materialized Views for faster lookups and avoid filter
queries. Querying across partitions can degrade performance. Use
prepared statements to improve efficiency. Use asynchronous queries
for faster query processing. For instance, in the metadata model,
20 synchronous queries were executed in parallel, and ScyllaDB
handled them within milliseconds. Zone-aware ScyllaDB clients help
reduce cross-AZ (Availability Zone) network costs. Fetching data
within the same AZ minimizes latency and significantly reduces
network expenses. First Look at the Monster SCALE Summit Agenda
Given that we’re hosting Monster SCALE Summit…with tech talks on extreme-scale engineering…many of which feature our monstrously fast and scalable database, a big announcement is probably expected? We hope this meets your super-sized expectations. Monster SCALE Summit 2025 will be featuring 60+ tech talks including: Just-added keynotes by Kelsey Hightower and Rachel Stephens + Adam Jacob Previously-teased keynotes by Avi Kivity, Martin Kleppmann + Chris Riccomini, Gwen Shapira, and Dor Laor Engineering talks by gamechangers like Uber, Slack, Canva, Atlassian, Wise, and Booking.com 14 talks by ScyllaDB users such as Medium, Disney+, ShareChat, Yieldmo, Clearview AI, and more – plus two talks by Discord The latest from ScyllaDB engineering: including object storage, vector search, and “ScyllaDB X Cloud” Like other ScyllaDB-hosted conferences (e.g., P99 CONF), this conference will be free and virtual so that everyone can participate. See the agenda and register – it’s free Mark your calendar for March 11 and 12 because – in addition to all those great talks – you can… Chat directly with speakers and connect with ~20K of your peers Participate in some monster scale global distributed system challenges – with prizes for winners, of course Learn from ScyllaDB’s top experts, who are eager to answer your toughest database performance questions in our lively lounge – and preparing special interactive training courses for the occasion Win conference swag, sea monster plushies, book bundles, and other cool giveaways It’s a lot. But hey, it’s Monster SCALE Summit. 🙂 Details, Details Beyond what’s on the agenda, here’s some additional detail on a few recently-added sessions (see more in our “tiny peek” blog post) How Discord Performs Database Upgrades at Scale Ethan Donowitz, Senior Software Engineer, Persistence Infrastructure at Discord Database upgrades are high-risk but high-reward. Upgrading to a newer version can make your database faster, cheaper, and more reliable; however, without thorough planning and testing, upgrades can be risky. Because databases are stateful, it is often not possible to roll back if you encounter problems after the upgrade due to backwards incompatible changes across versions. While new versions typically mean improved query latencies, changes in query planning or cache behavior across versions can cause unexpected differences in performance in places one might not expect. Discord relies on ScyllaDB to serve millions of reads per second across many clusters, so we needed a comprehensive strategy to sufficiently de-risk upgrades to avoid impact to our users. To accomplish this, we use what we call “shadow clusters.” A shadow cluster contains roughly the same data as its corresponding cluster in production, and traffic to the primary cluster is mirrored to the shadow cluster. Running a real production workload on a shadow cluster can expose differences in performance and resource usage across versions. When mirroring reads, we also have the ability to perform “read validations,” where the results for a query issued to the primary cluster and the shadow cluster are checked for equality. This gives us confidence that data has not been corrup How Discord Indexes Trillions of Messages: Scaling Search Infrastructure Vicki Niu, Senior Software Engineer at Discord When Discord first built messages search in 2017, we designed our infrastructure to handle billions of messages sent by millions of users. As our platform grew to trillions of messages, our search system failed to keep up. We thus set out to rebuild our message search platform to meet these new scaling needs using our learnings and some new technologies. This talk will share how we scaled Discord’s message search infrastructure using Rust, Kubernetes, and a multi-cluster Elasticsearch architecture to achieve better performance, operability, and reliability, while also enabling new search features for Discord users. ted due to differences in behavior across versions. Testing with shadow clusters has been paramount to de-risking complicated upgrades for one of the most important pieces of infrastructure at Discord. Route It Like It’s Hot: Scaling Payments Routing at American Express Benjamin Cane, Distinguished Engineer at American Express In 2023, there were over 723 billion credit card transactions. Whenever someone taps, swipes, dips, or clicks a credit or debit card, a payment switch ensures the transaction arrives safely and securely at the correct financial institution.These payment switches are the backbone of the worldwide payments ecosystem. Join the American Express Payment Acquiring and Network team as they share their experiences from building their Global Transaction Router, which is responsible for switching and routing payments at the scale of American Express. They will explore how they’ve designed, built, and operated this Global Transaction Router to perform during record-breaking shopping holidays, ticket sales, and unexpected customer behavior. The audience will leave with a deep understanding of the unique challenges of a payments switch (E.g., routing ISO 8583 transactions as fast as possible), some of our design choices (E.g., using containers and avoiding logging), and a deep dive into a few implementation challenges (E.g., Inefficient use of Goroutines and Channels) we found along the way. How Yieldmo Cut Database Costs and Cloud Dependencies Fast Todd Coleman, Chief Architect and Co-founder at Yieldmo Yieldmo’s business relies on processing hundreds of billions of daily ad requests with subsecond latency responses. Our services initially depended on DynamoDB, and we valued its simplicity and stability. However, DynamoDB costs were becoming unsustainable, latencies were not ideal, and we sought greater flexibility in deploying services to other cloud providers. In this session, we’ll walk you through the various options we considered to address these challenges and share why and how we ultimately moved forward with ScyllaDB’s DynamoDB-compatible API. See more session detailsReal-Time Write Heavy Database Workloads: Considerations & Tips
Let’s look at the performance-related complexities that teams commonly face with write-heavy workloads and discuss your options for tackling them Write-heavy database workloads bring a distinctly different set of challenges than read-heavy ones. For example: Scaling writes can be costly, especially if you pay per operation and writes are 5X more costly than reads Locking can add delays and reduce throughput I/O bottlenecks can lead to write amplification and complicate crash recovery Database backpressure can throttle the incoming load While cost matters – quite a lot, in many cases – it’s not a topic we want to cover here. Rather, let’s focus on the performance-related complexities that teams commonly face and discuss your options for tackling them. What Do We Mean by “a Real-Time Write Heavy Workload”? First, let’s clarify what we mean by a “real-time write-heavy” workload. We’re talking about workloads that: Ingest a large amount of data (e.g., over 50K OPS) Involve more writes than reads Are bound by strict latency SLAs (e.g., single-digit millisecond P99 latency) In the wild, they occur across everything from online gaming to real-time stock exchanges. A few specific examples: Internet of Things (IoT) workloads tend to involve small but frequent append-only writes of time series data. Here, the ingestion rate is primarily determined by the number of endpoints collecting data. Think of smart home sensors or industrial monitoring equipment constantly sending data streams to be processed and stored. Logging and Monitoring systems also deal with frequent data ingestion, but they don’t have a fixed ingestion rate. They may not necessarily append only, as well as may be prone to hotspots, such as when one endpoint misbehaves. Online Gaming platforms need to process real-time user interactions, including game state changes, player actions, and messaging. The workload tends to be spiky, with sudden surges in activity. They’re extremely latency sensitive since even small delays can impact the gaming experience. E-commerce and Retail workloads are typically update-heavy and often involve batch processing. These systems must maintain accurate inventory levels, process customer reviews, track order status, and manage shopping cart operations, which usually require reading existing data before making updates. Ad Tech and Real-time Bidding systems require split-second decisions. These systems handle complex bid processing, including impression tracking and auction results, while simultaneously monitoring user interactions such as clicks and conversions. They must also detect fraud in real time and manage sophisticated audience segmentation for targeted advertising. Real-time Stock Exchange systems must support high-frequency trading operations, constant stock price updates, and complex order matching processes – all while maintaining absolute data consistency and minimal latency. Next, let’s look at key architectural and configuration considerations that impact write performance. Storage Engine Architecture The choice of storage engine architecture fundamentally impacts write performance in databases. Two primary approaches exist: LSM trees and B-Trees. Databases known to handle writes efficiently – such as ScyllaDB, Apache Cassandra, HBase, and Google BigTable – use Log-Structured Merge Trees (LSM). This architecture is ideal for handling large volumes of writes. Since writes are immediately appended to memory, this allows for very fast initial storage. Once the “memtable” in memory fills up, the recent writes are flushed to disk in sorted order. That reduces the need for random I/O. For example, here’s what the ScyllaDB write path looks like: With B-tree structures, each write operation requires locating and modifying a node in the tree – and that involves both sequential and random I/O. As the dataset grows, the tree can require additional nodes and rebalancing, leading to more disk I/O, which can impact performance. B-trees are generally better suited for workloads involving joins and ad-hoc queries. Payload Size Payload size also impacts performance. With small payloads, throughput is good but CPU processing is the primary bottleneck. As the payload size increases, you get lower overall throughput and disk utilization also increases. Ultimately, a small write usually fits in all the buffers and everything can be processed quite quickly. That’s why it’s easy to get high throughput. For larger payloads, you need to allocate larger buffers or multiple buffers. The larger the payloads, the more resources (network and disk) are required to service those payloads. Compression Disk utilization is something to watch closely with a write-heavy workload. Although storage is continuously becoming cheaper, it’s still not free. Compression can help keep things in check – so choose your compression strategy wisely. Faster compression speeds are important for write-heavy workloads, but also consider your available CPU and memory resources. Be sure to look at the compression chunk size parameter. Compression basically splits your data into smaller blocks (or chunks) and then compresses each block separately. When tuning this setting, realize that larger chunks are better for reads while smaller ones are better for writes, and take your payload size into consideration. Compaction For LSM-based databases, the compaction strategy you select also influences write performance. Compaction involves merging multiple SSTables into fewer, more organized files, to optimize read performance, reclaim disk space, reduce data fragmentation, and maintain overall system efficiency. When selecting compaction strategies, you could aim for low read amplification, which makes reads as efficient as possible. Or, you could aim for low write amplification by avoiding compaction from being too aggressive. Or, you could prioritize low space amplification and have compaction purge data as efficiently as possible. For example, ScyllaDB offers several compaction strategies (and Cassandra offers similar ones): Size-tiered compaction strategy (STCS): Triggered when the system has enough (four by default) similarly sized SSTables. Leveled compaction strategy (LCS): The system uses small, fixed-size (by default 160 MB) SSTables distributed across different levels. Incremental Compaction Strategy (ICS): Shares the same read and write amplification factors as STCS, but it fixes its 2x temporary space amplification issue by breaking huge sstables into SSTable runs, which are comprised of a sorted set of smaller (1 GB by default), non-overlapping SSTables. Time-window compaction strategy (TWCS): Designed for time series data. For write-heavy workloads, we warn users to avoid leveled compaction at all costs. That strategy is designed for read-heavy use cases. Using it can result in a regrettable 40x write amplification. Batching In databases like ScyllaDB and Cassandra, batching can actually be a bit of a trap – especially for write-heavy workloads. If you’re used to relational databases, batching might seem like a good option for handling a high volume of writes. But it can actually slow things down if it’s not done carefully. Mainly, that’s because large or unstructured batches end up creating a lot of coordination and network overhead between nodes. However, that’s really not what you want in a distributed database like ScyllaDB. Here’s how to think about batching when you’re dealing with heavy writes: Batch by the Partition Key: Group your writes by the partition key so the batch goes to a coordinator node that also owns the data. That way, the coordinator doesn’t have to reach out to other nodes for extra data. Instead, it just handles its own, which cuts down on unnecessary network traffic. Keep Batches Small and Targeted: Breaking up large batches into smaller ones by partition keeps things efficient. It avoids overloading the network and lets each node work on only the data it owns. You still get the benefits of batching, but without the overhead that can bog things down. Stick to Unlogged Batches: Considering you follow the earlier points, it’s best to use unlogged batches. Logged batches add extra consistency checks, which can really slow down the write. So, if you’re in a write-heavy situation, structure your batches carefully to avoid the delays that big, cross-node batches can introduce. Wrapping Up We offered quite a few warnings, but don’t worry. It was easy to compile a list of lessons learned because so many teams are extremely successful working with real-time write-heavy workloads. Now you know many of their secrets, without having to experience their mistakes. 🙂 If you want to learn more, here are some firsthand perspectives from teams who tackled quite interesting write-heavy challenges: Zillow: Consuming records from multiple data producers, which resulted in out-of-order writes that could result in incorrect updates Tractian: Preparing for 10X growth in high-frequency data writes from IoT devices Fanatics: Heavy write operations like handling orders, shopping carts, and product updates for this online sports retailer Also, take a look at the following video, where we go into even greater depth on these write-heavy challenges and also walk you through what these workloads look like on ScyllaDB.Inside Tripadvisor’s Real-Time Personalization with ScyllaDB + AWS
See the engineering behind real-time personalization at Tripadvisor’s massive (and rapidly growing) scale What kind of traveler are you? Tripadvisor tries to assess this as soon as you engage with the site, then offer you increasingly relevant information on every click—within a matter of milliseconds. This personalization is powered by advanced ML models acting on data that’s stored on ScyllaDB running on AWS. In this article, Dean Poulin (Tripadvisor Data Engineering Lead on the AI Service and Products team) provides a look at how they power this personalization. Dean shares a taste of the technical challenges involved in delivering real-time personalization at Tripadvisor’s massive (and rapidly growing) scale. It’s based on the following AWS re:Invent talk: Pre-Trip Orientation In Dean’s words … Let’s start with a quick snapshot of who Tripadvisor is, and the scale at which we operate. Founded in 2000, Tripadvisor has become a global leader in travel and hospitality, helping hundreds of millions of travelers plan their perfect trips. Tripadvisor generates over $1.8 billion in revenue and is a publicly traded company on the NASDAQ stock exchange. Today, we have a talented team of over 2,800 employees driving innovation, and our platform serves a staggering 400 million unique visitors per month – a number that’s continuously growing. On any given day, our system handles more than 2 billion requests from 25 to 50 million users. Every click you make on Tripadvisor is processed in real time. Behind that, we’re leveraging machine learning models to deliver personalized recommendations – getting you closer to that perfect trip. At the heart of this personalization engine is ScyllaDB running on AWS. This allows us to deliver millisecond-latency at a scale that few organizations reach. At peak traffic, we hit around 425K operations per second on ScyllaDB with P99 latencies for reads and writes around 1-3 milliseconds. I’ll be sharing how Tripadvisor is harnessing the power of ScyllaDB, AWS, and real-time machine learning to deliver personalized recommendations for every user. We’ll explore how we help travelers discover everything they need to plan their perfect trip: whether it’s uncovering hidden gems, must-see attractions, unforgettable experiences, or the best places to stay and dine. This [article] is about the engineering behind that – how we deliver seamless, relevant content to users in real time, helping them find exactly what they’re looking for as quickly as possible. Personalized Trip Planning Imagine you’re planning a trip. As soon as you land on the Tripadvisor homepage, Tripadvisor already knows whether you’re a foodie, an adventurer, or a beach lover – and you’re seeing spot-on recommendations that seem personalized to your own interests. How does that happen within milliseconds? As you browse around Tripadvisor, we start to personalize what you see using Machine Learning models which calculate scores based on your current and prior browsing activity. We recommend hotels and experiences that we think you would be interested in. We sort hotels based on your personal preferences. We recommend popular points of interest near the hotel you’re viewing. These are all tuned based on your own personal preferences and prior browsing activity. Tripadvisor’s Model Serving Architecture Tripadvisor runs on hundreds of independently scalable microservices in Kubernetes on-prem and in Amazon EKS. Our ML Model Serving Platform is exposed through one of these microservices. This gateway service abstracts over 100 ML Models from the Client Services – which lets us run A/B tests to find the best models using our experimentation platform. The ML Models are primarily developed by our Data Scientists and Machine Learning Engineers using Jupyter Notebooks on Kubeflow. They’re managed and trained using ML Flow, and we deploy them on Seldon Core in Kubernetes. Our Custom Feature Store provides features to our ML Models, enabling them to make accurate predictions. The Custom Feature Store The Feature Store primarily serves User Features and Static Features. Static Features are stored in Redis because they don’t change very often. We run data pipelines daily to load data from our offline data warehouse into our Feature Store as Static Features. User Features are served in real time through a platform called Visitor Platform. We execute dynamic CQL queries against ScyllaDB, and we do not need a caching layer because ScyllaDB is so fast. Our Feature Store serves up to 5 million Static Features per second and half a million User Features per second. What’s an ML Feature? Features are input variables to the ML Models that are used to make a prediction. There are Static Features and User Features. Some examples of Static Features are awards that a restaurant has won or amenities offered by a hotel (like free Wi-Fi, pet friendly or fitness center). User Features are collected in real time as users browse around the site. We store them in ScyllaDB so we can get lightning fast queries. Some examples of user features are the hotels viewed over the last 30 minutes, restaurants viewed over the last 24 hours, or reviews submitted over the last 30 days. The Technologies Powering Visitor Platform ScyllaDB is at the core of Visitor Platform. We use Java-based Spring Boot microservices to expose the platform to our clients. This is deployed on AWS ECS Fargate. We run Apache Spark on Kubernetes for our daily data retention jobs, our offline to online jobs. Then we use those jobs to load data from our offline data warehouse into ScyllaDB so that they’re available on the live site. We also use Amazon Kinesis for processing streaming user tracking events. The Visitor Platform Data Flow The following graphic shows how data flows through our platform in four stages: produce, ingest, organize, and activate. Data is produced by our website and our mobile apps. Some of that data includes our Cross-Device User Identity Graph, Behavior Tracking events (like page views and clicks) and streaming events that go through Kinesis. Also, audience segmentation gets loaded into our platform. Visitor Platform’s microservices are used to ingest and organize this data. The data in ScyllaDB is stored in two keyspaces: The Visitor Core keyspace, which contains the Visitor Identity Graph The Visitor Metric keyspace, which contains Facts and Metrics (the things that the people did as they browsed the site) We use daily ETL processes to maintain and clean up the data in the platform. We produce Data Products, stamped daily, in our offline data warehouse – where they are available for other integrations and other data pipelines to use in their processing. Here’s a look at Visitor Platform by the numbers: Why Two Databases? Our online database is focused on the real-time, live website traffic. ScyllaDB fills this role by providing very low latencies and high throughput. We use short term TTLs to prevent the data in the online database from growing indefinitely, and our data retention jobs ensure that we only keep user activity data for real visitors. Tripadvisor.com gets a lot of bot traffic, and we don’t want to store their data and try to personalize bots – so we delete and clean up all that data. Our offline data warehouse retains historical data used for reporting, creating other data products, and training our ML Models. We don’t want large-scale offline data processes impacting the performance of our live site, so we have two separate databases used for two different purposes. Visitor Platform Microservices We use 5 microservices for Visitor Platform: Visitor Core manages the cross-device user identity graph based on cookies and device IDs. Visitor Metric is our query engine, and that provides us with the ability for exposing facts and metrics for specific visitors. We use a domain specific language called visitor query language, or VQL. This example VQL lets you see the latest commerce click facts over the last three hours. Visitor Publisher and Visitor Saver handle the write path, writing data into the platform. Besides saving data in ScyllaDB, we also stream data to the offline data warehouse. That’s done with Amazon Kinesis. Visitor Composite simplifies publishing data in batch processing jobs. It abstracts Visitor Saver and Visitor Core to identify visitors and publish facts and metrics in a single API call. Roundtrip Microservice Latency This graph illustrates how our microservice latencies remain stable over time. The average latency is only 2.5 milliseconds, and our P999 is under 12.5 milliseconds. This is impressive performance, especially given that we handle over 1 billion requests per day. Our microservice clients have strict latency requirements. 95% of the calls must complete in 12 milliseconds or less. If they go over that, then we will get paged and have to find out what’s impacting the latencies. ScyllaDB Latency Here’s a snapshot of ScyllaDB’s performance over three days. At peak, ScyllaDB is handling 340,000 operations per second (including writes and reads and deletes) and the CPU is hovering at just 21%. This is high scale in action! ScyllaDB delivers microsecond writes and millisecond reads for us. This level of blazing fast performance is exactly why we chose ScyllaDB. Partitioning Data into ScyllaDB This image shows how we partition data into ScyllaDB. The Visitor Metric Keyspace has two tables: Fact and Raw Metrics. The primary key on the Fact table is Visitor GUID, Fact Type, and Created At Date. The composite partition key is the Visitor GUID and Fact Type. The clustering key is Created At Date, which allows us to sort data in partitions by date. The attributes column contains a JSON object representing the event that occurred there. Some example Facts are Search Terms, Page Views, and Bookings. We use ScyllaDB’s Leveled Compaction Strategy because: It’s optimized for range queries It handles high cardinality very well It’s better for read-heavy workloads, and we have about 2-3X more reads than writes Why ScyllaDB? Our solution was originally built using Cassandra on-prem. But as the scale increased, so did the operational burden. It required dedicated operations support in order for us to manage the database upgrades, backups, etc. Also, our solution requires very low latencies for core components. Our User Identity Management system must identify the user within 30 milliseconds – and for the best personalization, we require our Event Tracking platform to respond in 40 milliseconds. It’s critical that our solution doesn’t block rendering the page so our SLAs are very low. With Cassandra, we had impacts to performance from garbage collection. That was primarily impacting the tail latencies, the P999 and P9999 latencies. We ran a Proof of Concept with ScyllaDB and found the throughput to be much better than Cassandra and the operational burden was eliminated. ScyllaDB gave us a monstrously fast live serving database with the lowest possible latencies. We wanted a fully-managed option, so we migrated from Cassandra to ScyllaDB Cloud, following a dual write strategy. That allowed us to migrate with zero downtime while handling 40,000 operations or requests per second. Later, we migrated from ScyllaDB Cloud to ScyllaDB’s “Bring your own account” model, where you can have the ScyllaDB team deploy the ScyllaDB database into your own AWS account. This gave us improved performance as well as better data privacy. This diagram shows what ScyllaDB’s BYOA deployment looks like. In the center of the diagram, you can see a 6-node ScyllaDB cluster that is running on EC2. And then there’s two additional EC2 instances. ScyllaDB Monitor gives us Grafana dashboards as well as Prometheus metrics. ScyllaDB Manager takes care of infrastructure automation like triggering backups and repairs. With this deployment, ScyllaDB could be co-located very close to our microservices to give us even lower latencies as well as much higher throughput and performance. Wrapping up, I hope you now have a better understanding of our architecture, the technologies that power the platform, and how ScyllaDB plays a critical role in allowing us to handle Tripadvisor’s extremely high scale.Innovative data compression for time series: An open source solution
Introduction
There’s no escaping the role that monitoring plays in our everyday lives. Whether it’s from monitoring the weather or the number of steps we take in a day, or computer systems to ever-popular IoT devices.
Practically any activity can be monitored in one form or another these days. This generates increasing amounts of data to be pored over and analyzed–but storing all this data adds significant costs over time. Given this huge amount of data that only increases with each passing day, efficient compression techniques are crucial.
Here at NetApp® Instaclustr we saw a great opportunity to improve the current compression techniques for our time series data. That’s why we created the Advanced Time Series Compressor (ATSC) in partnership with University of Canberra through the OpenSI initiative.
ATSC is a groundbreaking compressor designed to address the challenges of efficiently compressing large volumes of time-series data. Internal test results with production data from our database metrics showed that ATSC would compress, on average of the dataset, ~10x more than LZ4 and ~30x more than the default Prometheus compression. Check out ATSC on GitHub.
There are so many compressors already, so why develop another one?
While other compression methods like LZ4, DoubleDelta, and ZSTD are lossless, most of our timeseries data is already lossy. Timeseries data can be lossy from the beginning due to under-sampling or insufficient data collection, or it can become lossy over time as metrics are rolled over or averaged. Because of this, the idea of a lossy compressor was born.
ATSC is a highly configurable, lossy compressor that leverages the characteristics of time-series data to create function approximations. ATSC finds a fitting function and stores the parametrization of that function—no actual data from the original timeseries is stored. When the data is decompressed, it isn’t identical to the original, but it is still sufficient for the intended use.
Here’s an example: for a temperature change metric—which mostly varies slowly (as do a lot of system metrics!)—instead of storing all the points that have a small change, we fit a curve (or a line) and store that curve/line achieving significant compression ratios.
Image 1: ATSC data for temperature
How does ATSC work?
ATSC looks at the actual time series, in whole or in parts, to find how to better calculate a function that fits the existing data. For that, a quick statistical analysis is done, but if the results are inconclusive a sample is compressed with all the functions and the best function is selected.
By default, ATSC will segment the data—this guarantees better local fitting, more and smaller computations, and less memory usage. It also ensures that decompression targets a specific block instead of the whole file.
In each fitting frame, ATSC will create a function from a pre-defined set and calculate the parametrization of said function.
ATSC currently uses one (per frame) of those following functions:
- FFT (Fast Fourier Transforms)
- Constant
- Interpolation – Catmull-Rom
- Interpolation – Inverse Distance Weight
Image 2: Polynomial fitting vs. Fast-Fourier Transform fitting
These methods allow ATSC to compress data with a fitting error within 1% (configurable!) of the original time-series.
For a more detailed insight into ATSC internals and operations check our paper!
Use cases for ATSC and results
ATSC draws inspiration from established compression and signal analysis techniques, achieving compression ratios ranging from 46x to 880x with a fitting error within 1% of the original time-series. In some cases, ATSC can produce highly compressed data without losing any meaningful information, making it a versatile tool for various applications (please see use cases below).
Some results from our internal tests comparing to LZ4 and normal Prometheus compression yielded the following results:
Method | Compressed size (bytes) | Compression Ratio |
Prometheus | 454,778,552 | 1.33 |
LZ4 | 141,347,821 | 4.29 |
ATSC | 14,276,544 | 42.47 |
Another characteristic is the trade-off between fast compression speed vs. slower compression speed. Compression is about 30x slower than decompression. It is expected that time-series are compressed once but decompressed several times.
Image 3: A better fitting (purple) vs. a loose fitting (red). Purple takes twice as much space.
ATSC is versatile and can be applied in various scenarios where space reduction is prioritized over absolute precision. Some examples include:
- Rolled-over time series: ATSC can offer significant space savings without meaningful loss in precision, such as metrics data that are rolled over and stored for long term. ATSC provides the same or more space savings but with minimal information loss.
- Under-sampled time series: Increase sample rates without losing space. Systems that have very low sampling rates (30 seconds or more) and as such, it is very difficult to identify actual events. ATSC provides the space savings and keeps the information about the events.
- Long, slow-moving data series: Ideal for patterns that are easy to fit, such as weather data.
- Human visualization: Data meant for human analysis, with minimal impact on accuracy, such as historic views into system metrics (CPU, Memory, Disk, etc.)
Image 4: ATSC data (green) with an 88x compression vs. the original data (yellow)
Using ATSC
ATSC is written in Rust as and is available in GitHub. You can build and run yourself following these instructions.
Future work
Currently, we are planning to evolve ATSC in two ways (check our open issues):
- Adding features to the core compressor
focused on
these functionalities:
- Frame expansion for appending new data to existing frames
- Dynamic function loading to add more functions without altering the codebase
- Global and per-frame error storage
- Improved error encoding
- Integrations with
additional
technologies (e.g.
databases):
- We are currently looking into integrating ASTC with ClickHouse® and Apache Cassandra®
CREATE TABLE sensors_poly ( sensor_id UInt16, location UInt32, timestamp DateTime, pressure Float64 CODEC(ATSC('Polynomial', 1)), temperature Float64 CODEC(ATSC('Polynomial', 1)), ) ENGINE = MergeTree ORDER BY (sensor_id, location, timestamp);
Image 5: Currently testing ClickHouse integration
Sound interesting? Try it out and let us know what you think.
ATSC represents a significant advancement in time-series data compression, offering high compression ratios with a configurable accuracy loss. Whether for long-term storage or efficient data visualization, ATSC is a powerful open source tool for managing large volumes of time-series data.
But don’t just take our word for it—download and run it!
Check our documentation for any information you need and submit ideas for improvements or issues you find using GitHub issues. We also have easy first issues tagged if you’d like to contribute to the project.
Want to integrate this with another tool? You can build and run our demo integration with ClickHouse.
The post Innovative data compression for time series: An open source solution appeared first on Instaclustr.
New cassandra_latest.yaml configuration for a top performant Apache Cassandra®
Welcome to our deep dive into the latest advancements in Apache Cassandra® 5.0, specifically focusing on the cassandra_latest.yaml configuration that is available for new Cassandra 5.0 clusters.
This blog post will walk you through the motivation behind these changes, how to use the new configuration, and the benefits it brings to your Cassandra clusters.
Motivation
The primary motivation for introducing cassandra_latest.yaml is to bridge the gap between maintaining backward compatibility and leveraging the latest features and performance improvements. The yaml addresses the following varying needs for new Cassandra 5.0 clusters:
- Cassandra Developers: who want to push new features but face challenges due to backward compatibility constraints.
- Operators: who prefer stability and minimal disruption during upgrades.
- Evangelists and New Users: who seek the latest features and performance enhancements without worrying about compatibility.
Using cassandra_latest.yaml
Using cassandra_latest.yaml is straightforward. It involves copying the cassandra_latest.yaml content to your cassandra.yaml or pointing the cassandra.config JVM property to the cassandra_latest.yaml file.
This configuration is designed for new Cassandra 5.0 clusters (or those evaluating Cassandra), ensuring they get the most out of the latest features in Cassandra 5.0 and performance improvements.
Key changes and features
Key Cache Size
- Old: Evaluated as a minimum from 5% of the heap or 100MB
- Latest: Explicitly set to 0
Impact: Setting the key cache size to 0 in the latest configuration avoids performance degradation with the new SSTable format. This change is particularly beneficial for clusters using the new SSTable format, which doesn’t require key caching in the same way as the old format. Key caching was used to reduce the time it takes to find a specific key in Cassandra storage.
Commit Log Disk Access Mode
- Old: Set to legacy
- Latest: Set to auto
Impact: The auto setting optimizes the commit log disk access mode based on the available disks, potentially improving write performance. It can automatically choose the best mode (e.g., direct I/O) depending on the hardware and workload, leading to better performance without manual tuning.
Memtable Implementation
- Old: Skiplist-based
- Latest: Trie-based
Impact: The trie-based memtable implementation reduces garbage collection overhead and improves throughput by moving more metadata off-heap. This change can lead to more efficient memory usage and higher write performance, especially under heavy load.
create table … with memtable = {'class': 'TrieMemtable', … }
Memtable Allocation Type
- Old: Heap buffers
- Latest: Off-heap objects
Impact: Using off-heap objects for memtable allocation reduces the pressure on the Java heap, which can improve garbage collection performance and overall system stability. This is particularly beneficial for large datasets and high-throughput environments.
Trickle Fsync
- Old: False
- Latest: True
Impact: Enabling trickle fsync improves performance on SSDs by periodically flushing dirty buffers to disk, which helps avoid sudden large I/O operations that can impact read latencies. This setting is particularly useful for maintaining consistent performance in write-heavy workloads.
SSTable Format
- Old: big
- Latest: bti (trie-indexed structure)
Impact: The new BTI format is designed to improve read and write performance by using a trie-based indexing structure. This can lead to faster data access and more efficient storage management, especially for large datasets.
sstable: selected_format: bti default_compression: zstd compression: zstd: enabled: true chunk_length: 16KiB max_compressed_length: 16KiB
Default Compaction Strategy
- Old: STCS (Size-Tiered Compaction Strategy)
- Latest: Unified Compaction Strategy
Impact: The Unified Compaction Strategy (UCS) is more efficient and can handle a wider variety of workloads compared to STCS. UCS can reduce write amplification and improve read performance by better managing the distribution of data across SSTables.
default_compaction: class_name: UnifiedCompactionStrategy parameters: scaling_parameters: T4 max_sstables_to_compact: 64 target_sstable_size: 1GiB sstable_growth: 0.3333333333333333 min_sstable_size: 100MiB
Concurrent Compactors
- Old: Defaults to the smaller of the number of disks and cores
- Latest: Explicitly set to 8
Impact: Setting the number of concurrent compactors to 8 ensures that multiple compaction operations can run simultaneously, helping to maintain read performance during heavy write operations. This is particularly beneficial for SSD-backed storage where parallel I/O operations are more efficient.
Default Secondary Index
- Old: legacy_local_table
- Latest: sai
Impact: SAI is a new index implementation that builds on the advancements made with SSTable Storage Attached Secondary Index (SASI). Provide a solution that enables users to index multiple columns on the same table without suffering scaling problems, especially at write time.
Stream Entire SSTables
- Old: implicity set to True
- Latest: explicity set to True
Impact: When enabled, it permits Cassandra to zero-copy stream entire eligible, SSTables between nodes, including every component. This speeds up the network transfer significantly subject to throttling specified by
entire_sstable_stream_throughput_outbound
and
entire_sstable_inter_dc_stream_throughput_outbound
for inter-DC transfers.
UUID SSTable Identifiers
- Old: False
- Latest: True
Impact: Enabling UUID-based SSTable identifiers ensures that each SSTable has a unique name, simplifying backup and restore operations. This change reduces the risk of name collisions and makes it easier to manage SSTables in distributed environments.
Storage Compatibility Mode
- Old: Cassandra 4
- Latest: None
Impact: Setting the storage compatibility mode to none enables all new features by default, allowing users to take full advantage of the latest improvements, such as the new sstable format, in Cassandra. This setting is ideal for new clusters or those that do not need to maintain backward compatibility with older versions.
Testing and validation
The cassandra_latest.yaml configuration has undergone rigorous testing to ensure it works seamlessly. Currently, the Cassandra project CI pipeline tests both the standard (cassandra.yaml) and latest (cassandra_latest.yaml) configurations, ensuring compatibility and performance. This includes unit tests, distributed tests, and DTests.
Future improvements
Future improvements may include enforcing password strength policies and other security enhancements. The community is encouraged to suggest features that could be enabled by default in cassandra_latest.yaml.
Conclusion
The cassandra_latest.yaml configuration for new Cassandra 5.0 clusters is a significant step forward in making Cassandra more performant and feature-rich while maintaining the stability and reliability that users expect. Whether you are a developer, an operator professional, or an evangelist/end user, cassandra_latest.yaml offers something valuable for everyone.
Try it out
Ready to experience the incredible power of the cassandra_latest.yaml configuration on Apache Cassandra 5.0? Spin up your first cluster with a free trial on the Instaclustr Managed Platform and get started today with Cassandra 5.0!
The post New cassandra_latest.yaml configuration for a top performant Apache Cassandra® appeared first on Instaclustr.
Cassandra 5 Released! What's New and How to Try it
Apache Cassandra 5.0 has officially landed! This highly anticipated release brings a range of new features and performance improvements to one of the most popular NoSQL databases in the world. Having recently hosted a webinar covering the major features of Cassandra 5.0, I’m excited to give a brief overview of the key updates and show you how to easily get hands-on with the latest release using easy-cass-lab.
You can grab the latest release on the Cassandra download page.
Instaclustr for Apache Cassandra® 5.0 Now Generally Available
NetApp is excited to announce the general availability (GA) of Apache Cassandra® 5.0 on the Instaclustr Platform. This follows the release of the public preview in March.
NetApp was the first managed service provider to release the beta version, and now the Generally Available version, allowing the deployment of Cassandra 5.0 across the major cloud providers: AWS, Azure, and GCP, and on–premises.
Apache Cassandra has been a leader in NoSQL databases since its inception and is known for its high availability, reliability, and scalability. The latest version brings many new features and enhancements, with a special focus on building data-driven applications through artificial intelligence and machine learning capabilities.
Cassandra 5.0 will help you optimize performance, lower costs, and get started on the next generation of distributed computing by:
- Helping you build AI/ML-based applications through Vector Search
- Bringing efficiencies to your applications through new and enhanced indexing and processing capabilities
- Improving flexibility and security
With the GA release, you can use Cassandra 5.0 for your production workloads, which are covered by NetApp’s industry–leading SLAs. NetApp has conducted performance benchmarking and extensive testing while removing the limitations that were present in the preview release to offer a more reliable and stable version. Our GA offering is suitable for all workload types as it contains the most up-to-date range of features, bug fixes, and security patches.
Support for continuous backups and private network add–ons is available. Currently, Debezium is not yet compatible with Cassandra 5.0. NetApp will work with the Debezium community to add support for Debezium on Cassandra 5.0 and it will be available on the Instaclustr Platform as soon as it is supported.
Some of the key new features in Cassandra 5.0 include:
- Storage-Attached Indexes (SAI): A highly scalable, globally distributed index for Cassandra databases. With SAI, column-level indexes can be added, leading to unparalleled I/O throughput for searches across different data types, including vectors. SAI also enables lightning-fast data retrieval through zero-copy streaming of indices, resulting in unprecedented efficiency.
- Vector Search: This is a powerful technique for searching relevant content or discovering connections by comparing similarities in large document collections and is particularly useful for AI applications. It uses storage-attached indexing and dense indexing techniques to enhance data exploration and analysis.
- Unified Compaction Strategy: This strategy unifies compaction approaches, including leveled, tiered, and time-windowed strategies. It leads to a major reduction in SSTable sizes. Smaller SSTables mean better read and write performance, reduced storage requirements, and improved overall efficiency.
- Numerous stability and testing improvements: You can read all about these changes here.
All these new features are available out-of-the-box in Cassandra 5.0 and do not incur additional costs.
Our Development team has worked diligently to bring you a stable release of Cassandra 5.0. Substantial preparatory work was done to ensure you have a seamless experience with Cassandra 5.0 on the Instaclustr Platform. This includes updating the Cassandra YAML and Java environment and enhancing the monitoring capabilities of the platform to support new data types.
We also conducted extensive performance testing and benchmarked version 5.0 with the existing stable Apache Cassandra 4.1.5 version. We will be publishing our benchmarking results shortly; the highlight so far is that Cassandra 5.0 improves responsiveness by reducing latencies by up to 30% during peak load times.
Through our dedicated Apache Cassandra committer, NetApp has contributed to the development of Cassandra 5.0 by enhancing the documentation for new features like Vector Search (Cassandra-19030), enabling Materialized Views (MV) with only partition keys (Cassandra-13857), fixing numerous bugs, and contributing to the improvements for the unified compaction strategy feature, among many other things.
Lifecycle Policy Updates
As previously communicated, the project will no longer maintain Apache Cassandra 3.0 and 3.11 versions (full details of the announcement can be found on the Apache Cassandra website).
To help you transition smoothly, NetApp will provide extended support for these versions for an additional 12 months. During this period, we will backport any critical bug fixes, including security patches, to ensure the continued security and stability of your clusters.
Cassandra 3.0 and 3.11 versions will reach end-of-life on the Instaclustr Managed Platform within the next 12 months. We will work with you to plan and upgrade your clusters during this period.
Additionally, the Cassandra 5.0 beta version and the Cassandra 5.0 RC2 version, which were released as part of the public preview, are now end-of-life You can check the lifecycle status of different Cassandra application versions here.
You can read more about our lifecycle policies on our website.
Getting Started
Upgrading to Cassandra 5.0 will allow you to stay current and start taking advantage of its benefits. The Instaclustr by NetApp Support team is ready to help customers upgrade clusters to the latest version.
- Wondering if it’s possible to upgrade your workloads from Cassandra 3.x to Cassandra 5.0? Find the answer to this and other similar questions in this detailed blog.
- Click here to read about Storage Attached Indexes in Apache Cassandra 5.0.
- Learn about 4 new Apache Cassandra 5.0 features to be excited about.
- Click here to learn what you need to know about Apache Cassandra 5.0.
Why Choose Apache Cassandra on the Instaclustr Managed Platform?
NetApp strives to deliver the best of supported applications. Whether it’s the latest and newest application versions available on the platform or additional platform enhancements, we ensure a high quality through thorough testing before entering General Availability.
NetApp customers have the advantage of accessing the latest versions—not just the major version releases but also minor version releases—so that they can benefit from any new features and are protected from any vulnerabilities.
Don’t have an Instaclustr account yet? Sign up for a trial or reach out to our Sales team and start exploring Cassandra 5.0.
With more than 375 million node hours of management experience, Instaclustr offers unparalleled expertise. Visit our website to learn more about the Instaclustr Managed Platform for Apache Cassandra.
If you would like to upgrade your Apache Cassandra version or have any issues or questions about provisioning your cluster, please contact Instaclustr Support at any time.
The post Instaclustr for Apache Cassandra® 5.0 Now Generally Available appeared first on Instaclustr.
Apache Cassandra® 5.0: Behind the Scenes
Here at NetApp, our Instaclustr product development team has spent nearly a year preparing for the release of Apache Cassandra 5.
Starting with one engineer tinkering at night with the Apache Cassandra 5 Alpha branch, and then up to 5 engineers working on various monitoring, configuration, testing and functionality improvements to integrate the release with the Instaclustr Platform.
It’s been a long journey to the point we are at today, offering Apache Cassandra 5 Release Candidate 1 in public preview on the Instaclustr Platform.
Note: the Instaclustr team has a dedicated open source committer to the Apache Cassandra project. His changes are not included in this document as there were too many for us to include here. Instead, this blog primarily focuses on the engineering effort to release Cassandra 5.0 onto the Instaclustr Managed Platform.
August 2023: The Beginning
We began experimenting with the Apache Cassandra 5 Alpha 1 branches using our build systems. There were several tools we built into our Apache Cassandra images that were not working at this point, but we managed to get a node to start even though it immediately crashed with errors.
One of our early achievements was identifying and fixing a bug that impacted our packaging solution; this resulted in a small contribution to the project allowing Apache Cassandra to be installed on Debian systems with non-OpenJDK Java.
September 2023: First Milestone
The release of the Alpha 1 version allowed us to achieve our first running Cassandra 5 cluster in our development environments (without crashing!).
Basic core functionalities like user creation, data writing, and backups/restores were tested successfully. However, several advanced features, such as repair and replace tooling, monitoring, and alerting were still untested.
At this point we had to pause our Cassandra 5 efforts to focus on other priorities and planned to get back to testing Cassandra 5 after Alpha 2 was released.
November 2023: Further Testing and Internal Preview
The project released Alpha 2. We repeated the same build and test we did on alpha 1. We also tested some more advanced procedures like cluster resizes with no issues.
We also started testing with some of the new 5.0 features: Vector Data types and Storage-Attached Indexes (SAI), which resulted in another small contribution.
We launched Apache Cassandra 5 Alpha 2 for internal preview (basically for internal users). This allowed the wider Instaclustr team to access and use the Alpha on the platform.
During this phase we found a bug in our metrics collector when vectors were encountered that ended up being a major project for us.
If you see errors like the below, it’s time for a Java Cassandra driver upgrade to 4.16 or newer:
java.lang.IllegalArgumentException: Could not parse type name vector<float, 5> Nov 15 22:41:04 ip-10-0-39-7 process[1548]: at com.datastax.driver.core.DataTypeCqlNameParser.parse(DataTypeCqlNameParser.java:233) Nov 15 22:41:04 ip-10-0-39-7 process[1548]: at com.datastax.driver.core.TableMetadata.build(TableMetadata.java:311) Nov 15 22:41:04 ip-10-0-39-7 process[1548]: at com.datastax.driver.core.SchemaParser.buildTables(SchemaParser.java:302) Nov 15 22:41:04 ip-10-0-39-7 process[1548]: at com.datastax.driver.core.SchemaParser.refresh(SchemaParser.java:130) Nov 15 22:41:04 ip-10-0-39-7 process[1548]: at com.datastax.driver.core.ControlConnection.refreshSchema(ControlConnection.java:417) Nov 15 22:41:04 ip-10-0-39-7 process[1548]: at com.datastax.driver.core.ControlConnection.refreshSchema(ControlConnection.java:356) <Rest of stacktrace removed for brevity>
December 2023: Focus on new features and planning
As the project released Beta 1, we began focusing on the features in Cassandra 5 that we thought were the most exciting and would provide the most value to customers. There are a lot of awesome new features and changes, so it took a while to find the ones with the largest impact.
The final list of high impact features we came up with was:
- A new data type – Vectors
- Trie memtables/Trie Indexed SSTables (BTI Formatted SStables)
- Storage-Attached Indexes (SAI)
- Unified Compaction Strategy
A major new feature we considered deploying was support for JDK 17. However, due to its experimental nature, we have opted to postpone adoption and plan to support running Apache Cassandra on JDK 17 when it’s out of the experimentation phase.
Once the holiday season arrived, it was time for a break, and we were back in force in February next year.
February 2024: Intensive testing
In February, we released Beta 1 into internal preview so we could start testing it on our Preproduction test environments. As we started to do more intensive testing, we discovered issues in the interaction with our monitoring and provisioning setup.
We quickly fixed the issues identified as showstoppers for launching Cassandra 5. By the end of February, we initiated discussions about a public preview release. We also started to add more resourcing to the Cassandra 5 project. Up until now, only one person was working on it.
Next, we broke down the work we needed to do. This included identifying monitoring agents requiring upgrade and config defaults that needed to change.
From this point, the project split into 3 streams of work:
- Project Planning – Deciding how all this work gets pulled together cleanly, ensuring other work streams have adequate resourcing to hit their goals, and informing product management and the wider business of what’s happening.
- Configuration Tuning – Focusing on the new features of Apache Cassandra to include, how to approach the transition to JDK 17, and how to use BTI formatted SSTables on the platform.
- Infrastructure Upgrades – Identifying what to upgrade internally to handle Cassandra 5, including Vectors and BTI formatted SSTables.
A Senior Engineer was responsible for each workstream to ensure planned timeframes were achieved.
March 2024: Public Preview Release
In March, we launched Beta 1 into public preview on the Instaclustr Managed Platform. The initial release did not contain any opt in features like Trie indexed SSTables.
However, this gave us a consistent base to test in our development, test, and production environments, and proved our release pipeline for Apache Cassandra 5 was working as intended. This also gave customers the opportunity to start using Apache Cassandra 5 with their own use cases and environments for experimentation.
See our public preview launch blog for further details.
There was not much time to celebrate as we continued working on infrastructure and refining our configuration defaults.
April 2024: Configuration Tuning and Deeper Testing
The first configuration updates were completed for Beta 1, and we started performing deeper functional and performance testing. We identified a few issues from this effort and remediated. This default configuration was applied for all Beta 1 clusters moving forward.
This allowed users to start testing Trie Indexed SSTables and Trie memtables in their environment by default.
"memtable": { "configurations": { "skiplist": { "class_name": "SkipListMemtable" }, "sharded": { "class_name": "ShardedSkipListMemtable" }, "trie": { "class_name": "TrieMemtable" }, "default": { "inherits": "trie" } } }, "sstable": { "selected_format": "bti" }, "storage_compatibility_mode": "NONE",
The above graphic illustrates an Apache Cassandra YAML configuration where BTI formatted sstables are used by default (which allows Trie Indexed SSTables) and defaults use of Trie for memtables. You can override this per table:
CREATE TABLE test WITH memtable = {‘class’ : ‘ShardedSkipListMemtable’};
Note that you need to set storage_compatibility_mode to NONE to use BTI formatted sstables. See Cassandra documentation for more information.
You can also reference the cassandra_latest.yaml file for the latest settings (please note you should not apply these to existing clusters without rigorous testing).
May 2024: Major Infrastructure Milestone
We hit a very large infrastructure milestone when we released an upgrade to some of our core agents that were reliant on an older version of the Apache Cassandra Java driver. The upgrade to version 4.17 allowed us to start supporting vectors in certain keyspace level monitoring operations.
At the time, this was considered to be the riskiest part of the entire project as we had 1000s of nodes to upgrade across may different customer environments. This upgrade took a few weeks, finishing in June. We broke the release up into 4 separate rollouts to reduce the risk of introducing issues into our fleet, focusing on single key components in our architecture in each release. Each release had quality gates and tested rollback plans, which in the end were not needed.
June 2024: Successful Rollout New Cassandra Driver
The Java driver upgrade project was rolled out to all nodes in our fleet and no issues were encountered. At this point we hit all the major milestones before Release Candidates became available. We started to look at the testing systems to update to Apache Cassandra 5 by default.
July 2024: Path to Release Candidate
We upgraded our internal testing systems to use Cassandra 5 by default, meaning our nightly platform tests began running against Cassandra 5 clusters and our production releases will smoke test using Apache Cassandra 5. We started testing the upgrade path for clusters from 4.x to 5.0. This resulted in another small contribution to the Cassandra project.
The Apache Cassandra project released Apache Cassandra 5 Release Candidate 1 (RC1), and we launched RC1 into public preview on the Instaclustr Platform.
The Road Ahead to General Availability
We’ve just launched Apache Cassandra 5 Release Candidate 1 (RC1) into public preview, and there’s still more to do before we reach General Availability for Cassandra 5, including:
- Upgrading our own preproduction Apache Cassandra for internal use to Apache Cassandra 5 Release Candidate 1. This means we’ll be testing using our real-world use cases and testing our upgrade procedures on live infrastructure.
At Launch:
When Apache Cassandra 5.0 launches, we will perform another round of testing, including performance benchmarking. We will also upgrade our internal metrics storage production Apache Cassandra clusters to 5.0, and, if the results are satisfactory, we will mark the release as generally available for our customers. We want to have full confidence in running 5.0 before we recommend it for production use to our customers.
For more information about our own usage of Cassandra for storing metrics on the Instaclustr Platform check out our series on Monitoring at Scale.
What Have We Learned From This Project?
- Releasing limited,
small
and frequent changes
has resulted in a smooth project, even if sometimes frequent
releases do not feel smooth. Some
thoughts:
- Releasing to a small subset of internal users allowed us to take risks and break things more often so we could learn from our failures safely.
- Releasing small changes allowed us to more easily understand and predict the behaviour of our changes: what to look out for in case things went wrong, how to more easily measure success, etc.
- Releasing frequently built confidence within the wider Instaclustr team, which in turn meant we would be happier taking more risks and could release more often.
- Releasing to internal and public preview helped
create
momentum within
the Instaclustr
business and
teams:
- This turned the Apache Cassandra 5.0 release from something that “was coming soon and very exciting” to “something I can actually use.”
- Communicating frequently, transparently, and efficiently is the foundation
of success:
- We used a dedicated Slack channel (very creatively named #cassandra-5-project) to discuss everything.
- It was quick and easy to go back to see why we made certain decisions or revisit them if needed. This had a bonus of allowing a Lead Engineer to write a blog post very quickly about the Cassandra 5 project.
This has been a long–running but very exciting project for the entire team here at Instaclustr. The Apache Cassandra community is on the home stretch for this massive release, and we couldn’t be more excited to start seeing what everyone will build with it.
You can sign up today for a free trial and test Apache Cassandra 5 Release Candidate 1 by creating a cluster on the Instaclustr Managed Platform.
More Readings
- The Top 5 Questions We’re Asked about Apache Cassandra 5.0
- Vector Search in Apache Cassandra 5.0
- How Does Data Modeling Change in Apache Cassandra 5.0?
The post Apache Cassandra® 5.0: Behind the Scenes appeared first on Instaclustr.
Will Your Cassandra Database Project Succeed?: The New Stack
Open source Apache Cassandra® continues to stand out as an enterprise-proven solution for organizations seeking high availability, scalability and performance in a NoSQL database. (And hey, the brand-new 5.0 version is only making those statements even more true!) There’s a reason this database is trusted by some of the world’s largest and most successful companies.
That said, effectively harnessing the full spectrum of Cassandra’s powerful advantages can mean overcoming a fair share of operational complexity. Some folks will find a significant learning curve, and knowing what to expect is critical to success. In my years of experience working with Cassandra, it’s when organizations fail to anticipate and respect these challenges that they set the stage for their Cassandra projects to fall short of expectations.
Let’s look at the key areas where strong project management and following proven best practices will enable teams to evade common pitfalls and ensure a Cassandra implementation is built strong from Day 1.
Accurate Data Modeling Is a Must
Cassandra projects require a thorough understanding of its unique data model principles. Teams that approach Cassandra like a relationship database are unlikely to model data properly. This can lead to poor performance, excessive use of secondary indexes and significant data consistency issues.
On the other hand, teams that develop familiarity with Cassandra’s specific NoSQL data model will understand the importance of including partition keys, clustering keys and denormalization. These teams will know to closely analyze query and data access patterns associated with their applications and know how to use that understanding to build a Cassandra data model that matches their application’s needs step for step.
Configure Cassandra Clusters the Right Way
Accurate, expertly managed cluster configurations are pivotal to the success of Cassandra implementations. Get those cluster settings wrong and Cassandra can suffer from data inconsistencies and performance issues due to inappropriate node capacities, poor partitioning or replication strategies that aren’t up to the task.
Teams should understand the needs of their particular use case and how each cluster configuration setting affects Cassandra’s abilities to serve that use case. Attuning configurations to best support your application — including the right settings for node capacity, data distribution, replication factor and consistency levels — will ensure that you can harness the full power of Cassandra when it counts.
Take Advantage of Tunable Consistency
Cassandra gives teams the option to leverage the best balance of data consistency and availability for their use case. While these tunable consistency levels are a valuable tool in the right hands, teams that don’t understand the nuances of these controls can saddle their applications with painful latency and troublesome data inconsistencies.
Teams that learn to operate Cassandra’s tunable consistency levels properly and carefully assess their application’s needs — especially with read and write patterns, data sensitivity and the ability to tolerate eventual consistency — will unlock far more beneficial Cassandra experiences.
Perform Regular Maintenance
Regular Cassandra maintenance is required to stave off issues such as data inconsistencies and performance drop-offs. Within their Cassandra operational procedures, teams should routinely perform compaction, repair and node-tool operations to prevent challenges down the road, while ensuring cluster health and performance are optimized.
Anticipate Capacity and Scaling Needs
By its nature, success will yield new needs. Be prepared for your Cassandra cluster to grow and scale well into the future — that is what this database is built to do. Starving your Cassandra cluster for CPU, RAM and storage resources because you don’t have a plan to seamlessly add capacity is a way of plucking failure from the jaws of success. Poor performance, data loss and expensive downtime are the rewards for growing without looking ahead.
Plan for growth and scalability from the beginning of your Cassandra implementation. Practice careful capacity planning. Look at your data volumes, write/read patterns and performance requirements today and tomorrow. Teams with clusters built for growth will be ready to do so far more easily and affordably.
Make Changes With a Careful Testing/Staging/Prod Process
Teams that think they’re streamlining their process efficiency by putting Cassandra changes straight into production actually enable a pipeline for bugs, performance roadblocks and data inconsistencies. Testing and staging environments are essential for validating changes before putting them into production environments and will save teams countless hours of headaches.
At the end of the day, running all data migrations, changes to schema and application updates through testing and staging environments is far more efficient than putting them straight into production and then cleaning up myriad live issues.
Set Up Monitoring and Alerts
Teams implementing monitoring and alerts to track metrics and flag anomalies can mitigate trouble spots before they become full-blown service interruptions. The speed at which teams become aware of issues can mean the difference between a behind-the-scenes blip and a downtime event.
Have Backup and Disaster Recovery at the Ready
In addition to standing up robust monitoring and alerting, teams should regularly test and run practice drills on their procedures for recovering from disasters and using data backups. Don’t neglect this step; these measures are absolutely essential for ensuring the safety and resilience of systems and data.
The less prepared an organization is to recover from issues, the longer and more costly and impactful downtime will be. Incremental or snapshot backup strategies, replication that’s based in the cloud or across multiple data centers and fine-tuned recovery processes should be in place to minimize downtime, stress and confusion whenever the worst occurs.
Nurture Cassandra Expertise
The expertise required to optimize Cassandra configurations, operations and performance will only come with a dedicated focus. Enlisting experienced talent, instilling continuous training regimens that keep up with Cassandra updates, turning to external support and ensuring available resources — or all of the above — will position organizations to succeed in following the best practices highlighted here and achieving all of the benefits that Cassandra can deliver.
The post Will Your Cassandra Database Project Succeed?: The New Stack appeared first on Instaclustr.
Use Your Data in LLMs With the Vector Database You Already Have: The New Stack
Open source vector databases are among the top options out there for AI development, including some you may already be familiar with or even have on hand.
Vector databases allow you to enhance your LLM models with data from your internal data stores. Prompting the LLM with local, factual knowledge can allow you to get responses tailored to what your organization already knows about the situation. This reduces “AI hallucination” and improves relevance.
You can even ask the LLM to add references to the original data it used in its answer so you can check yourself. No doubt vendors have reached out with proprietary vector database solutions, advertised as a “magic wand” enabling you to assuage any AI hallucination concerns.
But, ready for some good news?
If you’re already using Apache Cassandra 5.0, OpenSearch or PostgreSQL, your vector database success is already primed. That’s right: There’s no need for costly proprietary vector database offerings. If you’re not (yet) using these free and fully open source database technologies, your generative AI aspirations are a good time to migrate — they are all enterprise-ready and avoid the pitfalls of proprietary systems.
For many enterprises, these open source vector databases are the most direct route to implementing LLMs — and possibly leveraging retrieval augmented generation (RAG) — that deliver tailored and factual AI experiences.
Vector databases store embedding vectors, which are lists of numbers representing spatial coordinates corresponding to pieces of data. Related data will have closer coordinates, allowing LLMs to make sense of complex and unstructured datasets for features such as generative AI responses and search capabilities.
RAG, a process skyrocketing in popularity, involves using a vector database to translate the words in an enterprise’s documents into embeddings to provide highly efficient and accurate querying of that documentation via LLMs.
Let’s look closer at what each open source technology brings to the vector database discussion:
Apache Cassandra 5.0 Offers Native Vector Indexing
With its latest version (currently in preview), Apache Cassandra has added to its reputation as an especially highly available and scalable open source database by including everything that enterprises developing AI applications require.
Cassandra 5.0 adds native vector indexing and vector search, as well as a new vector data type for embedding vector storage and retrieval. The new version has also added specific Cassandra Query Language (CQL) functions that enable enterprises to easily use Cassandra as a vector database. These additions make Cassandra 5.0 a smart open source choice for supporting AI workloads and executing enterprise strategies around managing intelligent data.
OpenSearch Provides a Combination of Benefits
Like Cassandra, OpenSearch is another highly popular open source solution, one that many folks on the lookout for a vector database happen to already be using. OpenSearch offers a one-stop shop for search, analytics and vector database capabilities, while also providing exceptional nearest-neighbor search capabilities that support vector, lexical, and hybrid search and analytics.
With OpenSearch, teams can put the pedal down on developing AI applications, counting on the database to deliver the stability, high availability and minimal latency it’s known for, along with the scalability to account for vectors into the tens of billions. Whether developing a recommendation engine, generative AI agent or any other solution where the accuracy of results is crucial, those using OpenSearch to leverage vector embeddings and stamp out hallucinations won’t be disappointed.
The pgvector Extension Makes Postgres a Powerful Vector Store
Enterprises are no strangers to Postgres, which ranks among the most used databases in the world. Given that the database only needs the pgvector extension to become a particularly performant vector database, countless organizations are just a simple deployment away from harnessing an ideal infrastructure for handling their intelligent data.
pgvector is especially well-suited to provide exact nearest-neighbor search, approximate nearest-neighbor search and distance-based embedding search, and at using cosine distance (as recommended by OpenAI), L2 distance and inner product to recognize semantic similarities. Efficiency with those capabilities makes pgvector a powerful and proven open source option for training accurate LLMs and RAG implementations, while positioning teams to deliver trustworthy AI applications they can be proud of.
Was the Answer to Your AI Challenges in Front of You All Along?
The solution to tailored LLM responses isn’t investing in some expensive proprietary vector database and then trying to dodge the very real risks of vendor lock-in or a bad fit. At least it doesn’t have to be. Recognizing that available open source vector databases are among the top options out there for AI development — including some you may already be familiar with or even have on hand — should be a very welcome revelation.
The post Use Your Data in LLMs With the Vector Database You Already Have: The New Stack appeared first on Instaclustr.
easy-cass-lab v5 released
I’ve got some fun news to start the week off for users of easy-cass-lab: I’ve just released version 5. There are a number of nice improvements and bug fixes in here that should make it more enjoyable, more useful, and lay groundwork for some future enhancements.
- When the cluster starts, we wait for the storage service to
reach NORMAL state, then move to the next node. This is in contrast
to the previous behavior where we waited for 2 minutes after
starting a node. This queries JMX directly using Swiss Java Knife
and is more reliable than the 2-minute method. Please see
packer/bin-cassandra/wait-for-up-normal
to read through the implementation. - Trunk now works correctly. Unfortunately, AxonOps doesn’t support trunk (5.1) yet, and using the agent was causing a startup error. You can test trunk out, but for now the AxonOps integration is disabled.
- Added a new repl mode. This saves keystrokes and provides some
auto-complete functionality and keeps SSH connections open. If
you’re going to do a lot of work with ECL this will help you be a
little more efficient. You can try this out with
ecl repl
. - Power user feature: Initial support for profiles in AWS regions
other than
us-west-2
. We only provide AMIs forus-west-2
, but you can now set up a profile in an alternate region, and build the required AMIs usingeasy-cass-lab build-image
. This feature is still under development and requires using aneasy-cass-lab
build from source. Credit to Jordan West for contributing this work. - Power user feature: Support for multiple profiles. Setting the
EASY_CASS_LAB_PROFILE
environment variable allows you to configure alternate profiles. This is handy if you want to use multiple regions or have multiple organizations. - The project now uses Kotlin instead of Groovy for Gradle configuration.
- Updated Gradle to 8.9.
- When using the list command, don’t show the alias “current”.
- Project cleanup, remove old unused pssh, cassandra build, and async profiler subprojects.
The release has been released to the project’s GitHub page and to homebrew. The project is largely driven by my own consulting needs and for my training. If you’re looking to have some features prioritized please reach out, and we can discuss a consulting engagement.
easy-cass-lab updated with Cassandra 5.0 RC-1 Support
I’m excited to announce that the latest version of easy-cass-lab now supports Cassandra 5.0 RC-1, which was just made available last week! This update marks a significant milestone, providing users with the ability to test and experiment with the newest Cassandra 5.0 features in a simplified manner. This post will walk you through how to set up a cluster, SSH in, and run your first stress test.
For those new to easy-cass-lab, it’s a tool designed to streamline the setup and management of Cassandra clusters in AWS, making it accessible for both new and experienced users. Whether you’re running tests, developing new features, or just exploring Cassandra, easy-cass-lab is your go-to tool.
easy-cass-lab now available in Homebrew
I’m happy to share some exciting news for all Cassandra enthusiasts! My open source project, easy-cass-lab, is now installable via a homebrew tap. This powerful tool is designed to make testing any major version of Cassandra (or even builds that haven’t been released yet) a breeze, using AWS. A big thank-you to Jordan West who took the time to make this happen!
What is easy-cass-lab?
easy-cass-lab is a versatile testing tool for Apache Cassandra. Whether you’re dealing with the latest stable releases or experimenting with unreleased builds, easy-cass-lab provides a seamless way to test and validate your applications. With easy-cass-lab, you can ensure compatibility and performance across different Cassandra versions, making it an essential tool for developers and system administrators. easy-cass-lab is used extensively for my consulting engagements, my training program, and to evaluate performance patches destined for open source Cassandra. Here are a few examples:
Cassandra Training Signups For July and August Are Open!
I’m pleased to announce that I’ve opened training signups for Operator Excellence to the public for July and August. If you’re interested in stepping up your game as a Cassandra operator, this course is for you. Head over to the training page to find out more and sign up for the course.
Streaming My Sessions With Cassandra 5.0
As a long time participant with the Cassandra project, I’ve witnessed firsthand the evolution of this incredible database. From its early days to the present, our journey has been marked by continuous innovation, challenges, and a relentless pursuit of excellence. I’m thrilled to share that I’ll be streaming several working sessions over the next several weeks as I evaluate the latest builds and test out new features as we move toward the 5.0 release.
Streaming Cassandra Workloads and Experiments
Streaming
In the world of software engineering, especially within the realm of distributed systems, continuous learning and experimentation are not just beneficial; they’re essential. As a software engineer with a focus on distributed systems, particularly Apache Cassandra, I’ve taken this ethos to heart. My journey has led me to not only explore the intricacies of Cassandra’s distributed architecture but also to share my experiences and findings with a broader audience. This is why my YouTube channel has become an active platform where I stream at least once a week, engaging with viewers through coding sessions, trying new approaches, and benchmarking different Cassandra workloads.