Scylla Open Source Release 3.3

Scylla Open Source Release Notes

The Scylla team is pleased to announce the availability of Scylla Open Source 3.3.0, a production-ready release of our open source NoSQL database. Scylla 3.3 is focused on bug fixes and stability.

Scylla 3.3 includes major new or modified experimental features, including Lightweight Transactions (LWT), Change Data Capture (CDC), our Amazon DynamoDB Compatible API (Alternator) and Lua language based User Defined Functions (UDFs). These features are experimental and you are welcome to try them out and share feedback. Once stabilized they will be promoted to General Availability (GA) in a followup Scylla release (see more below).

Scylla is an open source, Apache-Cassandra-compatible NoSQL database with superior performance and consistently low latencies. Find the Scylla Open Source 3.3 repository for your Linux distribution here.

Please note that only the last two minor releases of Scylla Open Source project are supported. Starting today, only Scylla Open Source 3.3 and 3.2 will be supported; Scylla Open Source 3.1 will no longer be supported.

Related Links

New features in Scylla 3.3

  • Snapshot enhancement: a table schema, schema.cql, is now part of each Scylla snapshot created with “nodetool snapshot”. Schema is required as part of the Scylla backup restore procedure. #4192
  • Connection virtual table #4820
    The new table system.clients table provides information about CQL clients currently connected to Scylla.
    Client Information includes: address, port, type, shard, protocol_version and username
  • Stability: Large collections are now more resistant to memory fragmentation
  • Stability: scylla-kernel-conf package which tunes the kernel for Scylla’s needs. It now tunes vm.swappiness, to reduce the probability of the kernel swapping out Scylla memory and introducing stalls.
  • Export system uptime via REST endpoint /system/uptime_ms

Full list of bug fixes is available in the Git log, and in RC1, RC2 and RC3 notes.

Metrics changes Scylla 3.2 to Scylla 3.3

Full list of metrics is available here

For Scylla 3.3 Dashboard, make sure to use Scylla Monitoring Stack release 3.2

Experimental features in Scylla 3.3

User Defined Functions (UDFs) #2204

Scylla now has basic support for Lua based UDFs. Scylla UDFs serve a function similar to Apache Cassandra’s UDFs, to provide server-side calculated functions, but are implemented with Lua

Example:

CREATE FUNCTION twice(val int)
RETURNS NULL ON NULL INPUT
RETURNS int
LANGUAGE Lua
AS 'return 2 * val';
SELECT twice(key) from tbl;

Please note that UDFs are under development, and are still missing integration with RBAC, Scylla Tracing and Scylla monitoring, and more performance improvements. More on UDAFs and UDAs here.

The following experimental features were already available in Scylla 3.2, and we continue to improve and stabilize them toward production in one of the following Scylla releases:

Lightweight Transactions (LWT) (Experimental) #1359

Lightweight transactions (LWT), also known as Compare and Set (CAS), add support for conditional INSERT and UPDATE CQL commands. Scylla supports both equality and non-equality conditions for lightweight transactions (i.e., you can use <, <=, >, >=, != and IN operators in an IF clause).

Updates from Scylla 3.2:

  • LWT performance optimization: lightweight transactions query phase was optimized to reduce the number of round trips.
  • LWT safety: allow commitlog to wait for specific entries to be flushed to disk

You can learn more on LWT in Scylla and LWT optimizations from the latest LWT Webinar (registration required)

Change Data Capture (CDC) (Experimental) #4985

Change Data Capture is a feature that enables users to monitor data in a selected table and asynchronously consume the change events. You can enable CDC per table.

Updates from Scylla 3.2:

  • Better support for schema changes
  • Support for rolling upgrade

Note that CDC has an impact on performance, as each update to a CDC enabled table is written to two tables (the original table and the CDC table). Enabling the preimage option affects performance further, as it includes read-before-update. CDC is still under continuous development, and the API and syntax are likely to change.

To learn more about the CDC capabilities register for our upcoming webinar on March 26th, 2020.

Scylla Alternator: The Open Source DynamoDB-compatible API (Experimental)
Project Alternator is an open-source implementation for an Amazon DynamoDB™-compatible API. The goal of this project is to deliver an open source alternative to Amazon’s DynamoDB, deployable wherever a user would want: on-premises, on other public clouds like Microsoft Azure or Google Cloud Platform, or still on AWS (for users who wish to take advantage of other aspects of Amazon’s market-leading cloud ecosystem, such as the high-density i3en instances). DynamoDB users can keep their client code unchanged. Alternator is written in C++ and is included in Scylla 3.3.

Updates from Scylla 3.2:

  • Correct support for all operators for the “Expected” parameter (previously some were unimplemented, some cases had bugs). #5034
  • The DescribeTable operation now returns the table’s schema.
  • Produce clear errors on unimplemented features (previously, some of them were silently ignored).

For the latest Alternator status: https://github.com/scylladb/scylla/blob/master/docs/alternator/alternator.md

Want to learn more about what you can do with Scylla? Tune in to discover the capabilities of Change Data Capture (CDC) in Scylla in our upcoming webinar on March 26th, 2020:

REGISTER NOW FOR OUR CDC WEBINAR

The post Scylla Open Source Release 3.3 appeared first on ScyllaDB.

Scylla Enterprise Release 2019.1.6

Scylla Enterprise Release Notes

The ScyllaDB team announces the release of Scylla Enterprise 2019.1.6, a production-ready Scylla Enterprise patch release. As always, Scylla Enterprise customers are encouraged to upgrade to Scylla Enterprise 2019.1.6 in coordination with the Scylla support team.

The focus of Scylla Enterprise 2019.1.6 is improving stability and bug fixes. More below.

Related Links

Fixed issues in this release are listed below, with open source references, if present:

  • Stability: a new option allows scrub operation to purge corrupt data from the output, by skipping over it.
  • Stability: write-path validator adds more tests to Scylla write path, identifying potential file format issues as soon as possible.
  • Stability: Possible heap-buffer-overflow when stopping the gossip service #5701
  • Stability: A rare race condition in range scans might cause the scan to read some data twice, triggering data validation errors which causes Scylla to exit.

The post Scylla Enterprise Release 2019.1.6 appeared first on ScyllaDB.

Scylla Monitoring Stack 3.2

Scylla Monitoring Stack Release Notes

The Scylla team is pleased to announce the release of Scylla Monitoring Stack 3.2.

Scylla Monitoring Stack is an open-source stack for monitoring Scylla Enterprise and Scylla Open Source, based on Prometheus and Grafana. Scylla Monitoring Stack 3.2 supports:

  • Scylla Open Source versions 3.1, 3.2 and the upcoming 3.3
  • Scylla Enterprise versions 2018.x and 2019.x
  • Scylla Manager 1.4.x and Scylla Manager 2.0.x

Related Links

New in Scylla Monitoring Stack 3.2

  • Add annotation for Scylla Manager tasks – The manager dashboard has now annotations for the beginning and termination of its tasks (repair and backup)

  • Show all possible I/O queue classes, the IO dashboard is now split into sections, each containing all possible io queue classes. This will support dynamic classes that are being added to the Enterprise release.

  • New Scylla Open Source 3.3 dashboards
  • Move to Grafana 6.6. More on Grafana release can be found here
  • Move to Prometheus 2.15.1. More on Prometheus release can be found here
  • Grafana security – Support specifying the default role. This will allow you to start the monitoring stack with a restricted default role.
    For example: start-all.sh -Q Viewer will start Grafana with the default role set to Viewer.
    This is useful when your dashboards are open to a public audience, in which you prefer that the users will not be able to make changes to the dashboard.
  • A new BYPASS CACHE panel in the CQL dashboard, showing the number of queries that uses BYPASS CACHE

  • New Hinted Handoff panel in the detailed dashboard.
  • A new annotation for reactor stalls. Scylla reactors report any stalles larger than a pre set value (command line parameter blocked-reactor-notify-ms, default 100ms)

  • A new ignore future metrics to the Error dashboard.

Bug Fixes

  • Bring back help strings that removed by mistake #841
  • Compaction share graph shows wrong results #817

The post Scylla Monitoring Stack 3.2 appeared first on ScyllaDB.

What’s New at Scylla University? March 2020 Update

A lot has happened at Scylla University since our last update. Here’s what’s new.

Summit Training Day

For the first time ever, all the material covered at our Scylla Summit training day, in both the Novice and Advanced tracks, is available online at Scylla University. Using hands-on exercises and quiz questions, trainees were able to improve their Scylla and NoSQL skills. You can read more about it here.

Tomer Sandler, ScyllaDB Technical Customer Success Manager, talks about migration strategies with attendees at Scylla Summit 2019.

New Content

A reminder, here are the courses we’ve previously announced:

  • Scylla Essentials – Overview of Scylla and NoSQL Basics
  • The Mutant Monitoring System (MMS)
  • Data Modeling

Now, two new courses are available, Scylla Operations and Using Scylla Drivers!

Scylla Operations Course

This new course focuses on database administration and operations. It’s designed with Administrators and Architects in mind. It will also be useful for Developers and System Engineers who would like to gain in-depth knowledge of Scylla administration. By the end of this course, participants will have a profound understanding of building, administering, and monitoring Scylla clusters, as well as how to troubleshoot Scylla.

The major lessons in the course are:

  • Admin Procedures and Monitoring: The tools Scylla uses to work, test and monitor the nodes and check cluster performance. Among them: Nodetool, logging, CQLsh, the Scylla Monitoring Stack, Cassandra-stress, and tracing. The lesson also covers some basic procedures such as removing a node and checking the cluster status and an example of how to check slow queries and how to solve some common issues.
  • Repair, Tombstones and Scylla Manager: The different kinds of repairs, why and when they are needed. Tombstones can be formed when data is deleted. They generally disappear after compaction or after gc_grace_seconds. Data resurrection can be an issue but there are steps to prevent this, which are also covered. This lesson also covers Scylla Manager, which enables centralized cluster administration and recurrent task automation, for tasks such as repairs and backups.
  • Migrating to Scylla: Offline and online migrations and actual steps with an example of how to perform each one. The lesson also discusses how to use Kafka for the migration, the Scylla Spark Migrator, different options for migrating existing data and some best practices for migration.
  • Compaction Strategies: The lesson begins with the Scylla storage write path and some general concepts required to understand the different compaction strategies. It then goes on to explain Size-Tiered Compaction Strategy (STCS), Leveled Compaction Strategy (LCS), Time-Window Compaction Strategy (TWCS), and the new Incremental Compaction Strategy (ICS), which is unique to Scylla. Finally, it goes over different use cases and which strategy to use in each case.
  • Cluster Management, Repair, and Scylla Manager: Starts with an overview of the Scylla Monitoring Stack, how it can be deployed (Docker / Native), the alerts it generates, and how to identify common Scylla pitfalls using the monitoring stack. Next, it talks about the Scylla Manager, starting from an intro and how it works, then covering different repair types, why they are needed and different approaches to repair.
  • Kubernetes Operator: Shows how it’s possible to leverage Kubernetes to write a great management layer for Scylla. The lesson explains some core Kubernetes principles, the design, and features of the Scylla Operator. It then presents a hands-on example in a playground environment. Finally, it discusses how to achieve high performance in the production environment.
  • Advanced Monitoring and how to Maximize Performance: Based on our experience in discovering and solving performance and other issues and pitfalls in running clusters. Some of the topics covered are: How to monitor Scylla, the Monitoring and Manager dashboards, how to debug an issue, stalls, memory management, scheduling, disk scheduling, CPU scheduling, Workload Prioritization, Controllers and Backpressure vs Overload.

Additional topics that are covered include installing Scylla, security features, onboarding, and troubleshooting common issues.

Using the Scylla Drivers Course

This new course focuses on Scylla drivers and how to use them for application development.
It was designed with Application Developers and Architects in mind. By the end of this course, you will know how to use drivers in different languages to interact with a Scylla cluster. The drivers covered are Java, Python, Go, Go driver extension (gocqlx), and Node.js.

Some of the topics covered include connecting to a cluster and performing simple queries, using prepared statements to improve performance and write better applications, and storing different data types in a table.

The course is still a work in progress and we’re continuing to add more content to it.

Scylla Materialized Views and Secondary Indexes (MV + 2i)

This new lesson has been added to the Data Modeling course. This is an advanced topic that gets a lot of interest but users sometimes get mixed up. A View is a table containing a copy of the results of some query performed on a base table. Some common use cases are indexing with denormalization, different sort order, and filtering (pre-computed queries). The lesson goes over an example of MV, discusses what actually happens when a MV is created and how Materialized Views are implemented in Scylla before moving on to the next topic, Global Secondary Indexes. These are a table containing a copy of the key of a base table, with a single partition key corresponding to the indexed column. An example use case is given followed by an explanation of how they are implemented under the hood. Finally, a new feature is discussed, Local Secondary Indexes, and guidance on when to use each of the above.

Recommended Learning Paths

A learning path is a sequence of recommended courses related to your role and to your experience. We recommend the following:

Certificates

You can now get official ScyllaDB certificates for the courses you completed! To see your certificates go to your profile page.

Certificates can be printed and shared on your LinkedIn profile to show off your achievements!

Next Steps

If you haven’t yet tried out Scylla University, just register as a user and start learning. You can have a look at the available courses here. All the material is available for free. Courses are completely online, self-paced, and include practical examples, quizzes, and hands-on labs.

Join the #scylla-university channel on Slack for more training related discussions. If you have any ideas for new content or if you’ve got any other feedback, just let us know.

REGISTER NOW FOR SCYLLA UNIVERSITY

The post What’s New at Scylla University? March 2020 Update appeared first on ScyllaDB.

Apache Cassandra 4.0 – Audit

Apache Cassandra 4.0 brings about a long awaited feature for tracking and logging database user activity. Primarily aimed at providing a robust set of audit capabilities allowing operators of Cassandra to meet external compliance obligations, it brings yet another enterprise feature into the database.  Combining work for the full query log capability, the audit log capability provides operators with the ability to audit all DML DDL and DCL changes to either a binary file or a user configurable source (including the new Diagnostics notification changes). 

This capability will go a long way toward helping Cassandra operators meet their SOX and PCI requirements.  If you are interested in reading about the development of the feature you can follow along here: https://issues.apache.org/jira/browse/CASSANDRA-12151

From a performance perspective the changes appear to only have a fairly minor hit on throughput and latency when enabled, and no discernible impact when disabled. Expect to see 10% to 15% impact on mixed workload throughput and p99 latency.

By default audit logs are written in the BinLog format and Cassandra comes with tools for parsing and processing them to human readable formats. Cassandra also supports executing an archive command for simple processing of audit logs. Audited keyspaces, users, and command categories can be whitelisted and blacklisted. Audit logging can be enabled in cassandra.yaml. 

What’s the Difference Between Audit Logging, Full Query Logging and Diagnostic Events? 

Both Audit logging (BinAuditLogger) and Full Query logging are managed internally by Apache Cassandra’s AuditLogManager. Both implement IAuditLogger, but are predefined in Apache Cassandra. The main difference is that the full query log receives AuditLogEntries before being processed by the AuditLogFilter. Both the FQL and BAL leverage the same BinLog format and share a common implementation of it. 

Diagnostic events are effectively a queue of internal events that happen in the node. There is an IAuditLogger implementation that publishes filtered LogEntries to the Diagnostics queue if users choose to consume audit records this way.

So think of it this way: Cassandra has an audit facility that enables both configurable audit on actions as well as a full query log, you can have as many AuditLoggers enabled as you want. Diagnostic events is a way for pushing events to client drivers using the CQL protocol and you can pipe AuditEvents to the Diagnostics system!

How Is This Different From Cassandra’s Change Data Capture() Mechanism?

Apache Cassandra has supported CDC on tables for some time now, however the implementation has always been a fairly low level and hard to consume mechanism. CDC in Cassandra is largely just an index into commitlog files that point to data relevant to the table with CDC enabled. It was then up to the consumer to read the commitlog format and do something with it. It also only just captured mutations that were persisted to disk.

Audit logging capability will log all reads, writes, login attempts, schema changes etc. Both features could be leveraged to build a proper CDC stream. I would hazard a guess that it’s probably easier to do with the IAuditLogger interface than consuming the CDC files!

The post Apache Cassandra 4.0 – Audit appeared first on Instaclustr.

Instaclustr Announces Preview Release of Apache Cassandra 4.0 on Instaclustr Managed Platform

Instaclustr announces immediate availability of a preview release of Apache Cassandra 4.0 on the Instaclustr Managed Platform. This release is designed to allow Instaclustr customers to easily undertake any application-specific testing of Apache Cassandra 4.0 (alpha 3) in preparation for the forthcoming GA release of Apache Cassandra 4.0.

Apache Cassandra 4.0, as the first major release of Apache Cassandra for more than 2 years, is a major step forward for the Apache Cassandra community. Key features in Apache Cassandra 4.0 include:

  • non-blocking IO for internode communication—this has provide significant performance improvements in both user query tail-end latency and streaming operations;
  • virtual tables—providing the ability to retrieve performance metrics and other; metadata directly via CQL rather than requiring external tools;
  • audit logging—the ability to log queries that are executed for compliance purposes; and
  • many (generally minor but adding up) stability and testing improvements.

While Cassandra 4.0 is still in an alpha release phase and not yet ready for production usage, Instaclustr is making this preview release available for several reasons:

  1. Building it into our automated provisioning system was a prerequisite to beginning our own validation testing of the 4.0 stream using our Certified Cassandra Framework, and thus allowing us to be ready to release a full GA release of Cassandra 4.0 on our managed platform as soon as Cassandra 4.0 is ready.
  2. This preview release provides customers an easy path to start to undertake their own application-specific validation of Cassandra 4.0, and interacting with Instaclustr Support to debug any specific issues they may come across.

Being a preview release (of an alpha release of Apache Cassandra), this release has several limitations:

  • It is not supported for production usage and not covered by SLAs. The release should be used for testing purposes only;
  • It does not support Password Authentication, User Authorization, and Client to Node encryption (TLS);
  • CounterMutationStage, ViewMutationStage and DroppedMessage Monitoring does not report results;
  • The following add-ons are not supported:
    • Zeppelin
    • Spark
    • Lucene Index
    • Continuous Backup

We will be working to remove these limitations prior to a final release of Cassandra 4.0.

We look forward to working with our customers in this validation phase of Cassandra 4.0. Should you have any issues or questions please contact support@instaclustr.com.

The post Instaclustr Announces Preview Release of Apache Cassandra 4.0 on Instaclustr Managed Platform appeared first on Instaclustr.

The Last Pickle Joining DataStax

Today is a very emotional day: I’m happy, excited, and extremely proud to announce The Last Pickle has been acquired by DataStax.

I started contributing to the Apache Cassandra project in 2010, working by myself in my spare time. In March 2011, I left my job at Weta Digital and “went pro” becoming one of the first Apache Cassandra Consultants in the world. In 2013, Nate McCall joined me at The Last Pickle and we realised we could have a bigger impact on the world by working together. As the team at TLP grew over the years, so did our impact on the world. And with joining DataStax we are going to have the biggest impact possible.

We are at DataStax because we want to be. Because we have a shared passion to make Apache Cassandra the best database in the world, to make it easier to use, and to make it the first database people chose to use when starting a new project. Cassandra made large scale, highly available, databases an achievable goal for many companies around the world. For the first few years this was enough; it worked well enough if you knew how to take care of it. The addition of the Cassandra Query Language and improvements in operations expanded the user base as things got easier. But there is more work to do. We want to make open source Apache Cassandra easier to use at any scale, from 1 node to 1000, and to make it a realistic choice for every developer in the world.

The great level of technical understanding and kindness that the TLP team has given to the open source community, and importantly our customers, is not going away. The team is going to continue to help our customers as we always have, and will now be able to bring more resources when needed. Our contributions to the community are going to continue, and hopefully increase.

I’m extremely thankful to everyone that has worked at TLP over the years, everyone who has said nice things about us and helped us, and all of the customers who put their trust in my little company from New Zealand.

Thanks, Aaron

Instaclustr Announces PCI-DSS Certification

Instaclustr is very pleased to announce that we have achieved PCI-DSS certification for our Managed Apache Cassandra and Managed Apache Kafka offerings running in AWS. PCI-DSS (Payment Card Industry – Data Security Standard) is a mandated standard for many financial applications and we increasingly see the PCI-DSS controls adopted as the “gold standard” in other industries where the highest standards of security are crucial. PCI-DSS certification adds to our existing SOC2 accreditation to provide the levels of security assurance required by even the most demanding business requirements. 

Overall, this certification effort was the most significant single engineering project in the history of Instaclustr, requiring several person years of engineering effort to implement well over 100 changes touching every aspect of our systems over the course of several months. We’re very proud that, despite this level of change, impact to our customers has been absolutely minimal and we’re able to deliver another very significant piece of background infrastructure, allowing a wider range of customers to focus their efforts on building innovative business applications based on open source data technologies.

While PCI-DSS compliance may not be required by all customers, and is only supported on selected Instaclustr products, most of the security enhancements we have implemented will result in improved levels of security for all our Managed Service customers, regardless of product or platform. The most significant of these changes are:

  • Tightening of our admin access environment with technical controls to prevent egress of data via our admin systems.
  • Improved logging and auditing infrastructure.
  • Tightened operating system hardening and crypto standards.
  • Addition of a WAF (Web Application Firewall) in front of our console and APIs.
  • More automated scanning, and tightened resolution policies, for code dependency vulnerabilities.
  • More frequent security scanning of our central management systems.
  • More developer security training.

Customers wishing to achieve full PCI-DSS compliance will need to opt-in when creating a cluster as achieving PCI compliance will enforce a range of more restrictive security options (for example, password complexity in the Instaclustr console and use of Private Network Clusters) and enabling the required additional logging on the cluster incurs a performance penalty of approximately 5%.  There are also a set of customer responsibilities that customers must implement for full compliance. Additional technical controls activated for PCI compliant clusters include:

  • Logging of all user access to the managed applications (Cassandra, Kafka)
  • Locked-down outbound firewall rules
  • Second approver system for sudo access for our admins

For full details please see our support page.

Customers with existing clusters who wish to move to full PCI compliance should contact support@instaclustr.com who will arrange a plan to apply the new controls to your cluster.

We will be publishing more detail on many of these controls in the coming weeks and holding webinars to cover the Cassandra and Kafka specific implementation details which we expect will be of broad interest. In the meantime, should you have any interest in any further information please contact your Instaclustr Customer Success representative or sales@instaclustr.com who will be able to arrange technical briefings.

The post Instaclustr Announces PCI-DSS Certification appeared first on Instaclustr.

Apache Cassandra 4.0 – Netty Transport

In this series of blog posts, we’ll take a meandering tour of some of the important changes, cool features, and nice ergonomics that are coming with Apache Cassandra 4.0. In part 1 we focussed on Cassandra 4.0 stability and testing, and Part 2 around virtual tables. In this part we will learn about Netty Transport Framework

One of the headline features for Apache Cassandra 4.0 is the refactor of internode messaging to use Javas (non-blocking) NIO capability (https://issues.apache.org/jira/browse/CASSANDRA-8457 and https://issues.apache.org/jira/browse/CASSANDRA-15066) via the Netty library (link to Netty). 

This allows Cassandra to move away from having an N threads per peer model, to a single thread pool for all connections. This dramatically reduces performance issues related to thread signalling, coordination and context switching.

Moving to the Netty framework has also enabled other features like zero copy streaming for SSTables.

Performance improvements have been apparent from day one, both in terms of throughput and latency. The big win however is the significant reduction in tail end latency with up to 40%+ reductions in P99s seen in initial testing.  

Of course patches and improvements are being made all the time during the beta process, so benchmarking will need to be revalidated against each workload and within your own environment, but progress is promising!

https://issues.apache.org/jira/browse/CASSANDRA-14746

The post Apache Cassandra 4.0 – Netty Transport appeared first on Instaclustr.

Apache Cassandra 4.0 – Virtual Tables

The last major version release of Apache Cassandra was 3.11.0 and that was more than 2 years ago in 2017. So what has the Cassandra developer community been doing over the last 2 years? Well let me tell you, it’s good real good. It’s Apache Cassandra 4.0! It’s also close, and with the release of the first alpha version, we now have a pretty solid idea of the features and capabilities that will be included in the final release. 

In this series of blog posts, we’ll take a meandering tour of some of the important changes, cool features, and nice ergonomics that are coming with Apache Cassandra 4.0. In part 1 we focussed on Cassandra 4.0 stability and testing, in this part we will learn about Virtual Tables.

Implementation of Virtual Tables

Among the many exciting new features, Cassandra 4.0 boasts is the implementation of Virtual Tables. Up until now, JMX access has been required for revealing Cassandra details such as running compactions, metrics, clients, and various configuration settings. With Virtual Tables, users will be able to easily query this data as CQL rows from a read-only system table. Let’s briefly discuss the changes associated with these Virtual Tables below. 

Previously if a user wanted to look up the compaction status of a given node in a cluster, they would first require a JMX connection to be established in order to run nodetool compactionstats on the node. This alone presents a number of considerations: configuring your client for JMX access, configuring your nodes and firewall to allow for JMX access, and ensuring the necessary security and auditing measures are in place, just to name a few. 

Virtual Tables eliminate this overhead by allowing the user to query this information via the driver they already have configured. There are two new keyspaces created for this purpose: system_views and system_virtual_schema. The system_virtual_schema keyspace is as it sounds; it contains the schema information for the Virtual Tables themselves. All of the pertinent information we want is housed in the system_views keyspace which contains a number of useful tables. 

cqlsh> select * from system_virtual_schema.tables;

 keyspace_name         | table_name                | comment
-----------------------+---------------------------+------------------------------
          system_views |                    caches |                system caches
          system_views |                   clients |  currently connected clients
          system_views |  coordinator_read_latency |                             
          system_views |  coordinator_scan_latency |                             
          system_views | coordinator_write_latency |                             
          system_views |                disk_usage |                             
          system_views |         internode_inbound |                             
          system_views |        internode_outbound |                             
          system_views |        local_read_latency |                             
          system_views |        local_scan_latency |                             
          system_views |       local_write_latency |                             
          system_views |        max_partition_size |                             
          system_views |             rows_per_read |                             
          system_views |                  settings |             current settings
          system_views |             sstable_tasks |        current sstable tasks
          system_views |              thread_pools |                             
          system_views |       tombstones_per_read |                             
 system_virtual_schema |                   columns |   virtual column definitions
 system_virtual_schema |                 keyspaces | virtual keyspace definitions
 system_virtual_schema |                    tables |    virtual table definitions

Before looking at an example, it’s important to touch upon the scope of these Virtual Tables. All Virtual Tables are restricted in scope to their node, and therefore all queries on these tables return data valid only for the node acting as coordinator regardless of consistency. As a result, support for specifying the coordinator node for such queries has been added to several drivers including the Python and Datastax Java drivers. 

Let’s take a look at a Virtual Table, in this case sstable_tasks. This table shows all operations on SSTables such as compactions, cleanups, and upgrades. 

cqlsh> select * from system_views.sstable_tasks;

 keyspace_name | table_name  | task_id                              | kind       | progress | total     | unit
---------------+-------------+--------------------------------------+------------+----------+-----------+-------
     keyspace1 |  standard1  | 09e00960-064c-11ea-a48a-87683fec5884 | compaction | 15383452 | 216385920 | bytes

This is the same information we would expect out of running nodetool compactionstats. We can see that there is currently one active compaction on the node, what its progress is, as well as its keyspace and table. Being able to quickly and efficiently view this information is often key in understanding and diagnosing cluster health. 

While there are still some metrics with which JMX is the only means of querying, having the ability to use CQL to pull important metrics on a cluster is a very nice feature. With Virtual Tables offering a convenient means of querying metrics less focus needs to be placed on building JMX tools, such as Reaper, and more time can be spent working within Cassandra. We may start to see a rise in client-side tooling that takes advantage of Virtual Tables as well. 

The post Apache Cassandra 4.0 – Virtual Tables appeared first on Instaclustr.

Running tlp-stress in Kubernetes

Performance tuning and benchmarking is key to the successful operation of Cassandra. We have a great tool in tlp-stress that makes benchmarking a lot easier. I have been exploring running Cassandra in Kubernetes for a while now. At one point I thought to myself, it would be nice to be able to utilize tlp-stress in Kubernetes. After a bit of prototyping, I decided that I would write an operator. This article introduces the Kubernetes operator for tlp-stress, stress-operator.

Before getting into the details of stress-operator, let’s consider the following question: What exactly is a Kubernetes operator?

Kubernetes has a well-defined REST API with lots of built-in resource types like Pods, Deployments, and Jobs. The API for creating these built-in objects is declarative. Users typically create objects using the tool kubectl and YAML files. A controller is code that executes a control loop watching one or more of these resource types. A controller’s job is to ensure that an object’s actual state matches its expected state.

An operator extends Kubernetes with custom resource definitions (CRDs) and custom controllers. A CRD provides domain specific abstractions. The custom controller provides automation that is tailored around those abstractions.

If the concept of an operator is still a bit murky, don’t worry. It will get clearer as we look at examples of using stress-operator that hightlight some of its features including:

  • Configuring and deploying tlp-stress
  • Provisioning Cassandra
  • Monitoring with Prometheus and Grafana

Installing the Operator

You need to have kubectl installed. Check out the official Kubernetes docs if you do not already have it installed.

Lastly, you need access to a running Kubernetes cluster. For local development, my tool of choice is kind.

Download the following manifests:

stress-operator.yaml declares all of the resources necessary to install and run the operator. The other files are optional dependencies.

casskop.yaml installs the Cassandra operator casskop which stress-operator uses to provision Cassandra.

grafana-operator.yaml and prometheus-operator.yaml install grafana-operator and prometheus-operator respectively. stress-operator uses them to install, configure, and monitor tlp-stress.

Install the operator along with the optional dependencies as follows:

$ kubectl apply -f stress-operator.yaml

$ kubectl apply -f casskop.yaml

$ kubectl apply -f grafana-operator.yaml

$ kubectl apply -f prometheus-operator.yaml

The above commands install CRDs as well as the operators themselves. There should be three CRDs installed for stress-operator. We can verify this as follows:

$ kubectl get crds | grep thelastpickle
cassandraclustertemplates.thelastpickle.com   2020-02-26T16:10:00Z
stresscontexts.thelastpickle.com              2020-02-26T16:10:00Z
stresses.thelastpickle.com                    2020-02-26T16:10:00Z

Lastly, verify that each of the operators is up and running:

$ kubectl get deployments
NAME                     READY   UP-TO-DATE   AVAILABLE   AGE
cassandra-k8s-operator   1/1     1            1           6h5m
grafana-deployment       1/1     1            1           4h35m
grafana-operator         1/1     1            1           6h5m
stress-operator          1/1     1            1           4h51m

Note: The prometheus-operator is currently installed with cluster-wide scope in the prometheus-operator namespace.

Configuring and Deploying a Stress Instance

Let’s look at an example of configuring and deploying a Stress instance. First, we create a KeyValue workload in a file named key-value-stress.yaml:

apiVersion: thelastpickle.com/v1alpha1
kind: Stress
metadata:
  name: key-value
spec:
  stressConfig:
    workload: KeyValue
    partitions: 25m
    duration: 60m
    readRate: "0.3"
    consistencyLevel: LOCAL_QUORUM
    replication:
      networkTopologyStrategy:
        dc1: 3
    partitionGenerator: sequence
  cassandraConfig:
    cassandraService: stress

Each property under stressConfig corresponds to a command line option for tlp-stress.

The cassandraConfig section is Kubernetes-specific. When you run tlp-stress (outside of Kubernetes) it will try to connect to Cassandra on localhost by default. You can override the default behavior with the --host option. See the tlp-stress docs for more information about all its options.

In Kubernetes, Cassandra should be deployed using StatefulSets. A StatefulSet requires a headless Service. Among other things, a Service maintains a list of endpoints for the pods to which it provides access.

The cassandraService property specifies the name of the Cassandra cluster headless service. It is needed in order for tlp-stress to connect to the Cassandra cluster.

Now we create the Stress object:

$ kubectl apply -f key-value-stress.yaml
stress.thelastpickle.com/key-value created

# Query for Stress objects to verify that it was created
$ kubectl get stress
NAME       AGE
key-value   4s

Under the hood, stress-operator deploys a Job to run tlp-stress.

$ kubectl get jobs
NAME       COMPLETIONS   DURATION   AGE
key-value   0/1           4s         4s

We can use a label selector to find the pod that is created by the job:

$ kubectl get pods -l stress=key-value,job-name=key-value
NAME             READY   STATUS    RESTARTS   AGE
key-value-pv6kz   1/1     Running   0          3m20s

We can monitor the progress of tlp-stress by following the logs:

$ kubectl logs -f key-value-pv6kz

Note: If you are following the steps locally, the Pod name will have a different suffix.

Later we will look at how we monitor tlp-stress with Prometheus and Grafana.

Cleaning Up

When you are ready to delete the Stress instance, run:

$ kubectl delete stress key-value

The above command deletes the Stress object as well as the underlying Job and Pod.

Provisioning a Cassandra Cluster

stress-operator provides the ability to provision a Cassandra cluster using casskop. This is convenient when you want to quickly to spin up a cluster for some testing.

Let’s take a look at another example, time-series-casskop-stress.yaml:

apiVersion: thelastpickle.com/v1alpha1
kind: Stress
metadata:
  name: time-series-casskop
spec:
  stressConfig:
    workload: BasicTimeSeries
    partitions: 50m
    duration: 60m
    readRate: "0.45"
    consistencyLevel: LOCAL_QUORUM
    replication:
      networkTopologyStrategy:
        dc1: 3
    ttl: 300
  cassandraConfig:
    cassandraClusterTemplate:
      metadata:
        name: time-series-casskop
      spec:
        baseImage: orangeopensource/cassandra-image
        version: 3.11.4-8u212-0.3.1-cqlsh
        runAsUser: 1000
        dataCapacity: 10Gi
        imagepullpolicy: IfNotPresent
        deletePVC: true
        maxPodUnavailable: 0
        nodesPerRacks: 3
        resources:
          requests:
            cpu: '1'
            memory: 1Gi
          limits:
            cpu: '1'
            memory: 1Gi
        topology:
          dc:
            - name: dc1
              rack:
                - name: rack1       

This time we are running a BasicTimeSeries workload with a TTL of five minutes.

In the cassandraConfig section we declare a cassandraClusterTemplate instead of a cassandraService. CassandraCluster is a CRD provided by casskop. With this template we are creating a three-node cluster in a single rack.

We won’t go into any more detail about casskop for now. It is beyond the scope of this post.

Here is what happens when we run kubectl apply -f time-series-casskop-stress.yaml:

  • We create the Stress object
  • stress-operator creates the CassandraCluster object specified in cassandraClusterTemplate
  • casskop provisions the Cassandra cluster
  • stress-operator waits for the Cassandra cluster to be ready (in a non-blocking manner)
  • stress-operator creates the Job to run tlp-stress
  • tlp-stress runs against the Cassandra cluster

There is another benefit of this approach in addition to being able to easily spin up a cluster. We do not have to implement any steps to wait for the cluster to be ready before running tlp-stress. The stress-operator takes care of this for us.

Cleaning Up

When you are ready to delete the Stress instance, run:

$ kubectl delete stress time-series-casskop

The deletion does not cascade to the CassandraCluster object. This is by design. If you want to rerun the same Stress instance (or a different one that uses reuses the same Cassandra cluster), the stress-operator reuses the Cassandra cluster if it already exists.

Run the following to delete the Cassandra cluster:

$ kubectl delete cassandracluster time-series-casskop

Monitoring with Prometheus and Grafana

The stress-operator integrates with Prometheus and Grafana to provide robust monitoring. Earlier we installed grafana-operator and prometheus-operator. They, along with casskop, are optional dependencies. It is entirely possible to integrate with Prometheus and Grafana instances that were installed by means other than the respective operators.

If you want stress-operator to provision Prometheus and Grafana, then the operators must be installed.

There is an additional step that is required for stress-operator to automatically provision Prometheus and Grafana. We need to create a StressContext. Let’s take a look at stresscontext.yaml:

# There should only be one StressContext per namespace. It must be named
# tlpstress; otherwise, the controller will ignore it.
#
apiVersion: thelastpickle.com/v1alpha1
kind: StressContext
metadata:
  name: tlpstress
spec:
  installPrometheus: true
  installGrafana: true

Let’s create the StressContext:

$ kubectl apply -f stresscontext.yaml

Creating the StressContext causes stress-operator to perform several actions including:

  • Configure RBAC setttings so that Prometheus can scrape metrics
  • Create a Prometheus custom resource
  • Expose Prometheus with a Service
  • Create a ServiceMonitor custom resource which effectively tells Prometheus to monitor tlp-stress
  • Create a Grafana custom resource
  • Expose Grafana with a Service
  • Create and configure a Prometheus data source in Grafana

Now when we create a Stress instance, stress-operator will now also create a Grafana dashboard for the tlp-stress job. We can test this with time-series-casskop-stress.yaml. The dashboard name will be the same as the name as the Stress instance, which in this example is time-series-casskop.

Note: To re run the job we need to delete and recreate the Stress instance.

$ kubectl delete stress times-series-casskop

$ kubectl apply -f time-series-casskop-stress.yaml

Note: You do not need to delete the CassandraCluster. The stress-operator will simply reuse it.

Now we want to check out the Grafana dashboard. There are different ways of accessing a Kubernetes service from outside the cluster. We will use kubectl port-forward.

Run the following to make Grafana accessible locally:

$ kubectl port-forward svc/grafana-service 3000:3000
Forwarding from 127.0.0.1:3000 -> 3000
Forwarding from [::1]:3000 -> 3000
Handling connection for 3000

Then in your browser go to http://localhost:3000/. It should direct you to the Home dashboard. Click on the Home label in the upper left part of the screen. You should see a row for the time-series-casskop dashboard. Click on it. The dashboard should look something like this:

tlp-stress Grafana dashboard

Cleaning Up

Delete the StressContext with:

$ kubectl delete stresscontext tlpstress

The deletion does not cascade to the Prometheus or Grafana objects. To delete them and their underlying resources run:

$ kubectl delete prometheus stress-prometheus

$ kubectl delete grafana stress-grafana

Wrap Up

This concludes the brief introduction to stress-operator. The project is still in early stages of development and as such undergoing lots of changes and improvements. I hope that stress-operator makes testing Cassandra in Kubernetes a little easier and more enjoyable!

Version 4.0 of tlp-stress for Cassandra released

The Last Pickle is very pleased to announce that version 4.0 of tlp-stress for Apache Cassandra has been released.

Here is what you will find in the release:

  • The biggest addition is support for including DELETE statements in a workload via the --delete option. This option works similar to the --reads option where a value between 0 and 1 is supplied with the argument. Each workload is responsible for creating the delete requests. So they naturally fit in with each of the workload patterns. As a result of this new feature, the BasicTimeSeries workload works with only Cassandra versions 3.0 and above. See the tlp-stress documentation for more information.

  • Support for adjusting the maximum requests per connection via the --max-request option.

  • A new Sets workload to exercise the set collection type. This was useful for investigating the root cause of slow inserts to set<text>. See CASSANDRA-15464.

  • Improved error handling for non-existent workload parameters, non-existent workloads, and when the read rate plus the delete rate is greater than 1.0.

  • Documentation updates to reflect the new option and workload additions.

A complete list of the items covered in the release can be found in the tlp-stress GitHub issues under the 4.0 milestone.

As always binaries for the release can be found in The Last Pickle Bintray and Docker Hub repositories.

If you have any questions or suggestions please let us know on the tlp-dev-tools mailing list. We would love to hear from you if you are using the tool!

Apache Cassandra 4.0 – Stability and Testing

The last major version release of Apache Cassandra was 3.11.0 and that was more than 2 years ago in 2017. So what has the Cassandra developer community been doing over the last 2 years? Well let me tell you, it’s good,  real good. It’s Apache Cassandra 4.0! The final release is not here as yet, but with the release of the first alpha version, we now have a pretty solid idea of the features and capabilities that will be included in the final release. 

In this series of blog posts, we’ll take a meandering tour of some of the important changes, cool features, and nice ergonomics that are coming with Apache Cassandra 4.0.

The first blog of this series focuses on stability and testing.

Apache Cassandra 4.0: Stability and Testing

One of the explicit goals for Apache Cassandra 4.0 was to be the “most stable major release of Cassandra ever” (https://instac.io/37KfiAb

As those who’ve run Cassandra in production know it was generally advisable to wait for up to 5 or 6 minor versions before switching production clusters to a new major version. This resulted in adoption only occurring later in the supported cycle for a given major version. All in all, this was not a great user experience, and frankly a pretty poor look for a database which is the one piece of your infrastructure that really needs to operate correctly. 

In order to support a stable and safe major release, a significant amount of effort was put into improving Apache Cassandra testing. 

The first of these is the ability to run multi-node/coordinator tests in a single JVM (https://issues.apache.org/jira/browse/CASSANDRA-14821). This allows us to test distributed behavior with Java unit tests for quicker, more immediate feedback. Rather than having to leverage the longer running, more intensive DTests. This paid off immediately identifying typically hard to catch distributed bugs such as https://issues.apache.org/jira/browse/CASSANDRA-14807 and https://issues.apache.org/jira/browse/CASSANDRA-14812. It also resulted in a number of folk backporting this to earlier versions to assist in debugging tricky issues. 

See here for an example of one of the new tests, checking that a write fails properly during a schema disagreement:(https://instac.io/2ulqNQ7)

Interestingly, the implementation is a nice use of distinct Java class loaders to get around Cassandra’s horrid use of singletons everywhere and allows it to fire up multiple Cassandra Instances in a single JVM. 

From the ticket: “In order to be able to pass some information between the nodes, a common class loader is used that loads up Java standard library and several helper classes. Tests look a lot like CQLTester tests would usually look like.

Each Cassandra Instance, with its distinct class loader is using serialization and class loading mechanisms in order to run instance-local queries and execute node state manipulation code, hooks, callbacks etc.”

On top of this, the community has started adopting Quick Theories as a library for introducing property based testing. Property based testing is a nice middle ground between unit tests and fuzzing. It allows you to define a range of inputs and test the test space (and beyond) in a repeatable and reproducible manner. 

Currently, in trunk there are two test classes that have adopted property based testing: EncodingStatsTest and ChecksummingTransformerTest.  However community members are using it in their own internal validation test frameworks for Cassandra and have been contributing bugs and patches back to the community as well. 

Moving beyond correctness testing, a significant amount of effort has gone into performance testing, especially with the change to adopt Netty as the framework for internode messaging.     So far testing has included, but definitely has not been limited to:

Probably the best indication of the amount of work that has gone into testing of the Netty rewrite can be seen in 15066 https://issues.apache.org/jira/browse/CASSANDRA-15066 and is well worth a read if you are into that kind of thing. 

The post Apache Cassandra 4.0 – Stability and Testing appeared first on Instaclustr.

Scylla Summit 2019

I’ve had the pleasure to attend again and present at the Scylla Summit in San Francisco and the honor to be awarded the Most innovative use case of Scylla.

It was a great event, full of friendly people and passionate conversations. Peter did a great full write-up of it already so I wanted to share some of my notes instead…

This a curated set of topics that I happened to question or discuss in depth so this post is not meant to be taken as a full coverage of the conference.

Scylla Manager version 2

The upcoming version of scylla-manager is dropping its dependency on SSH setup which will be replaced by an agent, most likely shipped as a separate package.

On the features side, I was a bit puzzled by the fact that ScyllaDB is advertising that its manager will provide a repair scheduling window so that you can control when it’s running or not.

Why did it struck me you ask?

Because MongoDB does the same thing within its balancer process and I always thought of this as a patch to a feature that the database should be able to cope with by itself.

And that database-do-it-better-than-you motto is exactly one of the promises of Scylla, the boring database, so smart at handling workload impacts on performance that you shouldn’t have to start playing tricks to mitigate them… I don’t want this time window feature on scylla-manager to be a trojan horse on the demise of that promise!

Kubernetes

They almost got late on this but are working hard to play well with the new toy of every tech around the world. Helm charts are also being worked on!

The community developed scylla operator by Yannis is now being worked on and backed by ScyllaDB. It can deploy, scale up and down a cluster.

Few things to note:

  • it’s using a configmap to store the scylla config
  • no TLS support yet
  • no RBAC support yet
  • kubernetes networking is lighter on the network performance hit that was seen on Docker
  • use placement strategies to dedicate kubernetes nodes to scylla!

Change Data Capture

Oh boy this one was awaited… but it’s now coming soon!

I inquired about it’s performance impact since every operation will be written to a table. Clearly my questioning was a bit alpha since CDC is still being worked on.

I had the chance to discuss ideas with Kamil, Tzach and Dor: one of the thing that one of my colleague Julien asked for was the ability for the CDC to generate an event when a tombstone is written so we could actually know when a specific data expired!

I want to stress a few other things too:

  • default TTL on CDC table is 24H
  • expect I/O impact (logical)
  • TTL tombstones can have a hidden disk space cost and nobody was able to tell me if the CDC table was going to be configured with a lower gc_grace_period than the default 10 days so that’s something we need to keep in mind and check for
  • there was no plan to add user information that would allow us to know who actually did the operation, so that’s something I asked for because it could be used as a cheap and open source way to get auditing!

LightWeight Transactions

Another so long awaited feature is also coming from the amazing work and knowledge of Konstantin. We had a great conversation about the differences between the currently worked on Paxos based LWT implementation and the maybe later Raft one.

So yes, the first LWT implementation will be using Paxos as a consensus algorithm. This will make the LWT feature very consistent while having it slower that what could be achieved using Raft. That’s why ScyllaDB have plans on another implementation that could be faster with less data consistency guarantees.

User Defined Functions / Aggregations

This one is bringing the Lua language inside Scylla!

To be precise, it will be a Lua JIT as its footprint is low and Lua can be cooperative enough but the ScyllaDB people made sure to monitor its violations (when it should yield but does not) and act strongly upon them.

I got into implementation details with Avi, this is what I noted:

  • lua function return type is not checked at creation but at execution, so expect runtime errors if your lua code is bad
  • since lua is lightweight, there’s no need to assign a core to lua execution
  • I found UDA examples, like top-k rows, to be very similar to the Map/Reduce logic
  • UDF will allow simpler token range full table scans thanks to syntax sugar
  • there will be memory limits applied to result sets from UDA, and they will be tunable

Text search

Dejan is the text search guy at ScyllaDB and the one who kindly implemented the LIKE feature we asked for and that will be released in the upcoming 3.2 version.

We discussed ideas and projected use cases to make sure that what’s going to be worked on will be used!

Redis API

I’ve always been frustrated about Redis because while I love the technology I never trusted its clustering and scaling capabilities.

What if you could scale your Redis like Scylla without giving up on performance? That’s what the implementation of the Redis API backed by Scylla will get us!

I’m desperately looking forward to see this happen!

Observability in Apache Cassandra 4.0 with Event Diagnostics

Several new observability features will be part of the next major Apache Cassandra 4.0 release. We covered virtual tables in an earlier blog post and now would like to give a preview of another new feature, Diagnostic Events, which provide real time insight into your Cassandra internals.

Observability is key to successfully operating Apache Cassandra, as it allows users and developers to find bugs and identify runtime issues. Log files and metrics are a very popular way to get insights into a cluster’s behaviour, but they are limited to small text representations or time series data. Often important information is missing from log files and can’t be added without changing the source code and rebuilding Cassandra. In addition, log text output is tailored to be readable by humans and is not designed to be machine readable without creating custom parsers.

Diagnostic Events have been designed to fill this gap by providing a way to observe all different types of changes that occur inside Cassandra as they happen. For example, testing frameworks can subscribe listeners that will block the server when change events happen, thus providing a continuous evaluation of invariants across testing scenarios and environments. While other observability tools could subscribe listeners without blocking the Cassandra server, providing third parties visibility into how Cassandra navigates a variety of changes and critical information about your cluster.

The idea of Event Diagnostics was first proposed in 2016 by Stefan Podwinksi and then implemented as part of CASSANDRA-12944.

Types of Diagnostic Events

Currently the types of observability implemented by Event Diagnostics are:

  • AuditEvents, capturing all audit log entries
  • BootstrapEvents, when a new node is bootstrapping into a cluster
  • GossiperEvents, state changes that are announced over gossip
  • HintEvents, the storage and delivery of hints
  • TokenMetadataEventsPendingRangeCalculatorServiceEvents, lifecycle changes and tasks when token ownership ranges are being calculated
  • ReadRepairEvents and PartitionRepairEvents, different types of repair events
  • SchemaAnnouncementEvents, schema changes proposed, received, and accepted
  • SchemaMigrationEvents, schema propagation across a cluster
  • SchemaEvents, high-level schema changes
  • TokenAllocatorEvents, allocation of token range ownerships, random or via an allocation strategy.

A little about the implementation

Diagnostic Events have been implemented with a JMX interface consisting of two MBeans: the DiagnosticEventService and the LastEventIdBroadcaster. The DiagnosticEventService provides methods to enable diagnostics on a per event type and to bulk read new events. The LastEventIdBroadcaster provides attributes for the last published offsets for each event type. Importanlty, the LastEventIdBroadcaster avoids the use of JMX notifications, a mechanism that too easily loses events, by maintaining an offset for each event type’s queue. Behind these JMX interfaces the persistence of events is regulated by DiagnosticEventStore and although an in-memory store is currently used, an implementation based on Chronicle Queue is planned.

Monitoring Cassandra Diagnostic Events

Reaper is one of the tools that has the ability to listen to and display Cassandra’s emitted Diagnostic Events in real time. The following section outlines how to implement this feature and gain better visiblity into your cluster.

Enabling Diagnostic Events server-side in Apache Cassandra 4.0

Diagnostic Events are not enabled (published) by default in Apache Cassandra version 4.0, but can be manually enabled. To activate the publishing of diagnostic events, enable the diagnostic_events_enabled flag on the Cassandra node:

# Diagnostic Events #
# If enabled, diagnostic events can be helpful for troubleshooting operational issues. Emitted events contain details
# on internal state and temporal relationships across events, accessible by clients via JMX.
diagnostic_events_enabled: true

Restarting the node is required after this change.

Using Reaper to display Event Diagnostics

In Reaper go to the “Live Diagnostics” page.

Select the cluster that is running Cassandra version 4.0 with diagnostic_events_enabled: true, using the “Filter cluster” field.

Expand the “Add Events Subscription” section and type in a description for the subscription to be created.

Select the node whose diagnostic events you want to observe and select the diagnostic events you want to observe. Check “Enable Live View”.

Add Diagnostic Subscription

Press “Save”. In the list of subscriptions there should now be a row that displays the information entered above.

Press “View”. A green bar will appear. Underneath this green bar the types of diagnostic events selected and the nodes subscribed to will be displayed. Press the green bar to stop showing live events.

Add Diagnostic Subscription

Event subscriptions is a way to understand what goes on inside Apache Cassandra, particularly at the cluster coordination level. Enabling Diagnostic Events has no performance overhead on the Cassandra node until subscriptions are also enabled. Reaper’s subscription may not be able to keep up and receive all events when traffic or load is significantly high on the Cassandra node as it keeps the event subscriptions in a fifo in-memory queue, of maximum length 200 per event type. The AuditEvent type can impose additional overhead on a cluster, and for that reason requires additional configuration in the auditing section of Cassandra’s yaml configuration file.

Happy diagnosing, hope you have some fun with this extra real-time insight into Cassandra internals. And if you would like to see more diagnostic event types added reach out, make your suggestions, and even better throw us a patch and get the ball rolling…

Cassandra Reaper 2.0 was released

Cassandra Reaper 2.0 was released a few days ago, bringing the (long awaited) sidecar mode along with a refreshed UI. It also features support for Apache Cassandra 4.0, diagnostic events and thanks to our new committer, Saleil Bhat, Postgres can now be used for all distributed modes of Reaper deployments, including sidecar.

Sidecar mode

By default and for security reasons, Apache Cassandra restricts JMX access to the local machine, blocking any external request.
Reaper relies heavily on JMX communications when discovering a cluster, starting repairs and monitoring metrics. In order to use Reaper, ops has to change Cassandra’s default configuration to allow external JMX access, potentially breaking existing company security policies. All this makes unnecessary burden on Reaper’s out-of-the-box experience.

With its 2.0 release, Reaper can now be installed as a sidecar to the Cassandra process and communicate locally only, coordinating with other Reaper instances through the storage backend exclusively.

At the risk of stating the obvious, this means that all nodes in the cluster must have a Reaper sidecar running so that repairs can be processed.

In sidecar mode, several Reaper instances are likely to be started at the same time, which could lead to schema disagreement. We’ve contributed to the migration library Reaper uses and added a consensus mechanism based on LWTs to only allow a single process to migrate a keyspace at once.

Also, since Reaper can only communicate with a single node in this mode, clusters in sidecar are automatically added to Reaper upon startup. This allowed us to seamlessly deploy Reaper in clusters generated by the latest versions of tlp-cluster.

A few limitations and caveats of the sidecar in 2.0:

  • Reaper clusters are isolated and you cannot manage several Cassandra clusters with a single Reaper cluster.
  • Authentication to the Reaper UI/backend cannot be shared among Reaper instances, which will make load balancing hard to implement.
  • Snapshots are not supported.
  • The default memory settings of Reaper will probably be too high (2G heap) for the sidecar and should be lowered in order to limit the impact of Reaper on the nodes.

Postgres for distributed modes

So far, we had only implemented the possibility of starting multiple Reaper instances at once when using Apache Cassandra as storage backend.
We were happy to receive a contribution from Saleil Bhat allowing Postgres for deployments with multiple Reaper instances, which also allows it to be used for sidecar setups.
As recognition for the hard work on this feature, we welcome Saleil as a committer on the project.

Apache Cassandra 4.0 support

Cassandra 4.0 is now available as an alpha release and there have been many changes we needed to support in Reaper. It is now fully operational and we will keep working on embracing 4.0 new features and enhancements.
Reaper can now listen and display in real-time live diagnostic events transmitted by Cassandra nodes. More background informations can be found in CASSANDRA-12944, and stay tuned for an upcoming TLP blog post on this exciting feature.

Refreshed UI look

While the UI look of Reaper is not as important as its core features, we’re trying to make Reaper as pleasant to use as possible. Reaper 2.0 now brings five UI themes that can be switched from the dropdown menu. You’ll have the choice between 2 dark themes and 3 light themes, which were all partially generated using this online tool.

Superhero reaper theme

Solarized reaper theme

Yeti reaper theme

Flatly reaper theme

United reaper theme

And more

The docker image was improved to avoid running Reaper as root and to allow disabling authentication, thanks to contributions from Miguel and Andrej.
The REST API and spreaper can now forcefully delete a cluster which still has schedules or repair runs in history, making it easier to remove obsolete clusters, without having to delete each run in history. Metrics naming was adjusted to improve tracking repair state on a keyspace. They are now provided in the UI, in the repair run detail panel:

United reaper theme

Also, Inactive/unreachable clusters will appear last in the cluster list, ensuring active clusters display quickly. Lastly, we brought various performance improvements, especially for Reaper installations with many registered clusters.

Upgrade now

In order to reduce the initial startup of Reaper and as we were starting to have a lot of small schema migrations, we collapsed the initial migration to cover the schema up to Reaper 1.2.2. This means upgrades to Reaper 2.0 are possible from Reaper 1.2.2 and onwards, if you are using Apache Cassandra as storage backend.

The binaries for Reaper 2.0 are available from yum, apt-get, Maven Central, Docker Hub, and are also downloadable as tarball packages. Remember to backup your database before starting the upgrade.

All instructions to download, install, configure, and use Reaper 1.4 are available on the Reaper website.

Note: the docker image in 2.0 seems to be broken currently and we’re actively working on a fix. Sorry for the inconvenience.