ScyllaDB’s community is growing faster than ever. From our experienced OSS sea monsters to the recent surge of users adopting ScyllaDB’s database-as-a-service offering, more and more teams are working with ScyllaDB around the globe. ScyllaDB users have questions to ask and a ton of valuable experience to share. That’s why we’re taking our community-building to a new level with the ScyllaDB Community Forum.
The ScyllaDB Community Forum is an open platform where users can learn from one another’s experiences with ScyllaDB. We hope that the forum’s quick access to commonly-asked questions – plus its rich searchability – will make it even easier for you to get started with ScyllaDB, integrate with your evolving ecosystem and pipeline, and try it out with additional use cases within your organization.
The community forum is powered by you, our community. Please help make it a valuable place to share knowledge, skills, experience, and tips through ongoing discussion.
For example, the new community forum is a great place to:
- Get quick access to the most common getting started questions
- Troubleshoot any issues you come across
- Engage in in-depth discussions about new features, configuration tradeoffs, and deployment options
- Search the archives to see how your peers are setting up similar integrations (e.g., ScyllaDB + JanusGraph + Tinkerpop)
- Propose a new topic for us to cover in ScyllaDB University
- Share your perspective on a ScyllaDB blog, ask questions about on-demand videos, or tell us more about what types of resources your team is looking for
- Engage with the community, share how you’re using ScyllaDB, what you learned along the way, and get ideas from your peers
To be clear, this is an additional channel that we’re hosting to help connect and foster our community. It is not replacing any of the other communication platforms that you might already be using. Specifically, it is NOT…
- Designed for real-time chat
- A means of accessing our 24/7 Enterprise Support
- A replacement for the ScyllaDB-User Slack, which remains a great way to ask and answer quick questions (though not a great way to search through existing Q&A)
We love Slack, and we truly enjoy the interactions that occur on the ScyllaDB-User Slack. But as the ScyllaDB community grows, it’s all too easy for discussions to get buried – resulting in the same question being asked over and over again in slightly different ways. We hope that the community forum will help persist the most valuable ScyllaDB tips in a way that promotes easier discovery as well as deeper discussion.
A Quick Tour
Anyone can visit and browse the forum, but you need an account in order to post questions or comments. To get started, sign up at https://forum.scylladb.com/, create a profile, and share how you’re using ScyllaDB. Then, have a look around!
We started off by creating some core categories:
- Knowledge Base: Topics for understanding and troubleshooting ScyllaDB. These are frequently asked questions and general topics.
- ScyllaDB: For general questions related to ScyllaDB products. Topics include Troubleshooting, Benchmarks, Data modeling, Drivers, 3rd party integrations, etc.
- Community: Have fun and engage with your fellow community members. Feel free to introduce yourself here, share how you’re using ScyllaDB, add feature requests, and provide feedback.
- University and Training: Questions regarding specific ScyllaDB University courses, lessons, and training events.
- Blog Posts: A place to discuss the latest blog posts.
- Announcements: News and information related to products, the ScyllaDB community, events, webinars, and so on
Beyond browsing, you can search for something specific. And if a simple search returns too much information, fine tune it with the advanced search filters.
If you can’t find what you’re looking for, go ahead and create a new topic.
Want to know right away when someone quotes your post, mentions your name, replies to your post, etc.? Just enable notifications and set the preferences to your liking.
The more you engage with the ScyllaDB community, the more badges you’ll earn. There might even be some swag giveaways for top badge holders in the near future. Stay tuned to the forum for updates.
Come Say Hello!
We hope that having a community forum will make it even easier for new and experienced sea monsters to connect and learn from one another. Be sure to visit https://forum.scylladb.com and say hello!
The last week of October was a fantastic time for ScyllaDB’s
Research & Development team. The ScyllaDB developers from all over
the world gathered in Tbilisi, the beautiful capital of Georgia, to
share knowledge, learn from each other, and have fun together.
A Global Gathering
ScyllaDB is a remote-first company with a development team working in distributed locations around the globe. A strong company culture, effective communication, and shared values let us achieve our goals and develop our products smoothly and efficiently. Still, in-person events hold a special place in our hearts. Real-time discussions during technical sessions, eye-to-eye chatter during breaks, and hanging out after hours – that’s what we were missing when the COVID-19 pandemic struck in 2020. That’s why, from now on, we’re planning to hold an R&D summit twice a year. Fingers crossed!
This year, over 50 ScyllaDB developers arrived in Tbilisi from different countries, such as Brazil, the Czech Republic, Denmark, Israel, Japan, Poland, Spain, and Sweden.
Learning, Sharing, and Planning
The summit was filled with technical sessions. If you think that sounds boring, you couldn’t be more wrong. Before the summit, everyone could send in their ideas for session topics. Next, the topics were put to a vote to make sure that the agenda was interesting and meaningful to all the participants. We chose a wide range of topics, including:
- Future user-visible features
- Future of ScyllaDB Cloud
- ScyllaDB drivers
- Raft consensus algorithm
- Serverless architecture
- Distributed transactions in NoSQL
Each session was scheduled for 30 minutes. Here at ScyllaDB, we’re really good at punctuality, but this didn’t quite hold true for the Q&A parts of the summit sessions, which sometimes took longer than the presentation itself. That’s what happens when many people knowledgeable and passionate about the same subject meet in one room.
Together to the Top
While the technical sessions were an excellent opportunity to train our brains, the time came to train our muscles. The trip to the Caucasus Mountains was challenging, especially the hike to see the Gergeti Trinity Church, situated at an elevation of 2170 meters. But it was worth it! We were breathless when we reached the top – and not just because of the hard hike. It was the spectacular view and the magnificent Mount Kazbek that took our breath away.
On our way back to Tbilisi, after a company dinner, we had an opportunity to study the arcana of Georgian cuisine and winemaking. It was a hands-on workshop, which turned the ScyllaDB folks into khinkali and churchkhela masters.
We could enjoy many other activities that let us bond together, immersed in the historical atmosphere of Tbilisi. Sightseeing tours, company dinners, and group activities (such as ScyllaDB lego building) helped us get to know each other better and genuinely feel that we all are one team that shares the same goals, attitudes, and values.
ScyllaDB has a number of open positions directly in Israel and other locations (remote work).
Check out our complete list of the latest openings for developers and other positions on our Careers page.
I’m happy to announce our next ScyllaDB University LIVE event, taking place on December 1, 2022. For those of you who are not familiar with it, ScyllaDB University LIVE is a free half-day virtual event, with instructor-led training sessions from our top ScyllaDB engineers and experts.
For the next ScyllaDB University LIVE, we will offer two parallel tracks – ScyllaDB Essentials and Advanced Topics. We’re excited to offer lots of new content, and even completely new sessions – especially in the Advanced track. You can bounce back and forth between tracks or drop in for any individual session that interests you.
Don’t miss out! Sessions are live, and there will not be an on-demand equivalent, so mark your calendars to attend the live event.
I’ll start by welcoming you on the Main stage. I’ll give a quick overview of the different sessions and what you can expect.
Aferwards, also on the Main stage, Tzach, our VP of Product, will talk about our latest release, ScyllaDB 5.1, new features, changes, and what’s on the roadmap.
|Essentials Track||Advanced Track|
|Getting Started with ScyllaDB
Learn how ScyllaDB works, assess if it’s a good fit for your use case, and see what’s involved in spinning up your first cluster and running some basic queries.
|Advanced Topics in ScyllaDB
Learn how to increase performance and efficiency by mastering the usage of collections, user-defined types, materialized views, secondary indexes, prepared statements, paging, retries, and more.
This session is a “bootcamp” for getting started with ScyllaDB. It covers best practices for NoSQL data modeling, selecting the right compaction strategy for your workload type, selecting and working with drivers, and more.
|Interactive Troubleshooting in ScyllaDB
This session shares the process that our experts use to diagnose and resolve emerging database issues. It will include (anonymized) real-world examples based on what we have seen working with ScyllaDB users.
|Build Your First ScyllaDB-Powered
This session is a hands-on demonstration of how to create a full-stack app powered by ScyllaDB Cloud.
|Leveraging ScyllaDB’s DynamoDB API
This session is a hands-on demonstration of how to build a DynamoDB-compatible application that can be deployed wherever you want: on-premises or on any public cloud. It also covers an example of data migration and the different ways of performing it.
Following the training sessions, we will host an expert panel with special guests ready to answer your most pressing questions about NoSQL, ScyllaDB, and distributed data systems.
You’ll also be invited to complete quizzes, take our hands-on labs and receive certificates of completion (and exclusive ScyllaDB swag!).
Get Started on ScyllaDB University
We recommend that before the event, you complete the ScyllaDB Essentials course on ScyllaDB University to better understand ScyllaDB and how the technology works.
Hope to see you at ScyllaDB University LIVE!
How do you achieve microsecond P99 latency with 1.2M op/sec – for ~180M monthly active users expecting real-time engagement with billions of posts per month? And how do you maintain that for a rapidly-growing service while actually reducing cost? These were the challenges that ShareChat, India’s top social media platform, recently faced. And that’s exactly what Geetish Nayak, Staff Engineer/Architect of Platforms at ShareChat, talked about in ShareChat’s Path to High-Performance NoSQL, a talk that is available on demand.
Spoiler: Geetish and team found that modernizing the NoSQL database powering their services was the key to staying ahead of these challenges. They were able to improve performance 3-5x while reducing costs 50-80%. But the devil is always in the details. How did they orchestrate a migration without disrupting their massive business – and what best practices and strategies did they apply to achieve such impressive results?
About ShareChat, India’s Top Social Media Platform
First, let’s take a step back. In case you’re not one of the many millions using India’s leading multilingual social media platform (ShareChat) or India’s biggest short video platform (Moj), here’s Geetish’s introduction:
The Pressures Driving their Database Migration
With such impressive growth and scale, the team started hitting a number of limitations with their existing NoSQL database as a service (DBaaS). They needed to uplevel performance to support users’ high expectations. “We wanted better performance, with lower latency and higher throughput, so that our app experience was better,” Geetish explained. “With social media, everything is expected fast, so we wanted single-digit millisecond latencies.”
Multi-region disaster recovery was another top concern: services need to remain highly available for their massive user base, especially when there are disaster scenarios. Turning the focus inward, there were two main things that would make the engineering team’s lives easier. First, they needed more insight into database KPIs to facilitate their debugging. And second, they sought greater control over their DBaaS. They wanted the option to change the compaction strategy on the fly for a given use case, adjust how much data to cache on the database, change replication factor and consistency levels, and so on. The more they tried to fine-tune the database for their needs, the more frustrated they grew with the lack of visibility and control.
Exploring Fast NoSQL Alternatives
So, the team started exploring other fast NoSQL database options. Someone shared a ScyllaDB white paper in the team’s Slack, and that sparked Geetish’s interest: “I did a lot more research on aspects like the Seastar framework and the shard-per-core architecture. I saw the NoSQL benchmarks comparing ScyllaDB to the database we were using, as well as other popular NoSQL databases. The numbers looked too good to be true, so we decided to put ScyllaDB to the test with one of our production workloads.”
Using ScyllaDB Operator, they quickly spun up a ScyllaDB cluster on Kubernetes, and tried it out against one of their production workloads. The throughput and latency they were able to squeeze out of a relatively small number of nodes with minimal configuration was impressive. After testing ScyllaDB with additional workloads and achieving even more satisfying results, they decided to “go full throttle” on deploying ScyllaDB across ShareChat.
Geetish and his team then set out to migrate ShareChat’s core use cases to ScyllaDB. For example, ScyllaDB is now powering their chat application, real-time notification framework, counters (for views, likes, shares, comments, and 10+ others), ads data management platform, in-memory database use cases, and data science feature store.
A Peek Into ShareChat’s System Architecture
Here’s a peek at how one of those use cases is architected. The counters powering all the views, likes, shares, and other interactions for 50 million users per day rely on an Apache Kafka cluster that’s hosted internally.
All of the views for a particular post get pushed to a Kafka cluster, aggregated with Kafka Streams according to ShareChat’s business logic. After that aggregation completes, they’re written to ScyllaDB using the “atomic counters” data type.
Go Deeper Into ShareChat’s System Architecture, Migration Strategy, ScyllaDB Best Practices, and DBaaS Experiences
Watch the complete video to hear Geetish share:
- The architecture behind other ShareChat use cases – including their data science feature and real-time communication framework
- Details of how they onboarded ~80TB of data and ~40 services to ScyllaDB – including one cluster that’s 20 TB with a throughput of 1.2M op/sec
- Their performance and cost savings results so far, including a look at how they are tracking KPIs
- The strategy they devised to migrate with zero downtime
- Core ScyllaDB best practice they have adopted
- Their experience working with ScyllaDB as a fully-managed database-as-a-service
Q & A with Geetish Nayak, Staff Engineer/Architect – Platforms at ShareChat
We hosted this webinar in two time zones to better accommodate the global community. However, this meant viewers in one timezone did not have the access to questions answered in the other timezone. Here is a recap of some of the top audience questions, along with Geetish’s response.
About their ScyllaDB Deployment
Can you talk about optimizing for cost, and also for performance?
We use a lot of best practices. For example, with ScyllaDB’s
application connects directly to the database node that contains the sharded data. Within the node, it connects directly to the vCPU owning the data, cutting latency and routing which optimizes performance and decreases costs. There are savings at both ends: the database is performing nicely, the apps are performing nicely, you don’t require an in-memory cache in between, and you don’t require an in-memory cache in your app. I think all of these things help.
For the counter nodes, what was the RAM & CPU per node?
n2-highmem-48 * 3 nodes
Handles 300K(Reads/Writes Ratio is 2:1) ops easily – with 50 percent CPU
When building a new use case, what’s the process for the
design of the ScyllaDB cluster? Is redundancy in data expected in
order to improve latency?
Redundancy is required for both high availability and latency. We use local disks to store data, and when a node goes down the data might get lost. Redundancy helps with that. Because of local disks, our latency numbers are much better.
With the distribution of content in social media, you’re going to have some hot posts where you could have hundreds of thousands or millions of likes. Do you run into hot shards or partitions when dealing with that?
Yes, absolutely. We do run into hot shards. I think one of the diagrams that I shared actually shows the hot partitions when one of the vCPUs was very much bombarded. We think of how many times are we getting these kinds of queries? Do we need to further partition our data? That’s a call that we have to make. Also, the live migration framework helps us; if you want to migrate from one table in ScyllaDB to another table in ScyllaDB by changing the partition key, you can also do that.
What is the total size of the data that you migrated from your previous database to ScyllaDB?
In total, 80TB – [and we] plan to add an additional 50 TB. But these are spread across different clusters.
What compaction strategy did you guys use?
We have a default Incremental Compaction Strategy for most of our clusters. We will be moving to Leveled Compaction Strategy as we see that most of our workloads are very read heavy.
ScyllaDB is a wide column database… does it support aggregation?
Yes, it does support aggregation functions: https://docs.scylladb.com/stable/cql/functions.html#aggregate-functions
About NoSQL Database Comparisons
What was the existing database you were using?
We were using databases available in GCP and also have other vendors for in-memory databases.
Did you consider any other databases?
We were looking for a pure NoSQL database. I had prior experience with Cassandra and knew the challenges there. We already were using some NoSQL databases to benchmark against. ScyllaDB’s website has detailed comparisons against every other major NoSQL database in the market.
Does ScyllaDB give better performance for caching compared to Redis?
I would answer this in 3 points:
- For pure Redis use cases where you are using far more involved data structures provided by Redis, ScyllaDB is not a good choice.
- If you are using Redis as a Key-Value store, then yes you can achieve almost similar numbers like Redis.
- If you are using Redis to cache some portion of data because your primary database is slow, you can get rid of Redis. With ScyllaDB your primary database is super fast – so why do you need Redis.
In ShareChat we have done #2 and #3.
Editor’s Note: This article was originally published on The New Stack.
GumGum is a company whose platform serves up online ads related to the context in which potential customers are already shopping or searching. (For instance: it will send ads for Zurich restaurants to someone who’s booked travel to Switzerland.) To handle that granular targeting, it relies on its proprietary machine-learning platform, Verity.
“For all of our publishers, we send a list of URLs to Verity,” according to Keith Sader, GumGum’s director of engineering. “Verity goes in and basically categorizes those URLs as different [internal bus] categories. So the IB has tons of taxonomies, based on autos, based upon clothing based upon entertainment. And then that’s how we do our targeting.”
Verity’s targeting data is stored in DynamoDB, but the rest of GumGum’s data is stored in managed MySQL and its daily tracking data is stored in ScyllaDB, a database designed for data-intensive applications. ScyllaDB, Sader said, helps his company avoid serving audiences the same ads over and over again, by keeping track of which ads customers have already seen.
“That’s where ScyllaDB comes into the picture for us,” he said. “ScyllaDB is our rate limiter on ad serving.”
In this episode of The New Stack’s Makers podcast, Sader and Dor Laor, CEO and co-founder of ScyllaDB, told how GumGum has used ScyllaDB to shift more IT resources to its core business and keep it from repeating ads to audiences that have already seen them, no matter where they travel.
This case study episode of Makers, hosted by Heather Joslyn, TNS features editor, was sponsored by ScyllaDB.
“With ScyllaDB, we have pretty much reduced our entire operations effort to almost nothing… The toughest thing to do in this industry is to make things look easy. And ScyllaDB helped us make ad serving look easy.” Keith Sader, GumGum’s director of engineering
‘Where Do We Spend Our Limited Funds?’
Before adding ScyllaDB to its stack, Sader said, “We had a Cassandra-based system that some very smart people put in. But Cassandra relies upon you to have an engineering staff to support it.
“That’s great. But like many types of systems, managing Cassandra databases is not really what our business makes money at.”
GumGum was hosting its Cassandra database, installed on Amazon Web Services, by itself — and the drain on resources brought the company’s teams to a crossroads, Sader said. “Where do we spend our limited funds? Do we spend it on Cassandra maintenance? Or do we hire someone to do it for us? And that’s really what determined the switch away from a sort of self-installed, self-managed Cassandra to another provider.”
A core issue for GumGum, Sader said, was making sure that it wasn’t over-serving consumers, even as they moved around the globe. “If you see an ad in one place, we need to make sure, if you fly across the country, you don’t see it again,” he said.
That’s an issue Cassandra previously solved for his company, he said. Because ScyllaDB is an API-compatible replacement for Apache Cassandra, it also helped prevent over-serving in all regions of the globe — thus preventing GumGum from losing money.
In addition to managing its database for GumGum and other customers, Laor said that an advantage ScyllaDB brings is an “always on” guarantee.
“We have a big legacy of infrastructure that’s supposed to be resilient,” he said. “For example, every implementation of ours has consistent configurable consistency, so you can have multiple replicas.”
Laor added, “Many many times organizations have multiple data centers. Sometimes it’s for disaster recovery, sometimes it’s also to shorten the latency and be closer to the client.” Replica databases located in data centers that are geographically distributed, he said, protect against failure in any one data center.
Bringing ScyllaDB to GumGum was not without challenges, both Sader and Laor said. When ScyllaDB is added to an organization’s stack, Laor said, it likes to start with as small a deployment as possible.
“But in the GumGum case, all of these clients were new processes,” Laor said. So hundreds or thousands of processes, all trying to connect to the database, it’s really a connection storm.”
Scylla’s team created a private version of its database to work on the problem and eventually solved it: “We had to massage the algorithm and make sure that all of the [open source] code committers upstream are summing it up.”
It ultimately designed an admission control mechanism that measures the number of parallel requests that the distributed database is handling, and slows down requests that arrived for the first time from a new process. “We tried to have the complexity on our end,” Laor said.
GumGum has seen the results of handing off that complexity and toil to a managed database. “With ScyllaDB, we have pretty much reduced our entire operations effort to almost nothing,” Sader said.
He added, “We’re coming into our busy point of the year, ads really get picked up in Q4. So we reach out so we go, ‘Hey, we need more nodes in these regions, can you make that happen for us?’ They go, ‘Yep.’ Give us the things, we pay the money. And it happens.”
In 2021, Sader said, “we increased our volume by probably 75% plus 50%, over our standard. The toughest thing to do in this industry is to make things look easy. And ScyllaDB helped us make ad serving look easy.”
Listen to — or watch — the complete podcast (above) to get more detail about GumGum’s move to a managed database.
ScyllaDB is partnering with our database expert peers at Pythian to provide the community with a free, online High Performance NoSQL Systems Masterclass.
High Performance NoSQL Systems Masterclass
November 9th, 2022, from 08:00 AM to 02:00 PM PST
The Era of NoSQL Choice
This is the era of choice, especially when it comes to databases. A decade ago the questions all revolved around “SQL vs. NoSQL.” These days, that’s no longer the argument. You use SQL for when you need tables and JOINs. And you use NoSQL for when your data structure or queries would be hampered by sticking with a traditional SQL RDBMS. So it’s not a question of “whether” NoSQL. Instead, it’s “which” NoSQL database you want to apply to your use case. Which NoSQL database will provide the best scalability, reliability, and performance? Which will be flexible enough for your queries? Which will give you the better ROI?
At last count, DB-Engines.com tracks nearly four hundred different database systems — SQL, NoSQL, and entirely different fish altogether. Of all the NoSQL systems, it tracks and ranks 65 key-value databases, 55 document model databases, 38 graph databases, and 13 wide column databases. What complicates matters more is that many modern NoSQL systems are “multi-model,” appearing on more than one of those rankings.
So how do you choose which one is right for you? Do you try to use a single multi-model database to tackle myriad and varied use cases? Or do you try to select a different database to be best-in-breed for each use case you have? Even more saliently, will the database you use in your MVP be able to scale and keep performing for your anticipated tens or hundreds of millions of users — all within budget? Or will you have to rip-and-replace half your tech stack as your traffic starts scaling?
I’ll begin the Masterclass with a session offering a quick survey of the current state of the NoSQL industry, with a special focus on wide column NoSQL (e.g., Apache Cassandra and ScyllaDB). What use cases are they best for? What attributes do they have that make them suitable for certain kinds of workloads and data distribution models over others?
Best Practices for Wide Column Databases
For those not familiar a wide column NoSQL database is actually a row-based store. You could call it a “key-key-value” since it has a partitioning key, as well as a clustering key for sorting data within partitions. But how does it actually work? And what are some lessons learned that will help you optimize their performance?
In the Masterclass, Allan Mason, Lead Database Consultant at Pythian, will focus on the wide-column NoSQL database architecture, using ScyllaDB as an example. For those not familiar, ScyllaDB is based on Apache Cassandra and is similar to many other wide column NoSQL systems, so these skills will be transferable to other databases on the market. He’ll highlight why and how to take advantage of distributed, leaderless active-active high availability clustering models, the robust and flexible Cassandra Query Language (CQL), and the data modeling and queries it supports.
Scaling for Performance
The third session in the Masterclass will be hosted by Felipe Cardeneti Mendes, Solutions Architect at ScyllaDB. He will focus on scaling NoSQL for performance and reliability. He’ll go more into attributes of distributed database operations, deployment, and observability. There are tricks of the trade that all good NoSQL database practitioners should know, and you’ll get an insider’s view of how you can get the most out of your own systems in production.
Director of Technical Advocacy
Lead Database Consultant
After the three separate sessions we’ll have all the three presenters join together for a lively conversation bringing together all these concepts. It also serves as a question and answer session for the attendees to make sure that lessons are reinforced and clarifications are given to any of the points made.
Testing, Testing, 1, 2, 3!
What separates a Masterclass from simply attending a webinar or conference is this section of the program which will test you on what you learned from each of the Masterclass presentations. Yes, there will be a test, so make sure you pay attention and don’t bury the browser tab. Though don’t worry, there are no “zingers” — all of the questions and correct answers are drawn from the materials presented during the sessions. Also, don’t sweat it too bad if you don’t pass on your first go-around; we’ll offer a retest to our attendees.
In Case You Missed It
This is the third Masterclass in our series with industry experts. In case you missed the prior events, you can watch them on-demand now:
- Our first was the Distributed Data Systems Masterclass, which makes the case for having high performance event streaming systems and a NoSQL database that can keep up with the massive real-time data pipelines they enable. For that, we partnered with industry expert Maheedhar Gunturu and our friends at StreamNative.
- The second was the Performance Engineering Masterclass, which we hosted with our friends at Dynatrace and Grafana k6. This Masterclass focused on operational requirements for observability, tracing and load testing on modern distributed
Sign Up Today!
This is going to be a great opportunity to deepen your professional skills and knowledge. Both the teams at ScyllaDB and Pythian are looking forward to hosting you on November 9th! Don’t hesitate. Sign up today.
Right on the heels of an amazing P99 CONF, we now turn our attention to ScyllaDB Summit 2023, our free online annual user conference, which will be held February 15–16, 2023. It’s two days dedicated to the high-performance, low-latency distributed applications driving this next tech cycle. The Call for Speakers (CfS) is now open and we invite you to submit your own proposals. You can find the CfS application at https://sessionize.com/scylladb-summit-2023/.
ScyllaDB founders Dor Laor and Avi Kivity will be joined by our engineering team to highlight the latest product and service announcements, reveal a few surprises, as well as provide detailed dives into our technical capabilities and advanced features.
The most critical part of every ScyllaDB Summit comes from you, our user base: your innovations, achievements, integrations, and journeys to production. We’d love for you to share your stories about building scalable, data intensive applications using ScyllaDB.
Making a Great Submission
There are two parts to making a great talk submission: understanding what our audience wants to hear, and framing the story you want to tell in the best light.
What Our Attendees Want to Hear Most
- Building for this Next Tech Cycle — The world is undergoing a massive shift to cloud-native, blink-of-an-eye response, petabyte-scale applications. How is your organization driving change?
- Real-world ScyllaDB use cases — What are you using ScyllaDB for? On-demand services? Streaming media? AI/ML-driven applications? Shopping carts or customer profiles? Cybersecurity and fraud detection? Time series data or IoT?
- War stories — Did you survive a major migration or a datacenter disaster? Everything from design and architecture considerations to POCs, to production deployments, our community loves to hear lessons learned
- Integrations into your data ecosystem — Share your stack! Kafka, Spark, AI/ML pipelines, other databases?
- “Built on ScyllaDB” — Are you embedding ScyllaDB within your own products and services? Are you building amazing graph data systems on top of ScyllaDB and JanusGraph?
- API-first implementations — Did you make a wrapper for CQL? Implement REST or GraphQL? What’s your microservice architecture leveraging Scylla?
- Computer languages and development methods — How are you getting the most from your favorite languages, frameworks and toolkits? What are you re-engineering in Rust? Are you a Pythonista?
- Operational insights — What are your intraday traffic patterns like? Are you deploying via Kubernetes? What observability and tracing tools are you using? Running multi-cloud?
- Open source projects — Are you integrating ScyllaDB with an open source project? Got a Github repo to share? Our attendees would love to walk your code
- Hard numbers — Our users love learning specifics of your clusters: nodes, CPUs, RAM and disk, data size, replication factors, IOPS, throughput, latencies, benchmark, stress test results and ROI. Trot out your charts & graphs
- Tips & tricks — We’d love to hear your best
ideas, from data modeling to performance tuning to unleashing your
inner chaos monkey
Next steps — What are your future plans?
8 Tips for Submitting a Successful Proposal
Help us understand why your presentation is perfect for Scylla Summit 2023. Please keep in mind this event is made by and for deeply technical professionals. All presentations and supporting materials must be respectful and inclusive (take a moment to read our Code of Conduct and Diversity & Inclusion Statement).
- Be authentic — Your peers need original ideas with real-world scenarios, relevant examples, and knowledge transfer
- Be catchy — Give your proposal a simple and straightforward title that’ll hook them
- Be interesting — Make sure the subject will be of interest to others; explain why people will want to attend and what they’ll take away from it
- Be complete — Include as much detail about the presentation as possible
- Don’t be “pitchy” — Keep proposals free of marketing and sales. We tend to ignore proposals submitted by PR agencies and require that we can reach the suggested participant directly.
- Be understandable — While you can certainly cite industry terms, try to write a jargon-free proposal that contains clear value for attendees
- Be deliverable — Sessions have a fixed length, and you will not be able to cover everything. The best sessions are concise and focused. Overviews aren’t great in this format; the narrower your topic is, the deeper you can dive into it, giving the audience more to take home
Lessons Learned for Virtual Conferences
We’ve learned a lot making the transition from in-person to virtual event hosting, and we’d like to share the portions of our process that are most relevant to you as a potential speaker. These processes also explain why we have a schedule set well in advance of the conference itself.
- Welcoming speakers of all experience — ScyllaDB Summit will showcase everyone from seasoned pros to first-time speakers, and can span all stages of adoption of our technology. We especially encourage submissions from voices that have been traditionally underrepresented in the tech industry.
- Speaker support — If you are accepted to speak, our team will help you by reviewing and providing feedback on your title and abstract, your content (from your first draft to final slides), and can even coach you on developing your best video recording and speaking techniques.
- Social media support — We’ll provide all speakers with a social media graphic you can share out personally — or provide to your marketing team — to let your colleagues and communities know you’ll be a featured speaker at ScyllaDB Summit 2023.
- All sessions are pre-recorded — We’ll schedule individual recording appointments about a month before the event. Why so early? This helps in a variety of ways. We can ensure your talk will fit the proper session length. We can edit out small boofs, or even do more than one take if needed. You’ll never have to worry if a live demo is going to fail spectacularly! Plus, it helps us spend more time ahead of the event promoting your talk. Speaking of which…
- Video teasers — While we have you recording your session with our production team, we’ll also take the opportunity to get a short (minute or so) promotional video of you saying you’ll be speaking at Scylla Summit. The sooner we can get that out into the world, the more people we can attract to see your talk at the event.
- You get to interact with the audience live throughout the session — It does feel a bit odd to see yourself on the main stage while you’re a member of the audience, but this gives you the chance to chat and interact with other attendees live the whole length of your session.
- Speaker’s Lounge — In face-to-face conferences, we all love that chance to gaggle in the hallways after a particularly awesome session. We’ve captured the spirit of that live event experience in our virtual Speaker’s Lounge — our virtual talk show. Once you wrap your scheduled session, as a speaker you’ll make your way to the lounge along with any interested attendees that wish to follow. We’ll have prepared some questions. The audience can ask their own, too. In fact, since you’ll be in the lounge along with other speakers, be prepared to pepper each other! We’ve found the interchanges quite lively and many speakers love to stick around in the lounge long after their scheduled talks are done.
The next part is up to you! Take a day or two to think about what you’d like to talk about. Bounce some ideas off your teammates and professional colleagues if you wish. Just don’t take too long and miss our November 23rd deadline! We’re looking forward to reviewing all your great ideas. And if you have any questions which we haven’t answered above, we welcome you to send them to us at firstname.lastname@example.org.
Editor’s Note: The deadline for submissions has been extended from November 11th to November 23rd, 2022.
P99 CONF 2022 is a wrap! If you were among the thousands of engineers in attendance, thank you. P99 CONF was designed to connect and advance the community of engineers obsessed with all things performance and P99 – and yes, that might involve questioning whether P99 itself remains a valuable metric. Your sharp speaker questions (and sometimes snarky chat) are vital to making the conference a compelling community resource.
Also, a special thanks to the many speakers who joined us far outside of their normal working hours. From New Zealand, across Oceania, Asia, Europe and the UK, Africa, South America, and North America all the way to the north shore of Kauai, it’s fair to assume that much coffee was consumed.
The conference is over, but you can keep the P99 conversation going with your team and engineering peers across the industry. Here are a few options…
(Re)Watch the Talks On Demand
All talks are now available on demand, with the accompanying slide decks. Rewatch anything that you missed, and share what you found interesting with your colleagues and social network. We provided a run-down of Day 1 sessions in yesterday’s blog. Here’s what happened Day 2:
- Charity Majors shared the performance lessons we can all learn from game developers, who were among the first to run up against the limits of low-cardinality tools.
- Bryan Cantrill weighed in on allowing engineers to make their own tools, resulting in better systems delivered faster and with greater confidence.
- Avi Kivity creatively applied tools to take us on a unique journey into what IO heavy database workloads look like from the perspective of a fast NVMe SSD.
- And over 30 more engineering talks covering topics like Rust rewrites, Go performance tuning, Linux tracing, Linux kernel vs DPDK, and more.
Speakers Lounge Highlights
Without leaving your work area of choice, you could instantly connect with a dizzying array of expert speakers, as well as tons of insightful fellow attendees. After the real-time chat during each session, any attendee could follow speakers into the Speakers Lounge, which was a lively venue for casual chat with host Peter Corless, plus the other speakers in the lounge at the time.
Here’s Peter’s take:
“The Speakers Lounge was a great opportunity to dig deeper into the technology, development methodologies, use cases, as well as the backgrounds of the speakers and their organizations.
It also allowed our audience to further engage with their favorite speakers. To ask the deeper questions they had on their mind beyond what was covered in the sessions themselves.”
The lounge was recorded this year. If you want to catch up on the moments you missed, stay tuned to the P99 CONF twitter handle.
Consume – or Contribute – More Technical Content
ScyllaDB, the host of P99 CONF, is known for its impressive engineering feats. We’re proud of our deeply technical engineering blogs and tech talks detailing how we approached these challenges – what worked, what didn’t, and what we learned along the way. Start browsing here.
Also, we’d love to help you share your own low-latency strategies across the P99 CONF community. If you have submissions or want to brainstorm ideas, ping us at email@example.com.
Join the High Performance NoSQL Masterclass
Is your project’s database performance holding you back? Then join – or invite your teammates to join – our upcoming masterclass on best practices for highly scalable, high performance NoSQL. The format is a lot like P99 CONF: interactive with in-session chats, follow-up speaker interviews, lounges, and of course some fun contests and giveaways.
Experts from Pythian and ScyllaDB will walk you through best practices for supporting real-time operational workloads at massive scale and speed. Specifically, we’ll start with an overview of core NoSQL options and tradeoffs, demonstrate principles to keep in mind when adopting NoSQL, and then explore advanced strategies to squeeze additional performance out of your NoSQL system. At the end, you will have the opportunity to demonstrate what you learned and earn a certificate that shows your achievement.
You’ll learn how to:
- Match database models and options to your specific use case
- Establish a solid foundation for highly scalable, high performance NoSQL
- Proactively identify and resolve emerging performance issues
- Optimize the system to support continued growth
P99 CONF by the Tweet
Here’s an overview of conference highlights from attendees’ points of view. Be sure to follow @P99CONF on Twitter for additional insights from speakers, technical discussion of P99 CONF themes, and news about P99 CONF 2023 – which we’re already starting to plan!
What a day! Not only I will get the chance to present #cachegrand at the #p99conf later on, but I have just received the new HW (2xAMD EPYC 7501 and 256GB of ram) I was waiting for a new build to better #benchmark cachegrand! Waiting for the motherboad, then will be build time! pic.twitter.com/Oqj4ea4KuM
— Daniele Albano (@daniele_dll) October 19, 2022
Any good talk on "misery metrics" (and the consequences) will lead with a slide like this.#ScyllaDB #P99Conf opens with a @giltene, CTO & Co-Founder, @AzulSystems keynote, about the trickiness of measuring latency in data pipelines, with lessons learned. pic.twitter.com/PJvHKUjqPD
— Paul Philleo (@philpauleo) October 19, 2022
.@giltene dishing out the truth about how percentiles lie at #p99conf. Take the red pill folks. "These charts are wasting a lot of your life. The 5% of things worse than p95 aren't shown" #scylladb pic.twitter.com/0buLjGdPvJ
— Fred Moyer (@phredmoyer) October 19, 2022
— Dave Taht (guy/grumpy) (@mtaht) October 19, 2022
#p99conf with @giltene Misery Metrics and Consequences. "This is the chart to show if you want to hide reality… Percentiles are an act of hiding reality. It might give you a sense of avg performance, but if you care about the behavior of your systems you need to look deeper." pic.twitter.com/XLJDk554nG
— Rachel Stephens (@rstephensme) October 19, 2022
I have been ranting and raving for decades about all the topics #p99conf #scylladb is covering today. I have coped with too many folk think that 95% was *good* and I'd explain – if your steering wheel failed one time in 20, how long would you live? https://t.co/CPvXzM65Ob
— Dave Taht (guy/grumpy) (@mtaht) October 19, 2022
No matter how perfect the environment, the measurement, the fine-tuning, it's wise not to let perfection become the enemy of the good. And to focus on tracking meaningful metrics. #LatencyTipoftheDay pic.twitter.com/BKqKm3OfQw
— Paul Philleo (@philpauleo) October 19, 2022
— Steven Cole (@steve_the_dm) October 19, 2022
— Liz Rice (@lizrice) October 19, 2022
— Paul Philleo (@philpauleo) October 19, 2022
I'm thoroughly enjoying #P99CONF from #ScyllaDB. Amazing knowledge, lessons, and tools about all aspects of high performance that drive our technological world. As we educators work to prepare students to work in this sector, it's great that conferences like this exist.
— Cam Macdonell (@cjmacdonell) October 19, 2022
#p99conf #scylladb as much as I am liking this conference, and there seems to be lofts of e2e principle thinking here – a bit more deep thought on more intelligently shedding load than retries and exponential backoff, IMHO. https://t.co/l4apVD50Fl
— Dave Taht (guy/grumpy) (@mtaht) October 19, 2022
— F. J. Tsao Santín ن (@FranTsao) October 19, 2022
.@petercorless and @mikeb2701 in the Speaker's Lounge, digging a little deeper, with @mitsuhiko after the mid-day day 1 keynote (about building ingestion and processing pipelines to accommodate complex data payloads and events) at #scylladb #P99Conf. pic.twitter.com/0JRH064ZiJ
— Paul Philleo (@philpauleo) October 19, 2022
I still find remote conferences weird but @P99CONF was quite fun so far i have to admit.
— Armin Ronacher (@mitsuhiko) October 19, 2022
I really enjoyed the talks (and hallway track!) yesterday at #p99conf, and I am looking forward to giving my own this morning on the primacy of toolmaking. The talks are prerecorded and registration is free — so you can join me in heckling myself! https://t.co/MY15cF0S1I
— Bryan Cantrill (@bcantrill) October 20, 2022
— Mohnish Kodnani (@mohnishkodnani) October 20, 2022
In @mipsytipsy's (CTO, @honeycombio) day 2 #ScyllaDB #P99Conf keynote, this "O.D.D." (#observability driven development) advice got my attention — about how to write & deploy code with instrumentation, given increasingly complex, highly monitored, user-oriented environments. pic.twitter.com/VD1Vq8YSRV
— Paul Philleo (@philpauleo) October 20, 2022
— David Calvert (@0xDC_) October 20, 2022
.@bcantrill @oxidecomputer bringing the fire in passionate support of #toolmaking as a source of #innovation and #investment, even if it may be risky. As the last point says, is it's always paying off, maybe taking a little more risk is the way to go. #p99conf #scylladb pic.twitter.com/xbu7m5UE8d
— Paul Philleo (@philpauleo) October 20, 2022
— Rachel Stephens (@rstephensme) October 20, 2022
— Marcin Rusek (@marcin_rusek) October 20, 2022
Day 2 @P99CONF
Very informative and interesting q&a session @P99CONF #P99CONF #ScyllaDB
Learn lots of concept about #observability and #SLOs , #toolmaking by
Co-founder/CTO of @honeycombio@mipsytipsy
And rocking @oxidecomputer's @bcantrill pic.twitter.com/WwVOHEmWfU
— ADITYA DAS (@ADITYA90546170) October 20, 2022
— Dave Taht (guy/grumpy) (@mtaht) October 20, 2022
When looking to optimize servers for high throughput and low latency, high level optimizations are a strong starting point — Alexey Ivanov, Software Engineer, @dapperlabs. (Good examples are included on the slide as well.) #P99Conf #Scylladb #server #optimization pic.twitter.com/BqbB83W55I
— Paul Philleo (@philpauleo) October 20, 2022
— Bulent Coskun (@bulentcoskun) October 20, 2022
Amazing atmosphere pic.twitter.com/qVDmeOHiTt
— Bartłomiej Płotka (@bwplotka) October 20, 2022
Here is my @P99CONF talk! Some of you might have seen me give this talk before, but this was a really good recording of it.
I've been joining the conference as I can and everything has been high quality and fun!https://t.co/CZhcWqzUM2
— William (Bill) Kennedy (@goinggodotnet) October 20, 2022
As the end of day 2 of #P99Conf nears, the event is finishing strong with talks such as this one: how #MachineLearning can optimize #cloudnative reliability, efficiency and performance — by @liko9, Global Director of Solutions Architecture at @stormforge. #kubernetes pic.twitter.com/wz4n9iyYkp
— Paul Philleo (@philpauleo) October 20, 2022
— Rachel Stephens (@rstephensme) October 20, 2022