Self-Driving Databases: Current and Future
Treasure Data is a fully-managed DBMS service that frees our customers from operating DBMS by themselves. This empowers non-engineers (e.g., executives, marketers, etc.) to collect data and get insights through data analysis in a self-service manner. In October 2017, Oracle unveiled the world-first self-driving database as a service that even supports automated performance tuning, self-scaling and self-repairing. This is a dream database engine that database researchers have been longing for.
Twenty years ago (1997) the keyword was self-tuning database systems. Surajit Chaudhuri, a distinguished scientist at Microsoft Research, had been studying this subject for a long time and some of the results have been implemented in SQL Server as well as in other DBMS products. His self-tuning database paper published in VLDB 1997 was selected as the 10-year best paper at VLDB 2007. Still, most of the major DBMSs need expert tuning and careful schema design to maximize the query performance.
At the biennial Conference on Innovative Data Systems Research (CIDR) 2017, Andrew Pavlo’s group revived this subject giving it a catchy new name: Self-Driving Database Management Systems. If DBMSs can operate themselves and automatically optimize the physical storage for query processing, our systems will no longer be susceptible to sudden traffic changes or poor query performance. It also means our reliance on DBA’s work will be greatly reduced and data engineering work will be much easier since neither tuning nor sophisticated schema design will be required.
There are several major changes that happened to DBMSs in the last 10 years. First was a shift to columnar storage for processing massive amounts of data with distributed processing – major open-source data processing engines, such as Hive, Presto, Spark, etc. have already incorporated columnar storage in their core designs for distributed OLAP workloads. Second was the adoption of in-memory storage for transaction processing – recent development of non-volatile memory (NVM) storage also accelerated this trend because NVM supports fine-grained data writes and can be used complementarily with ensuring the persistency of in-memory DBMSs.
A potential third major development still to come would be to transform a DBMS itself for supporting OLAP and OLTP workloads at the same time. This type of workload is called Hybrid Transactional Analytical Processing (HTAP), a term coined by Gartner. To date, most DBMS products still need to choose a single DBMS engine suited to individual workloads, such as OLAP and OLTP. A true self-driving DBMS should be able to adjust its internals to accommodate different types of workloads at runtime.
Peloton is the prototype of just such a new DBMS, which utilizes machine learning techniques to classify current workloads and deep-learning technologies (e.g., TensorFlow) for forecasting the future ones. Based on these predictions, Peloton controls the number of CPUs to use to optimize the query performance. Peloton also customizes the hybrid storage format layout of rows and columns to reduce the latency of OLAP and OLTP workloads.
To be fully self-driving, however, DBMSs require further automation. Through more than four years of experience in operating database services 24/7 at Treasure Data, I have learned several challenges not mentioned in these research papers:
- Upgrading DBMS systems. To operate a database service in the cloud we must avoid long service downtime. What makes for the biggest challenge though is maintaining continuity of query behavior between versions. Our system is now processing more than a half-million queries / day and checking the compatibility of all these queries is a hard and complex task. Presto, a distributed query engine open-sourced by Facebook, is evolving rapidly, with up to three new releases every month, bringing new features, bug fixes, and performance improvements as well as changes to its internals. It would be impossible for on-premise database systems to upgrade the systems at this pace.
- Multi-tenant deployment. Deploying DBMS systems is still challenging even after the introduction of application container systems like Docker, Kubernetes, YARN, etc. These container-based deployments are too coarse-grained for database applications and fail at optimally utilizing all the system resources. To fully utilize resources to minimize the cost, we need to pack query workloads of multiple users into a shared cluster (multi-tenancy). Unlike containers, which can be mapped to a fixed amount of CPU and memory resources, estimating the required computing resources for multi-tenancy systems under the mixed query workloads is not straightforward.
- Physical storage format change. To optimize the query performance according to the workload, we need to adopt a variety of compression techniques, different partitioning, data formats, etc., but finding the optimal data compression strategy is hard. And also migrating existing data files to the new format will require significant amount of computing resources at scale.
- Metadata management. There has been no silver bullet developed for distributed data storage systems. AWS S3 works well for storing massive amounts of data and is capable of handling high-frequency read requests but its inherent key value store design makes it inefficient at listing files. We need to operate multiple systems for data files (e.g. S3, KVS) and metadata management (e.g. indexes for file lists). For example, Google also ended up managing multiple layers of metadata stores to support their various types of workloads.
- Collecting and analyzing workload logs. Big-data management systems handle massive amounts of workloads, and analyzing the logs produced by these workloads is no small processing task. We are lucky to be able to dog food our own product to collect and analyze large volumes of query logs; if we did not, we would have to run yet another large scale data processing system just to analyze the telemetry data necessary to get the insights for optimizing the system.
- Guiding users to more efficient query writing. For processing large volumes of data, we also need to employ state-of-the-art techniques in computer sciences: one of them is data sketching. Presto has approx_distinct functions, which can accelerate count(distinct x) queries while preserving reasonable accuracy. We can also apply incremental processing to reduce duplicated work in data processing by introducing workflows. Although rewriting the queries in this manner slightly changes its semantics, it does accelerate the speed dramatically. It’s a task that resembles suggesting a better route to a car driver.
If we include these points into considerations, we are still far from a true self-driving DBMS. These systems still need to change themselves to optimize the data processing, while designing and developing database systems still requires considerable human efforts. The extent of what we can do now is to architect the DBMSs as building blocks that the system can compose as optimally as possible to provide the best performance for any given workload. Algorithms and systems to analyze the query workload also have not been established yet, but machine learning approaches are definitely effective in this context.
Some of the challenges and our approaches at Treasure Data are described in the following slides. As our system keeps growing, some numbers are already outdated, for example, 150k queries / day in June 2017 are now 500k queries / day in April 2018:
A self-driving DBMS is a system capable of collecting its own statistics and workload logs and use it to optimize itself. Data sketching algorithms might be useful to reduce the data size. We believe the future of a self-driving DBMS will be a system that is more self-contained as a service and includes its own data ingestion subsystem for collecting both user data and workload logs, query, and transaction processing engines, and machine learning components for self-optimization. Now that deep learning technologies can beat human professionals without requiring any prior knowledge, it might even be possible to build DBMSs without requiring human help.