skip to main content
research-article
Open access

PolarDB-IMCI: A Cloud-Native HTAP Database System at Alibaba

Published: 20 June 2023 Publication History
  • Get Citation Alerts
  • Abstract

    Cloud-native databases have become the de-facto choice for mission-critical applications on the cloud due to the need for high availability, resource elasticity, and cost efficiency. Meanwhile, driven by the increasing connectivity between data generation and analysis, users prefer a single database to efficiently process both OLTP and OLAP workloads, which enhances data freshness and reduces the complexity of data synchronization and the overall business cost.
    In this paper, we summarize five crucial design goals for a cloud-native HTAP database based on our experience and customers' feedback, i.e., transparency, competitive OLAP performance, minimal perturbation on OLTP workloads, high data freshness, and excellent resource elasticity. As our solution to realize these goals, we present PolarDB-IMCI, a cloud-native HTAP database system designed and deployed at Alibaba Cloud. Our evaluation results show that PolarDB-IMCI is able to handle HTAP efficiently on both experimental and production workloads; notably, it speeds up analytical queries up to ×149 on TPC-H (100GB). PolarDB-IMCI introduces low visibility delay and little performance perturbation on OLTP workloads (<5%), and resource elasticity can be achieved by scaling out in tens of seconds.

    Supplemental Material

    MP4 File
    Presentation video for SIGMOD 2023

    References

    [1]
    Cagri Balkesen, Jens Teubner, Gustavo Alonso, and M. Tamer Ö zsu. 2013. Main-memory hash joins on multi-core CPUs: Tuning to the underlying hardware. In 29th IEEE International Conference on Data Engineering, ICDE 2013, Brisbane, Australia, April 8--12, 2013. IEEE Computer Society, 362--373.
    [2]
    Ronald Barber, Matt Huras, Guy Lohman, C Mohan, Rene Mueller, Fatma Özcan, Hamid Pirahesh, Vijayshankar Raman, Richard Sidle, Oleg Sidorkin, et al. 2016. Wildfire: Concurrent blazing data ingest and analytics. In Proceedings of the 2016 International Conference on Management of Data. 2077--2080.
    [3]
    Ronald Barber, Guy M. Lohman, Ippokratis Pandis, Vijayshankar Raman, Richard Sidle, Gopi K. Attaluri, Naresh Chainani, Sam Lightstone, and David Sharpe. 2014. Memory-Efficient Hash Joins. Proc. VLDB Endow., Vol. 8, 4 (2014), 353--364.
    [4]
    Peter A. Boncz, Marcin Zukowski, and Niels Nes. 2005. MonetDB/X100: Hyper-Pipelining Query Execution. In Second Biennial Conference on Innovative Data Systems Research, CIDR 2005, Asilomar, CA, USA, January 4--7, 2005, Online Proceedings. www.cidrdb.org, 225--237.
    [5]
    Dhruba Borthakur, Jonathan Gray, Joydeep Sen Sarma, Kannan Muthukkaruppan, Nicolas Spiegelberg, Hairong Kuang, Karthik Ranganathan, Dmytro Molkov, Aravind Menon, Samuel Rash, Rodrigo Schmidt, and Amitanand Aiyer. 2011. Apache Hadoop Goes Realtime at Facebook. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data (SIGMOD '11). Association for Computing Machinery, New York, NY, USA, 1071--1080.
    [6]
    Dennis Butterstein, Daniel Martin, Knut Stolze, Felix Beier, Jia Zhong, and Lingyun Wang. 2020. Replication at the speed of change: a fast, scalable replication solution for near real-time HTAP processing. Proceedings of the VLDB Endowment, Vol. 13, 12 (2020), 3245--3257.
    [7]
    Shaosheng Cao, XinXing Yang, Cen Chen, Jun Zhou, Xiaolong Li, and Yuan Qi. 2019. TitAnt: Online Real-Time Transaction Fraud Detection in Ant Financial. Proc. VLDB Endow., Vol. 12, 12 (aug 2019), 2082--2093.
    [8]
    Wei Cao, Zhenjun Liu, Peng Wang, Sen Chen, Caifeng Zhu, Song Zheng, Yuhui Wang, and Guoqing Ma. 2018. PolarFS: An ultralow latency and failure resilient distributed file system for shared storage cloud database. Proceedings of the VLDB Endowment, Vol. 11, 12 (2018), 1849--1862.
    [9]
    Wei Cao, Yingqiang Zhang, Xinjun Yang, Feifei Li, Sheng Wang, Qingda Hu, Xuntao Cheng, Zongzhi Chen, Zhenjun Liu, Jing Fang, Bo Wang, Yuhui Wang, Haiqing Sun, Ze Yang, Zhushi Cheng, Sen Chen, Jian Wu, Wei Hu, Jianwei Zhao, Yusong Gao, Songlu Cai, Yunyang Zhang, and Jiawang Tong. 2021. PolarDB Serverless: A Cloud Native Database for Disaggregated Data Centers. In SIGMOD '21: International Conference on Management of Data, Virtual Event, China, June 20--25, 2021. ACM, 2477--2489.
    [10]
    Surajit Chaudhuri, Gautam Das, and Utkarsh Srivastava. 2004. Effective Use of Block-Level Sampling in Statistics Estimation. In Proceedings of the ACM SIGMOD International Conference on Management of Data, Paris, France, June 13--18, 2004. ACM, 287--298.
    [11]
    Surajit Chaudhuri, Rajeev Motwani, and Vivek R. Narasayya. 1998. Random Sampling for Histogram Construction: How much is enough?. In SIGMOD 1998, Proceedings ACM SIGMOD International Conference on Management of Data, June 2--4, 1998, Seattle, Washington, USA. ACM Press, 436--447.
    [12]
    Jianjun Chen, Yonghua Ding, Ye Liu, Fangshi Li, Li Zhang, Mingyi Zhang, Kui Wei, Lixun Cao, Dan Zou, Yang Liu, et al. 2022a. ByteHTAP: bytedance's HTAP system with high data freshness and strong data consistency. Proceedings of the VLDB Endowment, Vol. 15, 12 (2022), 3411--3424.
    [13]
    Shimin Chen, Anastassia Ailamaki, Phillip B. Gibbons, and Todd C. Mowry. 2004. Improving Hash Join Performance through Prefetching. In Proceedings of the 20th International Conference on Data Engineering, ICDE 2004, 30 March - 2 April 2004, Boston, MA, USA. IEEE Computer Society, 116--127.
    [14]
    Zongzhi Chen, Xinjun Yang, Feifei Li, Xuntao Cheng, Qingda Hu, Zheyu Miao, Rongbiao Xie, Xiaofei Wu, Kang Wang, Zhao Song, et al. 2022b. CloudJump: optimizing cloud databases for cloud storages. Proceedings of the VLDB Endowment, Vol. 15, 12 (2022), 3432--3444.
    [15]
    Inc. ClickHouse. 2022. ClickHouse - open source distributed column-oriented DBMS. https://github.com/ClickHouse/ClickHouse/tree/22.6.
    [16]
    Inc. ClickHouse. 2023. ClickHouse - Roadmap 2023. https://github.com/ClickHouse/ClickHouse/issues/44767.
    [17]
    Richard L. Cole, Florian Funke, Leo Giakoumakis, Wey Guy, Alfons Kemper, Stefan Krompass, Harumi A. Kuno, Raghunath Othayoth Nambiar, Thomas Neumann, Meikel Poess, Kai-Uwe Sattler, Michael Seibold, Eric Simon, and Florian Waas. 2011. The mixed workload CH-benCHmark. In Proceedings of the Fourth International Workshop on Testing Database Systems, DBTest 2011, Athens, Greece, June 13, 2011. ACM, 8.
    [18]
    Apache Community. 2023. Apache Flink. https://flink.apache.org/.
    [19]
    THE TRANSACTION PROCESSING COUNCIL. 2014. TPC-C. http://www.tpc.org/tpcc/.
    [20]
    THE TRANSACTION PROCESSING COUNCIL. 2023. TPC-H. http://www.tpc.org/tpch/.
    [21]
    Benoit Dageville, Thierry Cruanes, Marcin Zukowski, Vadim Antonov, Artin Avanes, Jon Bock, Jonathan Claybaugh, Daniel Engovatov, Martin Hentschel, Jiansheng Huang, et al. 2016. The snowflake elastic data warehouse. In Proceedings of the 2016 International Conference on Management of Data. 215--226.
    [22]
    Lei Deng, Jerry Gao, and Chandrasekar Vuppalapati. 2015. Building a Big Data Analytics Service Framework for Mobile Advertising and Marketing. In First IEEE International Conference on Big Data Computing Service and Applications, BigDataService 2015, Redwood City, CA, USA, March 30 - April 2, 2015. IEEE Computer Society, 256--266.
    [23]
    Alex Depoutovitch, Chong Chen, Jin Chen, Paul Larson, Shu Lin, Jack Ng, Wenlin Cui, Qiang Liu, Wei Huang, Yong Xiao, et al. 2020. Taurus database: How to be fast, available, and frugal in the cloud. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 1463--1478.
    [24]
    James R Driscoll, Neil Sarnak, Daniel D Sleator, and Robert E Tarjan. 1989. Making data structures persistent. Journal of computer and system sciences, Vol. 38, 1 (1989), 86--124.
    [25]
    Franz F"a rber, Norman May, Wolfgang Lehner, Philipp Große, Ingo Mü ller, Hannes Rauhe, and Jonathan Dees. 2012. The SAP HANA Database -- An Architecture Overview. IEEE Data Eng. Bull., Vol. 35, 1 (2012), 28--33.
    [26]
    Anurag Gupta, Deepak Agarwal, Derek Tan, Jakub Kulesza, Rahul Pathak, Stefano Stefani, and Vidhya Srinivasan. 2015. Amazon redshift and the case for simpler data warehouses. In Proceedings of the 2015 ACM SIGMOD international conference on management of data. 1917--1923.
    [27]
    Peter J Haas and Lynne Stokes. 1998. Estimating the number of classes in a finite population. J. Amer. Statist. Assoc., Vol. 93, 444 (1998), 1475--1487.
    [28]
    Dongxu Huang, Qi Liu, Qiu Cui, Zhuhe Fang, Xiaoyu Ma, Fei Xu, Li Shen, Liu Tang, Yuxing Zhou, Menglong Huang, et al. 2020. TiDB: a Raft-based HTAP database. Proceedings of the VLDB Endowment, Vol. 13, 12 (2020), 3072--3084.
    [29]
    Gui Huang, Xuntao Cheng, Jianying Wang, Yujie Wang, Dengcheng He, Tieying Zhang, Feifei Li, Sheng Wang, Wei Cao, and Qiang Li. 2019. X-engine: An optimized storage engine for large-scale e-commerce transaction processing. In Proceedings of the ACM SIGMOD International Conference on Management of Data. Association for Computing Machinery, 651--665.
    [30]
    Shiva Jahangiri, Michael J. Carey, and Johann-Christoph Freytag. 2022. Design Trade-offs for a Robust Dynamic Hybrid Hash Join. Proc. VLDB Endow., Vol. 15, 10 (2022), 2257--2269.
    [31]
    Tirthankar Lahiri, Shasank Chavan, Maria Colgan, Dinesh Das, Amit Ganesh, Mike Gleeson, Sanket Hase, Allison Holloway, Jesse Kamp, Teck-Hua Lee, et al. 2015. Oracle database in-memory: A dual format in-memory database. In 2015 IEEE 31st International Conference on Data Engineering. IEEE, 1253--1258.
    [32]
    Andrew Lamb, Matt Fuller, Ramakrishna Varadarajan, Nga Tran, Ben Vandiver, Lyric Doshi, and Chuck Bear. 2012. The Vertica Analytic Database: C-Store 7 Years Later. Proc. VLDB Endow., Vol. 5, 12 (2012), 1790--1801.
    [33]
    Per-Åke Larson, Adrian Birka, Eric N Hanson, Weiyun Huang, Michal Nowakiewicz, and Vassilis Papadimos. 2015. Real-time analytical processing with SQL server. Proceedings of the VLDB Endowment, Vol. 8, 12 (2015), 1740--1751.
    [34]
    Per-Åke Larson, Cipri Clinciu, Eric N Hanson, Artem Oks, Susan L Price, Srikumar Rangarajan, Aleksandras Surna, and Qingqing Zhou. 2011. SQL server column store indexes. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data. 1177--1184.
    [35]
    Juchang Lee, SeungHyun Moon, Kyu Hwan Kim, Deok Hoe Kim, Sang Kyun Cha, and Wook-Shin Han. 2017. Parallel replication across formats in SAP HANA for scaling out mixed OLTP/OLAP workloads. Proceedings of the VLDB Endowment, Vol. 10, 12 (2017), 1598--1609.
    [36]
    Viktor Leis, Peter A. Boncz, Alfons Kemper, and Thomas Neumann. 2014. Morsel-driven parallelism: a NUMA-aware query evaluation framework for the many-core age. In International Conference on Management of Data, SIGMOD 2014, Snowbird, UT, USA, June 22--27, 2014. ACM, 743--754.
    [37]
    Guoliang Li, Haowen Dong, and Chao Zhang. 2022. Cloud Databases: New Techniques, Challenges, and Opportunities. Proc. VLDB Endow., Vol. 15, 12 (2022), 3758--3761.
    [38]
    Meng Li, Zheyu Miao, Di Wu, Feifei Li, Sheng Wang, Wei Cao, Zhi Qiao, Yubin Ruan, Yukun Liang, Jimmy Yang, Haipeng Dai, and Guihai Chen. 2023. ROVEC: Runtime Optimization of Vectorized Expression Evaluation for Column Store. IEEE Trans. Knowl. Data Eng., Vol. 35, 3 (2023), 3045--3058.
    [39]
    Gilad Mishne, Jeff Dalton, Zhenghua Li, Aneesh Sharma, and Jimmy Lin. 2013. Fast data in the era of big data: Twitter's real-time related query suggestion architecture. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data. 1147--1158.
    [40]
    Guido Moerkotte and Thomas Neumann. 2008. Dynamic programming strikes back. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2008, Vancouver, BC, Canada, June 10--12, 2008. ACM, 539--552.
    [41]
    MySQL. 2019. MySQL 8.0.18 (2019--10--14, General Availability). https://dev.mysql.com/doc/relnotes/mysql/8.0/en/news-8-0--18.html.
    [42]
    Diego Ongaro and John Ousterhout. 2014. In search of an understandable consensus algorithm. In 2014 USENIX Annual Technical Conference (Usenix ATC 14). 305--319.
    [43]
    Oracle. 2018. Database-Level Supplemental Logging. https://docs.oracle.com/database/121/SUTIL/GUID-D2DDD67C-E1CC-45A6-A2A7--198E4C142FA3.htm.
    [44]
    Sukhada Pendse, Vasudha Krishnaswamy, Kartik Kulkarni, Yunrui Li, Tirthankar Lahiri, Vivekanandhan Raja, Jing Zheng, Mahesh Girkar, and Akshay Kulkarni. 2020. Oracle database in-memory on active data guard: Real-time analytics on a standby database. In 2020 IEEE 36th International Conference on Data Engineering (ICDE). IEEE, 1570--1578.
    [45]
    Adam Prout, Szu-Po Wang, Joseph Victor, Zhou Sun, Yongzhu Li, Jack Chen, Evan Bergeron, Eric N. Hanson, Robert Walzer, Rodrigo Gomes, and Nikita Shamgunov. 2022. Cloud-Native Transactions and Analytics in SingleStore. In SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12 - 17, 2022. ACM, 2340--2352.
    [46]
    Vijayshankar Raman, Gopi Attaluri, Ronald Barber, Naresh Chainani, David Kalmuk, Vincent KulandaiSamy, Jens Leenstra, Sam Lightstone, Shaorong Liu, Guy M Lohman, et al. 2013. DB2 with BLU acceleration: So much more than just a column store. Proceedings of the VLDB Endowment, Vol. 6, 11 (2013), 1080--1091.
    [47]
    Sijie Shen, Rong Chen, Haibo Chen, and Binyu Zang. 2021. Retrofitting High Availability Mechanism to Tame Hybrid Transaction/Analytical Processing. In 15th $$USENIX$$ Symposium on Operating Systems Design and Implementation ($$OSDI$$ 21). 219--238.
    [48]
    Vishal Sikka, Franz F"a rber, Wolfgang Lehner, Sang Kyun Cha, Thomas Peh, and Christof Bornhö vd. 2012. Efficient transaction processing in SAP HANA database: the end of a column store myth. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2012, Scottsdale, AZ, USA, May 20--24, 2012. ACM, 731--742.
    [49]
    SysBench. 2023. SysBench. https://github.com/akopytov/sysbench.
    [50]
    Ben Vandiver, Shreya Prasad, Pratibha Rana, Eden Zik, Amin Saeidi, Pratyush Parimal, Styliani Pantela, and Jaimin Dave. 2018. Eon Mode: Bringing the Vertica Columnar Database to the Cloud. In Proceedings of the 2018 International Conference on Management of Data, SIGMOD Conference 2018, Houston, TX, USA, June 10--15, 2018. ACM, 797--809.
    [51]
    Panos Vassiliadis. 2009. A survey of extract--transform--load technology. International Journal of Data Warehousing and Mining (IJDWM), Vol. 5, 3 (2009), 1--27.
    [52]
    Alejandro Vera-Baquero, Ricardo Colomo-Palacios, and Owen Molloy. 2016. Real-time business activity monitoring and analysis of process performance on big-data domains. Telematics and Informatics, Vol. 33, 3 (2016), 793--807.
    [53]
    Alexandre Verbitski, Anurag Gupta, Debanjan Saha, Murali Brahmadesam, Kamal Gupta, Raman Mittal, Sailesh Krishnamurthy, Sandor Maurice, Tengiz Kharatishvili, and Xiaofeng Bao. 2017. Amazon Aurora: Design Considerations for High Throughput Cloud-Native Relational Databases. In Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD Conference 2017, Chicago, IL, USA, May 14--19, 2017. ACM, 1041--1052.
    [54]
    Alexandre Verbitski, Anurag Gupta, Debanjan Saha, James Corey, Kamal Gupta, Murali Brahmadesam, Raman Mittal, Sailesh Krishnamurthy, Sandor Maurice, Tengiz Kharatishvilli, et al. 2018. Amazon aurora: On avoiding distributed consensus for i/os, commits, and membership changes. In Proceedings of the 2018 International Conference on Management of Data. 789--796.
    [55]
    Jiacheng Yang, Ian Rae, Jun Xu, Jeff Shute, Zhan Yuan, Kelvin Lau, Qiang Zeng, Xi Zhao, Jun Ma, Ziyang Chen, et al. 2020. F1 Lightning: HTAP as a Service. Proceedings of the VLDB Endowment, Vol. 13, 12 (2020), 3313--3325.
    [56]
    Chaoqun Zhan, Maomeng Su, Chuangxian Wei, Xiaoqiang Peng, Liang Lin, Sheng Wang, Zhe Chen, Feifei Li, Yue Pan, Fang Zheng, and Chengliang Chai. 2019. AnalyticDB: Real-time OLAP Database System at Alibaba Cloud. Proc. VLDB Endow., Vol. 12, 12 (2019), 2059--2070.
    [57]
    Jun Zhou, Xiaolong Li, Peilin Zhao, Chaochao Chen, Longfei Li, Xinxing Yang, Qing Cui, Jin Yu, Xu Chen, Yi Ding, and Yuan Alan Qi. 2017. KunPeng: Parameter Server Based Distributed Learning Systems and Its Applications in Alibaba and Ant Financial. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '17). Association for Computing Machinery, New York, NY, USA, 1693--1702.

    Cited By

    View all
    • (2024)Visualization-Aware Time Series Min-Max Caching with Error Bound GuaranteesProceedings of the VLDB Endowment10.14778/3659437.365946017:8(2091-2103)Online publication date: 31-May-2024
    • (2024)Performance-Based Pricing for Federated Learning via AuctionProceedings of the VLDB Endowment10.14778/3648160.364816917:6(1269-1282)Online publication date: 3-May-2024
    • (2024)Hybrid Prompt Learning for Generating Justifications of Security Risks in Automation RulesACM Transactions on Intelligent Systems and Technology10.1145/3675401Online publication date: 29-Jun-2024
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the ACM on Management of Data
    Proceedings of the ACM on Management of Data  Volume 1, Issue 2
    PACMMOD
    June 2023
    2310 pages
    EISSN:2836-6573
    DOI:10.1145/3605748
    Issue’s Table of Contents
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 20 June 2023
    Published in PACMMOD Volume 1, Issue 2

    Author Tags

    1. cloud databases
    2. hybrid transactional and analytical processing

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)895
    • Downloads (Last 6 weeks)77

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Visualization-Aware Time Series Min-Max Caching with Error Bound GuaranteesProceedings of the VLDB Endowment10.14778/3659437.365946017:8(2091-2103)Online publication date: 31-May-2024
    • (2024)Performance-Based Pricing for Federated Learning via AuctionProceedings of the VLDB Endowment10.14778/3648160.364816917:6(1269-1282)Online publication date: 3-May-2024
    • (2024)Hybrid Prompt Learning for Generating Justifications of Security Risks in Automation RulesACM Transactions on Intelligent Systems and Technology10.1145/3675401Online publication date: 29-Jun-2024
    • (2024)Databases in Edge and Fog Environments: A SurveyACM Computing Surveys10.1145/366600156:11(1-40)Online publication date: 8-Jul-2024
    • (2024)RaBitQ: Quantizing High-Dimensional Vectors with a Theoretical Error Bound for Approximate Nearest Neighbor SearchProceedings of the ACM on Management of Data10.1145/36549702:3(1-27)Online publication date: 30-May-2024
    • (2024)Convolution and Cross-Correlation of Count Sketches Enables Fast Cardinality Estimation of Multi-Join QueriesProceedings of the ACM on Management of Data10.1145/36549322:3(1-26)Online publication date: 30-May-2024
    • (2024)Time Series Representation for Visualization in Apache IoTDBProceedings of the ACM on Management of Data10.1145/36392902:1(1-26)Online publication date: 26-Mar-2024
    • (2024)A survey on hybrid transactional and analytical processingThe VLDB Journal10.1007/s00778-024-00858-9Online publication date: 4-Jun-2024
    • (2024)Secure Federated Boolean Count Queries Using Fully-Homomorphic CryptographyResearch in Computational Molecular Biology10.1007/978-1-0716-3989-4_4(54-67)Online publication date: 29-Apr-2024
    • (2023)Modernization of Databases in the Cloud Era: Building Databases that Run Like LegosProceedings of the VLDB Endowment10.14778/3611540.361163916:12(4140-4151)Online publication date: 1-Aug-2023
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media