Author: Ke’ao Yang (Software Engineer at PingCAP)

Transcreator: Ran Huang; Editor: Tom Dewan

At the heart of modern software is layered abstraction. In abstraction, each layer hides details that are not relevant to other layers and provides a simple, functioning interface.

However, because abstraction brings the extra overhead of system calls, its simplicity comes at the cost of performance. Thus, for performance-sensitive software such as databases, abstraction might bring unwanted consequences. How can we boost performance for these applications?

In this article, I’ll talk about the pitfalls of abstraction in the modern storage structure and how we tackled the problem…


  • Ruoxi Sun (Tech Lead of Analytical Computing Team at PingCAP)
  • Fei Xu (Software Engineer at PingCAP)

Editors: Tom Dewan, Caitin Chen

TiDB is a Hybrid Transaction/Analytical Processing (HTAP) database that can efficiently process analytical queries. However, when large amounts of data are involved, the CPU becomes the bottleneck for processing queries that include JOIN statements and/or aggregation functions.

At the same time, the GPU is rapidly gaining popularity in areas of scientific computing, AI, data processing, and so on. It outperforms the CPU by orders of magnitude in such areas. …


  • Yueyue Zhou (Product Expert at PingCAP)
  • Dan Su (Product Expert at PingCAP)

Transcreator: Ran Huang; Editor: Tom Dewan

Big data, in recent years, is not just a buzzword, but also a growing need for ambitious companies. With skyrocketing data scale and stringent requirements on data freshness, big data-related scenarios are becoming complicated and multi-dimensional. Thus, many companies use real-time data warehouses to meet their business demand.

But data warehouses are not the only option. An emerging category of databases, Hybrid Analytical/Transactional Processing (HTAP) databases, can serve you just as well as data warehouses, if not better. …

Author: Jun Yu (solution architect at PingCAP)

Transcreator: Ran Huang; Editor: Tom Dewan

The financial industry, especially banks, needs a robust system to support their complicated transactions. With the development of technology, the core banking system has evolved — from a traditional centralized system to a distributed, service-oriented architecture (SOA).

As an open-source, distributed SQL database, TiDB is widely adopted by banks, securities, insurance, online payments, and FinTech companies, and it supports our users in over twenty mission critical business scenarios.

In this article, I will introduce these critical business scenarios of the financial industry and the pain points they…

Author: Shawn Ma (Tech Lead of the Real-time Analytics team at PingCAP)

Transcreator: Ran Huang

Many companies need to analyze large-scale data in real time. It allows them to identify potential risks, efficiently allocate their resources, and serve their customers quickly. However, the more data you hold in your DBMS, the longer it takes to retrieve and process data. As you amass more data over time, it’s harder and harder to process it in real time.

To deliver a better one-stop Hybrid Transactional/Analytical Processing (HTAP) service, TiDB, an open-source, distributed SQL database, has released version 5.0. …

Author: Xuanwo

Transcreator: Caitin Chen; Editor: Tom Dewan

In September 2020, I was honored to participate in the TiKV Linux Foundation’s (LFX) Mentorship Program. I worked on the Coprocessor support ENUM/SET project to help enable TiKV Coprocessor to support ENUM and SET calculations. This helped improve TiKV's calculation performance.

In this post, I’ll share with you some background information about my project, how I implemented the project, my lessons learned during community cooperation, and my future plans for this project.

Background information

What is the LFX Mentorship Program?

The LFX Mentorship Programs teach developers — many of whom are first-time open source contributors — to effectively experiment, learn…

Author: Yiwen Chen (Committer of TiDB Operator)

Transcreator: Ran Huang; Editor: Tom Dewan

In my last article, I introduced TiDB Operator’s architecture and what it is capable of. But how does TiDB Operator code run? How does TiDB Operator manage the lifecycle of each component in the TiDB cluster?

In this post, I’ll present Kubernetes’s Operator pattern and how it is implemented in TiDB Operator. More specifically, we’ll go through TiDB Operator’s major control loop, from its entry point to the trigger of the lifecycle management.

From Controller to Operator

Because TiDB Operator learns from the kube-controller-manager, understanding the design of kube-controller-manager helps you…

Author: Yiwen Chen (Committer of TiDB Operator)

Transcreator: Ran Huang; Editor: Tom Dewan

TiDB Operator is an automatic operation system for TiDB in Kubernetes. As a Kubernetes Operator tailored for TiDB, it is widely used by TiDB users to manage their clusters throughout the entire lifecycle, and thus boasts of an active developer community.

However, both Kubernetes and TiDB Operator are rather complex, so contributing to its development requires some effort and preparation. To help our community grasp this essential knowledge, we’re creating a series of articles that walk you through the TiDB Operator source code. …

Industry: Logistics


  • Shaun Chong (Co-founder and CTO at Ninja Van)
  • Mengnan Gong (Sr. Software Engineer at Ninja Van)

Editors: Tom Dewan, Caitin Chen

Ninja Van is Southeast Asia’s fastest-growing last-mile logistics company. We are now in six countries in Southeast Asia. Our customers include Amazon, Shopee, Lazada, Lineman, GradExpress, Zilingo, Tokopedia, and Sendo.

We deliver upwards of 1.5 million parcels a day. As our data size rapidly grew, our databases faced great pressure, and we had significant issues in ProxySQL, sharding, and Galera. After we compared Vitess, CockroachDB (CRDB), and TiDB, an open-source, MySQL-compatible, distributed SQL database, we found…

Authors: Jinpeng Zhang (TiKV maintainer), Bokang Zhang (TiKV committer)

Transcreator: Ran Huang; Editor: Tom Dewan

In the cloud, databases are often deployed across more than one availability zone (AZ). That way, if one AZ fails, the database service can continue uninterrupted. However, cloud service providers charge more for cross-AZ traffic, so we might have to spend a fortune on clusters with heavy traffic. In extreme cases, these charges can make up 30%~40% of the total hardware cost.

Can we do something to lower this expense? Now we can. At TiDB Hackathon 2020, our team focused on reducing cross-AZ traffic for…


PingCAP is the team behind TiDB, an open source MySQL compatible NewSQL HTAP database. Official website: GitHub:

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store